onnx 119

AMD Ryzen Threadripper 7980X 64-Cores testing with a System76 Thelio Major (FA Z5 BIOS) and AMD Radeon Pro W7900 on Ubuntu 24.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2408227-PTS-ONNX119192&grs.

onnx 119ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionabcdeAMD Ryzen Threadripper 7980X 64-Cores @ 7.79GHz (64 Cores / 128 Threads)System76 Thelio Major (FA Z5 BIOS)AMD Device 14a44 x 32GB DDR5-4800MT/s Micron MTC20F1045S1RC48BA21000GB CT1000T700SSD5AMD Radeon Pro W7900AMD Device 14ccDELL P2415QAquantia AQC113C NBase-T/IEEE + Realtek RTL8125 2.5GbE + Intel Wi-Fi 6EUbuntu 24.106.8.0-31-generic (x86_64)GNOME ShellX Server + Wayland4.6 Mesa 24.0.9-0ubuntu2 (LLVM 17.0.6 DRM 3.57)GCC 14.2.0ext41920x1200OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-F5tscv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-F5tscv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105Python Details- Python 3.12.5Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

onnx 119onnx: CaffeNet 12-int8 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Standardonnx: ZFNet-512 - CPU - Parallelonnx: ArcFace ResNet-100 - CPU - Standardonnx: yolov4 - CPU - Parallelsvt-av1: Preset 13 - Bosphorus 1080ponnx: super-resolution-10 - CPU - Standardonnx: ZFNet-512 - CPU - Standardonnx: bertsquad-12 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallelonnx: yolov4 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Parallelonnx: T5 Encoder - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardonnx: GPT-2 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Parallelsvt-av1: Preset 8 - Bosphorus 1080psvt-av1: Preset 13 - Bosphorus 4Konnx: ResNet50 v1-12-int8 - CPU - Parallelsvt-av1: Preset 3 - Bosphorus 4Konnx: super-resolution-10 - CPU - Parallelsvt-av1: Preset 8 - Beauty 4K 10-bitonnx: ResNet101_DUC_HDC-12 - CPU - Standardsvt-av1: Preset 5 - Bosphorus 4Konnx: T5 Encoder - CPU - Parallelsvt-av1: Preset 8 - Bosphorus 4Konnx: GPT-2 - CPU - Parallelsvt-av1: Preset 5 - Bosphorus 1080psvt-av1: Preset 3 - Beauty 4K 10-bitsvt-av1: Preset 3 - Bosphorus 1080psvt-av1: Preset 13 - Beauty 4K 10-bitsvt-av1: Preset 5 - Beauty 4K 10-bitonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallelonnx: ResNet101_DUC_HDC-12 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Parallelonnx: super-resolution-10 - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: ArcFace ResNet-100 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Parallelonnx: bertsquad-12 - CPU - Standardonnx: bertsquad-12 - CPU - Parallelonnx: T5 Encoder - CPU - Standardonnx: T5 Encoder - CPU - Parallelonnx: ZFNet-512 - CPU - Standardonnx: ZFNet-512 - CPU - Parallelonnx: yolov4 - CPU - Standardonnx: yolov4 - CPU - Parallelonnx: GPT-2 - CPU - Standardonnx: GPT-2 - CPU - Parallelabcde204.479454.72353.406838.34765.27246740.001100.592781.77526.76854201.10328.72598.7600712.673012.6039140.95844.5600105.0391.60987271.676231.11174.206113.398131.31310.8442.6824845.614335.50396.210160.706119.2211.86436.60519.5207.77522.441534.8288372.789621.2269.944977.614174.9718713.474826.080379.3400257.9903.92686908.2961.103582.198414.8892578.9209147.8147.094422.9793812.226818.7297114.152189.6779.518326.21628200.840460.01654.577936.96585.12789740.790100.683081.41586.86262201.42428.87668.7125112.737712.4756142.11444.4566105.7911.59999269.879233.24874.362413.350130.48910.8272.7014545.343338.92896.423161.407119.2291.85736.55319.4177.80122.491834.6450370.177625.0609.936547.662424.9643713.448227.058280.1594252.3354.00404913.5391.098252.173394.9949078.5721145.7847.036942.9493612.293918.3266114.827195.1289.451286.18911178.126442.11750.742138.78165.30486764.003100.87682.49536.9174194.34327.65269.0561913.179612.3908141.23144.0419105.591.61916266.006232.39673.120813.53130.76610.8522.6910545.662339.17596.06162.175119.7681.86736.66419.3967.81122.703236.1605371.599617.69.912847.64595.1448913.674625.783680.7021291.4443.43115913.4941.094692.261265.6123675.8716144.5597.080042.946912.118619.7054110.418188.59.468536.15989212.81419.42154.30439.21015.00717783.122105.02181.66737.10593203.83527.61398.7927912.997112.6112138.16445.2883105.1461.58526269.428228.83473.474913.381132.22110.8732.6696645.165335.85997.046161.862118.7981.87436.70719.4427.80922.078336.2111374.577630.8059.521657.56184.9050313.608825.501279.2916288.1243.47069873.4671.144862.383554.6974276.9365140.7247.237342.9760812.242618.4128113.726199.7079.508916.17185210.622440.35454.415536.685.11842749.25799.408378.15527.02121202.69227.75729.0878812.742812.845138.60544.1099107.7231.6101268.76228.9774.330213.413131.26910.9652.6941345.41338.70396.775162.245119.9231.86736.45619.4417.76522.66836.0242371.175621.07610.05927.616594.9328513.452227.260477.8483276.9233.61108889.2311.124562.270194.7462778.4717142.4227.214152.9512412.792318.3751110.033195.3669.281126.15756OpenBenchmarking.org

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallelabcde50100150200250SE +/- 1.28, N = 3SE +/- 3.47, N = 12204.48200.84178.13212.81210.621. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardabcde100200300400500SE +/- 1.43, N = 3SE +/- 3.83, N = 3454.72460.02442.12419.42440.351. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Parallelabcde1224364860SE +/- 0.75, N = 3SE +/- 0.73, N = 353.4154.5850.7454.3054.421. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardabcde918273645SE +/- 0.40, N = 3SE +/- 0.45, N = 338.3536.9738.7839.2136.681. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Parallelabcde1.19362.38723.58084.77445.968SE +/- 0.03706, N = 3SE +/- 0.03454, N = 155.272465.127895.304865.007175.118421. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Bosphorus 1080pabcde2004006008001000SE +/- 7.99, N = 4SE +/- 8.40, N = 4740.00740.79764.00783.12749.261. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Standardabcde20406080100SE +/- 0.94, N = 6SE +/- 1.11, N = 5100.59100.68100.88105.0299.411. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Standardabcde20406080100SE +/- 0.40, N = 3SE +/- 0.72, N = 1581.7881.4282.5081.6778.161. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Parallelabcde246810SE +/- 0.05485, N = 9SE +/- 0.07644, N = 56.768546.862626.917407.105937.021211. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardabcde4080120160200SE +/- 1.01, N = 3SE +/- 1.52, N = 3201.10201.42194.34203.84202.691. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallelabcde714212835SE +/- 0.21, N = 11SE +/- 0.32, N = 528.7328.8827.6527.6127.761. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Standardabcde3691215SE +/- 0.01899, N = 3SE +/- 0.10839, N = 48.760078.712519.056198.792799.087881. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Standardabcde3691215SE +/- 0.13, N = 3SE +/- 0.10, N = 1512.6712.7413.1813.0012.741. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallelabcde3691215SE +/- 0.05, N = 3SE +/- 0.08, N = 312.6012.4812.3912.6112.851. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Standardabcde306090120150SE +/- 0.88, N = 3SE +/- 1.08, N = 3140.96142.11141.23138.16138.611. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardabcde1020304050SE +/- 0.33, N = 3SE +/- 0.14, N = 344.5644.4644.0445.2944.111. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Standardabcde20406080100SE +/- 0.21, N = 3SE +/- 0.58, N = 3105.04105.79105.59105.15107.721. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallelabcde0.36430.72861.09291.45721.8215SE +/- 0.01149, N = 3SE +/- 0.01116, N = 31.609871.599991.619161.585261.610101. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Bosphorus 1080pabcde60120180240300SE +/- 0.46, N = 3SE +/- 0.65, N = 3271.68269.88266.01269.43268.761. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Bosphorus 4Kabcde50100150200250SE +/- 1.99, N = 8SE +/- 2.31, N = 3231.11233.25232.40228.83228.971. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelabcde20406080100SE +/- 0.25, N = 3SE +/- 0.72, N = 374.2174.3673.1273.4774.331. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 3 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Bosphorus 4Kabcde3691215SE +/- 0.02, N = 3SE +/- 0.04, N = 313.4013.3513.5313.3813.411. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Parallelabcde306090120150SE +/- 0.20, N = 3SE +/- 0.45, N = 3131.31130.49130.77132.22131.271. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Beauty 4K 10-bitabcde3691215SE +/- 0.04, N = 3SE +/- 0.05, N = 310.8410.8310.8510.8710.971. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standardabcde0.60781.21561.82342.43123.039SE +/- 0.00331, N = 3SE +/- 0.00886, N = 32.682482.701452.691052.669662.694131. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 5 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Bosphorus 4Kabcde1020304050SE +/- 0.18, N = 3SE +/- 0.22, N = 345.6145.3445.6645.1745.411. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Parallelabcde70140210280350SE +/- 0.69, N = 3SE +/- 1.56, N = 3335.50338.93339.18335.86338.701. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Bosphorus 4Kabcde20406080100SE +/- 0.26, N = 3SE +/- 0.79, N = 396.2196.4296.0697.0596.781. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Parallelabcde4080120160200SE +/- 0.25, N = 3SE +/- 0.37, N = 3160.71161.41162.18161.86162.251. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 5 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Bosphorus 1080pabcde306090120150SE +/- 0.52, N = 3SE +/- 0.35, N = 3119.22119.23119.77118.80119.921. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Beauty 4K 10-bitabcde0.42170.84341.26511.68682.1085SE +/- 0.002, N = 3SE +/- 0.004, N = 31.8641.8571.8671.8741.8671. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 3 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Bosphorus 1080pabcde816243240SE +/- 0.03, N = 3SE +/- 0.01, N = 336.6136.5536.6636.7136.461. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Beauty 4K 10-bitabcde510152025SE +/- 0.03, N = 3SE +/- 0.04, N = 319.5219.4219.4019.4419.441. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Beauty 4K 10-bitabcde246810SE +/- 0.029, N = 3SE +/- 0.008, N = 37.7757.8017.8117.8097.7651. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardabcde510152025SE +/- 0.16, N = 3SE +/- 0.07, N = 322.4422.4922.7022.0822.671. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallelabcde816243240SE +/- 0.26, N = 11SE +/- 0.39, N = 534.8334.6536.1636.2136.021. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standardabcde80160240320400SE +/- 0.46, N = 3SE +/- 1.22, N = 3372.79370.18371.60374.58371.181. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallelabcde140280420560700SE +/- 4.43, N = 3SE +/- 4.33, N = 3621.23625.06617.60630.81621.081. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Standardabcde3691215SE +/- 0.08968, N = 6SE +/- 0.10599, N = 59.944979.936549.912849.5216510.059201. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Parallelabcde246810SE +/- 0.01157, N = 3SE +/- 0.02614, N = 37.614177.662427.645907.561807.616591. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardabcde1.15762.31523.47284.63045.788SE +/- 0.02497, N = 3SE +/- 0.03731, N = 34.971874.964375.144894.905034.932851. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelabcde48121620SE +/- 0.04, N = 3SE +/- 0.13, N = 313.4713.4513.6713.6113.451. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardabcde612182430SE +/- 0.27, N = 3SE +/- 0.33, N = 326.0827.0625.7825.5027.261. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallelabcde20406080100SE +/- 0.34, N = 3SE +/- 0.51, N = 379.3480.1680.7079.2977.851. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Standardabcde60120180240300SE +/- 7.70, N = 15SE +/- 6.89, N = 15257.99252.34291.44288.12276.921. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Standardabcde0.90091.80182.70273.60364.5045SE +/- 0.12199, N = 15SE +/- 0.10773, N = 153.926864.004043.431153.470693.611081. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelabcde2004006008001000SE +/- 11.68, N = 15SE +/- 13.79, N = 15908.30913.54913.49873.47889.231. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelabcde0.25760.51520.77281.03041.288SE +/- 0.01462, N = 15SE +/- 0.01709, N = 151.103581.098251.094691.144861.124561. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardabcde0.53631.07261.60892.14522.6815SE +/- 0.00686, N = 3SE +/- 0.01799, N = 32.198412.173392.261262.383552.270191. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallelabcde1.26282.52563.78845.05126.314SE +/- 0.03038, N = 3SE +/- 0.09195, N = 124.889254.994905.612364.697424.746271. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Standardabcde20406080100SE +/- 0.83, N = 3SE +/- 0.63, N = 1578.9278.5775.8776.9478.471. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Parallelabcde306090120150SE +/- 1.16, N = 9SE +/- 1.59, N = 5147.81145.78144.56140.72142.421. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Standardabcde246810SE +/- 0.04465, N = 3SE +/- 0.05397, N = 37.094427.036947.080047.237347.214151. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Parallelabcde0.67041.34082.01122.68163.352SE +/- 0.00618, N = 3SE +/- 0.01351, N = 32.979382.949362.946902.976082.951241. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Standardabcde3691215SE +/- 0.06, N = 3SE +/- 0.11, N = 1512.2312.2912.1212.2412.791. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Parallelabcde510152025SE +/- 0.27, N = 3SE +/- 0.25, N = 318.7318.3319.7118.4118.381. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Standardabcde306090120150SE +/- 0.25, N = 3SE +/- 1.43, N = 4114.15114.83110.42113.73110.031. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Parallelabcde4080120160200SE +/- 1.34, N = 3SE +/- 1.30, N = 15189.68195.13188.50199.71195.371. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Standardabcde3691215SE +/- 0.01924, N = 3SE +/- 0.05188, N = 39.518329.451289.468539.508919.281121. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Parallelabcde246810SE +/- 0.00960, N = 3SE +/- 0.01421, N = 36.216286.189116.159896.171856.157561. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt


Phoronix Test Suite v10.8.5