n1n1 ARMv8 Neoverse-N1 testing with a GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2403174-NE-N1N13670960&gru .
n1n1 Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Compiler File-System Screen Resolution a aa b c ARMv8 Neoverse-N1 @ 3.00GHz (128 Cores) GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS) Ampere Computing LLC Altra PCI Root Complex A 16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE 800GB Micron_7450_MTFDKBA800TFS ASPEED VGA HDMI 2 x Intel I350 Ubuntu 23.10 6.5.0-15-generic (aarch64) GCC 13.2.0 ext4 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details - Scaling Governor: cppc_cpufreq performance (Boost: Disabled) Python Details - Python 3.11.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
n1n1 openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU svt-av1: Preset 4 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 12 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K svt-av1: Preset 4 - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 1080p svt-av1: Preset 12 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 1080p deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream srsran: PDSCH Processor Benchmark, Throughput Total srsran: PUSCH Processor Benchmark, Throughput Total srsran: PDSCH Processor Benchmark, Throughput Thread srsran: PUSCH Processor Benchmark, Throughput Thread jpegxl: PNG - 80 jpegxl: PNG - 90 jpegxl: JPEG - 80 jpegxl: JPEG - 90 jpegxl: PNG - 100 jpegxl: JPEG - 100 jpegxl-decode: 1 jpegxl-decode: All stockfish: Chess Benchmark onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_1d - CPU onednn: Deconvolution Batch shapes_3d - CPU onednn: Recurrent Neural Network Training - CPU onednn: Recurrent Neural Network Inference - CPU draco: Lion draco: Church Facade openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream build-linux-kernel: defconfig build-linux-kernel: allmodconfig compress-pbzip2: FreeBSD-13.0-RELEASE-amd64-memstick.img Compression primesieve: 1e12 primesieve: 1e13 encode-wavpack: WAV To WavPack a aa b c 2.652 24.945 74.682 74.896 8.914 56.897 265.743 364.399 14099.8 1602.1 175.8 46.7 43.097 39.249 39.268 37.591 29.603 31.665 27.237 558.569 59028775 94.273 2.84 14.77 14.73 222.86 2.74 676.59 65.60 89.35 293.47 333.15 34.90 40.11 217.95 204.69 164.82 163.95 142.60 1402.51 147.76 1462.94 2.644 24.927 74.469 74.900 8.925 57.135 264.978 363.354 33.4187 26.0871 1149.4724 132.1849 474.8976 133.5301 2678.2382 315.7450 2.2602 12.9298 476.3557 133.6218 202.6359 112.5334 345.1080 109.9523 46.6120 30.5961 438.7131 50.5976 33.5337 26.2531 13936.1 175.7 40.279 37.895 38.921 37.415 29.238 31.121 27.152 523.019 59449725 4.84065 2.15582 4.29470 20.9255 2.79626 3738.39 1460.94 7351 10100 10877.53 2150.30 2156.87 143.42 11232.43 47.28 486.11 357.86 108.97 95.98 913.41 794.06 146.71 156.22 193.84 194.88 224.22 22.80 216.18 21.86 1844.1246 38.3162 55.0261 7.5520 132.9525 7.4741 23.5039 3.1508 21332.8931 77.3026 132.5425 7.4691 310.5129 8.8709 182.8789 9.0819 1337.5918 32.6597 143.8317 19.7459 1840.3677 38.0726 92.760 348.018 2.413553 2.911 42.305 25.199 2.84 14.77 14.84 223.85 2.75 664.78 65.49 89.19 297.48 329.97 34.79 40.15 221.47 205.33 164.75 163.98 142.58 1402.97 147.08 1473.23 2.65 25.006 75.167 74.958 8.921 57.027 265.435 365.102 33.7125 25.9453 1144.8012 132.9867 475.8212 134.0016 2688.9567 312.4633 2.2754 12.8977 478.6418 133.6253 202.1471 112.8291 346.6699 109.4784 46.715 30.7211 439.6023 50.7328 33.5843 26.3468 13999.6 41.309 39.669 37.766 37.79 29.494 31.621 27.417 564.893 51901853 4.88015 2.15178 4.28036 20.4308 2.78238 3737.15 1461 7320 9847 10891.93 2151.85 2140.2 142.79 11206.13 48.1 486.9 358.5 107.5 96.9 915.17 793.31 144.38 155.74 193.93 194.84 224.22 22.79 217.16 21.71 1834.8257 38.5246 55.2555 7.5061 132.7356 7.4484 23.4116 3.1835 21231.5641 77.4941 131.9529 7.4692 311.1676 8.8483 181.8956 9.1198 1335.3456 32.5272 143.4837 19.6933 1835.2572 37.9374 94.426 350.294 2.439338 2.872 42.441 25.205 2.84 14.73 14.8 222.78 2.75 670.19 65.6 89.3 294.58 331.77 34.88 40.21 219.27 207.24 164.79 164.13 142.54 1403.65 146.9 1460.72 2.65 24.952 75.015 74.604 8.926 56.789 264.28 363.612 33.6797 26.3315 1144.7727 131.4797 479.9901 133.4866 2630.334 316.347 2.2836 12.9605 478.3732 133.8853 201.0244 112.8521 339.9 111.1578 46.6799 30.6675 438.2501 50.649 33.6663 26.2002 41.354 39.251 39.315 35.843 29.544 31.624 27.396 542.103 53514996 4.88858 2.14878 4.28461 20.8925 2.80386 3738.53 1469.65 7332 9848 10876.7 2157.45 2146.07 143.48 11196.54 47.72 486.11 357.41 108.56 96.38 913.21 792 145.82 154.3 193.87 194.65 224.31 22.78 217.41 21.89 1833.4487 37.9597 55.2435 7.5924 131.5237 7.4767 23.9126 3.1449 21169.2953 77.119 131.8594 7.454 312.7909 8.8461 185.7196 8.982 1333.7177 32.5835 143.6025 19.7262 1836.793 38.1495 94.496 349.915 2.438631 2.893 42.294 25.2 OpenBenchmarking.org
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU aa b c 0.639 1.278 1.917 2.556 3.195 SE +/- 0.01, N = 3 2.84 2.84 2.84 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU aa b c 4 8 12 16 20 SE +/- 0.01, N = 3 14.77 14.77 14.73 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU aa b c 4 8 12 16 20 SE +/- 0.02, N = 3 14.73 14.84 14.80 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU aa b c 50 100 150 200 250 SE +/- 0.10, N = 3 222.86 223.85 222.78 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU aa b c 0.6188 1.2376 1.8564 2.4752 3.094 SE +/- 0.00, N = 3 2.74 2.75 2.75 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU aa b c 150 300 450 600 750 SE +/- 8.52, N = 3 676.59 664.78 670.19 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU aa b c 15 30 45 60 75 SE +/- 0.12, N = 3 65.60 65.49 65.60 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU aa b c 20 40 60 80 100 SE +/- 0.03, N = 3 89.35 89.19 89.30 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU aa b c 60 120 180 240 300 SE +/- 0.30, N = 3 293.47 297.48 294.58 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU aa b c 70 140 210 280 350 SE +/- 0.61, N = 3 333.15 329.97 331.77 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU aa b c 8 16 24 32 40 SE +/- 0.02, N = 3 34.90 34.79 34.88 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU aa b c 9 18 27 36 45 SE +/- 0.05, N = 3 40.11 40.15 40.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU aa b c 50 100 150 200 250 SE +/- 0.18, N = 3 217.95 221.47 219.27 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU aa b c 50 100 150 200 250 SE +/- 0.66, N = 3 204.69 205.33 207.24 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU aa b c 40 80 120 160 200 SE +/- 0.03, N = 3 164.82 164.75 164.79 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU aa b c 40 80 120 160 200 SE +/- 0.06, N = 3 163.95 163.98 164.13 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.34, N = 3 142.60 142.58 142.54 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU aa b c 300 600 900 1200 1500 SE +/- 3.07, N = 3 1402.51 1402.97 1403.65 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.83, N = 3 147.76 147.08 146.90 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU aa b c 300 600 900 1200 1500 SE +/- 1.48, N = 3 1462.94 1473.23 1460.72 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 4K a aa b c 0.5967 1.1934 1.7901 2.3868 2.9835 SE +/- 0.004, N = 3 2.652 2.644 2.650 2.650 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 4K a aa b c 6 12 18 24 30 SE +/- 0.01, N = 3 24.95 24.93 25.01 24.95 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 4K a aa b c 20 40 60 80 100 SE +/- 0.28, N = 3 74.68 74.47 75.17 75.02 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 4K a aa b c 20 40 60 80 100 SE +/- 0.19, N = 3 74.90 74.90 74.96 74.60 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 1080p a aa b c 2 4 6 8 10 SE +/- 0.010, N = 3 8.914 8.925 8.921 8.926 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a aa b c 13 26 39 52 65 SE +/- 0.06, N = 3 56.90 57.14 57.03 56.79 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a aa b c 60 120 180 240 300 SE +/- 0.05, N = 3 265.74 264.98 265.44 264.28 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a aa b c 80 160 240 320 400 SE +/- 0.57, N = 3 364.40 363.35 365.10 363.61 1. (CXX) g++ options: -march=native
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream aa b c 8 16 24 32 40 SE +/- 0.02, N = 3 33.42 33.71 33.68
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream aa b c 6 12 18 24 30 SE +/- 0.11, N = 3 26.09 25.95 26.33
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 200 400 600 800 1000 SE +/- 2.84, N = 3 1149.47 1144.80 1144.77
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 30 60 90 120 150 SE +/- 0.27, N = 3 132.18 132.99 131.48
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream aa b c 100 200 300 400 500 SE +/- 1.33, N = 3 474.90 475.82 479.99
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream aa b c 30 60 90 120 150 SE +/- 0.11, N = 3 133.53 134.00 133.49
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 600 1200 1800 2400 3000 SE +/- 6.53, N = 3 2678.24 2688.96 2630.33
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 70 140 210 280 350 SE +/- 0.75, N = 3 315.75 312.46 316.35
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream aa b c 0.5138 1.0276 1.5414 2.0552 2.569 SE +/- 0.0074, N = 3 2.2602 2.2754 2.2836
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream aa b c 3 6 9 12 15 SE +/- 0.02, N = 3 12.93 12.90 12.96
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream aa b c 100 200 300 400 500 SE +/- 1.18, N = 3 476.36 478.64 478.37
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream aa b c 30 60 90 120 150 SE +/- 0.17, N = 3 133.62 133.63 133.89
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 40 80 120 160 200 SE +/- 0.34, N = 3 202.64 202.15 201.02
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 30 60 90 120 150 SE +/- 0.16, N = 3 112.53 112.83 112.85
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream aa b c 80 160 240 320 400 SE +/- 0.25, N = 3 345.11 346.67 339.90
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream aa b c 20 40 60 80 100 SE +/- 0.86, N = 3 109.95 109.48 111.16
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream aa b c 11 22 33 44 55 SE +/- 0.11, N = 3 46.61 46.72 46.68
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream aa b c 7 14 21 28 35 SE +/- 0.01, N = 3 30.60 30.72 30.67
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 100 200 300 400 500 SE +/- 0.42, N = 3 438.71 439.60 438.25
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 11 22 33 44 55 SE +/- 0.08, N = 3 50.60 50.73 50.65
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream aa b c 8 16 24 32 40 SE +/- 0.04, N = 3 33.53 33.58 33.67
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream aa b c 6 12 18 24 30 SE +/- 0.02, N = 3 26.25 26.35 26.20
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PDSCH Processor Benchmark, Throughput Total a aa b 3K 6K 9K 12K 15K SE +/- 42.60, N = 3 14099.8 13936.1 13999.6 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PUSCH Processor Benchmark, Throughput Total a 300 600 900 1200 1500 1602.1 MIN: 947.2 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PDSCH Processor Benchmark, Throughput Thread a aa 40 80 120 160 200 SE +/- 0.03, N = 3 175.8 175.7 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PUSCH Processor Benchmark, Throughput Thread a 11 22 33 44 55 46.7 MIN: 28.9 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
JPEG-XL libjxl Input: PNG - Quality: 80 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 80 a aa b c 10 20 30 40 50 SE +/- 0.30, N = 3 43.10 40.28 41.31 41.35 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: PNG - Quality: 90 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 90 a aa b c 9 18 27 36 45 SE +/- 0.55, N = 15 39.25 37.90 39.67 39.25 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: JPEG - Quality: 80 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 80 a aa b c 9 18 27 36 45 SE +/- 0.12, N = 3 39.27 38.92 37.77 39.32 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: JPEG - Quality: 90 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 90 a aa b c 9 18 27 36 45 SE +/- 0.45, N = 15 37.59 37.42 37.79 35.84 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: PNG - Quality: 100 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 100 a aa b c 7 14 21 28 35 SE +/- 0.04, N = 3 29.60 29.24 29.49 29.54 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: JPEG - Quality: 100 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 100 a aa b c 7 14 21 28 35 SE +/- 0.00, N = 3 31.67 31.12 31.62 31.62 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL Decoding libjxl CPU Threads: 1 OpenBenchmarking.org MP/s, More Is Better JPEG-XL Decoding libjxl 0.10.1 CPU Threads: 1 a aa b c 6 12 18 24 30 SE +/- 0.01, N = 3 27.24 27.15 27.42 27.40
JPEG-XL Decoding libjxl CPU Threads: All OpenBenchmarking.org MP/s, More Is Better JPEG-XL Decoding libjxl 0.10.1 CPU Threads: All a aa b c 120 240 360 480 600 SE +/- 1.96, N = 3 558.57 523.02 564.89 542.10
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 16.1 Chess Benchmark a aa b c 13M 26M 39M 52M 65M SE +/- 1497045.19, N = 12 59028775 59449725 51901853 53514996 1. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -flto -flto-partition=one -flto=jobserver
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 1D - Engine: CPU aa b c 1.0999 2.1998 3.2997 4.3996 5.4995 SE +/- 0.01022, N = 3 4.84065 4.88015 4.88858 MIN: 4.25 MIN: 4.23 MIN: 4.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 3D - Engine: CPU aa b c 0.4851 0.9702 1.4553 1.9404 2.4255 SE +/- 0.00137, N = 3 2.15582 2.15178 2.14878 MIN: 2.06 MIN: 2.06 MIN: 2.06 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Convolution Batch Shapes Auto - Engine: CPU aa b c 0.9663 1.9326 2.8989 3.8652 4.8315 SE +/- 0.01638, N = 3 4.29470 4.28036 4.28461 MIN: 4.16 MIN: 4.17 MIN: 4.14 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_1d - Engine: CPU aa b c 5 10 15 20 25 SE +/- 0.20, N = 3 20.93 20.43 20.89 MIN: 19.34 MIN: 19.32 MIN: 19.81 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_3d - Engine: CPU aa b c 0.6309 1.2618 1.8927 2.5236 3.1545 SE +/- 0.01912, N = 12 2.79626 2.78238 2.80386 MIN: 2.68 MIN: 2.72 MIN: 2.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Training - Engine: CPU aa b c 800 1600 2400 3200 4000 SE +/- 2.30, N = 3 3738.39 3737.15 3738.53 MIN: 3728.79 MIN: 3730.87 MIN: 3730.99 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Inference - Engine: CPU aa b c 300 600 900 1200 1500 SE +/- 3.72, N = 3 1460.94 1461.00 1469.65 MIN: 1436.36 MIN: 1442.49 MIN: 1448.43 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion aa b c 1600 3200 4800 6400 8000 SE +/- 1.86, N = 3 7351 7320 7332 1. (CXX) g++ options: -O3
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade aa b c 2K 4K 6K 8K 10K SE +/- 6.24, N = 3 10100 9847 9848 1. (CXX) g++ options: -O3
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU aa b c 2K 4K 6K 8K 10K SE +/- 17.40, N = 3 10877.53 10891.93 10876.70 MIN: 4104.89 / MAX: 18949.05 MIN: 3821.31 / MAX: 19031.99 MIN: 3255.92 / MAX: 18738.42 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU aa b c 500 1000 1500 2000 2500 SE +/- 1.19, N = 3 2150.30 2151.85 2157.45 MIN: 491.1 / MAX: 2996.72 MIN: 500.93 / MAX: 2975.2 MIN: 644.54 / MAX: 2962.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU aa b c 500 1000 1500 2000 2500 SE +/- 2.39, N = 3 2156.87 2140.20 2146.07 MIN: 504.09 / MAX: 2990 MIN: 527.18 / MAX: 2951.37 MIN: 439.17 / MAX: 2969.83 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.06, N = 3 143.42 142.79 143.48 MIN: 62.82 / MAX: 295.2 MIN: 60 / MAX: 245.21 MIN: 44.55 / MAX: 252.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU aa b c 2K 4K 6K 8K 10K SE +/- 9.32, N = 3 11232.43 11206.13 11196.54 MIN: 6926.76 / MAX: 21113.44 MIN: 7011.32 / MAX: 20429.17 MIN: 7222.84 / MAX: 20603.63 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU aa b c 11 22 33 44 55 SE +/- 0.60, N = 3 47.28 48.10 47.72 MIN: 10.17 / MAX: 121.04 MIN: 9.92 / MAX: 115.12 MIN: 9.97 / MAX: 99.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU aa b c 110 220 330 440 550 SE +/- 0.88, N = 3 486.11 486.90 486.11 MIN: 118.22 / MAX: 849.31 MIN: 119.18 / MAX: 852.49 MIN: 171.7 / MAX: 813.73 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU aa b c 80 160 240 320 400 SE +/- 0.11, N = 3 357.86 358.50 357.41 MIN: 301.59 / MAX: 522.85 MIN: 300.19 / MAX: 528.83 MIN: 204.13 / MAX: 519.56 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU aa b c 20 40 60 80 100 SE +/- 0.11, N = 3 108.97 107.50 108.56 MIN: 17.48 / MAX: 1207.62 MIN: 57.15 / MAX: 1202.08 MIN: 17.21 / MAX: 1188.34 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU aa b c 20 40 60 80 100 SE +/- 0.18, N = 3 95.98 96.90 96.38 MIN: 71.43 / MAX: 140.32 MIN: 70.14 / MAX: 141.32 MIN: 69.36 / MAX: 140.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU aa b c 200 400 600 800 1000 SE +/- 0.49, N = 3 913.41 915.17 913.21 MIN: 742.17 / MAX: 1356.42 MIN: 711.5 / MAX: 1350.07 MIN: 718.49 / MAX: 1350.67 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU aa b c 200 400 600 800 1000 SE +/- 0.93, N = 3 794.06 793.31 792.00 MIN: 604.52 / MAX: 1620.5 MIN: 559.01 / MAX: 1581.54 MIN: 568.74 / MAX: 1657.2 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.12, N = 3 146.71 144.38 145.82 MIN: 96.02 / MAX: 1572.43 MIN: 96.65 / MAX: 1566.66 MIN: 96.38 / MAX: 1563.28 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.49, N = 3 156.22 155.74 154.30 MIN: 44.3 / MAX: 240.55 MIN: 48.23 / MAX: 240.13 MIN: 44.57 / MAX: 239.56 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU aa b c 40 80 120 160 200 SE +/- 0.04, N = 3 193.84 193.93 193.87 MIN: 183.19 / MAX: 407.14 MIN: 182.93 / MAX: 402.18 MIN: 182.85 / MAX: 406.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU aa b c 40 80 120 160 200 SE +/- 0.08, N = 3 194.88 194.84 194.65 MIN: 185.7 / MAX: 356.13 MIN: 185.09 / MAX: 355.83 MIN: 185.45 / MAX: 358.03 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU aa b c 50 100 150 200 250 SE +/- 0.53, N = 3 224.22 224.22 224.31 MIN: 29.21 / MAX: 400.61 MIN: 36.4 / MAX: 368.76 MIN: 31.77 / MAX: 351.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU aa b c 5 10 15 20 25 SE +/- 0.05, N = 3 22.80 22.79 22.78 MIN: 1.57 / MAX: 164.42 MIN: 1.59 / MAX: 165.35 MIN: 1.63 / MAX: 162.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU aa b c 50 100 150 200 250 SE +/- 1.21, N = 3 216.18 217.16 217.41 MIN: 206.9 / MAX: 376.9 MIN: 208.82 / MAX: 374.93 MIN: 210.44 / MAX: 372.96 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU aa b c 5 10 15 20 25 SE +/- 0.02, N = 3 21.86 21.71 21.89 MIN: 2 / MAX: 157.1 MIN: 2.05 / MAX: 156.88 MIN: 2.07 / MAX: 156.71 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream aa b c 400 800 1200 1600 2000 SE +/- 1.78, N = 3 1844.12 1834.83 1833.45
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream aa b c 9 18 27 36 45 SE +/- 0.16, N = 3 38.32 38.52 37.96
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 12 24 36 48 60 SE +/- 0.11, N = 3 55.03 55.26 55.24
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 2 4 6 8 10 SE +/- 0.0154, N = 3 7.5520 7.5061 7.5924
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream aa b c 30 60 90 120 150 SE +/- 0.39, N = 3 132.95 132.74 131.52
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream aa b c 2 4 6 8 10 SE +/- 0.0064, N = 3 7.4741 7.4484 7.4767
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 6 12 18 24 30 SE +/- 0.06, N = 3 23.50 23.41 23.91
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 0.7163 1.4326 2.1489 2.8652 3.5815 SE +/- 0.0074, N = 3 3.1508 3.1835 3.1449
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream aa b c 5K 10K 15K 20K 25K SE +/- 55.68, N = 3 21332.89 21231.56 21169.30
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream aa b c 20 40 60 80 100 SE +/- 0.12, N = 3 77.30 77.49 77.12
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream aa b c 30 60 90 120 150 SE +/- 0.34, N = 3 132.54 131.95 131.86
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream aa b c 2 4 6 8 10 SE +/- 0.0095, N = 3 7.4691 7.4692 7.4540
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 70 140 210 280 350 SE +/- 0.51, N = 3 310.51 311.17 312.79
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 2 4 6 8 10 SE +/- 0.0129, N = 3 8.8709 8.8483 8.8461
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream aa b c 40 80 120 160 200 SE +/- 0.08, N = 3 182.88 181.90 185.72
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream aa b c 3 6 9 12 15 SE +/- 0.0713, N = 3 9.0819 9.1198 8.9820
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream aa b c 300 600 900 1200 1500 SE +/- 3.19, N = 3 1337.59 1335.35 1333.72
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream aa b c 8 16 24 32 40 SE +/- 0.01, N = 3 32.66 32.53 32.58
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 30 60 90 120 150 SE +/- 0.07, N = 3 143.83 143.48 143.60
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 5 10 15 20 25 SE +/- 0.03, N = 3 19.75 19.69 19.73
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream aa b c 400 800 1200 1600 2000 SE +/- 1.00, N = 3 1840.37 1835.26 1836.79
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream aa b c 9 18 27 36 45 SE +/- 0.04, N = 3 38.07 37.94 38.15
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: defconfig a aa b c 20 40 60 80 100 SE +/- 0.90, N = 3 94.27 92.76 94.43 94.50
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: allmodconfig aa b c 80 160 240 320 400 SE +/- 0.68, N = 3 348.02 350.29 349.92
Parallel BZIP2 Compression FreeBSD-13.0-RELEASE-amd64-memstick.img Compression OpenBenchmarking.org Seconds, Fewer Is Better Parallel BZIP2 Compression 1.1.13 FreeBSD-13.0-RELEASE-amd64-memstick.img Compression aa b c 0.5489 1.0978 1.6467 2.1956 2.7445 SE +/- 0.001512, N = 3 2.413553 2.439338 2.438631 1. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e12 aa b c 0.655 1.31 1.965 2.62 3.275 SE +/- 0.003, N = 3 2.911 2.872 2.893 1. (CXX) g++ options: -O3
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e13 aa b c 10 20 30 40 50 SE +/- 0.07, N = 3 42.31 42.44 42.29 1. (CXX) g++ options: -O3
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.7 WAV To WavPack aa b c 6 12 18 24 30 SE +/- 0.00, N = 5 25.20 25.21 25.20
Phoronix Test Suite v10.8.5