ddddx AMD Ryzen Threadripper PRO 5965WX 24-Cores testing with a ASUS Pro WS WRX80E-SAGE SE WIFI (1201 BIOS) and ASUS NVIDIA NV106 2GB on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2403218-NE-DDDDX513530&grr&rdt .
ddddx Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution a b c d AMD Ryzen Threadripper PRO 5965WX 24-Cores @ 3.80GHz (24 Cores / 48 Threads) ASUS Pro WS WRX80E-SAGE SE WIFI (1201 BIOS) AMD Starship/Matisse 8 x 16GB DDR4-2133MT/s Corsair CMK32GX4M2E3200C16 2048GB SOLIDIGM SSDPFKKW020X7 ASUS NVIDIA NV106 2GB AMD Starship/Matisse VA2431 2 x Intel X550 + Intel Wi-Fi 6 AX200 Ubuntu 23.10 6.5.0-15-generic (x86_64) GNOME Shell 45.0 X Server + Wayland nouveau 4.3 Mesa 23.2.1-1ubuntu3 GCC 13.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa008205 Python Details - Python 3.11.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
ddddx build-linux-kernel: allmodconfig brl-cad: VGR Performance Metric stockfish: Chess Benchmark ospray-studio: 3 - 4K - 32 - Path Tracer - CPU ospray-studio: 2 - 4K - 32 - Path Tracer - CPU ospray-studio: 1 - 4K - 32 - Path Tracer - CPU ospray: particle_volume/scivis/real_time ospray: particle_volume/pathtracer/real_time ospray: particle_volume/ao/real_time ospray-studio: 3 - 4K - 16 - Path Tracer - CPU jpegxl: PNG - 90 jpegxl: JPEG - 90 ospray-studio: 2 - 4K - 16 - Path Tracer - CPU ospray-studio: 1 - 4K - 16 - Path Tracer - CPU vvenc: Bosphorus 4K - Fast ospray-studio: 3 - 4K - 1 - Path Tracer - CPU ospray-studio: 2 - 4K - 1 - Path Tracer - CPU ospray-studio: 3 - 1080p - 16 - Path Tracer - CPU ospray-studio: 1 - 4K - 1 - Path Tracer - CPU primesieve: 1e13 onednn: Recurrent Neural Network Training - CPU v-ray: CPU onednn: Recurrent Neural Network Inference - CPU ospray: gravity_spheres_volume/dim_512/scivis/real_time ospray: gravity_spheres_volume/dim_512/ao/real_time ospray-studio: 2 - 1080p - 16 - Path Tracer - CPU deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream ospray-studio: 1 - 1080p - 16 - Path Tracer - CPU ospray-studio: 3 - 1080p - 1 - Path Tracer - CPU ospray-studio: 2 - 1080p - 1 - Path Tracer - CPU ospray-studio: 1 - 1080p - 1 - Path Tracer - CPU openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU ospray: gravity_spheres_volume/dim_512/pathtracer/real_time openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU rocksdb: Rand Fill Sync openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU rocksdb: Update Rand rocksdb: Overwrite rocksdb: Read Rand Write Rand rocksdb: Rand Fill rocksdb: Read While Writing rocksdb: Rand Read ospray-studio: 3 - 1080p - 32 - Path Tracer - CPU build-linux-kernel: defconfig rocksdb: Seq Fill ospray-studio: 2 - 1080p - 32 - Path Tracer - CPU ospray-studio: 1 - 1080p - 32 - Path Tracer - CPU deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream jpegxl: PNG - 80 deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream jpegxl: JPEG - 80 deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream vvenc: Bosphorus 4K - Faster deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream jpegxl-decode: 1 deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream vvenc: Bosphorus 1080p - Fast svt-av1: Preset 4 - Bosphorus 4K onednn: Deconvolution Batch shapes_1d - CPU jpegxl-decode: All srsran: PUSCH Processor Benchmark, Throughput Total jpegxl: PNG - 100 jpegxl: JPEG - 100 vvenc: Bosphorus 1080p - Faster onednn: IP Shapes 1D - CPU srsran: PUSCH Processor Benchmark, Throughput Thread svt-av1: Preset 8 - Bosphorus 4K srsran: PDSCH Processor Benchmark, Throughput Total svt-av1: Preset 4 - Bosphorus 1080p onednn: IP Shapes 3D - CPU draco: Church Facade compress-pbzip2: FreeBSD-13.0-RELEASE-amd64-memstick.img Compression draco: Lion svt-av1: Preset 8 - Bosphorus 1080p onednn: Convolution Batch Shapes Auto - CPU primesieve: 1e12 svt-av1: Preset 13 - Bosphorus 4K svt-av1: Preset 12 - Bosphorus 4K encode-wavpack: WAV To WavPack srsran: PDSCH Processor Benchmark, Throughput Thread onednn: Deconvolution Batch shapes_3d - CPU svt-av1: Preset 12 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 1080p a b c d 597.034 430387 52607528 177445 152653 150375 10.1386 156.396 10.2835 90362 39.403 42.502 78788 78061 7.06 5330 4616 21449 4528 77.148 1254.68 44287 638.128 4.59882 4.90257 18543 13.1742 75.876 18203 1336 1158 1138 1558.81 7.6 5873.2701 1.8772 715.22 16.66 7.46614 171.22 70.01 171.92 69.73 136 88.13 13.38 895.25 27.99 428.43 11.54 1033.67 10 1198.64 70.48 170.1 56.87 421.72 63.46 377.84 45528 10.86 1104.03 3.71 3224.64 0.53 44518.2 19.94 600.94 14.52 1652.05 5.46 2192.09 0.97 24478.31 16.69 718.2 671121 781166 2879787 790603 5546801 145962891 47495 54.133 920430 41839 41167 165.3999 6.0448 44.748 5.1435 194.2938 17.5108 684.5651 46.584 35.8509 334.3887 448.7091 26.6926 446.8406 26.7411 54.4167 18.372 54.0673 18.4908 14.566 393.6606 30.4683 46.0582 21.7009 64.032 53.4812 224.1641 8.8921 112.3308 6.003 1994.389 79.4548 150.8283 10.0615 99.3344 39.0535 306.9974 39.0528 306.9702 6.3927 156.2099 6.3846 156.4322 1.2491 798.4747 19.624 6.647 5.51082 483.415 1888.2 27.65 27.528 41.283 1.31474 177.6 62.954 11710.3 19.206 3.52569 7023 3.230305 5328 131.721 2.73804 6.244 151.958 154.17 4.433 603.7 2.36511 489.431 587.48 597.898 424641 61008270 175496 153552 150040 10.1433 155.373 10.2466 90131 37.44 39.892 78304 76995 7.075 5342 4609 21398 4521 76.918 1255.67 44634 636.768 4.57706 4.88053 18471 13.1312 76.1222 18181 1338 1150 1138 1547.31 7.68 5904.2788 1.8677 713.72 16.73 7.44837 171.43 69.95 171.16 70.01 135.7 88.33 13.47 889.36 28 428.18 11.41 1046.3 9.98 1201.22 70.23 170.68 56.3 426.03 63.08 380.22 45974 10.83 1106.44 3.69 3241.35 0.53 44461.02 20.04 598.03 14.49 1654.72 5.45 2197.5 0.97 24434.51 16.65 719.93 666715 779236 2859339 783967 5586539 145893753 47827 54.208 908177 41598 41125 165.5519 6.0393 43.245 5.2115 191.7617 17.4858 685.4489 42.66 35.5391 337.4525 447.6535 26.6035 446.2375 26.8554 54.0139 18.509 53.9731 18.5232 14.823 393.0761 30.4851 45.9601 21.7471 63.301 53.4714 224.1971 8.9317 111.8371 5.8898 2032.9567 78.8208 151.9811 10.0731 99.2228 39.0501 307.0188 39.0344 307.0893 6.4286 155.3413 6.4293 155.3135 1.2468 800.0474 19.489 6.787 5.33929 482.577 1889.2 27.572 27.323 40.461 1.31101 178.7 63.893 12063.4 18.916 3.52584 7034 3.314749 5365 131.928 2.71135 6.081 145.97 150.809 4.438 600.2 2.34602 499.14 602.968 596.506 420528 55237573 175767 152510 150381 10.1479 155.833 10.2804 90083 38.511 40.761 78919 77511 7.066 5341 4615 21409 4535 77.214 1256.43 44375 637.173 4.59266 4.88608 18579 13.1431 76.0575 18229 1333 1152 1138 1563.15 7.59 5896.5933 1.8705 716.16 16.66 7.43821 171.67 69.83 171.29 69.99 136.15 88.05 13.39 894.57 28.08 427.02 11.46 1041.67 9.97 1202.09 70.66 169.65 56.64 423.46 63.40 378.34 46701 10.82 1107.64 3.71 3228.19 0.53 44306.10 20.21 592.96 14.56 1647.13 5.45 2197.98 0.98 24329.21 16.76 715.28 660978 777804 2868681 774384 5592760 144921763 47686 52.812 908882 42050 41189 165.4651 6.0425 39.904 5.1787 192.9770 17.5229 684.0159 42.353 35.8092 334.7472 449.1187 26.6165 447.5829 26.6956 54.3292 18.4018 54.0787 18.4866 14.777 393.8206 30.3866 46.0763 21.6925 63.038 53.7553 223.0028 8.9189 112.0022 5.9888 1998.6073 79.2452 151.2559 10.1159 98.8001 39.0664 306.9159 39.0572 306.9464 6.4155 155.6803 6.4103 155.7940 1.2478 799.4151 19.460 6.711 5.37367 470.159 1886.9 27.487 27.321 40.915 1.32743 178.6 62.498 11723.5 19.170 3.53323 7092 3.285082 5363 133.270 2.73192 6.113 150.522 150.133 4.435 605.5 2.36138 487.658 606.360 597.827 425778 53082452 175129 151620 150694 10.1833 155.049 10.2985 90380 38.265 39.595 78354 77073 7.062 5331 4618 21385 4528 77.06 1254.43 44633 642.194 4.58081 4.89928 18488 13.2048 75.6991 18177 1330 1153 1134 1553.54 7.64 5897.2791 1.8701 713.64 16.74 7.4522 172.18 69.61 171.54 69.86 135.96 88.14 13.4 894.41 27.98 428.55 11.44 1043.82 9.97 1202.22 70.76 169.41 56.9 421.54 63.54 377.49 46299 10.86 1103.6 3.71 3230.25 0.53 44596.35 19.84 603.96 14.51 1652.91 5.45 2198.11 0.97 24496.84 16.72 716.84 658520 762175 2814513 777115 5633613 145516874 47597 54.163 919148 41959 41111 165.6061 6.0373 42.176 5.1408 194.384 17.476 685.8159 43.502 35.8135 334.7181 448.927 26.6941 447.1743 26.8079 54.3402 18.3978 54.1018 18.479 14.859 393.394 30.488 46.0166 21.7206 62.817 53.6259 223.554 8.9625 111.4506 5.9417 2014.5766 79.5069 150.7344 10.0902 99.0547 39.0595 306.9436 39.0746 306.8997 6.4001 156.0415 6.4059 155.8985 1.2559 794.3902 19.529 6.66 5.03641 439.613 1857 27.338 27.319 41 1.31333 177.9 62.931 11654.9 19.318 3.53851 7096 3.221993 5347 133.784 2.73352 6.123 152.466 151.638 4.442 607.8 2.34346 488.521 587.971 OpenBenchmarking.org
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: allmodconfig a b c d 130 260 390 520 650 SE +/- 0.92, N = 3 597.03 597.90 596.51 597.83
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.38.2 VGR Performance Metric a b c d 90K 180K 270K 360K 450K 430387 424641 420528 425778 1. (CXX) g++ options: -std=c++17 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lnetpbm -lregex_brl -lz_brl -lassimp -ldl -lm -ltk8.6
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 16.1 Chess Benchmark a b c d 13M 26M 39M 52M 65M SE +/- 1129127.95, N = 15 52607528 61008270 55237573 53082452 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver
OSPRay Studio Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b c d 40K 80K 120K 160K 200K SE +/- 268.88, N = 3 177445 175496 175767 175129
OSPRay Studio Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b c d 30K 60K 90K 120K 150K SE +/- 88.33, N = 3 152653 153552 152510 151620
OSPRay Studio Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b c d 30K 60K 90K 120K 150K SE +/- 64.70, N = 3 150375 150040 150381 150694
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: particle_volume/scivis/real_time a b c d 3 6 9 12 15 SE +/- 0.01, N = 3 10.14 10.14 10.15 10.18
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: particle_volume/pathtracer/real_time a b c d 30 60 90 120 150 SE +/- 0.22, N = 3 156.40 155.37 155.83 155.05
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: particle_volume/ao/real_time a b c d 3 6 9 12 15 SE +/- 0.01, N = 3 10.28 10.25 10.28 10.30
OSPRay Studio Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU a b c d 20K 40K 60K 80K 100K SE +/- 228.07, N = 3 90362 90131 90083 90380
JPEG-XL libjxl Input: PNG - Quality: 90 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 90 a b c d 9 18 27 36 45 SE +/- 0.27, N = 15 39.40 37.44 38.51 38.27 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: JPEG - Quality: 90 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 90 a b c d 10 20 30 40 50 SE +/- 0.39, N = 15 42.50 39.89 40.76 39.60 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
OSPRay Studio Camera: 2 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU a b c d 20K 40K 60K 80K 100K SE +/- 110.73, N = 3 78788 78304 78919 78354
OSPRay Studio Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU a b c d 20K 40K 60K 80K 100K SE +/- 155.86, N = 3 78061 76995 77511 77073
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 4K - Video Preset: Fast a b c d 2 4 6 8 10 SE +/- 0.031, N = 3 7.060 7.075 7.066 7.062 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OSPRay Studio Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU a b c d 1100 2200 3300 4400 5500 SE +/- 4.04, N = 3 5330 5342 5341 5331
OSPRay Studio Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU a b c d 1000 2000 3000 4000 5000 SE +/- 7.22, N = 3 4616 4609 4615 4618
OSPRay Studio Camera: 3 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU a b c d 5K 10K 15K 20K 25K SE +/- 35.14, N = 3 21449 21398 21409 21385
OSPRay Studio Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU a b c d 1000 2000 3000 4000 5000 SE +/- 4.41, N = 3 4528 4521 4535 4528
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e13 a b c d 20 40 60 80 100 SE +/- 0.05, N = 3 77.15 76.92 77.21 77.06 1. (CXX) g++ options: -O3
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Training - Engine: CPU a b c d 300 600 900 1200 1500 SE +/- 0.77, N = 3 1254.68 1255.67 1256.43 1254.43 MIN: 1250.31 MIN: 1250.99 MIN: 1249.44 MIN: 1249.27 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Chaos Group V-RAY Mode: CPU OpenBenchmarking.org vsamples, More Is Better Chaos Group V-RAY 6.0 Mode: CPU a b c d 10K 20K 30K 40K 50K SE +/- 195.74, N = 3 44287 44634 44375 44633
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Inference - Engine: CPU a b c d 140 280 420 560 700 SE +/- 0.39, N = 3 638.13 636.77 637.17 642.19 MIN: 634.67 MIN: 632.58 MIN: 632.47 MIN: 633.4 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time a b c d 1.0347 2.0694 3.1041 4.1388 5.1735 SE +/- 0.00963, N = 3 4.59882 4.57706 4.59266 4.58081
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: gravity_spheres_volume/dim_512/ao/real_time a b c d 1.1031 2.2062 3.3093 4.4124 5.5155 SE +/- 0.01074, N = 3 4.90257 4.88053 4.88608 4.89928
OSPRay Studio Camera: 2 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU a b c d 4K 8K 12K 16K 20K SE +/- 16.33, N = 3 18543 18471 18579 18488
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 3 6 9 12 15 SE +/- 0.02, N = 3 13.17 13.13 13.14 13.20
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 20 40 60 80 100 SE +/- 0.14, N = 3 75.88 76.12 76.06 75.70
OSPRay Studio Camera: 1 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU a b c d 4K 8K 12K 16K 20K SE +/- 39.50, N = 3 18203 18181 18229 18177
OSPRay Studio Camera: 3 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU a b c d 300 600 900 1200 1500 SE +/- 4.41, N = 3 1336 1338 1333 1330
OSPRay Studio Camera: 2 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU a b c d 200 400 600 800 1000 SE +/- 0.67, N = 3 1158 1150 1152 1153
OSPRay Studio Camera: 1 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU a b c d 200 400 600 800 1000 SE +/- 1.00, N = 3 1138 1138 1138 1134
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU a b c d 300 600 900 1200 1500 SE +/- 0.80, N = 3 1558.81 1547.31 1563.15 1553.54 MIN: 1416.22 / MAX: 1644.19 MIN: 1403.59 / MAX: 1636.72 MIN: 1369.79 / MAX: 1663.37 MIN: 1365.71 / MAX: 1635.16 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU a b c d 2 4 6 8 10 SE +/- 0.01, N = 3 7.60 7.68 7.59 7.64 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream a b c d 1300 2600 3900 5200 6500 SE +/- 9.69, N = 3 5873.27 5904.28 5896.59 5897.28
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream a b c d 0.4224 0.8448 1.2672 1.6896 2.112 SE +/- 0.0031, N = 3 1.8772 1.8677 1.8705 1.8701
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU a b c d 150 300 450 600 750 SE +/- 0.26, N = 3 715.22 713.72 716.16 713.64 MIN: 664.62 / MAX: 729.04 MIN: 658.61 / MAX: 738.08 MIN: 661.6 / MAX: 732.11 MIN: 667.56 / MAX: 731.59 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU a b c d 4 8 12 16 20 SE +/- 0.01, N = 3 16.66 16.73 16.66 16.74 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time a b c d 2 4 6 8 10 SE +/- 0.00473, N = 3 7.46614 7.44837 7.43821 7.45220
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU a b c d 40 80 120 160 200 SE +/- 0.11, N = 3 171.22 171.43 171.67 172.18 MIN: 130.32 / MAX: 233.99 MIN: 140.26 / MAX: 224.54 MIN: 132.19 / MAX: 231.16 MIN: 138.53 / MAX: 224.8 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU a b c d 16 32 48 64 80 SE +/- 0.04, N = 3 70.01 69.95 69.83 69.61 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU a b c d 40 80 120 160 200 SE +/- 0.12, N = 3 171.92 171.16 171.29 171.54 MIN: 129.51 / MAX: 227.57 MIN: 129.54 / MAX: 225.82 MIN: 134.25 / MAX: 225.4 MIN: 135.7 / MAX: 226.04 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU a b c d 16 32 48 64 80 SE +/- 0.06, N = 3 69.73 70.01 69.99 69.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU a b c d 30 60 90 120 150 SE +/- 0.06, N = 3 136.00 135.70 136.15 135.96 MIN: 118.61 / MAX: 153.61 MIN: 109.99 / MAX: 155.71 MIN: 73.11 / MAX: 161.85 MIN: 110.6 / MAX: 155.59 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU a b c d 20 40 60 80 100 SE +/- 0.03, N = 3 88.13 88.33 88.05 88.14 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b c d 3 6 9 12 15 SE +/- 0.05, N = 3 13.38 13.47 13.39 13.40 MIN: 7.23 / MAX: 35.43 MIN: 7.33 / MAX: 31.02 MIN: 7.1 / MAX: 31.9 MIN: 8.15 / MAX: 34.65 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b c d 200 400 600 800 1000 SE +/- 3.73, N = 3 895.25 889.36 894.57 894.41 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU a b c d 7 14 21 28 35 SE +/- 0.02, N = 3 27.99 28.00 28.08 27.98 MIN: 18.95 / MAX: 38.76 MIN: 18.81 / MAX: 38.86 MIN: 14.29 / MAX: 41.57 MIN: 14.99 / MAX: 46.26 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU a b c d 90 180 270 360 450 SE +/- 0.36, N = 3 428.43 428.18 427.02 428.55 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU a b c d 3 6 9 12 15 SE +/- 0.03, N = 3 11.54 11.41 11.46 11.44 MIN: 6.61 / MAX: 30.79 MIN: 7.86 / MAX: 31.96 MIN: 6.19 / MAX: 32.08 MIN: 9.06 / MAX: 31.81 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU a b c d 200 400 600 800 1000 SE +/- 3.03, N = 3 1033.67 1046.30 1041.67 1043.82 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU a b c d 3 6 9 12 15 SE +/- 0.01, N = 3 10.00 9.98 9.97 9.97 MIN: 6.87 / MAX: 16.1 MIN: 5.6 / MAX: 24.1 MIN: 5.91 / MAX: 41.34 MIN: 5.63 / MAX: 25.55 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU a b c d 300 600 900 1200 1500 SE +/- 0.87, N = 3 1198.64 1201.22 1202.09 1202.22 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU a b c d 16 32 48 64 80 SE +/- 0.02, N = 3 70.48 70.23 70.66 70.76 MIN: 43.7 / MAX: 128.66 MIN: 43.23 / MAX: 122.75 MIN: 24.84 / MAX: 127.05 MIN: 41.53 / MAX: 122.26 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU a b c d 40 80 120 160 200 SE +/- 0.04, N = 3 170.10 170.68 169.65 169.41 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU a b c d 13 26 39 52 65 SE +/- 0.01, N = 3 56.87 56.30 56.64 56.90 MIN: 35.77 / MAX: 72.74 MIN: 52.17 / MAX: 69.13 MIN: 36.21 / MAX: 73.18 MIN: 51.72 / MAX: 72.17 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU a b c d 90 180 270 360 450 SE +/- 0.10, N = 3 421.72 426.03 423.46 421.54 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU a b c d 14 28 42 56 70 SE +/- 0.11, N = 3 63.46 63.08 63.40 63.54 MIN: 45.07 / MAX: 85.2 MIN: 42.84 / MAX: 83.6 MIN: 37.56 / MAX: 90.19 MIN: 41.89 / MAX: 89.71 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU a b c d 80 160 240 320 400 SE +/- 0.65, N = 3 377.84 380.22 378.34 377.49 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
RocksDB Test: Random Fill Sync OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Fill Sync a b c d 10K 20K 30K 40K 50K SE +/- 32.60, N = 3 45528 45974 46701 46299 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU a b c d 3 6 9 12 15 SE +/- 0.02, N = 3 10.86 10.83 10.82 10.86 MIN: 7.32 / MAX: 25.15 MIN: 6.67 / MAX: 24.71 MIN: 5.79 / MAX: 27.55 MIN: 6.39 / MAX: 25.76 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU a b c d 200 400 600 800 1000 SE +/- 1.89, N = 3 1104.03 1106.44 1107.64 1103.60 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU a b c d 0.8348 1.6696 2.5044 3.3392 4.174 SE +/- 0.01, N = 3 3.71 3.69 3.71 3.71 MIN: 2.14 / MAX: 26.73 MIN: 2.28 / MAX: 15.95 MIN: 2.23 / MAX: 18 MIN: 2.38 / MAX: 15.94 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU a b c d 700 1400 2100 2800 3500 SE +/- 3.42, N = 3 3224.64 3241.35 3228.19 3230.25 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b c d 0.1193 0.2386 0.3579 0.4772 0.5965 SE +/- 0.00, N = 3 0.53 0.53 0.53 0.53 MIN: 0.3 / MAX: 12.95 MIN: 0.3 / MAX: 12.53 MIN: 0.3 / MAX: 13.83 MIN: 0.31 / MAX: 12.66 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b c d 10K 20K 30K 40K 50K SE +/- 27.92, N = 3 44518.20 44461.02 44306.10 44596.35 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU a b c d 5 10 15 20 25 SE +/- 0.01, N = 3 19.94 20.04 20.21 19.84 MIN: 8.87 / MAX: 42.42 MIN: 12.81 / MAX: 37.51 MIN: 9.28 / MAX: 49.09 MIN: 11.38 / MAX: 34.24 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU a b c d 130 260 390 520 650 SE +/- 0.39, N = 3 600.94 598.03 592.96 603.96 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b c d 4 8 12 16 20 SE +/- 0.01, N = 3 14.52 14.49 14.56 14.51 MIN: 8.14 / MAX: 28.07 MIN: 8.36 / MAX: 28.02 MIN: 8.28 / MAX: 27.62 MIN: 8.58 / MAX: 29.8 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b c d 400 800 1200 1600 2000 SE +/- 0.87, N = 3 1652.05 1654.72 1647.13 1652.91 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU a b c d 1.2285 2.457 3.6855 4.914 6.1425 SE +/- 0.01, N = 3 5.46 5.45 5.45 5.45 MIN: 2.82 / MAX: 21.64 MIN: 2.89 / MAX: 21.85 MIN: 2.8 / MAX: 22.36 MIN: 2.8 / MAX: 28.34 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU a b c d 500 1000 1500 2000 2500 SE +/- 4.26, N = 3 2192.09 2197.50 2197.98 2198.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b c d 0.2205 0.441 0.6615 0.882 1.1025 SE +/- 0.00, N = 3 0.97 0.97 0.98 0.97 MIN: 0.66 / MAX: 15.15 MIN: 0.57 / MAX: 14.08 MIN: 0.53 / MAX: 16.95 MIN: 0.55 / MAX: 13.67 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b c d 5K 10K 15K 20K 25K SE +/- 23.00, N = 3 24478.31 24434.51 24329.21 24496.84 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU a b c d 4 8 12 16 20 SE +/- 0.01, N = 3 16.69 16.65 16.76 16.72 MIN: 13.14 / MAX: 33.56 MIN: 10 / MAX: 33.75 MIN: 8.74 / MAX: 34.17 MIN: 9.04 / MAX: 25.27 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU a b c d 160 320 480 640 800 SE +/- 0.61, N = 3 718.20 719.93 715.28 716.84 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
RocksDB Test: Update Random OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Update Random a b c d 140K 280K 420K 560K 700K SE +/- 1458.06, N = 3 671121 666715 660978 658520 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Overwrite OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Overwrite a b c d 200K 400K 600K 800K 1000K SE +/- 4178.44, N = 3 781166 779236 777804 762175 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read Random Write Random a b c d 600K 1200K 1800K 2400K 3000K SE +/- 5471.11, N = 3 2879787 2859339 2868681 2814513 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Fill a b c d 200K 400K 600K 800K 1000K SE +/- 1329.24, N = 3 790603 783967 774384 777115 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read While Writing a b c d 1.2M 2.4M 3.6M 4.8M 6M SE +/- 38492.46, N = 3 5546801 5586539 5592760 5633613 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Read a b c d 30M 60M 90M 120M 150M SE +/- 374266.35, N = 3 145962891 145893753 144921763 145516874 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OSPRay Studio Camera: 3 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b c d 10K 20K 30K 40K 50K SE +/- 129.36, N = 3 47495 47827 47686 47597
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: defconfig a b c d 12 24 36 48 60 SE +/- 0.59, N = 3 54.13 54.21 52.81 54.16
RocksDB Test: Sequential Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Sequential Fill a b c d 200K 400K 600K 800K 1000K SE +/- 4090.17, N = 3 920430 908177 908882 919148 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OSPRay Studio Camera: 2 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b c d 9K 18K 27K 36K 45K SE +/- 250.02, N = 3 41839 41598 42050 41959
OSPRay Studio Camera: 1 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b c d 9K 18K 27K 36K 45K SE +/- 58.89, N = 3 41167 41125 41189 41111
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream a b c d 40 80 120 160 200 SE +/- 0.11, N = 3 165.40 165.55 165.47 165.61
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream a b c d 2 4 6 8 10 SE +/- 0.0038, N = 3 6.0448 6.0393 6.0425 6.0373
JPEG-XL libjxl Input: PNG - Quality: 80 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 80 a b c d 10 20 30 40 50 SE +/- 0.29, N = 3 44.75 43.25 39.90 42.18 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 1.1726 2.3452 3.5178 4.6904 5.863 SE +/- 0.0247, N = 3 5.1435 5.2115 5.1787 5.1408
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 40 80 120 160 200 SE +/- 0.92, N = 3 194.29 191.76 192.98 194.38
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 4 8 12 16 20 SE +/- 0.01, N = 3 17.51 17.49 17.52 17.48
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 150 300 450 600 750 SE +/- 0.35, N = 3 684.57 685.45 684.02 685.82
JPEG-XL libjxl Input: JPEG - Quality: 80 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 80 a b c d 11 22 33 44 55 SE +/- 0.32, N = 3 46.58 42.66 42.35 43.50 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 8 16 24 32 40 SE +/- 0.06, N = 3 35.85 35.54 35.81 35.81
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 70 140 210 280 350 SE +/- 0.53, N = 3 334.39 337.45 334.75 334.72
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b c d 100 200 300 400 500 SE +/- 0.38, N = 3 448.71 447.65 449.12 448.93
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b c d 6 12 18 24 30 SE +/- 0.04, N = 3 26.69 26.60 26.62 26.69
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b c d 100 200 300 400 500 SE +/- 0.66, N = 3 446.84 446.24 447.58 447.17
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b c d 6 12 18 24 30 SE +/- 0.03, N = 3 26.74 26.86 26.70 26.81
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a b c d 12 24 36 48 60 SE +/- 0.05, N = 3 54.42 54.01 54.33 54.34
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a b c d 5 10 15 20 25 SE +/- 0.02, N = 3 18.37 18.51 18.40 18.40
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream a b c d 12 24 36 48 60 SE +/- 0.02, N = 3 54.07 53.97 54.08 54.10
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream a b c d 5 10 15 20 25 SE +/- 0.01, N = 3 18.49 18.52 18.49 18.48
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 4K - Video Preset: Faster a b c d 4 8 12 16 20 SE +/- 0.03, N = 3 14.57 14.82 14.78 14.86 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b c d 90 180 270 360 450 SE +/- 0.32, N = 3 393.66 393.08 393.82 393.39
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b c d 7 14 21 28 35 SE +/- 0.09, N = 3 30.47 30.49 30.39 30.49
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b c d 10 20 30 40 50 SE +/- 0.03, N = 3 46.06 45.96 46.08 46.02
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b c d 5 10 15 20 25 SE +/- 0.01, N = 3 21.70 21.75 21.69 21.72
JPEG-XL Decoding libjxl CPU Threads: 1 OpenBenchmarking.org MP/s, More Is Better JPEG-XL Decoding libjxl 0.10.1 CPU Threads: 1 a b c d 14 28 42 56 70 SE +/- 0.12, N = 3 64.03 63.30 63.04 62.82
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b c d 12 24 36 48 60 SE +/- 0.05, N = 3 53.48 53.47 53.76 53.63
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b c d 50 100 150 200 250 SE +/- 0.20, N = 3 224.16 224.20 223.00 223.55
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream a b c d 3 6 9 12 15 SE +/- 0.0057, N = 3 8.8921 8.9317 8.9189 8.9625
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream a b c d 30 60 90 120 150 SE +/- 0.07, N = 3 112.33 111.84 112.00 111.45
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 2 4 6 8 10 SE +/- 0.0186, N = 3 6.0030 5.8898 5.9888 5.9417
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 400 800 1200 1600 2000 SE +/- 6.40, N = 3 1994.39 2032.96 1998.61 2014.58
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 20 40 60 80 100 SE +/- 0.16, N = 3 79.45 78.82 79.25 79.51
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 30 60 90 120 150 SE +/- 0.31, N = 3 150.83 151.98 151.26 150.73
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 3 6 9 12 15 SE +/- 0.01, N = 3 10.06 10.07 10.12 10.09
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 20 40 60 80 100 SE +/- 0.05, N = 3 99.33 99.22 98.80 99.05
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream a b c d 9 18 27 36 45 SE +/- 0.00, N = 3 39.05 39.05 39.07 39.06
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream a b c d 70 140 210 280 350 SE +/- 0.00, N = 3 307.00 307.02 306.92 306.94
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b c d 9 18 27 36 45 SE +/- 0.00, N = 3 39.05 39.03 39.06 39.07
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b c d 70 140 210 280 350 SE +/- 0.03, N = 3 306.97 307.09 306.95 306.90
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream a b c d 2 4 6 8 10 SE +/- 0.0148, N = 3 6.3927 6.4286 6.4155 6.4001
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream a b c d 30 60 90 120 150 SE +/- 0.37, N = 3 156.21 155.34 155.68 156.04
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b c d 2 4 6 8 10 SE +/- 0.0061, N = 3 6.3846 6.4293 6.4103 6.4059
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b c d 30 60 90 120 150 SE +/- 0.15, N = 3 156.43 155.31 155.79 155.90
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 0.2826 0.5652 0.8478 1.1304 1.413 SE +/- 0.0030, N = 3 1.2491 1.2468 1.2478 1.2559
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 200 400 600 800 1000 SE +/- 1.90, N = 3 798.47 800.05 799.42 794.39
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 1080p - Video Preset: Fast a b c d 5 10 15 20 25 SE +/- 0.03, N = 3 19.62 19.49 19.46 19.53 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 4K a b c d 2 4 6 8 10 SE +/- 0.005, N = 3 6.647 6.787 6.711 6.660 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_1d - Engine: CPU a b c d 1.2399 2.4798 3.7197 4.9596 6.1995 SE +/- 0.03672, N = 3 5.51082 5.33929 5.37367 5.03641 MIN: 3.9 MIN: 3.86 MIN: 3.84 MIN: 3.91 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
JPEG-XL Decoding libjxl CPU Threads: All OpenBenchmarking.org MP/s, More Is Better JPEG-XL Decoding libjxl 0.10.1 CPU Threads: All a b c d 100 200 300 400 500 SE +/- 1.28, N = 3 483.42 482.58 470.16 439.61
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PUSCH Processor Benchmark, Throughput Total a b c d 400 800 1200 1600 2000 SE +/- 1.62, N = 3 1888.2 1889.2 1886.9 1857.0 MIN: 1136.3 MIN: 1137.6 MIN: 1135 / MAX: 1889.2 MIN: 1145.4 1. (CXX) g++ options: -march=native -mavx2 -mavx -msse4.1 -mfma -O3 -fno-trapping-math -fno-math-errno -ldl
JPEG-XL libjxl Input: PNG - Quality: 100 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 100 a b c d 7 14 21 28 35 SE +/- 0.01, N = 3 27.65 27.57 27.49 27.34 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: JPEG - Quality: 100 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 100 a b c d 6 12 18 24 30 SE +/- 0.03, N = 3 27.53 27.32 27.32 27.32 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 1080p - Video Preset: Faster a b c d 9 18 27 36 45 SE +/- 0.15, N = 3 41.28 40.46 40.92 41.00 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 1D - Engine: CPU a b c d 0.2987 0.5974 0.8961 1.1948 1.4935 SE +/- 0.00673, N = 3 1.31474 1.31101 1.32743 1.31333 MIN: 1.27 MIN: 1.27 MIN: 1.26 MIN: 1.27 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PUSCH Processor Benchmark, Throughput Thread a b c d 40 80 120 160 200 SE +/- 0.70, N = 3 177.6 178.7 178.6 177.9 MIN: 113.3 MIN: 112.2 MIN: 112.6 / MAX: 179.4 MIN: 110.9 1. (CXX) g++ options: -march=native -mavx2 -mavx -msse4.1 -mfma -O3 -fno-trapping-math -fno-math-errno -ldl
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 4K a b c d 14 28 42 56 70 SE +/- 0.26, N = 3 62.95 63.89 62.50 62.93 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PDSCH Processor Benchmark, Throughput Total a b c d 3K 6K 9K 12K 15K SE +/- 27.21, N = 3 11710.3 12063.4 11723.5 11654.9 1. (CXX) g++ options: -march=native -mavx2 -mavx -msse4.1 -mfma -O3 -fno-trapping-math -fno-math-errno -ldl
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 1080p a b c d 5 10 15 20 25 SE +/- 0.04, N = 3 19.21 18.92 19.17 19.32 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 3D - Engine: CPU a b c d 0.7962 1.5924 2.3886 3.1848 3.981 SE +/- 0.00386, N = 3 3.52569 3.52584 3.53323 3.53851 MIN: 3.48 MIN: 3.48 MIN: 3.47 MIN: 3.49 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade a b c d 1500 3000 4500 6000 7500 SE +/- 9.91, N = 3 7023 7034 7092 7096 1. (CXX) g++ options: -O3
Parallel BZIP2 Compression FreeBSD-13.0-RELEASE-amd64-memstick.img Compression OpenBenchmarking.org Seconds, Fewer Is Better Parallel BZIP2 Compression 1.1.13 FreeBSD-13.0-RELEASE-amd64-memstick.img Compression a b c d 0.7458 1.4916 2.2374 2.9832 3.729 SE +/- 0.043942, N = 12 3.230305 3.314749 3.285082 3.221993 1. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion a b c d 1200 2400 3600 4800 6000 SE +/- 15.72, N = 3 5328 5365 5363 5347 1. (CXX) g++ options: -O3
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a b c d 30 60 90 120 150 SE +/- 0.72, N = 3 131.72 131.93 133.27 133.78 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Convolution Batch Shapes Auto - Engine: CPU a b c d 0.6161 1.2322 1.8483 2.4644 3.0805 SE +/- 0.00421, N = 3 2.73804 2.71135 2.73192 2.73352 MIN: 2.66 MIN: 2.65 MIN: 2.66 MIN: 2.67 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e12 a b c d 2 4 6 8 10 SE +/- 0.056, N = 3 6.244 6.081 6.113 6.123 1. (CXX) g++ options: -O3
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b c d 30 60 90 120 150 SE +/- 1.34, N = 3 151.96 145.97 150.52 152.47 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 4K a b c d 30 60 90 120 150 SE +/- 0.99, N = 3 154.17 150.81 150.13 151.64 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.7 WAV To WavPack a b c d 0.9995 1.999 2.9985 3.998 4.9975 SE +/- 0.000, N = 5 4.433 4.438 4.435 4.442
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PDSCH Processor Benchmark, Throughput Thread a b c d 130 260 390 520 650 SE +/- 1.71, N = 3 603.7 600.2 605.5 607.8 1. (CXX) g++ options: -march=native -mavx2 -mavx -msse4.1 -mfma -O3 -fno-trapping-math -fno-math-errno -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_3d - Engine: CPU a b c d 0.5321 1.0642 1.5963 2.1284 2.6605 SE +/- 0.00819, N = 3 2.36511 2.34602 2.36138 2.34346 MIN: 2.28 MIN: 2.28 MIN: 2.25 MIN: 2.25 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a b c d 110 220 330 440 550 SE +/- 6.37, N = 3 489.43 499.14 487.66 488.52 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a b c d 130 260 390 520 650 SE +/- 4.87, N = 3 587.48 602.97 606.36 587.97 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Phoronix Test Suite v10.8.5