svt deepsparse AMD Ryzen Threadripper 7980X 64-Cores testing with a System76 Thelio Major (FA Z5 BIOS) and AMD Radeon Pro W7900 45GB on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2403159-PTS-SVTDEEPS65&sro&gru .
svt deepsparse Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c d AMD Ryzen Threadripper 7980X 64-Cores @ 7.79GHz (64 Cores / 128 Threads) System76 Thelio Major (FA Z5 BIOS) AMD Device 14a4 4 x 32GB DRAM-4800MT/s Micron MTC20F1045S1RC48BA2 1000GB CT1000T700SSD5 AMD Radeon Pro W7900 45GB (1760/1124MHz) AMD Device 14cc DELL P2415Q Aquantia AQC113C NBase-T/IEEE + Realtek RTL8125 2.5GbE + Intel Wi-Fi 6 AX210/AX211/AX411 Ubuntu 23.10 6.5.0-25-generic (x86_64) GNOME Shell 45.2 X Server + Wayland 4.6 Mesa 23.2.1-1ubuntu3.1 (LLVM 15.0.7 DRM 3.54) GCC 13.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105 Python Details - Python 3.11.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
svt deepsparse svt-av1: Preset 4 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 12 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K svt-av1: Preset 4 - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 1080p svt-av1: Preset 12 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 1080p deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream a b c d 9.892 94.509 188.558 192.567 27.476 193.759 616.583 662.493 21.8474 17.2332 887.4442 252.3577 274.2576 170.3090 2253.8602 1134.7704 3.4391 11.1249 274.5853 170.9594 129.4408 87.3635 188.3255 113.4701 39.5729 29.7744 396.5768 89.7091 21.9224 17.2184 365.3355 58.0191 8.9970 3.9594 29.1453 5.8648 3.5332 0.8788 2252.4627 89.8704 29.1108 5.8423 61.7698 11.4406 42.4509 8.8070 202.0153 33.5681 20.1516 11.1350 363.5336 58.0689 9.793 93.734 195.907 193.286 27.282 194.769 615.540 653.688 21.8225 17.2332 886.1847 251.6102 274.4692 171.4934 2269.5068 1131.6373 3.4406 11.0159 274.3872 171.3099 128.9560 87.5496 188.1505 114.4072 39.4685 29.8179 396.4446 90.4983 21.9027 17.1960 365.9832 58.0188 9.0094 3.9707 29.1215 5.8243 3.5087 0.8810 2251.6125 90.7521 29.1324 5.8302 61.9978 11.4159 42.4853 8.7347 202.5056 33.5194 20.1585 11.0383 364.4283 58.1442 9.889 93.930 194.065 192.638 27.518 189.608 625.817 671.832 21.8606 17.2329 884.6567 251.2477 274.4696 171.4781 2254.2228 1133.4956 3.4298 11.0814 274.2789 171.2808 129.1012 87.6317 188.2128 113.8761 39.4664 29.7891 397.5870 89.8053 21.9041 17.1825 365.5771 58.0207 9.0252 3.9766 29.1230 5.8245 3.5324 0.8796 2258.1407 90.2253 29.1435 5.8312 61.9371 11.4055 42.4756 8.7752 202.3677 33.5525 20.1007 11.1223 363.7623 58.1916 9.866 94.559 195.858 187.625 27.241 195.612 621.442 670.138 21.7950 17.2130 886.7758 253.0264 274.4604 171.3561 2236.9169 1132.3151 3.4284 11.0145 274.5981 171.3647 129.4552 87.6213 188.4415 114.8252 39.4525 29.7882 397.5652 90.3875 21.9527 17.1981 366.6823 58.0879 9.0040 3.9490 29.1232 5.8285 3.5604 0.8806 2259.2648 90.7667 29.1083 5.8281 61.7579 11.4067 42.4217 8.7026 202.4630 33.5532 20.1014 11.0513 363.5857 58.1383 OpenBenchmarking.org
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 4K a b c d 3 6 9 12 15 SE +/- 0.009, N = 3 SE +/- 0.014, N = 3 SE +/- 0.028, N = 3 SE +/- 0.034, N = 3 9.892 9.793 9.889 9.866 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 4K a b c d 20 40 60 80 100 SE +/- 0.33, N = 3 SE +/- 0.31, N = 3 SE +/- 0.20, N = 3 SE +/- 0.60, N = 3 94.51 93.73 93.93 94.56 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 4K a b c d 40 80 120 160 200 SE +/- 2.09, N = 5 SE +/- 2.54, N = 3 SE +/- 1.67, N = 3 SE +/- 2.35, N = 3 188.56 195.91 194.07 195.86 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b c d 40 80 120 160 200 SE +/- 2.20, N = 3 SE +/- 2.02, N = 15 SE +/- 2.22, N = 3 SE +/- 1.36, N = 3 192.57 193.29 192.64 187.63 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 1080p a b c d 6 12 18 24 30 SE +/- 0.22, N = 3 SE +/- 0.10, N = 3 SE +/- 0.16, N = 3 SE +/- 0.12, N = 3 27.48 27.28 27.52 27.24 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a b c d 40 80 120 160 200 SE +/- 2.31, N = 3 SE +/- 1.86, N = 3 SE +/- 0.78, N = 3 SE +/- 1.92, N = 3 193.76 194.77 189.61 195.61 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a b c d 140 280 420 560 700 SE +/- 8.04, N = 3 SE +/- 5.36, N = 3 SE +/- 4.75, N = 3 SE +/- 4.99, N = 3 616.58 615.54 625.82 621.44 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a b c d 140 280 420 560 700 SE +/- 3.32, N = 3 SE +/- 1.76, N = 3 SE +/- 5.18, N = 15 SE +/- 6.00, N = 15 662.49 653.69 671.83 670.14 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b c d 5 10 15 20 25 SE +/- 0.08, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 21.85 21.82 21.86 21.80
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a b c d 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 17.23 17.23 17.23 17.21
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 200 400 600 800 1000 SE +/- 0.68, N = 3 SE +/- 0.69, N = 3 SE +/- 0.15, N = 3 SE +/- 0.73, N = 3 887.44 886.18 884.66 886.78
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 60 120 180 240 300 SE +/- 1.31, N = 3 SE +/- 0.19, N = 3 SE +/- 0.20, N = 3 SE +/- 1.88, N = 3 252.36 251.61 251.25 253.03
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream a b c d 60 120 180 240 300 SE +/- 0.04, N = 3 SE +/- 0.15, N = 3 SE +/- 0.11, N = 3 SE +/- 0.20, N = 3 274.26 274.47 274.47 274.46
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream a b c d 40 80 120 160 200 SE +/- 0.32, N = 3 SE +/- 0.01, N = 3 SE +/- 0.16, N = 3 SE +/- 0.19, N = 3 170.31 171.49 171.48 171.36
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 500 1000 1500 2000 2500 SE +/- 6.75, N = 3 SE +/- 8.66, N = 3 SE +/- 9.51, N = 3 SE +/- 15.99, N = 3 2253.86 2269.51 2254.22 2236.92
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 200 400 600 800 1000 SE +/- 2.80, N = 3 SE +/- 7.00, N = 3 SE +/- 3.06, N = 3 SE +/- 5.41, N = 3 1134.77 1131.64 1133.50 1132.32
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream a b c d 0.7741 1.5482 2.3223 3.0964 3.8705 SE +/- 0.0053, N = 3 SE +/- 0.0006, N = 3 SE +/- 0.0031, N = 3 SE +/- 0.0032, N = 3 3.4391 3.4406 3.4298 3.4284
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream a b c d 3 6 9 12 15 SE +/- 0.07, N = 3 SE +/- 0.00, N = 3 SE +/- 0.06, N = 3 SE +/- 0.00, N = 3 11.12 11.02 11.08 11.01
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b c d 60 120 180 240 300 SE +/- 0.26, N = 3 SE +/- 0.04, N = 3 SE +/- 0.12, N = 3 SE +/- 0.36, N = 3 274.59 274.39 274.28 274.60
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b c d 40 80 120 160 200 SE +/- 0.03, N = 3 SE +/- 0.40, N = 3 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 170.96 171.31 171.28 171.36
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 30 60 90 120 150 SE +/- 0.18, N = 3 SE +/- 0.15, N = 3 SE +/- 0.42, N = 3 SE +/- 0.18, N = 3 129.44 128.96 129.10 129.46
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 20 40 60 80 100 SE +/- 0.16, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 SE +/- 0.08, N = 3 87.36 87.55 87.63 87.62
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b c d 40 80 120 160 200 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.00, N = 3 188.33 188.15 188.21 188.44
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream a b c d 30 60 90 120 150 SE +/- 0.37, N = 3 SE +/- 0.52, N = 3 SE +/- 0.17, N = 3 SE +/- 0.34, N = 3 113.47 114.41 113.88 114.83
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b c d 9 18 27 36 45 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 39.57 39.47 39.47 39.45
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b c d 7 14 21 28 35 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 29.77 29.82 29.79 29.79
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 90 180 270 360 450 SE +/- 0.23, N = 3 SE +/- 0.87, N = 3 SE +/- 0.37, N = 3 SE +/- 0.27, N = 3 396.58 396.44 397.59 397.57
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 20 40 60 80 100 SE +/- 0.23, N = 3 SE +/- 0.29, N = 3 SE +/- 0.14, N = 3 SE +/- 0.05, N = 3 89.71 90.50 89.81 90.39
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b c d 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 21.92 21.90 21.90 21.95
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream a b c d 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 17.22 17.20 17.18 17.20
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b c d 80 160 240 320 400 SE +/- 0.99, N = 3 SE +/- 0.38, N = 3 SE +/- 0.63, N = 3 SE +/- 0.31, N = 3 365.34 365.98 365.58 366.68
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a b c d 13 26 39 52 65 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 58.02 58.02 58.02 58.09
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 3 6 9 12 15 SE +/- 0.0068, N = 3 SE +/- 0.0071, N = 3 SE +/- 0.0015, N = 3 SE +/- 0.0073, N = 3 8.9970 9.0094 9.0252 9.0040
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 0.8947 1.7894 2.6841 3.5788 4.4735 SE +/- 0.0204, N = 3 SE +/- 0.0031, N = 3 SE +/- 0.0030, N = 3 SE +/- 0.0293, N = 3 3.9594 3.9707 3.9766 3.9490
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream a b c d 7 14 21 28 35 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 29.15 29.12 29.12 29.12
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream a b c d 1.3196 2.6392 3.9588 5.2784 6.598 SE +/- 0.0111, N = 3 SE +/- 0.0004, N = 3 SE +/- 0.0053, N = 3 SE +/- 0.0066, N = 3 5.8648 5.8243 5.8245 5.8285
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 0.8011 1.6022 2.4033 3.2044 4.0055 SE +/- 0.0104, N = 3 SE +/- 0.0132, N = 3 SE +/- 0.0151, N = 3 SE +/- 0.0258, N = 3 3.5332 3.5087 3.5324 3.5604
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 0.1982 0.3964 0.5946 0.7928 0.991 SE +/- 0.0021, N = 3 SE +/- 0.0054, N = 3 SE +/- 0.0025, N = 3 SE +/- 0.0042, N = 3 0.8788 0.8810 0.8796 0.8806
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream a b c d 500 1000 1500 2000 2500 SE +/- 3.18, N = 3 SE +/- 0.34, N = 3 SE +/- 1.96, N = 3 SE +/- 1.96, N = 3 2252.46 2251.61 2258.14 2259.26
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream a b c d 20 40 60 80 100 SE +/- 0.58, N = 3 SE +/- 0.01, N = 3 SE +/- 0.49, N = 3 SE +/- 0.02, N = 3 89.87 90.75 90.23 90.77
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b c d 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 29.11 29.13 29.14 29.11
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b c d 1.3145 2.629 3.9435 5.258 6.5725 SE +/- 0.0009, N = 3 SE +/- 0.0134, N = 3 SE +/- 0.0039, N = 3 SE +/- 0.0043, N = 3 5.8423 5.8302 5.8312 5.8281
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 14 28 42 56 70 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 SE +/- 0.21, N = 3 SE +/- 0.07, N = 3 61.77 62.00 61.94 61.76
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 11.44 11.42 11.41 11.41
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b c d 10 20 30 40 50 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 42.45 42.49 42.48 42.42
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream a b c d 2 4 6 8 10 SE +/- 0.0288, N = 3 SE +/- 0.0397, N = 3 SE +/- 0.0131, N = 3 SE +/- 0.0260, N = 3 8.8070 8.7347 8.7752 8.7026
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b c d 40 80 120 160 200 SE +/- 0.12, N = 3 SE +/- 0.18, N = 3 SE +/- 0.08, N = 3 SE +/- 0.10, N = 3 202.02 202.51 202.37 202.46
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b c d 8 16 24 32 40 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 33.57 33.52 33.55 33.55
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c d 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 20.15 20.16 20.10 20.10
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream a b c d 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 11.14 11.04 11.12 11.05
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b c d 80 160 240 320 400 SE +/- 0.17, N = 3 SE +/- 0.43, N = 3 SE +/- 0.33, N = 3 SE +/- 1.03, N = 3 363.53 364.43 363.76 363.59
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream a b c d 13 26 39 52 65 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 58.07 58.14 58.19 58.14
Phoronix Test Suite v10.8.5