dddas AMD Ryzen Threadripper 3970X 32-Core testing with a ASUS ROG ZENITH II EXTREME (1603 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2306242-NE-DDDAS565146&gru&sor .
dddas Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution a b AMD Ryzen Threadripper 3970X 32-Core @ 3.70GHz (32 Cores / 64 Threads) ASUS ROG ZENITH II EXTREME (1603 BIOS) AMD Starship/Matisse 64GB Samsung SSD 980 PRO 500GB AMD Radeon RX 5700 8GB (1750/875MHz) AMD Navi 10 HDMI Audio ASUS VP28U Aquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 22.04 5.19.0-051900rc7-generic (x86_64) GNOME Shell 42.2 X Server + Wayland 4.6 Mesa 22.0.1 (LLVM 13.0.1 DRM 3.47) 1.2.204 GCC 11.3.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - NONE / errors=remount-ro,relatime,rw / Block Size: 4096 Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830104d Graphics Details - BAR1 / Visible vRAM Size: 256 MB - vBIOS Version: 113-D1820201-101 Python Details - Python 3.10.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
dddas stress-ng: Hash stress-ng: MMAP stress-ng: NUMA stress-ng: Pipe stress-ng: Poll stress-ng: Zlib stress-ng: Futex stress-ng: MEMFD stress-ng: Mutex stress-ng: Atomic stress-ng: Crypto stress-ng: Malloc stress-ng: Cloning stress-ng: Forking stress-ng: Pthread stress-ng: AVL Tree stress-ng: IO_uring stress-ng: SENDFILE stress-ng: CPU Cache stress-ng: CPU Stress stress-ng: Semaphores stress-ng: Matrix Math stress-ng: Vector Math stress-ng: Function Call stress-ng: Floating Point stress-ng: Matrix 3D Math stress-ng: Memory Copying stress-ng: Vector Shuffle stress-ng: Socket Activity stress-ng: Wide Vector Math stress-ng: Context Switching stress-ng: Fused Multiply-Add stress-ng: Vector Floating Point stress-ng: Glibc C String Functions stress-ng: Glibc Qsort Data Sorting stress-ng: System V Message Passing nekrs: Kershaw nekrs: TurboPipe Periodic dav1d: Chimera 1080p dav1d: Summer Nature 4K dav1d: Summer Nature 1080p dav1d: Chimera 1080p 10-bit xonotic: 1920 x 1080 - Low xonotic: 1920 x 1200 - Low xonotic: 2560 x 1440 - Low xonotic: 3840 x 2160 - Low xonotic: 1920 x 1080 - High xonotic: 1920 x 1200 - High xonotic: 2560 x 1440 - High xonotic: 3840 x 2160 - High xonotic: 1920 x 1080 - Ultra xonotic: 1920 x 1200 - Ultra xonotic: 2560 x 1440 - Ultra xonotic: 3840 x 2160 - Ultra xonotic: 1920 x 1080 - Ultimate xonotic: 1920 x 1200 - Ultimate xonotic: 2560 x 1440 - Ultimate xonotic: 3840 x 2160 - Ultimate embree: Pathtracer - Crown embree: Pathtracer ISPC - Crown embree: Pathtracer - Asian Dragon embree: Pathtracer - Asian Dragon Obj embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer ISPC - Asian Dragon Obj svt-av1: Preset 4 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 12 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K svt-av1: Preset 4 - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 1080p svt-av1: Preset 12 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 1080p vvenc: Bosphorus 4K - Fast vvenc: Bosphorus 4K - Faster vvenc: Bosphorus 1080p - Fast vvenc: Bosphorus 1080p - Faster hpcg: 104 104 104 - 60 heffte: c2c - FFTW - double-long - 128 heffte: c2c - FFTW - double-long - 256 heffte: c2c - FFTW - double-long - 512 heffte: r2c - FFTW - double-long - 128 heffte: r2c - FFTW - double-long - 256 heffte: r2c - FFTW - double-long - 512 heffte: c2c - Stock - double-long - 128 heffte: c2c - Stock - double-long - 256 heffte: c2c - Stock - double-long - 512 heffte: r2c - Stock - double-long - 128 heffte: r2c - Stock - double-long - 256 heffte: r2c - Stock - double-long - 512 libxsmm: 128 libxsmm: 256 libxsmm: 32 libxsmm: 64 oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only oidn: RTLightmap.hdr.4096x4096 - CPU-Only ospray: particle_volume/ao/real_time ospray: particle_volume/scivis/real_time ospray: particle_volume/pathtracer/real_time ospray: gravity_spheres_volume/dim_512/ao/real_time ospray: gravity_spheres_volume/dim_512/scivis/real_time ospray: gravity_spheres_volume/dim_512/pathtracer/real_time deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream laghos: Triple Point Problem laghos: Sedov Blast Wave, ube_922_hex.mesh leveldb: Fill Sync leveldb: Overwrite leveldb: Rand Fill leveldb: Seq Fill petsc: Streams palabos: 100 palabos: 400 palabos: 500 liquid-dsp: 1 - 256 - 32 liquid-dsp: 1 - 256 - 57 liquid-dsp: 2 - 256 - 32 liquid-dsp: 2 - 256 - 57 liquid-dsp: 4 - 256 - 32 liquid-dsp: 4 - 256 - 57 liquid-dsp: 8 - 256 - 32 liquid-dsp: 8 - 256 - 57 liquid-dsp: 1 - 256 - 512 liquid-dsp: 16 - 256 - 32 liquid-dsp: 16 - 256 - 57 liquid-dsp: 2 - 256 - 512 liquid-dsp: 32 - 256 - 32 liquid-dsp: 32 - 256 - 57 liquid-dsp: 4 - 256 - 512 liquid-dsp: 64 - 256 - 32 liquid-dsp: 64 - 256 - 57 liquid-dsp: 8 - 256 - 512 liquid-dsp: 16 - 256 - 512 liquid-dsp: 32 - 256 - 512 liquid-dsp: 64 - 256 - 512 kripke: leveldb: Hot Read leveldb: Fill Sync leveldb: Overwrite leveldb: Rand Fill leveldb: Rand Read leveldb: Seek Rand leveldb: Rand Delete leveldb: Seq Fill onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream sqlite: 1 sqlite: 2 sqlite: 4 sqlite: 8 sqlite: 16 sqlite: 32 sqlite: 64 cp2k: H20-64 cp2k: Fayalite-FIST mocassin: Gas HII40 mocassin: Dust 2D tau100.0 remhos: Sample Remap Example z3: 1.smt2 z3: 2.smt2 encode-opus: WAV To Opus Encode espeak: Text-To-Speech Synthesis gpaw: Carbon Nanotube whisper-cpp: ggml-base.en - 2016 State of the Union whisper-cpp: ggml-small.en - 2016 State of the Union whisper-cpp: ggml-medium.en - 2016 State of the Union qmcpack: Li2_STO_ae qmcpack: simple-H2O qmcpack: FeCO6_b3lyp_gms qmcpack: FeCO6_b3lyp_gms a b 7627578.66 437.11 752.30 18809740.35 4084623.29 4517.78 4610857.40 395.11 18827346.28 480.06 78260.17 92853207.13 3354.40 51344.69 128353.64 283.41 439798.24 515575.47 1624118.54 82729.76 66510329.66 199178.68 224417.23 24278.34 11201.44 2806.09 10973.65 22825.44 3072.80 1501239.29 11409509.77 33507543.08 94803.76 33453867.32 942.22 10692419.88 2123046667 3444566667 398.39 222.52 597.02 374.79 671.4224194 671.9542910 673.1714914 670.0380428 561.4381904 561.0057748 560.9659015 467.6921769 518.6729034 521.4981114 520.7893156 420.7457662 386.9438277 384.7672496 384.0158631 311.3989499 38.4669 34.4087 41.5850 37.3900 39.3962 33.8311 3.756 54.148 126.423 127.136 10.868 85.308 308.249 360.924 5.44 10.931 13.875 24.877 10.9645 30.8024 13.7638 15.3464 56.4524 27.4329 27.7110 26.5517 13.8764 15.4082 51.8810 30.1443 30.0147 635.8 910.4 160.5 318.5 1.22 1.22 0.60 9.86893 9.74548 128.572 4.93554 4.62468 7.67668 28.8773 16.3453 255.9872 77.5785 86.1294 28.7279 148.9743 91.5663 323.0917 141.2991 235.8139 81.3373 33.0990 21.5675 119.0134 43.8828 28.6062 16.3383 220.46 264.34 0.6 27.0 26.9 27.8 58312.0964 121.931 139.299 143.850 45075000 51993333 89896333 103806667 178100000 206086667 354570000 409183333 10537667 690800000 795103333 20851667 1343200000 1506266667 41277667 2250733333 1836033333 82123667 160113333 313753333 506326667 148243333 43.137 10866.014 262.354 262.981 43.493 65.839 245.160 254.942 1.55099 4.26624 1.177229 0.948450 4.81893 5.69206 2.68566 5.76769 1.36630 1.57740 3252.18 976.467 3244.46 938.102 3235.41 935.611 552.2812 61.1727 62.4431 12.8830 185.7387 34.8038 107.3698 10.9104 49.4917 7.0704 67.8249 12.2881 482.7469 46.3523 134.4071 22.7831 556.6090 61.1988 106.014 243.219 266.581 291.254 373.820 505.417 681.448 42.966 123.826 12.684 181.265 23.537 29.932 76.011 28.695 31.077 110.846 156.48335 395.70935 1018.28439 136.22 27.602 175.39 196.98 7624159.24 439.24 741.66 22201912.16 4101817.96 4518.88 4644715 394.5 18816044.49 480.51 78455.19 92812375.37 3360.52 51160.22 128387.57 282.42 440335.12 528847.31 1535034.64 82887.24 71041068.48 200423.6 224460.71 24275.23 11221.55 2795.85 10984.91 22200.86 9580.27 1496970.66 11620881.04 33539318.97 95693.93 33092079.25 943.84 10677047.79 2109640000 3441770000 398.2 222.24 597.19 374.13 669.8590201 676.0787756 687.4861314 675.8397046 563.7930829 567.9388648 567.977016 469.7486112 524.5527153 521.2002993 527.2114812 423.3297457 386.2753935 386.8054656 381.3165184 311.7695344 38.45 34.5263 41.782 37.4822 39.399 33.9593 3.721 54.556 127.675 127.68 10.845 85.496 305.73 364.28 5.387 10.892 13.767 24.801 11.0163 30.8579 13.79 15.3508 55.8515 27.2699 27.7303 26.8097 13.8536 15.4135 50.934 30.2863 30.036 635.4 907.4 160.7 318.7 1.22 1.23 0.60 9.89124 9.76771 128.663 4.98028 4.62434 7.70051 28.9177 16.4828 256.6381 78.4003 87.0186 29.8626 149.9629 91.9798 324.2731 146.1891 236.8228 83.4278 32.9066 21.4935 119.0879 43.0475 28.9881 16.4295 219.12 265.390123291 0.4 27 26.7 27.7 58276.7926 122.23 140.078 144.062 45023000 51814000 89686000 103260000 179170000 206140000 355110000 410090000 10560000 692130000 799490000 20989000 1350000000 1512100000 41475000 2269700000 1837000000 82224000 160130000 314560000 506050000 146215600 42.757 16348.371 262.283 264.868 43.547 64.562 245.182 255.488 1.30181 4.21682 1.08944 0.985098 4.83565 5.73181 2.69872 5.70574 1.43664 1.5719 3275.76 987.992 3227.02 958.843 3226.7 932.987 551.16 60.6621 62.3112 12.7473 183.8144 33.4783 106.6629 10.8616 49.319 6.833 67.5353 11.9799 485.9478 46.5115 134.323 23.2237 550.4325 60.8593 105 237.187 262.624 284.865 374.33 502.766 680.811 42.112 122.975 12.598 180.727 23.654 29.898 76.122 28.83 31.488 110.949 151.04973 363.32431 1003.11362 132.76 27.484 174.82 191.15 OpenBenchmarking.org
Stress-NG Test: Hash OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Hash a b 1.6M 3.2M 4.8M 6.4M 8M SE +/- 2470.62, N = 3 7627578.66 7624159.24 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: MMAP OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: MMAP b a 100 200 300 400 500 SE +/- 0.81, N = 3 439.24 437.11 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: NUMA a b 160 320 480 640 800 SE +/- 5.01, N = 3 752.30 741.66 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Pipe OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pipe b a 5M 10M 15M 20M 25M SE +/- 858971.94, N = 15 22201912.16 18809740.35 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Poll OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Poll b a 900K 1800K 2700K 3600K 4500K SE +/- 1922.15, N = 3 4101817.96 4084623.29 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Zlib OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Zlib b a 1000 2000 3000 4000 5000 SE +/- 2.96, N = 3 4518.88 4517.78 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Futex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Futex b a 1000K 2000K 3000K 4000K 5000K SE +/- 56259.21, N = 4 4644715.00 4610857.40 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: MEMFD OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: MEMFD a b 90 180 270 360 450 SE +/- 0.62, N = 3 395.11 394.50 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Mutex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Mutex a b 4M 8M 12M 16M 20M SE +/- 22386.94, N = 3 18827346.28 18816044.49 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Atomic OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Atomic b a 100 200 300 400 500 SE +/- 0.46, N = 3 480.51 480.06 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Crypto OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Crypto b a 20K 40K 60K 80K 100K SE +/- 78.61, N = 3 78455.19 78260.17 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Malloc OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Malloc a b 20M 40M 60M 80M 100M SE +/- 44041.52, N = 3 92853207.13 92812375.37 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Cloning OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Cloning b a 700 1400 2100 2800 3500 SE +/- 2.74, N = 3 3360.52 3354.40 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Forking OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Forking a b 11K 22K 33K 44K 55K SE +/- 291.28, N = 3 51344.69 51160.22 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Pthread OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pthread b a 30K 60K 90K 120K 150K SE +/- 521.45, N = 3 128387.57 128353.64 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: AVL Tree OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: AVL Tree a b 60 120 180 240 300 SE +/- 0.22, N = 3 283.41 282.42 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: IO_uring OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: IO_uring b a 90K 180K 270K 360K 450K SE +/- 726.45, N = 3 440335.12 439798.24 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: SENDFILE OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: SENDFILE b a 110K 220K 330K 440K 550K SE +/- 656.59, N = 3 528847.31 515575.47 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: CPU Cache a b 300K 600K 900K 1200K 1500K SE +/- 19308.61, N = 3 1624118.54 1535034.64 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: CPU Stress b a 20K 40K 60K 80K 100K SE +/- 76.38, N = 3 82887.24 82729.76 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Semaphores OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Semaphores b a 15M 30M 45M 60M 75M SE +/- 919418.56, N = 3 71041068.48 66510329.66 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix Math b a 40K 80K 120K 160K 200K SE +/- 476.06, N = 3 200423.60 199178.68 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Math b a 50K 100K 150K 200K 250K SE +/- 22.96, N = 3 224460.71 224417.23 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Function Call OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Function Call a b 5K 10K 15K 20K 25K SE +/- 38.00, N = 3 24278.34 24275.23 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Floating Point b a 2K 4K 6K 8K 10K SE +/- 7.25, N = 3 11221.55 11201.44 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Matrix 3D Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix 3D Math a b 600 1200 1800 2400 3000 SE +/- 1.75, N = 3 2806.09 2795.85 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Memory Copying b a 2K 4K 6K 8K 10K SE +/- 6.50, N = 3 10984.91 10973.65 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Vector Shuffle OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Shuffle a b 5K 10K 15K 20K 25K SE +/- 43.57, N = 3 22825.44 22200.86 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Socket Activity b a 2K 4K 6K 8K 10K SE +/- 1064.20, N = 15 9580.27 3072.80 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Wide Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Wide Vector Math a b 300K 600K 900K 1200K 1500K SE +/- 3256.42, N = 3 1501239.29 1496970.66 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Context Switching b a 2M 4M 6M 8M 10M SE +/- 22031.57, N = 3 11620881.04 11409509.77 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Fused Multiply-Add OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Fused Multiply-Add b a 7M 14M 21M 28M 35M SE +/- 7490.49, N = 3 33539318.97 33507543.08 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Vector Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Floating Point b a 20K 40K 60K 80K 100K SE +/- 139.68, N = 3 95693.93 94803.76 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Glibc C String Functions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Glibc C String Functions a b 7M 14M 21M 28M 35M SE +/- 238790.03, N = 3 33453867.32 33092079.25 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Glibc Qsort Data Sorting OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Glibc Qsort Data Sorting b a 200 400 600 800 1000 SE +/- 0.47, N = 3 943.84 942.22 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: System V Message Passing OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: System V Message Passing a b 2M 4M 6M 8M 10M SE +/- 13654.50, N = 3 10692419.88 10677047.79 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
nekRS Input: Kershaw OpenBenchmarking.org flops/rank, More Is Better nekRS 23.0 Input: Kershaw a b 500M 1000M 1500M 2000M 2500M SE +/- 3171604.92, N = 3 2123046667 2109640000 1. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -rdynamic -lmpi_cxx -lmpi
nekRS Input: TurboPipe Periodic OpenBenchmarking.org flops/rank, More Is Better nekRS 23.0 Input: TurboPipe Periodic a b 700M 1400M 2100M 2800M 3500M SE +/- 1942175.18, N = 3 3444566667 3441770000 1. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -rdynamic -lmpi_cxx -lmpi
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Chimera 1080p a b 90 180 270 360 450 SE +/- 0.11, N = 3 398.39 398.20 1. (CC) gcc options: -pthread -lm
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Summer Nature 4K a b 50 100 150 200 250 SE +/- 0.24, N = 3 222.52 222.24 1. (CC) gcc options: -pthread -lm
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Summer Nature 1080p b a 130 260 390 520 650 SE +/- 0.86, N = 3 597.19 597.02 1. (CC) gcc options: -pthread -lm
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Chimera 1080p 10-bit a b 80 160 240 320 400 SE +/- 0.32, N = 3 374.79 374.13 1. (CC) gcc options: -pthread -lm
Xonotic Resolution: 1920 x 1080 - Effects Quality: Low OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1080 - Effects Quality: Low a b 140 280 420 560 700 SE +/- 0.98, N = 3 671.42 669.86 MIN: 430 / MAX: 1177 MIN: 439 / MAX: 1136
Xonotic Resolution: 1920 x 1200 - Effects Quality: Low OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1200 - Effects Quality: Low b a 150 300 450 600 750 SE +/- 1.00, N = 3 676.08 671.95 MIN: 427 / MAX: 1181 MIN: 431 / MAX: 1193
Xonotic Resolution: 2560 x 1440 - Effects Quality: Low OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 2560 x 1440 - Effects Quality: Low b a 150 300 450 600 750 SE +/- 2.49, N = 3 687.49 673.17 MIN: 439 / MAX: 1194 MIN: 426 / MAX: 1185
Xonotic Resolution: 3840 x 2160 - Effects Quality: Low OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 3840 x 2160 - Effects Quality: Low b a 150 300 450 600 750 SE +/- 1.56, N = 3 675.84 670.04 MIN: 413 / MAX: 1166 MIN: 387 / MAX: 1175
Xonotic Resolution: 1920 x 1080 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1080 - Effects Quality: High b a 120 240 360 480 600 SE +/- 1.66, N = 3 563.79 561.44 MIN: 337 / MAX: 945 MIN: 330 / MAX: 956
Xonotic Resolution: 1920 x 1200 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1200 - Effects Quality: High b a 120 240 360 480 600 SE +/- 3.14, N = 3 567.94 561.01 MIN: 343 / MAX: 932 MIN: 341 / MAX: 967
Xonotic Resolution: 2560 x 1440 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 2560 x 1440 - Effects Quality: High b a 120 240 360 480 600 SE +/- 0.81, N = 3 567.98 560.97 MIN: 347 / MAX: 923 MIN: 336 / MAX: 962
Xonotic Resolution: 3840 x 2160 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 3840 x 2160 - Effects Quality: High b a 100 200 300 400 500 SE +/- 0.24, N = 3 469.75 467.69 MIN: 225 / MAX: 637 MIN: 222 / MAX: 635
Xonotic Resolution: 1920 x 1080 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1080 - Effects Quality: Ultra b a 110 220 330 440 550 SE +/- 1.23, N = 3 524.55 518.67 MIN: 285 / MAX: 905 MIN: 259 / MAX: 910
Xonotic Resolution: 1920 x 1200 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1200 - Effects Quality: Ultra a b 110 220 330 440 550 SE +/- 0.46, N = 3 521.50 521.20 MIN: 282 / MAX: 935 MIN: 285 / MAX: 919
Xonotic Resolution: 2560 x 1440 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 2560 x 1440 - Effects Quality: Ultra b a 110 220 330 440 550 SE +/- 2.20, N = 3 527.21 520.79 MIN: 294 / MAX: 931 MIN: 272 / MAX: 931
Xonotic Resolution: 3840 x 2160 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 3840 x 2160 - Effects Quality: Ultra b a 90 180 270 360 450 SE +/- 0.23, N = 3 423.33 420.75 MIN: 194 / MAX: 581 MIN: 194 / MAX: 579
Xonotic Resolution: 1920 x 1080 - Effects Quality: Ultimate OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1080 - Effects Quality: Ultimate a b 80 160 240 320 400 SE +/- 2.17, N = 3 386.94 386.28 MIN: 97 / MAX: 892 MIN: 101 / MAX: 871
Xonotic Resolution: 1920 x 1200 - Effects Quality: Ultimate OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1200 - Effects Quality: Ultimate b a 80 160 240 320 400 SE +/- 0.49, N = 3 386.81 384.77 MIN: 104 / MAX: 887 MIN: 102 / MAX: 919
Xonotic Resolution: 2560 x 1440 - Effects Quality: Ultimate OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 2560 x 1440 - Effects Quality: Ultimate a b 80 160 240 320 400 SE +/- 1.62, N = 3 384.02 381.32 MIN: 99 / MAX: 847 MIN: 106 / MAX: 824
Xonotic Resolution: 3840 x 2160 - Effects Quality: Ultimate OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 3840 x 2160 - Effects Quality: Ultimate b a 70 140 210 280 350 SE +/- 0.68, N = 3 311.77 311.40 MIN: 98 / MAX: 488 MIN: 97 / MAX: 487
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Crown a b 9 18 27 36 45 SE +/- 0.08, N = 3 38.47 38.45 MIN: 37.94 / MAX: 39.09 MIN: 38.09 / MAX: 38.98
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Crown b a 8 16 24 32 40 SE +/- 0.09, N = 3 34.53 34.41 MIN: 34.2 / MAX: 35.09 MIN: 33.95 / MAX: 35.09
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon b a 10 20 30 40 50 SE +/- 0.05, N = 3 41.78 41.59 MIN: 41.55 / MAX: 42.41 MIN: 41.23 / MAX: 42.19
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon Obj b a 9 18 27 36 45 SE +/- 0.05, N = 3 37.48 37.39 MIN: 37.25 / MAX: 38.16 MIN: 37.08 / MAX: 37.99
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon b a 9 18 27 36 45 SE +/- 0.02, N = 3 39.40 39.40 MIN: 39.18 / MAX: 39.86 MIN: 39.15 / MAX: 40.06
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj b a 8 16 24 32 40 SE +/- 0.06, N = 3 33.96 33.83 MIN: 33.75 / MAX: 34.43 MIN: 33.53 / MAX: 34.4
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 4 - Input: Bosphorus 4K a b 0.8451 1.6902 2.5353 3.3804 4.2255 SE +/- 0.010, N = 3 3.756 3.721 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 8 - Input: Bosphorus 4K b a 12 24 36 48 60 SE +/- 0.30, N = 3 54.56 54.15 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 12 - Input: Bosphorus 4K b a 30 60 90 120 150 SE +/- 1.42, N = 4 127.68 126.42 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 13 - Input: Bosphorus 4K b a 30 60 90 120 150 SE +/- 0.08, N = 3 127.68 127.14 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 4 - Input: Bosphorus 1080p a b 3 6 9 12 15 SE +/- 0.05, N = 3 10.87 10.85 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 8 - Input: Bosphorus 1080p b a 20 40 60 80 100 SE +/- 0.40, N = 3 85.50 85.31 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a b 70 140 210 280 350 SE +/- 1.75, N = 3 308.25 305.73 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 13 - Input: Bosphorus 1080p b a 80 160 240 320 400 SE +/- 1.35, N = 3 364.28 360.92 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 4K - Video Preset: Fast a b 1.224 2.448 3.672 4.896 6.12 SE +/- 0.015, N = 3 5.440 5.387 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 4K - Video Preset: Faster a b 3 6 9 12 15 SE +/- 0.04, N = 3 10.93 10.89 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 1080p - Video Preset: Fast a b 4 8 12 16 20 SE +/- 0.04, N = 3 13.88 13.77 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 1080p - Video Preset: Faster a b 6 12 18 24 30 SE +/- 0.09, N = 3 24.88 24.80 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
High Performance Conjugate Gradient X Y Z: 104 104 104 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 104 104 104 - RT: 60 b a 3 6 9 12 15 SE +/- 0.02, N = 3 11.02 10.96 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 b a 7 14 21 28 35 SE +/- 0.37, N = 3 30.86 30.80 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 b a 4 8 12 16 20 SE +/- 0.01, N = 3 13.79 13.76 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 b a 4 8 12 16 20 SE +/- 0.00, N = 3 15.35 15.35 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 a b 13 26 39 52 65 SE +/- 0.45, N = 15 56.45 55.85 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 a b 6 12 18 24 30 SE +/- 0.06, N = 3 27.43 27.27 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 b a 7 14 21 28 35 SE +/- 0.01, N = 3 27.73 27.71 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 b a 6 12 18 24 30 SE +/- 0.17, N = 3 26.81 26.55 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 a b 4 8 12 16 20 SE +/- 0.01, N = 3 13.88 13.85 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 b a 4 8 12 16 20 SE +/- 0.01, N = 3 15.41 15.41 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 a b 12 24 36 48 60 SE +/- 0.72, N = 3 51.88 50.93 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 b a 7 14 21 28 35 SE +/- 0.11, N = 3 30.29 30.14 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 b a 7 14 21 28 35 SE +/- 0.04, N = 3 30.04 30.01 1. (CXX) g++ options: -O3
libxsmm M N K: 128 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 128 a b 140 280 420 560 700 SE +/- 0.22, N = 3 635.8 635.4 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
libxsmm M N K: 256 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 256 a b 200 400 600 800 1000 SE +/- 3.58, N = 3 910.4 907.4 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
libxsmm M N K: 32 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 32 b a 40 80 120 160 200 SE +/- 0.07, N = 3 160.7 160.5 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
libxsmm M N K: 64 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 64 b a 70 140 210 280 350 SE +/- 0.09, N = 3 318.7 318.5 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only b a 0.2745 0.549 0.8235 1.098 1.3725 SE +/- 0.00, N = 3 1.22 1.22
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only b a 0.2768 0.5536 0.8304 1.1072 1.384 SE +/- 0.00, N = 3 1.23 1.22
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only b a 0.135 0.27 0.405 0.54 0.675 SE +/- 0.00, N = 3 0.60 0.60
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/ao/real_time b a 3 6 9 12 15 SE +/- 0.00341, N = 3 9.89124 9.86893
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/scivis/real_time b a 3 6 9 12 15 SE +/- 0.00531, N = 3 9.76771 9.74548
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/pathtracer/real_time b a 30 60 90 120 150 SE +/- 0.05, N = 3 128.66 128.57
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/ao/real_time b a 1.1206 2.2412 3.3618 4.4824 5.603 SE +/- 0.00393, N = 3 4.98028 4.93554
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time a b 1.0406 2.0812 3.1218 4.1624 5.203 SE +/- 0.00170, N = 3 4.62468 4.62434
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time b a 2 4 6 8 10 SE +/- 0.01066, N = 3 7.70051 7.67668
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream b a 7 14 21 28 35 SE +/- 0.10, N = 3 28.92 28.88
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream b a 4 8 12 16 20 SE +/- 0.01, N = 3 16.48 16.35
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream b a 60 120 180 240 300 SE +/- 0.36, N = 3 256.64 255.99
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream b a 20 40 60 80 100 SE +/- 0.19, N = 3 78.40 77.58
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream b a 20 40 60 80 100 SE +/- 0.56, N = 3 87.02 86.13
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream b a 7 14 21 28 35 SE +/- 0.18, N = 3 29.86 28.73
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream b a 30 60 90 120 150 SE +/- 0.14, N = 3 149.96 148.97
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream b a 20 40 60 80 100 SE +/- 0.13, N = 3 91.98 91.57
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream b a 70 140 210 280 350 SE +/- 0.39, N = 3 324.27 323.09
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream b a 30 60 90 120 150 SE +/- 0.75, N = 3 146.19 141.30
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream b a 50 100 150 200 250 SE +/- 0.13, N = 3 236.82 235.81
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream b a 20 40 60 80 100 SE +/- 0.18, N = 3 83.43 81.34
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b 8 16 24 32 40 SE +/- 0.16, N = 3 33.10 32.91
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b 5 10 15 20 25 SE +/- 0.01, N = 3 21.57 21.49
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream b a 30 60 90 120 150 SE +/- 0.18, N = 3 119.09 119.01
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream a b 10 20 30 40 50 SE +/- 0.26, N = 3 43.88 43.05
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream b a 7 14 21 28 35 SE +/- 0.06, N = 3 28.99 28.61
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream b a 4 8 12 16 20 SE +/- 0.01, N = 3 16.43 16.34
Laghos Test: Triple Point Problem OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Triple Point Problem a b 50 100 150 200 250 SE +/- 0.34, N = 3 220.46 219.12 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
Laghos Test: Sedov Blast Wave, ube_922_hex.mesh OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Sedov Blast Wave, ube_922_hex.mesh b a 60 120 180 240 300 SE +/- 0.22, N = 3 265.39 264.34 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
LevelDB Benchmark: Fill Sync OpenBenchmarking.org MB/s, More Is Better LevelDB 1.23 Benchmark: Fill Sync a b 0.135 0.27 0.405 0.54 0.675 SE +/- 0.00, N = 3 0.6 0.4 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
LevelDB Benchmark: Overwrite OpenBenchmarking.org MB/s, More Is Better LevelDB 1.23 Benchmark: Overwrite b a 6 12 18 24 30 SE +/- 0.09, N = 3 27.0 27.0 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
LevelDB Benchmark: Random Fill OpenBenchmarking.org MB/s, More Is Better LevelDB 1.23 Benchmark: Random Fill a b 6 12 18 24 30 SE +/- 0.09, N = 3 26.9 26.7 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
LevelDB Benchmark: Sequential Fill OpenBenchmarking.org MB/s, More Is Better LevelDB 1.23 Benchmark: Sequential Fill a b 7 14 21 28 35 SE +/- 0.07, N = 3 27.8 27.7 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
PETSc Test: Streams OpenBenchmarking.org MB/s, More Is Better PETSc 3.19 Test: Streams a b 12K 24K 36K 48K 60K SE +/- 71.95, N = 3 58312.10 58276.79 1. (CC) gcc options: -fPIC -O3 -O2 -lpthread -ludev -lpciaccess -lm
Palabos Grid Size: 100 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 100 b a 30 60 90 120 150 SE +/- 0.14, N = 3 122.23 121.93 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Palabos Grid Size: 400 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 400 b a 30 60 90 120 150 SE +/- 0.57, N = 3 140.08 139.30 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Palabos Grid Size: 500 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 500 b a 30 60 90 120 150 SE +/- 0.27, N = 3 144.06 143.85 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 32 a b 10M 20M 30M 40M 50M SE +/- 21825.06, N = 3 45075000 45023000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 57 a b 11M 22M 33M 44M 55M SE +/- 193694.20, N = 3 51993333 51814000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 32 a b 20M 40M 60M 80M 100M SE +/- 85545.96, N = 3 89896333 89686000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 57 a b 20M 40M 60M 80M 100M SE +/- 150591.43, N = 3 103806667 103260000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 32 b a 40M 80M 120M 160M 200M SE +/- 81853.53, N = 3 179170000 178100000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 57 b a 40M 80M 120M 160M 200M SE +/- 210502.05, N = 3 206140000 206086667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 32 b a 80M 160M 240M 320M 400M SE +/- 120138.81, N = 3 355110000 354570000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 57 b a 90M 180M 270M 360M 450M SE +/- 176099.72, N = 3 410090000 409183333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 512 b a 2M 4M 6M 8M 10M SE +/- 21333.33, N = 3 10560000 10537667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 32 b a 150M 300M 450M 600M 750M SE +/- 1128996.60, N = 3 692130000 690800000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 57 b a 200M 400M 600M 800M 1000M SE +/- 377903.57, N = 3 799490000 795103333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 512 b a 4M 8M 12M 16M 20M SE +/- 21712.77, N = 3 20989000 20851667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 32 b a 300M 600M 900M 1200M 1500M SE +/- 2211334.44, N = 3 1350000000 1343200000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 57 b a 300M 600M 900M 1200M 1500M SE +/- 2366666.67, N = 3 1512100000 1506266667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 512 b a 9M 18M 27M 36M 45M SE +/- 67087.84, N = 3 41475000 41277667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 32 b a 500M 1000M 1500M 2000M 2500M SE +/- 296273.15, N = 3 2269700000 2250733333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 57 b a 400M 800M 1200M 1600M 2000M SE +/- 19718378.34, N = 3 1837000000 1836033333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 512 b a 20M 40M 60M 80M 100M SE +/- 37834.43, N = 3 82224000 82123667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 512 b a 30M 60M 90M 120M 150M SE +/- 32829.53, N = 3 160130000 160113333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 512 b a 70M 140M 210M 280M 350M SE +/- 399263.21, N = 3 314560000 313753333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 512 a b 110M 220M 330M 440M 550M SE +/- 148361.42, N = 3 506326667 506050000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.6 a b 30M 60M 90M 120M 150M SE +/- 636875.17, N = 3 148243333 146215600 1. (CXX) g++ options: -O3 -fopenmp -ldl
LevelDB Benchmark: Hot Read OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Hot Read b a 10 20 30 40 50 SE +/- 0.21, N = 3 42.76 43.14 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
LevelDB Benchmark: Fill Sync OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Fill Sync a b 4K 8K 12K 16K 20K SE +/- 65.20, N = 3 10866.01 16348.37 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
LevelDB Benchmark: Overwrite OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Overwrite b a 60 120 180 240 300 SE +/- 0.92, N = 3 262.28 262.35 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
LevelDB Benchmark: Random Fill OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Random Fill a b 60 120 180 240 300 SE +/- 0.90, N = 3 262.98 264.87 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
LevelDB Benchmark: Random Read OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Random Read a b 10 20 30 40 50 SE +/- 0.19, N = 3 43.49 43.55 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
LevelDB Benchmark: Seek Random OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Seek Random b a 15 30 45 60 75 SE +/- 0.19, N = 3 64.56 65.84 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
LevelDB Benchmark: Random Delete OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Random Delete a b 50 100 150 200 250 SE +/- 0.46, N = 3 245.16 245.18 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
LevelDB Benchmark: Sequential Fill OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Sequential Fill a b 60 120 180 240 300 SE +/- 0.72, N = 3 254.94 255.49 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU b a 0.349 0.698 1.047 1.396 1.745 SE +/- 0.01212, N = 10 1.30181 1.55099 MIN: 1.19 MIN: 1.33 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU b a 0.9599 1.9198 2.8797 3.8396 4.7995 SE +/- 0.01334, N = 3 4.21682 4.26624 MIN: 4.1 MIN: 4.12 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU b a 0.2649 0.5298 0.7947 1.0596 1.3245 SE +/- 0.016952, N = 14 1.089440 1.177229 MIN: 0.97 MIN: 0.89 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU a b 0.2216 0.4432 0.6648 0.8864 1.108 SE +/- 0.010152, N = 3 0.948450 0.985098 MIN: 0.87 MIN: 0.9 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU a b 1.088 2.176 3.264 4.352 5.44 SE +/- 0.01212, N = 3 4.81893 4.83565 MIN: 4.74 MIN: 4.78 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU a b 1.2897 2.5794 3.8691 5.1588 6.4485 SE +/- 0.03300, N = 3 5.69206 5.73181 MIN: 4.03 MIN: 4.01 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU a b 0.6072 1.2144 1.8216 2.4288 3.036 SE +/- 0.01103, N = 3 2.68566 2.69872 MIN: 2.61 MIN: 2.64 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU b a 1.2977 2.5954 3.8931 5.1908 6.4885 SE +/- 0.00320, N = 3 5.70574 5.76769 MIN: 5.64 MIN: 5.69 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU a b 0.3232 0.6464 0.9696 1.2928 1.616 SE +/- 0.01766, N = 3 1.36630 1.43664 MIN: 1.27 MIN: 1.35 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU b a 0.3549 0.7098 1.0647 1.4196 1.7745 SE +/- 0.00305, N = 3 1.57190 1.57740 MIN: 1.5 MIN: 1.49 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU a b 700 1400 2100 2800 3500 SE +/- 29.27, N = 3 3252.18 3275.76 MIN: 3200.87 MIN: 3269.28 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU a b 200 400 600 800 1000 SE +/- 3.39, N = 3 976.47 987.99 MIN: 961.99 MIN: 979.32 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU b a 700 1400 2100 2800 3500 SE +/- 29.09, N = 3 3227.02 3244.46 MIN: 3219.36 MIN: 3177.37 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU a b 200 400 600 800 1000 SE +/- 8.51, N = 3 938.10 958.84 MIN: 914.11 MIN: 951.92 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU b a 700 1400 2100 2800 3500 SE +/- 25.05, N = 3 3226.70 3235.41 MIN: 3215.86 MIN: 3194.45 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU b a 200 400 600 800 1000 SE +/- 7.63, N = 15 932.99 935.61 MIN: 924.9 MIN: 895.45 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream b a 120 240 360 480 600 SE +/- 1.59, N = 3 551.16 552.28
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream b a 14 28 42 56 70 SE +/- 0.04, N = 3 60.66 61.17
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream b a 14 28 42 56 70 SE +/- 0.10, N = 3 62.31 62.44
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream b a 3 6 9 12 15 SE +/- 0.03, N = 3 12.75 12.88
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream b a 40 80 120 160 200 SE +/- 1.22, N = 3 183.81 185.74
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream b a 8 16 24 32 40 SE +/- 0.22, N = 3 33.48 34.80
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream b a 20 40 60 80 100 SE +/- 0.10, N = 3 106.66 107.37
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream b a 3 6 9 12 15 SE +/- 0.02, N = 3 10.86 10.91
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream b a 11 22 33 44 55 SE +/- 0.05, N = 3 49.32 49.49
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream b a 2 4 6 8 10 SE +/- 0.0378, N = 3 6.8330 7.0704
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream b a 15 30 45 60 75 SE +/- 0.04, N = 3 67.54 67.82
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream b a 3 6 9 12 15 SE +/- 0.03, N = 3 11.98 12.29
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b 110 220 330 440 550 SE +/- 2.43, N = 3 482.75 485.95
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b 11 22 33 44 55 SE +/- 0.03, N = 3 46.35 46.51
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream b a 30 60 90 120 150 SE +/- 0.21, N = 3 134.32 134.41
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream a b 6 12 18 24 30 SE +/- 0.14, N = 3 22.78 23.22
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream b a 120 240 360 480 600 SE +/- 0.31, N = 3 550.43 556.61
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream b a 14 28 42 56 70 SE +/- 0.05, N = 3 60.86 61.20
SQLite Threads / Copies: 1 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 1 b a 20 40 60 80 100 SE +/- 0.30, N = 3 105.00 106.01 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
SQLite Threads / Copies: 2 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 2 b a 50 100 150 200 250 SE +/- 1.34, N = 3 237.19 243.22 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
SQLite Threads / Copies: 4 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 4 b a 60 120 180 240 300 SE +/- 2.89, N = 4 262.62 266.58 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
SQLite Threads / Copies: 8 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 8 b a 60 120 180 240 300 SE +/- 2.25, N = 3 284.87 291.25 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
SQLite Threads / Copies: 16 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 16 a b 80 160 240 320 400 SE +/- 1.37, N = 3 373.82 374.33 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
SQLite Threads / Copies: 32 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 32 b a 110 220 330 440 550 SE +/- 1.08, N = 3 502.77 505.42 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
SQLite Threads / Copies: 64 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 64 b a 150 300 450 600 750 SE +/- 0.71, N = 3 680.81 681.45 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
CP2K Molecular Dynamics Input: H20-64 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2023.1 Input: H20-64 b a 10 20 30 40 50 42.11 42.97 1. (F9X) gfortran options: -fopenmp -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kdbm -lcp2kgrid -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -lhdf5 -lhdf5_hl -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -lopenblas -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2023.1 Input: Fayalite-FIST b a 30 60 90 120 150 122.98 123.83 1. (F9X) gfortran options: -fopenmp -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kdbm -lcp2kgrid -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -lhdf5 -lhdf5_hl -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -lopenblas -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
Monte Carlo Simulations of Ionised Nebulae Input: Gas HII40 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Gas HII40 b a 3 6 9 12 15 SE +/- 0.05, N = 3 12.60 12.68 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz
Monte Carlo Simulations of Ionised Nebulae Input: Dust 2D tau100.0 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Dust 2D tau100.0 b a 40 80 120 160 200 SE +/- 0.15, N = 3 180.73 181.27 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz
Remhos Test: Sample Remap Example OpenBenchmarking.org Seconds, Fewer Is Better Remhos 1.0 Test: Sample Remap Example a b 6 12 18 24 30 SE +/- 0.04, N = 3 23.54 23.65 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
Z3 Theorem Prover SMT File: 1.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 1.smt2 b a 7 14 21 28 35 SE +/- 0.01, N = 3 29.90 29.93 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
Z3 Theorem Prover SMT File: 2.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 2.smt2 a b 20 40 60 80 100 SE +/- 0.12, N = 3 76.01 76.12 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.4 WAV To Opus Encode a b 7 14 21 28 35 SE +/- 0.05, N = 5 28.70 28.83 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 1.51 Text-To-Speech Synthesis a b 7 14 21 28 35 SE +/- 0.34, N = 4 31.08 31.49 1. (CXX) g++ options: -O2
GPAW Input: Carbon Nanotube OpenBenchmarking.org Seconds, Fewer Is Better GPAW 23.6 Input: Carbon Nanotube a b 20 40 60 80 100 SE +/- 0.26, N = 3 110.85 110.95 1. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi
Whisper.cpp Model: ggml-base.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-base.en - Input: 2016 State of the Union b a 30 60 90 120 150 SE +/- 1.99, N = 3 151.05 156.48 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Whisper.cpp Model: ggml-small.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-small.en - Input: 2016 State of the Union b a 90 180 270 360 450 SE +/- 6.52, N = 9 363.32 395.71 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-medium.en - Input: 2016 State of the Union b a 200 400 600 800 1000 SE +/- 11.84, N = 3 1003.11 1018.28 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
QMCPACK Input: Li2_STO_ae OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: Li2_STO_ae b a 30 60 90 120 150 SE +/- 0.40, N = 3 132.76 136.22 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
QMCPACK Input: simple-H2O OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: simple-H2O b a 6 12 18 24 30 SE +/- 0.04, N = 3 27.48 27.60 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
QMCPACK Input: FeCO6_b3lyp_gms OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: FeCO6_b3lyp_gms b a 40 80 120 160 200 SE +/- 0.14, N = 3 174.82 175.39 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
QMCPACK Input: FeCO6_b3lyp_gms OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: FeCO6_b3lyp_gms b a 40 80 120 160 200 SE +/- 1.72, N = 3 191.15 196.98 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
Phoronix Test Suite v10.8.5