dddas AMD Ryzen Threadripper 3970X 32-Core testing with a ASUS ROG ZENITH II EXTREME (1603 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2306247-NE-DDDAS346146&grs .
dddas Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution a b AMD Ryzen Threadripper 3970X 32-Core @ 3.70GHz (32 Cores / 64 Threads) ASUS ROG ZENITH II EXTREME (1603 BIOS) AMD Starship/Matisse 64GB Samsung SSD 980 PRO 500GB AMD Radeon RX 5700 8GB (1750/875MHz) AMD Navi 10 HDMI Audio ASUS VP28U Aquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 22.04 5.19.0-051900rc7-generic (x86_64) GNOME Shell 42.2 X Server + Wayland 4.6 Mesa 22.0.1 (LLVM 13.0.1 DRM 3.47) 1.2.204 GCC 11.3.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - NONE / errors=remount-ro,relatime,rw / Block Size: 4096 Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830104d Graphics Details - BAR1 / Visible vRAM Size: 256 MB - vBIOS Version: 113-D1820201-101 Python Details - Python 3.10.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
dddas leveldb: Fill Sync leveldb: Fill Sync onednn: IP Shapes 1D - f32 - CPU whisper-cpp: ggml-small.en - 2016 State of the Union onednn: IP Shapes 1D - u8s8f32 - CPU stress-ng: Semaphores stress-ng: CPU Cache onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream onednn: IP Shapes 3D - u8s8f32 - CPU whisper-cpp: ggml-base.en - 2016 State of the Union deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream qmcpack: FeCO6_b3lyp_gms stress-ng: Vector Shuffle qmcpack: Li2_STO_ae stress-ng: SENDFILE deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream sqlite: 2 sqlite: 8 onednn: Recurrent Neural Network Inference - u8s8f32 - CPU xonotic: 2560 x 1440 - Low cp2k: H20-64 leveldb: Seek Rand deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream heffte: r2c - Stock - double-long - 128 stress-ng: Context Switching whisper-cpp: ggml-medium.en - 2016 State of the Union sqlite: 4 stress-ng: NUMA kripke: deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream espeak: Text-To-Speech Synthesis xonotic: 2560 x 1440 - High xonotic: 1920 x 1200 - High xonotic: 2560 x 1440 - Ultra onednn: Recurrent Neural Network Inference - f32 - CPU onednn: IP Shapes 3D - f32 - CPU xonotic: 1920 x 1080 - Ultra deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream stress-ng: Glibc C String Functions onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU heffte: r2c - FFTW - double-long - 128 deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream svt-av1: Preset 12 - Bosphorus 4K vvenc: Bosphorus 4K - Fast heffte: c2c - Stock - double-long - 128 sqlite: 1 svt-av1: Preset 4 - Bosphorus 4K stress-ng: Vector Floating Point svt-av1: Preset 13 - Bosphorus 1080p ospray: gravity_spheres_volume/dim_512/ao/real_time leveldb: Hot Read xonotic: 3840 x 2160 - Low liquid-dsp: 64 - 256 - 32 deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream svt-av1: Preset 12 - Bosphorus 1080p oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only vvenc: Bosphorus 1080p - Fast svt-av1: Preset 8 - Bosphorus 4K leveldb: Rand Fill stress-ng: Futex onednn: Recurrent Neural Network Training - f32 - CPU leveldb: Rand Fill xonotic: 2560 x 1440 - Ultimate onednn: Deconvolution Batch shapes_1d - f32 - CPU cp2k: Fayalite-FIST mocassin: Gas HII40 deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream liquid-dsp: 2 - 256 - 512 nekrs: Kershaw stress-ng: Matrix Math xonotic: 3840 x 2160 - Ultra xonotic: 1920 x 1200 - Low laghos: Triple Point Problem liquid-dsp: 4 - 256 - 32 heffte: r2c - FFTW - double-long - 256 deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream palabos: 400 deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream liquid-dsp: 16 - 256 - 57 onednn: Recurrent Neural Network Training - u8s8f32 - CPU xonotic: 1920 x 1200 - Ultimate liquid-dsp: 2 - 256 - 57 sqlite: 32 liquid-dsp: 32 - 256 - 32 remhos: Sample Remap Example stress-ng: MMAP onednn: Deconvolution Batch shapes_3d - f32 - CPU liquid-dsp: 4 - 256 - 512 embree: Pathtracer - Asian Dragon hpcg: 104 104 104 - 60 heffte: r2c - Stock - double-long - 256 encode-opus: WAV To Opus Encode deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream xonotic: 3840 x 2160 - High qmcpack: simple-H2O deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream svt-av1: Preset 13 - Bosphorus 4K deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream stress-ng: Poll xonotic: 1920 x 1080 - High laghos: Sedov Blast Wave, ube_922_hex.mesh liquid-dsp: 32 - 256 - 57 embree: Pathtracer ISPC - Asian Dragon Obj stress-ng: Matrix 3D Math deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream leveldb: Seq Fill stress-ng: Forking vvenc: Bosphorus 4K - Faster stress-ng: AVL Tree deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU liquid-dsp: 1 - 256 - 57 deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream embree: Pathtracer ISPC - Crown libxsmm: 256 qmcpack: FeCO6_b3lyp_gms ospray: gravity_spheres_volume/dim_512/pathtracer/real_time vvenc: Bosphorus 1080p - Faster mocassin: Dust 2D tau100.0 stress-ng: Wide Vector Math onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU liquid-dsp: 32 - 256 - 512 deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream stress-ng: Crypto embree: Pathtracer - Asian Dragon Obj palabos: 100 liquid-dsp: 2 - 256 - 32 xonotic: 1920 x 1080 - Low ospray: particle_volume/scivis/real_time ospray: particle_volume/ao/real_time liquid-dsp: 8 - 256 - 57 svt-av1: Preset 8 - Bosphorus 1080p leveldb: Seq Fill svt-av1: Preset 4 - Bosphorus 1080p liquid-dsp: 1 - 256 - 512 deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream liquid-dsp: 16 - 256 - 32 stress-ng: CPU Stress heffte: c2c - FFTW - double-long - 256 stress-ng: Cloning heffte: c2c - FFTW - double-long - 128 stress-ng: Floating Point dav1d: Chimera 1080p 10-bit xonotic: 1920 x 1080 - Ultimate stress-ng: Glibc Qsort Data Sorting heffte: c2c - Stock - double-long - 256 stress-ng: MEMFD liquid-dsp: 8 - 256 - 32 palabos: 500 z3: 2.smt2 stress-ng: System V Message Passing deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream sqlite: 16 dav1d: Summer Nature 4K libxsmm: 32 leveldb: Rand Read liquid-dsp: 8 - 256 - 512 stress-ng: IO_uring xonotic: 3840 x 2160 - Ultimate liquid-dsp: 1 - 256 - 32 z3: 1.smt2 stress-ng: Memory Copying stress-ng: Fused Multiply-Add stress-ng: Atomic sqlite: 64 gpaw: Carbon Nanotube nekrs: TurboPipe Periodic heffte: r2c - Stock - double-long - 512 ospray: particle_volume/pathtracer/real_time heffte: r2c - FFTW - double-long - 512 libxsmm: 128 libxsmm: 64 deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream petsc: Streams stress-ng: Mutex xonotic: 1920 x 1200 - Ultra liquid-dsp: 64 - 256 - 512 liquid-dsp: 64 - 256 - 57 dav1d: Chimera 1080p stress-ng: Hash stress-ng: Malloc embree: Pathtracer - Crown heffte: c2c - Stock - double-long - 512 heffte: c2c - FFTW - double-long - 512 dav1d: Summer Nature 1080p leveldb: Overwrite stress-ng: Pthread liquid-dsp: 4 - 256 - 57 stress-ng: Zlib stress-ng: Vector Math stress-ng: Function Call liquid-dsp: 16 - 256 - 512 leveldb: Rand Delete ospray: gravity_spheres_volume/dim_512/scivis/real_time embree: Pathtracer ISPC - Asian Dragon oidn: RTLightmap.hdr.4096x4096 - CPU-Only oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only leveldb: Overwrite stress-ng: Socket Activity stress-ng: Pipe hpcg: 144 144 144 - 60 a b 10866.014 0.6 1.55099 395.70935 1.177229 66510329.66 1624118.54 1.36630 34.8038 28.7279 0.948450 156.48335 7.0704 141.2991 196.98 22825.44 136.22 515575.47 12.2881 81.3373 243.219 291.254 938.102 673.1714914 42.966 65.839 43.8828 22.7831 51.8810 11409509.77 1018.28439 266.581 752.30 148243333 28.6062 31.077 560.9659015 561.0057748 520.7893156 976.467 4.26624 518.6729034 556.6090 33453867.32 5.76769 56.4524 12.8830 77.5785 185.7387 86.1294 126.423 5.44 26.5517 106.014 3.756 94803.76 360.924 4.93554 43.137 670.0380428 2250733333 61.1727 16.3453 308.249 1.22 13.875 54.148 26.9 4610857.40 3252.18 262.981 384.0158631 5.69206 123.826 12.684 148.9743 482.7469 107.3698 20851667 2123046667 199178.68 420.7457662 671.9542910 220.46 178100000 27.4329 33.0990 139.299 16.3383 61.1988 795103333 3244.46 384.7672496 103806667 505.417 1343200000 23.537 437.11 2.68566 41277667 41.5850 10.9645 30.1443 28.695 91.5663 10.9104 467.6921769 27.602 67.8249 127.136 235.8139 4084623.29 561.4381904 264.34 1506266667 33.8311 2806.09 323.0917 27.8 51344.69 10.931 283.41 49.4917 1.57740 4.81893 51993333 21.5675 46.3523 34.4087 910.4 175.39 7.67668 24.877 181.265 1501239.29 935.611 3235.41 313753333 255.9872 78260.17 37.3900 121.931 89896333 671.4224194 9.74548 9.86893 409183333 85.308 254.942 10.868 10537667 62.4431 552.2812 690800000 82729.76 13.7638 3354.40 30.8024 11201.44 374.79 386.9438277 942.22 13.8764 395.11 354570000 143.850 76.011 10692419.88 28.8773 373.820 222.52 160.5 43.493 82123667 439798.24 311.3989499 45075000 29.932 10973.65 33507543.08 480.06 681.448 110.846 3444566667 30.0147 128.572 27.7110 635.8 318.5 134.4071 119.0134 58312.0964 18827346.28 521.4981114 506326667 1836033333 398.39 7627578.66 92853207.13 38.4669 15.4082 15.3464 597.02 262.354 128353.64 206086667 4517.78 224417.23 24278.34 160113333 245.160 4.62468 39.3962 0.60 1.22 27.0 3072.80 18809740.35 16348.371 0.4 1.30181 363.32431 1.08944 71041068.48 1535034.64 1.43664 33.4783 29.8626 0.985098 151.04973 6.833 146.1891 191.15 22200.86 132.76 528847.31 11.9799 83.4278 237.187 284.865 958.843 687.4861314 42.112 64.562 43.0475 23.2237 50.934 11620881.04 1003.11362 262.624 741.66 146215600 28.9881 31.488 567.977016 567.9388648 527.2114812 987.992 4.21682 524.5527153 550.4325 33092079.25 5.70574 55.8515 12.7473 78.4003 183.8144 87.0186 127.675 5.387 26.8097 105 3.721 95693.93 364.28 4.98028 42.757 675.8397046 2269700000 60.6621 16.4828 305.73 1.23 13.767 54.556 26.7 4644715 3275.76 264.868 381.3165184 5.73181 122.975 12.598 149.9629 485.9478 106.6629 20989000 2109640000 200423.6 423.3297457 676.0787756 219.12 179170000 27.2699 32.9066 140.078 16.4295 60.8593 799490000 3227.02 386.8054656 103260000 502.766 1350000000 23.654 439.24 2.69872 41475000 41.782 11.0163 30.2863 28.83 91.9798 10.8616 469.7486112 27.484 67.5353 127.68 236.8228 4101817.96 563.7930829 265.390123291 1512100000 33.9593 2795.85 324.2731 27.7 51160.22 10.892 282.42 49.319 1.5719 4.83565 51814000 21.4935 46.5115 34.5263 907.4 174.82 7.70051 24.801 180.727 1496970.66 932.987 3226.7 314560000 256.6381 78455.19 37.4822 122.23 89686000 669.8590201 9.76771 9.89124 410090000 85.496 255.488 10.845 10560000 62.3112 551.16 692130000 82887.24 13.79 3360.52 30.8579 11221.55 374.13 386.2753935 943.84 13.8536 394.5 355110000 144.062 76.122 10677047.79 28.9177 374.33 222.24 160.7 43.547 82224000 440335.12 311.7695344 45023000 29.898 10984.91 33539318.97 480.51 680.811 110.949 3441770000 30.036 128.663 27.7303 635.4 318.7 134.323 119.0879 58276.7926 18816044.49 521.2002993 506050000 1837000000 398.2 7624159.24 92812375.37 38.45 15.4135 15.3508 597.19 262.283 128387.57 206140000 4518.88 224460.71 24275.23 160130000 245.182 4.62434 39.399 0.60 1.22 27 9580.27 22201912.16 OpenBenchmarking.org
LevelDB Benchmark: Fill Sync OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Fill Sync a b 4K 8K 12K 16K 20K SE +/- 65.20, N = 3 10866.01 16348.37 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
LevelDB Benchmark: Fill Sync OpenBenchmarking.org MB/s, More Is Better LevelDB 1.23 Benchmark: Fill Sync a b 0.135 0.27 0.405 0.54 0.675 SE +/- 0.00, N = 3 0.6 0.4 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU a b 0.349 0.698 1.047 1.396 1.745 SE +/- 0.01212, N = 10 1.55099 1.30181 MIN: 1.33 MIN: 1.19 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Whisper.cpp Model: ggml-small.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-small.en - Input: 2016 State of the Union a b 90 180 270 360 450 SE +/- 6.52, N = 9 395.71 363.32 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU a b 0.2649 0.5298 0.7947 1.0596 1.3245 SE +/- 0.016952, N = 14 1.177229 1.089440 MIN: 0.89 MIN: 0.97 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Stress-NG Test: Semaphores OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Semaphores a b 15M 30M 45M 60M 75M SE +/- 919418.56, N = 3 66510329.66 71041068.48 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: CPU Cache a b 300K 600K 900K 1200K 1500K SE +/- 19308.61, N = 3 1624118.54 1535034.64 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU a b 0.3232 0.6464 0.9696 1.2928 1.616 SE +/- 0.01766, N = 3 1.36630 1.43664 MIN: 1.27 MIN: 1.35 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream a b 8 16 24 32 40 SE +/- 0.22, N = 3 34.80 33.48
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream a b 7 14 21 28 35 SE +/- 0.18, N = 3 28.73 29.86
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU a b 0.2216 0.4432 0.6648 0.8864 1.108 SE +/- 0.010152, N = 3 0.948450 0.985098 MIN: 0.87 MIN: 0.9 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Whisper.cpp Model: ggml-base.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-base.en - Input: 2016 State of the Union a b 30 60 90 120 150 SE +/- 1.99, N = 3 156.48 151.05 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b 2 4 6 8 10 SE +/- 0.0378, N = 3 7.0704 6.8330
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b 30 60 90 120 150 SE +/- 0.75, N = 3 141.30 146.19
QMCPACK Input: FeCO6_b3lyp_gms OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: FeCO6_b3lyp_gms a b 40 80 120 160 200 SE +/- 1.72, N = 3 196.98 191.15 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
Stress-NG Test: Vector Shuffle OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Shuffle a b 5K 10K 15K 20K 25K SE +/- 43.57, N = 3 22825.44 22200.86 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
QMCPACK Input: Li2_STO_ae OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: Li2_STO_ae a b 30 60 90 120 150 SE +/- 0.40, N = 3 136.22 132.76 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
Stress-NG Test: SENDFILE OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: SENDFILE a b 110K 220K 330K 440K 550K SE +/- 656.59, N = 3 515575.47 528847.31 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream a b 3 6 9 12 15 SE +/- 0.03, N = 3 12.29 11.98
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream a b 20 40 60 80 100 SE +/- 0.18, N = 3 81.34 83.43
SQLite Threads / Copies: 2 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 2 a b 50 100 150 200 250 SE +/- 1.34, N = 3 243.22 237.19 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
SQLite Threads / Copies: 8 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 8 a b 60 120 180 240 300 SE +/- 2.25, N = 3 291.25 284.87 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU a b 200 400 600 800 1000 SE +/- 8.51, N = 3 938.10 958.84 MIN: 914.11 MIN: 951.92 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Xonotic Resolution: 2560 x 1440 - Effects Quality: Low OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 2560 x 1440 - Effects Quality: Low a b 150 300 450 600 750 SE +/- 2.49, N = 3 673.17 687.49 MIN: 426 / MAX: 1185 MIN: 439 / MAX: 1194
CP2K Molecular Dynamics Input: H20-64 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2023.1 Input: H20-64 a b 10 20 30 40 50 42.97 42.11 1. (F9X) gfortran options: -fopenmp -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kdbm -lcp2kgrid -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -lhdf5 -lhdf5_hl -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -lopenblas -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
LevelDB Benchmark: Seek Random OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Seek Random a b 15 30 45 60 75 SE +/- 0.19, N = 3 65.84 64.56 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream a b 10 20 30 40 50 SE +/- 0.26, N = 3 43.88 43.05
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream a b 6 12 18 24 30 SE +/- 0.14, N = 3 22.78 23.22
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 a b 12 24 36 48 60 SE +/- 0.72, N = 3 51.88 50.93 1. (CXX) g++ options: -O3
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Context Switching a b 2M 4M 6M 8M 10M SE +/- 22031.57, N = 3 11409509.77 11620881.04 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-medium.en - Input: 2016 State of the Union a b 200 400 600 800 1000 SE +/- 11.84, N = 3 1018.28 1003.11 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
SQLite Threads / Copies: 4 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 4 a b 60 120 180 240 300 SE +/- 2.89, N = 4 266.58 262.62 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: NUMA a b 160 320 480 640 800 SE +/- 5.01, N = 3 752.30 741.66 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.6 a b 30M 60M 90M 120M 150M SE +/- 636875.17, N = 3 148243333 146215600 1. (CXX) g++ options: -O3 -fopenmp -ldl
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b 7 14 21 28 35 SE +/- 0.06, N = 3 28.61 28.99
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 1.51 Text-To-Speech Synthesis a b 7 14 21 28 35 SE +/- 0.34, N = 4 31.08 31.49 1. (CXX) g++ options: -O2
Xonotic Resolution: 2560 x 1440 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 2560 x 1440 - Effects Quality: High a b 120 240 360 480 600 SE +/- 0.81, N = 3 560.97 567.98 MIN: 336 / MAX: 962 MIN: 347 / MAX: 923
Xonotic Resolution: 1920 x 1200 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1200 - Effects Quality: High a b 120 240 360 480 600 SE +/- 3.14, N = 3 561.01 567.94 MIN: 341 / MAX: 967 MIN: 343 / MAX: 932
Xonotic Resolution: 2560 x 1440 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 2560 x 1440 - Effects Quality: Ultra a b 110 220 330 440 550 SE +/- 2.20, N = 3 520.79 527.21 MIN: 272 / MAX: 931 MIN: 294 / MAX: 931
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU a b 200 400 600 800 1000 SE +/- 3.39, N = 3 976.47 987.99 MIN: 961.99 MIN: 979.32 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU a b 0.9599 1.9198 2.8797 3.8396 4.7995 SE +/- 0.01334, N = 3 4.26624 4.21682 MIN: 4.12 MIN: 4.1 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Xonotic Resolution: 1920 x 1080 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1080 - Effects Quality: Ultra a b 110 220 330 440 550 SE +/- 1.23, N = 3 518.67 524.55 MIN: 259 / MAX: 910 MIN: 285 / MAX: 905
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b 120 240 360 480 600 SE +/- 0.31, N = 3 556.61 550.43
Stress-NG Test: Glibc C String Functions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Glibc C String Functions a b 7M 14M 21M 28M 35M SE +/- 238790.03, N = 3 33453867.32 33092079.25 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU a b 1.2977 2.5954 3.8931 5.1908 6.4885 SE +/- 0.00320, N = 3 5.76769 5.70574 MIN: 5.69 MIN: 5.64 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 a b 13 26 39 52 65 SE +/- 0.45, N = 15 56.45 55.85 1. (CXX) g++ options: -O3
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream a b 3 6 9 12 15 SE +/- 0.03, N = 3 12.88 12.75
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream a b 20 40 60 80 100 SE +/- 0.19, N = 3 77.58 78.40
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream a b 40 80 120 160 200 SE +/- 1.22, N = 3 185.74 183.81
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream a b 20 40 60 80 100 SE +/- 0.56, N = 3 86.13 87.02
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 12 - Input: Bosphorus 4K a b 30 60 90 120 150 SE +/- 1.42, N = 4 126.42 127.68 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 4K - Video Preset: Fast a b 1.224 2.448 3.672 4.896 6.12 SE +/- 0.015, N = 3 5.440 5.387 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 a b 6 12 18 24 30 SE +/- 0.17, N = 3 26.55 26.81 1. (CXX) g++ options: -O3
SQLite Threads / Copies: 1 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 1 a b 20 40 60 80 100 SE +/- 0.30, N = 3 106.01 105.00 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 4 - Input: Bosphorus 4K a b 0.8451 1.6902 2.5353 3.3804 4.2255 SE +/- 0.010, N = 3 3.756 3.721 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Stress-NG Test: Vector Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Floating Point a b 20K 40K 60K 80K 100K SE +/- 139.68, N = 3 94803.76 95693.93 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a b 80 160 240 320 400 SE +/- 1.35, N = 3 360.92 364.28 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/ao/real_time a b 1.1206 2.2412 3.3618 4.4824 5.603 SE +/- 0.00393, N = 3 4.93554 4.98028
LevelDB Benchmark: Hot Read OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Hot Read a b 10 20 30 40 50 SE +/- 0.21, N = 3 43.14 42.76 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
Xonotic Resolution: 3840 x 2160 - Effects Quality: Low OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 3840 x 2160 - Effects Quality: Low a b 150 300 450 600 750 SE +/- 1.56, N = 3 670.04 675.84 MIN: 387 / MAX: 1175 MIN: 413 / MAX: 1166
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 32 a b 500M 1000M 1500M 2000M 2500M SE +/- 296273.15, N = 3 2250733333 2269700000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a b 14 28 42 56 70 SE +/- 0.04, N = 3 61.17 60.66
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a b 4 8 12 16 20 SE +/- 0.01, N = 3 16.35 16.48
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a b 70 140 210 280 350 SE +/- 1.75, N = 3 308.25 305.73 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.2768 0.5536 0.8304 1.1072 1.384 SE +/- 0.00, N = 3 1.22 1.23
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 1080p - Video Preset: Fast a b 4 8 12 16 20 SE +/- 0.04, N = 3 13.88 13.77 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 8 - Input: Bosphorus 4K a b 12 24 36 48 60 SE +/- 0.30, N = 3 54.15 54.56 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
LevelDB Benchmark: Random Fill OpenBenchmarking.org MB/s, More Is Better LevelDB 1.23 Benchmark: Random Fill a b 6 12 18 24 30 SE +/- 0.09, N = 3 26.9 26.7 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
Stress-NG Test: Futex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Futex a b 1000K 2000K 3000K 4000K 5000K SE +/- 56259.21, N = 4 4610857.40 4644715.00 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU a b 700 1400 2100 2800 3500 SE +/- 29.27, N = 3 3252.18 3275.76 MIN: 3200.87 MIN: 3269.28 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
LevelDB Benchmark: Random Fill OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Random Fill a b 60 120 180 240 300 SE +/- 0.90, N = 3 262.98 264.87 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
Xonotic Resolution: 2560 x 1440 - Effects Quality: Ultimate OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 2560 x 1440 - Effects Quality: Ultimate a b 80 160 240 320 400 SE +/- 1.62, N = 3 384.02 381.32 MIN: 99 / MAX: 847 MIN: 106 / MAX: 824
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU a b 1.2897 2.5794 3.8691 5.1588 6.4485 SE +/- 0.03300, N = 3 5.69206 5.73181 MIN: 4.03 MIN: 4.01 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2023.1 Input: Fayalite-FIST a b 30 60 90 120 150 123.83 122.98 1. (F9X) gfortran options: -fopenmp -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kdbm -lcp2kgrid -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -lhdf5 -lhdf5_hl -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -lopenblas -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
Monte Carlo Simulations of Ionised Nebulae Input: Gas HII40 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Gas HII40 a b 3 6 9 12 15 SE +/- 0.05, N = 3 12.68 12.60 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream a b 30 60 90 120 150 SE +/- 0.14, N = 3 148.97 149.96
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b 110 220 330 440 550 SE +/- 2.43, N = 3 482.75 485.95
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream a b 20 40 60 80 100 SE +/- 0.10, N = 3 107.37 106.66
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 512 a b 4M 8M 12M 16M 20M SE +/- 21712.77, N = 3 20851667 20989000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
nekRS Input: Kershaw OpenBenchmarking.org flops/rank, More Is Better nekRS 23.0 Input: Kershaw a b 500M 1000M 1500M 2000M 2500M SE +/- 3171604.92, N = 3 2123046667 2109640000 1. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -rdynamic -lmpi_cxx -lmpi
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix Math a b 40K 80K 120K 160K 200K SE +/- 476.06, N = 3 199178.68 200423.60 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Xonotic Resolution: 3840 x 2160 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 3840 x 2160 - Effects Quality: Ultra a b 90 180 270 360 450 SE +/- 0.23, N = 3 420.75 423.33 MIN: 194 / MAX: 579 MIN: 194 / MAX: 581
Xonotic Resolution: 1920 x 1200 - Effects Quality: Low OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1200 - Effects Quality: Low a b 150 300 450 600 750 SE +/- 1.00, N = 3 671.95 676.08 MIN: 431 / MAX: 1193 MIN: 427 / MAX: 1181
Laghos Test: Triple Point Problem OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Triple Point Problem a b 50 100 150 200 250 SE +/- 0.34, N = 3 220.46 219.12 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 32 a b 40M 80M 120M 160M 200M SE +/- 81853.53, N = 3 178100000 179170000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 a b 6 12 18 24 30 SE +/- 0.06, N = 3 27.43 27.27 1. (CXX) g++ options: -O3
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b 8 16 24 32 40 SE +/- 0.16, N = 3 33.10 32.91
Palabos Grid Size: 400 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 400 a b 30 60 90 120 150 SE +/- 0.57, N = 3 139.30 140.08 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream a b 4 8 12 16 20 SE +/- 0.01, N = 3 16.34 16.43
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream a b 14 28 42 56 70 SE +/- 0.05, N = 3 61.20 60.86
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 57 a b 200M 400M 600M 800M 1000M SE +/- 377903.57, N = 3 795103333 799490000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU a b 700 1400 2100 2800 3500 SE +/- 29.09, N = 3 3244.46 3227.02 MIN: 3177.37 MIN: 3219.36 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Xonotic Resolution: 1920 x 1200 - Effects Quality: Ultimate OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1200 - Effects Quality: Ultimate a b 80 160 240 320 400 SE +/- 0.49, N = 3 384.77 386.81 MIN: 102 / MAX: 919 MIN: 104 / MAX: 887
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 57 a b 20M 40M 60M 80M 100M SE +/- 150591.43, N = 3 103806667 103260000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
SQLite Threads / Copies: 32 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 32 a b 110 220 330 440 550 SE +/- 1.08, N = 3 505.42 502.77 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 32 a b 300M 600M 900M 1200M 1500M SE +/- 2211334.44, N = 3 1343200000 1350000000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Remhos Test: Sample Remap Example OpenBenchmarking.org Seconds, Fewer Is Better Remhos 1.0 Test: Sample Remap Example a b 6 12 18 24 30 SE +/- 0.04, N = 3 23.54 23.65 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
Stress-NG Test: MMAP OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: MMAP a b 100 200 300 400 500 SE +/- 0.81, N = 3 437.11 439.24 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU a b 0.6072 1.2144 1.8216 2.4288 3.036 SE +/- 0.01103, N = 3 2.68566 2.69872 MIN: 2.61 MIN: 2.64 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 512 a b 9M 18M 27M 36M 45M SE +/- 67087.84, N = 3 41277667 41475000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon a b 10 20 30 40 50 SE +/- 0.05, N = 3 41.59 41.78 MIN: 41.23 / MAX: 42.19 MIN: 41.55 / MAX: 42.41
High Performance Conjugate Gradient X Y Z: 104 104 104 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 104 104 104 - RT: 60 a b 3 6 9 12 15 SE +/- 0.02, N = 3 10.96 11.02 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 a b 7 14 21 28 35 SE +/- 0.11, N = 3 30.14 30.29 1. (CXX) g++ options: -O3
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.4 WAV To Opus Encode a b 7 14 21 28 35 SE +/- 0.05, N = 5 28.70 28.83 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream a b 20 40 60 80 100 SE +/- 0.13, N = 3 91.57 91.98
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream a b 3 6 9 12 15 SE +/- 0.02, N = 3 10.91 10.86
Xonotic Resolution: 3840 x 2160 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 3840 x 2160 - Effects Quality: High a b 100 200 300 400 500 SE +/- 0.24, N = 3 467.69 469.75 MIN: 222 / MAX: 635 MIN: 225 / MAX: 637
QMCPACK Input: simple-H2O OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: simple-H2O a b 6 12 18 24 30 SE +/- 0.04, N = 3 27.60 27.48 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b 15 30 45 60 75 SE +/- 0.04, N = 3 67.82 67.54
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b 30 60 90 120 150 SE +/- 0.08, N = 3 127.14 127.68 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b 50 100 150 200 250 SE +/- 0.13, N = 3 235.81 236.82
Stress-NG Test: Poll OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Poll a b 900K 1800K 2700K 3600K 4500K SE +/- 1922.15, N = 3 4084623.29 4101817.96 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Xonotic Resolution: 1920 x 1080 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1080 - Effects Quality: High a b 120 240 360 480 600 SE +/- 1.66, N = 3 561.44 563.79 MIN: 330 / MAX: 956 MIN: 337 / MAX: 945
Laghos Test: Sedov Blast Wave, ube_922_hex.mesh OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Sedov Blast Wave, ube_922_hex.mesh a b 60 120 180 240 300 SE +/- 0.22, N = 3 264.34 265.39 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 57 a b 300M 600M 900M 1200M 1500M SE +/- 2366666.67, N = 3 1506266667 1512100000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj a b 8 16 24 32 40 SE +/- 0.06, N = 3 33.83 33.96 MIN: 33.53 / MAX: 34.4 MIN: 33.75 / MAX: 34.43
Stress-NG Test: Matrix 3D Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix 3D Math a b 600 1200 1800 2400 3000 SE +/- 1.75, N = 3 2806.09 2795.85 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b 70 140 210 280 350 SE +/- 0.39, N = 3 323.09 324.27
LevelDB Benchmark: Sequential Fill OpenBenchmarking.org MB/s, More Is Better LevelDB 1.23 Benchmark: Sequential Fill a b 7 14 21 28 35 SE +/- 0.07, N = 3 27.8 27.7 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
Stress-NG Test: Forking OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Forking a b 11K 22K 33K 44K 55K SE +/- 291.28, N = 3 51344.69 51160.22 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 4K - Video Preset: Faster a b 3 6 9 12 15 SE +/- 0.04, N = 3 10.93 10.89 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
Stress-NG Test: AVL Tree OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: AVL Tree a b 60 120 180 240 300 SE +/- 0.22, N = 3 283.41 282.42 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b 11 22 33 44 55 SE +/- 0.05, N = 3 49.49 49.32
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU a b 0.3549 0.7098 1.0647 1.4196 1.7745 SE +/- 0.00305, N = 3 1.57740 1.57190 MIN: 1.49 MIN: 1.5 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU a b 1.088 2.176 3.264 4.352 5.44 SE +/- 0.01212, N = 3 4.81893 4.83565 MIN: 4.74 MIN: 4.78 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 57 a b 11M 22M 33M 44M 55M SE +/- 193694.20, N = 3 51993333 51814000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b 5 10 15 20 25 SE +/- 0.01, N = 3 21.57 21.49
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b 11 22 33 44 55 SE +/- 0.03, N = 3 46.35 46.51
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Crown a b 8 16 24 32 40 SE +/- 0.09, N = 3 34.41 34.53 MIN: 33.95 / MAX: 35.09 MIN: 34.2 / MAX: 35.09
libxsmm M N K: 256 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 256 a b 200 400 600 800 1000 SE +/- 3.58, N = 3 910.4 907.4 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
QMCPACK Input: FeCO6_b3lyp_gms OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: FeCO6_b3lyp_gms a b 40 80 120 160 200 SE +/- 0.14, N = 3 175.39 174.82 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time a b 2 4 6 8 10 SE +/- 0.01066, N = 3 7.67668 7.70051
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 1080p - Video Preset: Faster a b 6 12 18 24 30 SE +/- 0.09, N = 3 24.88 24.80 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
Monte Carlo Simulations of Ionised Nebulae Input: Dust 2D tau100.0 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Dust 2D tau100.0 a b 40 80 120 160 200 SE +/- 0.15, N = 3 181.27 180.73 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz
Stress-NG Test: Wide Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Wide Vector Math a b 300K 600K 900K 1200K 1500K SE +/- 3256.42, N = 3 1501239.29 1496970.66 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU a b 200 400 600 800 1000 SE +/- 7.63, N = 15 935.61 932.99 MIN: 895.45 MIN: 924.9 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU a b 700 1400 2100 2800 3500 SE +/- 25.05, N = 3 3235.41 3226.70 MIN: 3194.45 MIN: 3215.86 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 512 a b 70M 140M 210M 280M 350M SE +/- 399263.21, N = 3 313753333 314560000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream a b 60 120 180 240 300 SE +/- 0.36, N = 3 255.99 256.64
Stress-NG Test: Crypto OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Crypto a b 20K 40K 60K 80K 100K SE +/- 78.61, N = 3 78260.17 78455.19 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon Obj a b 9 18 27 36 45 SE +/- 0.05, N = 3 37.39 37.48 MIN: 37.08 / MAX: 37.99 MIN: 37.25 / MAX: 38.16
Palabos Grid Size: 100 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 100 a b 30 60 90 120 150 SE +/- 0.14, N = 3 121.93 122.23 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 32 a b 20M 40M 60M 80M 100M SE +/- 85545.96, N = 3 89896333 89686000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Xonotic Resolution: 1920 x 1080 - Effects Quality: Low OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1080 - Effects Quality: Low a b 140 280 420 560 700 SE +/- 0.98, N = 3 671.42 669.86 MIN: 430 / MAX: 1177 MIN: 439 / MAX: 1136
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/scivis/real_time a b 3 6 9 12 15 SE +/- 0.00531, N = 3 9.74548 9.76771
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/ao/real_time a b 3 6 9 12 15 SE +/- 0.00341, N = 3 9.86893 9.89124
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 57 a b 90M 180M 270M 360M 450M SE +/- 176099.72, N = 3 409183333 410090000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a b 20 40 60 80 100 SE +/- 0.40, N = 3 85.31 85.50 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
LevelDB Benchmark: Sequential Fill OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Sequential Fill a b 60 120 180 240 300 SE +/- 0.72, N = 3 254.94 255.49 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 4 - Input: Bosphorus 1080p a b 3 6 9 12 15 SE +/- 0.05, N = 3 10.87 10.85 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 512 a b 2M 4M 6M 8M 10M SE +/- 21333.33, N = 3 10537667 10560000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream a b 14 28 42 56 70 SE +/- 0.10, N = 3 62.44 62.31
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b 120 240 360 480 600 SE +/- 1.59, N = 3 552.28 551.16
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 32 a b 150M 300M 450M 600M 750M SE +/- 1128996.60, N = 3 690800000 692130000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: CPU Stress a b 20K 40K 60K 80K 100K SE +/- 76.38, N = 3 82729.76 82887.24 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 a b 4 8 12 16 20 SE +/- 0.01, N = 3 13.76 13.79 1. (CXX) g++ options: -O3
Stress-NG Test: Cloning OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Cloning a b 700 1400 2100 2800 3500 SE +/- 2.74, N = 3 3354.40 3360.52 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 a b 7 14 21 28 35 SE +/- 0.37, N = 3 30.80 30.86 1. (CXX) g++ options: -O3
Stress-NG Test: Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Floating Point a b 2K 4K 6K 8K 10K SE +/- 7.25, N = 3 11201.44 11221.55 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Chimera 1080p 10-bit a b 80 160 240 320 400 SE +/- 0.32, N = 3 374.79 374.13 1. (CC) gcc options: -pthread -lm
Xonotic Resolution: 1920 x 1080 - Effects Quality: Ultimate OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1080 - Effects Quality: Ultimate a b 80 160 240 320 400 SE +/- 2.17, N = 3 386.94 386.28 MIN: 97 / MAX: 892 MIN: 101 / MAX: 871
Stress-NG Test: Glibc Qsort Data Sorting OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Glibc Qsort Data Sorting a b 200 400 600 800 1000 SE +/- 0.47, N = 3 942.22 943.84 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 a b 4 8 12 16 20 SE +/- 0.01, N = 3 13.88 13.85 1. (CXX) g++ options: -O3
Stress-NG Test: MEMFD OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: MEMFD a b 90 180 270 360 450 SE +/- 0.62, N = 3 395.11 394.50 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 32 a b 80M 160M 240M 320M 400M SE +/- 120138.81, N = 3 354570000 355110000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Palabos Grid Size: 500 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 500 a b 30 60 90 120 150 SE +/- 0.27, N = 3 143.85 144.06 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Z3 Theorem Prover SMT File: 2.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 2.smt2 a b 20 40 60 80 100 SE +/- 0.12, N = 3 76.01 76.12 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
Stress-NG Test: System V Message Passing OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: System V Message Passing a b 2M 4M 6M 8M 10M SE +/- 13654.50, N = 3 10692419.88 10677047.79 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b 7 14 21 28 35 SE +/- 0.10, N = 3 28.88 28.92
SQLite Threads / Copies: 16 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 16 a b 80 160 240 320 400 SE +/- 1.37, N = 3 373.82 374.33 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Summer Nature 4K a b 50 100 150 200 250 SE +/- 0.24, N = 3 222.52 222.24 1. (CC) gcc options: -pthread -lm
libxsmm M N K: 32 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 32 a b 40 80 120 160 200 SE +/- 0.07, N = 3 160.5 160.7 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
LevelDB Benchmark: Random Read OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Random Read a b 10 20 30 40 50 SE +/- 0.19, N = 3 43.49 43.55 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 512 a b 20M 40M 60M 80M 100M SE +/- 37834.43, N = 3 82123667 82224000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Stress-NG Test: IO_uring OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: IO_uring a b 90K 180K 270K 360K 450K SE +/- 726.45, N = 3 439798.24 440335.12 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Xonotic Resolution: 3840 x 2160 - Effects Quality: Ultimate OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 3840 x 2160 - Effects Quality: Ultimate a b 70 140 210 280 350 SE +/- 0.68, N = 3 311.40 311.77 MIN: 97 / MAX: 487 MIN: 98 / MAX: 488
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 32 a b 10M 20M 30M 40M 50M SE +/- 21825.06, N = 3 45075000 45023000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Z3 Theorem Prover SMT File: 1.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 1.smt2 a b 7 14 21 28 35 SE +/- 0.01, N = 3 29.93 29.90 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Memory Copying a b 2K 4K 6K 8K 10K SE +/- 6.50, N = 3 10973.65 10984.91 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Fused Multiply-Add OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Fused Multiply-Add a b 7M 14M 21M 28M 35M SE +/- 7490.49, N = 3 33507543.08 33539318.97 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Atomic OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Atomic a b 100 200 300 400 500 SE +/- 0.46, N = 3 480.06 480.51 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
SQLite Threads / Copies: 64 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 64 a b 150 300 450 600 750 SE +/- 0.71, N = 3 681.45 680.81 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
GPAW Input: Carbon Nanotube OpenBenchmarking.org Seconds, Fewer Is Better GPAW 23.6 Input: Carbon Nanotube a b 20 40 60 80 100 SE +/- 0.26, N = 3 110.85 110.95 1. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi
nekRS Input: TurboPipe Periodic OpenBenchmarking.org flops/rank, More Is Better nekRS 23.0 Input: TurboPipe Periodic a b 700M 1400M 2100M 2800M 3500M SE +/- 1942175.18, N = 3 3444566667 3441770000 1. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -rdynamic -lmpi_cxx -lmpi
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 a b 7 14 21 28 35 SE +/- 0.04, N = 3 30.01 30.04 1. (CXX) g++ options: -O3
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/pathtracer/real_time a b 30 60 90 120 150 SE +/- 0.05, N = 3 128.57 128.66
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 a b 7 14 21 28 35 SE +/- 0.01, N = 3 27.71 27.73 1. (CXX) g++ options: -O3
libxsmm M N K: 128 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 128 a b 140 280 420 560 700 SE +/- 0.22, N = 3 635.8 635.4 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
libxsmm M N K: 64 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 64 a b 70 140 210 280 350 SE +/- 0.09, N = 3 318.5 318.7 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream a b 30 60 90 120 150 SE +/- 0.21, N = 3 134.41 134.32
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream a b 30 60 90 120 150 SE +/- 0.18, N = 3 119.01 119.09
PETSc Test: Streams OpenBenchmarking.org MB/s, More Is Better PETSc 3.19 Test: Streams a b 12K 24K 36K 48K 60K SE +/- 71.95, N = 3 58312.10 58276.79 1. (CC) gcc options: -fPIC -O3 -O2 -lpthread -ludev -lpciaccess -lm
Stress-NG Test: Mutex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Mutex a b 4M 8M 12M 16M 20M SE +/- 22386.94, N = 3 18827346.28 18816044.49 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Xonotic Resolution: 1920 x 1200 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1200 - Effects Quality: Ultra a b 110 220 330 440 550 SE +/- 0.46, N = 3 521.50 521.20 MIN: 282 / MAX: 935 MIN: 285 / MAX: 919
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 512 a b 110M 220M 330M 440M 550M SE +/- 148361.42, N = 3 506326667 506050000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 57 a b 400M 800M 1200M 1600M 2000M SE +/- 19718378.34, N = 3 1836033333 1837000000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Chimera 1080p a b 90 180 270 360 450 SE +/- 0.11, N = 3 398.39 398.20 1. (CC) gcc options: -pthread -lm
Stress-NG Test: Hash OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Hash a b 1.6M 3.2M 4.8M 6.4M 8M SE +/- 2470.62, N = 3 7627578.66 7624159.24 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Malloc OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Malloc a b 20M 40M 60M 80M 100M SE +/- 44041.52, N = 3 92853207.13 92812375.37 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Crown a b 9 18 27 36 45 SE +/- 0.08, N = 3 38.47 38.45 MIN: 37.94 / MAX: 39.09 MIN: 38.09 / MAX: 38.98
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 a b 4 8 12 16 20 SE +/- 0.01, N = 3 15.41 15.41 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 a b 4 8 12 16 20 SE +/- 0.00, N = 3 15.35 15.35 1. (CXX) g++ options: -O3
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Summer Nature 1080p a b 130 260 390 520 650 SE +/- 0.86, N = 3 597.02 597.19 1. (CC) gcc options: -pthread -lm
LevelDB Benchmark: Overwrite OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Overwrite a b 60 120 180 240 300 SE +/- 0.92, N = 3 262.35 262.28 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
Stress-NG Test: Pthread OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pthread a b 30K 60K 90K 120K 150K SE +/- 521.45, N = 3 128353.64 128387.57 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 57 a b 40M 80M 120M 160M 200M SE +/- 210502.05, N = 3 206086667 206140000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Stress-NG Test: Zlib OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Zlib a b 1000 2000 3000 4000 5000 SE +/- 2.96, N = 3 4517.78 4518.88 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Math a b 50K 100K 150K 200K 250K SE +/- 22.96, N = 3 224417.23 224460.71 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Function Call OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Function Call a b 5K 10K 15K 20K 25K SE +/- 38.00, N = 3 24278.34 24275.23 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 512 a b 30M 60M 90M 120M 150M SE +/- 32829.53, N = 3 160113333 160130000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
LevelDB Benchmark: Random Delete OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Random Delete a b 50 100 150 200 250 SE +/- 0.46, N = 3 245.16 245.18 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time a b 1.0406 2.0812 3.1218 4.1624 5.203 SE +/- 0.00170, N = 3 4.62468 4.62434
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon a b 9 18 27 36 45 SE +/- 0.02, N = 3 39.40 39.40 MIN: 39.15 / MAX: 40.06 MIN: 39.18 / MAX: 39.86
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only a b 0.135 0.27 0.405 0.54 0.675 SE +/- 0.00, N = 3 0.60 0.60
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.2745 0.549 0.8235 1.098 1.3725 SE +/- 0.00, N = 3 1.22 1.22
LevelDB Benchmark: Overwrite OpenBenchmarking.org MB/s, More Is Better LevelDB 1.23 Benchmark: Overwrite a b 6 12 18 24 30 SE +/- 0.09, N = 3 27.0 27.0 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy -ltcmalloc
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Socket Activity a b 2K 4K 6K 8K 10K SE +/- 1064.20, N = 15 3072.80 9580.27 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Stress-NG Test: Pipe OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pipe a b 5M 10M 15M 20M 25M SE +/- 858971.94, N = 15 18809740.35 22201912.16 1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lmpfr -lpthread -lrt -lsctp -lz
Phoronix Test Suite v10.8.5