HP Zbook Intel Core i9-10885H testing with a HP 8736 (S91 Ver. 01.02.01 BIOS) and NVIDIA Quadro RTX 5000 with Max-Q Design 16GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2101076-HA-HPZBOOK6247&grs&sor .
HP Zbook Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution r1 r2 r3 Intel Core i9-10885H @ 5.30GHz (8 Cores / 16 Threads) HP 8736 (S91 Ver. 01.02.01 BIOS) Intel Comet Lake PCH 32GB 2048GB KXG50PNV2T04 KIOXIA NVIDIA Quadro RTX 5000 with Max-Q Design 16GB (600/6000MHz) Intel Comet Lake PCH cAVS Intel Wi-Fi 6 AX201 Ubuntu 20.04 5.6.0-1034-oem (x86_64) GNOME Shell 3.36.4 X Server 1.20.8 NVIDIA 450.80.02 4.6.0 OpenCL 1.2 CUDA 11.0.228 1.2.133 GCC 9.3.0 + CUDA 10.1 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - NONE / errors=remount-ro,relatime,rw / Block Size: 4096 Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xe0 - Thermald 1.9.1 OpenCL Details - GPU Compute Cores: 3072 Python Details - Python 3.8.3 Security Details - itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
HP Zbook clomp: Static OMP Speedup tnn: CPU - MobileNet v2 redis: GET viennacl: OpenCL LU Factorization espeak: Text-To-Speech Synthesis rnnoise: astcenc: Fast plaidml: No - Inference - IMDB LSTM - OpenCL encode-ape: WAV To APE graphics-magick: Rotate cryptsetup: PBKDF2-sha512 tnn: CPU - SqueezeNet v1.1 compress-lz4: 3 - Compression Speed leveldb: Hot Read redis: LPUSH ncnn: CPU - regnety_400m redis: SET cryptsetup: PBKDF2-whirlpool ncnn: CPU - blazeface compress-lz4: 9 - Compression Speed stockfish: Total Time hashcat: 7-Zip leveldb: Fill Sync onednn: IP Shapes 3D - u8s8f32 - CPU cryptsetup: AES-XTS 256b Encryption lczero: OpenCL build-eigen: Time To Compile onednn: IP Shapes 1D - f32 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP32 - CPU openvino: Age Gender Recognition Retail 0013 FP32 - CPU mnn: inception-v3 ncnn: CPU - resnet50 ncnn: Vulkan GPU - resnet18 sqlite-speedtest: Timed Time - Size 1,000 betsy: ETC2 RGB - Highest clpeak: Single-Precision Float graphics-magick: Sharpen cryptsetup: AES-XTS 512b Encryption embree: Pathtracer ISPC - Asian Dragon waifu2x-ncnn: 2x - 3 - Yes warsow: 1920 x 1080 ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.3 - Default - Multeasymap embree: Pathtracer ISPC - Crown simdjson: Kostya onednn: IP Shapes 3D - f32 - CPU cryptsetup: AES-XTS 256b Decryption asmfish: 1024 Hash Memory, 26 Depth rodinia: OpenCL Particle Filter redis: SADD cryptsetup: AES-XTS 512b Decryption unigine-super: 1920 x 1080 - Fullscreen - Ultra - OpenGL simdjson: PartialTweets openvino: Age Gender Recognition Retail 0013 FP16 - CPU ncnn: CPU - googlenet gromacs: Water Benchmark simdjson: DistinctUserID betsy: ETC1 - Highest cryptsetup: Twofish-XTS 256b Encryption gegl: Reflect hashcat: TrueCrypt RIPEMD160 + XTS onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU mnn: resnet-v2-50 mafft: Multiple Sequence Alignment - LSU RNA ncnn: CPU - yolov4-tiny phpbench: PHP Benchmark Suite realsr-ncnn: 4x - Yes vkresample: 2x - Single node-web-tooling: crafty: Elapsed Time unigine-super: 1920 x 1080 - Fullscreen - High - OpenGL openvino: Person Detection 0106 FP16 - CPU basis: UASTC Level 0 cryptsetup: Serpent-XTS 512b Encryption gegl: Crop cryptsetup: Serpent-XTS 256b Encryption openvino: Face Detection 0106 FP32 - CPU cryptsetup: Twofish-XTS 256b Decryption cryptsetup: Serpent-XTS 512b Decryption onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU tensorflow-lite: NASNet Mobile arrayfire: Conjugate Gradient OpenCL cryptsetup: Twofish-XTS 512b Encryption dav1d: Chimera 1080p graphics-magick: Noise-Gaussian unpack-firefox: firefox-84.0.source.tar.xz vkfft: gegl: Tile Glass astcenc: Thorough gegl: Scale clpeak: Integer Compute INT cryptsetup: Serpent-XTS 256b Decryption dav1d: Summer Nature 4K cl-mem: Copy cryptsetup: Twofish-XTS 512b Decryption hashcat: SHA-512 numpy: namd-cuda: ATPase Simulation - 327,506 Atoms gegl: Cartoon compress-lz4: 1 - Compression Speed hashcat: SHA1 ncnn: Vulkan GPU - alexnet embree: Pathtracer - Crown hashcat: MD5 unigine-heaven: 1920 x 1080 - Fullscreen - OpenGL blender: Fishy Cat - CUDA leveldb: Seq Fill realsr-ncnn: 4x - No rav1e: 10 blender: Classroom - NVIDIA OptiX leveldb: Seek Rand onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU ncnn: CPU - resnet18 build-linux-kernel: Time To Compile basis: UASTC Level 2 leveldb: Overwrite ncnn: CPU - squeezenet_ssd rav1e: 5 leveldb: Overwrite blender: Classroom - CUDA redshift: rawtherapee: Total Benchmark Time blender: Barbershop - CUDA indigobench: CPU - Bedroom astcenc: Exhaustive build2: Time To Compile indigobench: CPU - Supercar cl-mem: Write gegl: Rotate 90 Degrees basis: ETC1S onednn: Deconvolution Batch shapes_1d - f32 - CPU unigine-super: 1920 x 1080 - Fullscreen - Low - OpenGL gegl: Color Enhance tensorflow-lite: SqueezeNet leveldb: Seq Fill darktable: Masskrug - CPU-only ncnn: CPU - mobilenet tensorflow-lite: Inception ResNet V2 ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.0 - Default - Multeasymap onednn: Convolution Batch Shapes Auto - f32 - CPU compress-zstd: 19 inkscape: SVG Files To PNG leveldb: Rand Delete onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU mandelgpu: GPU brl-cad: VGR Performance Metric compress-lz4: 3 - Decompression Speed darktable: Boat - CPU-only ncnn: CPU - vgg16 deepspeech: CPU compress-lz4: 9 - Decompression Speed compress-lz4: 1 - Decompression Speed openvino: Face Detection 0106 FP32 - CPU tensorflow-lite: Inception V4 tensorflow-lite: Mobilenet Quant dav1d: Chimera 1080p 10-bit vkresample: 2x - Double plaidml: Yes - Inference - Mobilenet - OpenCL encode-opus: WAV To Opus Encode rav1e: 1 blender: Fishy Cat - NVIDIA OptiX rav1e: 6 mnn: mobilenet-v1-1.0 onednn: Recurrent Neural Network Training - u8s8f32 - CPU coremark: CoreMark Size 666 - Iterations Per Second gegl: Wavelet Blur ncnn: CPU - alexnet graphics-magick: HWB Color Space gegl: Antialias ai-benchmark: Device Training Score blender: Barbershop - NVIDIA OptiX ncnn: Vulkan GPU - resnet50 plaidml: No - Inference - Mobilenet - OpenCL leveldb: Rand Fill unigine-super: 1920 x 1080 - Fullscreen - Medium - OpenGL ncnn: Vulkan GPU - squeezenet_ssd embree: Pathtracer - Asian Dragon openvino: Person Detection 0106 FP32 - CPU blender: BMW27 - CUDA ncnn: Vulkan GPU - vgg16 build-ffmpeg: Time To Compile onednn: Recurrent Neural Network Training - f32 - CPU basis: UASTC Level 3 hpcg: graphics-magick: Resizing tensorflow-lite: Mobilenet Float onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU darktable: Server Room - CPU-only yquake2: Software CPU - 1920 x 1080 blender: Pabellon Barcelona - CUDA onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU compress-zstd: 3 leveldb: Rand Fill ai-benchmark: Device AI Score octanebench: Total Score cl-mem: Read onednn: Recurrent Neural Network Inference - u8s8f32 - CPU basis: UASTC Level 2 + RDO Post-Processing blender: Pabellon Barcelona - NVIDIA OptiX dav1d: Summer Nature 1080p fahbench: plaidml: No - Inference - DenseNet 201 - OpenCL ncnn: Vulkan GPU - mobilenet onednn: Recurrent Neural Network Inference - f32 - CPU openvino: Face Detection 0106 FP16 - CPU hmmer: Pfam Database Search clpeak: Global Memory Bandwidth ncnn: Vulkan GPU - yolov4-tiny clpeak: Double-Precision Double financebench: Black-Scholes OpenCL ai-benchmark: Device Inference Score openvino: Person Detection 0106 FP32 - CPU openvino: Person Detection 0106 FP16 - CPU openvino: Face Detection 0106 FP16 - CPU darktable: Server Rack - CPU-only graphics-magick: Enhanced graphics-magick: Swirl simdjson: LargeRand yquake2: OpenGL 3.x - 1920 x 1080 yquake2: OpenGL 1.x - 1920 x 1080 leveldb: Fill Sync neatbench: GPU blender: BMW27 - NVIDIA OptiX ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 mnn: MobileNetV2_224 mnn: SqueezeNetV1.0 redis: LPOP astcenc: Medium onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU lammps: Rhodopsin Protein luxcorerender-cl: Rainbow Colors and Prism luxcorerender-cl: LuxCore Benchmark luxcorerender-cl: Food luxcorerender-cl: DLSC ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.3 - Default - RaiNyMore2 ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.0 - Default - RaiNyMore2 leveldb: Rand Read r1 r2 r3 3.7 321.420 3248596.08 68.2924 26.474 22.084 5.44 463.34 10.512 902 1919349 272.907 57.88 6.946 2041750.08 19.16 2375800.25 816282 2.54 55.72 9703133 373667 3361.777 2.72558 4005.6 13277 68.744 7.16575 1.17 3363.55 1.21 62.568 37.81 18.62 49.547 8.016 5940.64 72 3346.8 9.1343 6.020 955.6 435.20 7.0735 0.76 12.4721 4002.4 15984719 7.115 2660539.42 3348.3 25.1 0.86 3442.78 19.98 0.617 0.89 5.854 482.0 28.183 301233 8.96782 58.161 10.497 35.95 837911 99.812 24.992 13.06 9497414 65.9 4961.99 7.288 878.0 8.900 874.1 1.26 482.5 871.7 18.0080 302594 2.549 483.0 489.84 146 16.028 25820 28.243 54.29 6.954 5504.35 872.3 112.75 236.6 482.7 1023100000 419.58 0.22103 86.789 8120.67 8585766667 15.44 6.0806 24334866667 139.126 168.87 37.5 14.734 3.422 116.76 12.694 4.36381 18.62 151.656 55.499 40.925 27.64 1.069 43.2 250.78 461 80.586 734.81 0.939 447.99 210.051 2.147 215.7 37.697 57.824 9.77594 177.7 54.114 354892 47.235 7.128 26.62 4660197 413.88 21.6871 28.8 20.996 47.228 4.45564 251986408.7 63909 9676.3 15.914 72.09 81.29517 9679.8 9823.2 3202.53 5163190 236716 86.08 256.867 1819.24 7.624 0.347 60.35 1.444 10.646 7140.50 223414.807558 57.993 15.50 775 36.556 816 1192.96 37.25 1246.78 43.1 90.4 27.58 7.5555 5069.44 91.00 71.96 100.257 7155.41 110.838 3.96177 552 239119 3797.32 4.181 60.7 608.80 7144.23 2833.6 41.037 1546 189.085068 330.3 3795.81 840.347 196.21 460.02 186.4611 110.07 26.52 3795.02 3165.24 105.526 324.63 35.52 340.42 17.477667 730 0.79 0.80 1.28 0.181 115 207 0.5 60 59.9 0.5 27.5 41.47 19.16 20.05 2.55 10.01 6.63 7.92 5.74 7.23 10.00 6.67 7.93 5.74 7.31 5.239 8.899 3394660.2 7.68 4.74772 9.87893 3.17762 5.198 5.30 2.26 1.27 2.70 158.21 170.36 9.620 2.5 295.547 3012560.83 64.2335 27.178 21.316 5.59 477.39 10.861 875 1943008 264.948 57.36 7.099 2094056.31 18.91 2413657.0 830020 2.60 56.07 9839292 370400 3424.918 2.77670 4080.5 13173 67.543 7.04404 1.19 3307.53 1.23 63.180 37.30 18.33 50.268 7.912 5858.32 72 3381.9 9.2596 6.102 967.9 429.37 6.9794 0.75 12.4447 4055.1 15974611 7.055 2628039.25 3388.5 25.4 0.87 3403.45 20.01 0.610 0.88 5.789 487.4 28.496 301433 9.00692 58.530 10.564 35.59 832417 100.617 25.190 13.17 9584148 66.5 4978.25 7.345 882.1 8.839 881.4 1.27 486.3 878.1 17.9035 304756 2.531 486.4 486.46 147 16.137 25647 28.242 54.38 6.973 5519.39 876.6 112.03 235.4 485.7 1020000000 419.36 0.22238 87.319 8127.78 8544500000 15.53 6.0989 24260200000 139.905 167.96 37.4 14.656 3.404 116.15 12.629 4.37852 18.71 152.208 55.742 40.955 27.51 1.064 43.2 251.90 460 80.934 731.67 0.938 449.37 210.712 2.150 215.6 37.541 58.063 9.76468 178.1 54.312 356034 47.287 7.150 26.63 4670567 412.43 21.6992 28.7 21.048 47.296 4.47062 252826584.8 63822 9653.7 15.870 71.91 81.07316 9664.8 9839.9 3207.35 5168183 237129 85.83 257.062 1823.06 7.602 0.346 60.18 1.440 10.675 7159.42 223304.983286 57.950 15.46 774 36.557 814 1190.05 37.34 1244.95 43.1 90.6 27.52 7.5656 5079.89 90.82 71.82 100.397 7159.48 110.926 3.96068 551 239224 3799.45 4.174 60.7 609.56 7154.66 2831.0 41.027 1544 189.101719 329.9 3800.41 840.319 196.28 459.61 186.4777 109.98 26.53 3797.05 3166.57 105.572 324.58 35.51 340.46 17.476 730 0.79 0.80 1.28 0.181 115 207 0.5 60 59.9 0.5 27.1 38.07 17.15 18.20 2.29 9.02 5.86 6.98 5.73 7.22 9.05 5.96 6.95 5.81 7.22 5.291 8.982 2104092.33 7.61 4.71457 9.77701 3.16769 5.169 5.39 2.31 1.32 2.77 100.58 169.30 9.692 3.6 299.396 3009326.75 65.9180 27.713 22.044 5.63 478.73 10.592 900 1886103 272.676 58.89 7.128 2083566.29 19.38 2433543.8 810352 2.57 57.01 9629353 366433 3386.084 2.74874 4023.0 13416 68.699 7.14574 1.19 3347.93 1.22 63.563 37.22 18.38 50.261 7.903 5892.70 73 3336.0 9.1967 6.093 968.6 434.24 6.9976 0.75 12.6089 4026.9 16180674 7.027 2634908.83 3362.9 25.3 0.86 3405.92 20.21 0.614 0.88 5.792 483.6 28.313 298133 9.06628 58.786 10.608 35.66 829705 100.748 25.225 13.18 9560012 66.2 5006.34 7.353 874.4 8.826 874.1 1.27 483.0 873.5 18.0326 304079 2.548 483.8 487.57 147 16.103 25683 28.055 54.65 7.000 5540.44 870.9 112.65 235.1 483.0 1016800000 417.03 0.22171 86.993 8079.18 8535333333 15.50 6.0641 24196900000 139.184 168.08 37.3 14.694 3.420 116.26 12.644 4.38535 18.66 151.478 55.766 40.760 27.63 1.064 43.4 251.80 459 80.712 733.02 0.935 449.90 210.945 2.156 214.8 37.691 58.062 9.73732 177.4 54.101 356258 47.415 7.155 26.53 4677473 412.38 21.6210 28.8 21.068 47.388 4.46656 252822614.4 64033 9685.2 15.863 71.86 81.03983 9695.2 9810.0 3212.10 5178263 237406 85.95 257.615 1817.78 7.616 0.347 60.25 1.443 10.658 7151.58 223892.444726 57.843 15.49 776 36.646 814 1192.80 37.26 1247.93 43.2 90.5 27.55 7.5496 5073.09 90.93 71.86 100.203 7169.03 111.040 3.95457 551 239537 3792.87 4.178 60.6 608.62 7147.09 2835.1 40.981 1544 189.316553 329.9 3798.12 841.228 196.41 459.71 186.6158 109.99 26.51 3797.72 3164.51 105.505 324.78 35.53 340.59 17.476333 730 0.79 0.80 1.28 0.181 115 207 0.5 60 59.9 0.5 27.6 38.07 17.60 18.26 2.29 8.99 5.91 7.05 5.81 7.19 9.06 5.96 7.03 5.81 7.23 5.285 8.944 2809233.48 7.58 4.73728 9.81238 3.11291 5.179 5.41 2.29 1.30 2.76 130.66 151.49 9.573 OpenBenchmarking.org
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap - Total Frame Time OpenBenchmarking.org Milliseconds, Fewer Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap - Total Frame Time r1 r2 r3 3 6 9 12 15 Min: 2 / Avg: 2.3 / Max: 10.06 Min: 2 / Avg: 2.32 / Max: 5.18 Min: 2 / Avg: 2.32 / Max: 8.68 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: Multeasymap - Total Frame Time OpenBenchmarking.org Milliseconds, Fewer Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: Multeasymap - Total Frame Time r3 r1 r2 3 6 9 12 15 Min: 2 / Avg: 2.39 / Max: 7.28 Min: 2 / Avg: 2.43 / Max: 6.55 Min: 2 / Avg: 2.46 / Max: 6.5 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup r1 r3 r2 0.8325 1.665 2.4975 3.33 4.1625 SE +/- 0.03, N = 3 SE +/- 0.03, N = 15 SE +/- 0.03, N = 15 3.7 3.6 2.5 1. (CC) gcc options: -fopenmp -O3 -lm
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 r2 r3 r1 70 140 210 280 350 SE +/- 0.81, N = 3 SE +/- 0.36, N = 3 SE +/- 2.78, N = 8 295.55 299.40 321.42 MIN: 292.39 / MAX: 306.56 MIN: 297.92 / MAX: 315.55 MIN: 300.42 / MAX: 371.06 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET r1 r2 r3 700K 1400K 2100K 2800K 3500K SE +/- 41615.25, N = 3 SE +/- 13828.40, N = 3 SE +/- 8077.93, N = 3 3248596.08 3012560.83 3009326.75 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
ViennaCL OpenCL LU Factorization OpenBenchmarking.org GFLOPS, More Is Better ViennaCL 1.4.2 OpenCL LU Factorization r1 r3 r2 15 30 45 60 75 SE +/- 0.36, N = 3 SE +/- 0.44, N = 3 SE +/- 0.08, N = 3 68.29 65.92 64.23 1. (CXX) g++ options: -rdynamic -lOpenCL
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis r1 r2 r3 7 14 21 28 35 SE +/- 0.29, N = 4 SE +/- 0.12, N = 4 SE +/- 0.04, N = 4 26.47 27.18 27.71 1. (CC) gcc options: -O2 -std=c99 -lpthread -lm
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 r2 r3 r1 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 21.32 22.04 22.08 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden -lm
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast r1 r2 r3 1.2668 2.5336 3.8004 5.0672 6.334 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.04, N = 12 5.44 5.59 5.63 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL r3 r2 r1 100 200 300 400 500 SE +/- 2.86, N = 3 SE +/- 1.92, N = 3 SE +/- 0.36, N = 3 478.73 477.39 463.34
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE r1 r3 r2 3 6 9 12 15 SE +/- 0.03, N = 5 SE +/- 0.01, N = 5 SE +/- 0.04, N = 5 10.51 10.59 10.86 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
GraphicsMagick Operation: Rotate OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate r1 r3 r2 200 400 600 800 1000 SE +/- 2.52, N = 3 SE +/- 1.86, N = 3 SE +/- 3.18, N = 3 902 900 875 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
Cryptsetup PBKDF2-sha512 OpenBenchmarking.org Iterations Per Second, More Is Better Cryptsetup PBKDF2-sha512 r2 r1 r3 400K 800K 1200K 1600K 2000K SE +/- 1201.00, N = 3 SE +/- 7117.07, N = 3 SE +/- 12877.64, N = 3 1943008 1919349 1886103
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 r2 r3 r1 60 120 180 240 300 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 SE +/- 1.46, N = 3 264.95 272.68 272.91 MIN: 264.07 / MAX: 268.01 MIN: 271.53 / MAX: 277.6 MIN: 264.43 / MAX: 277.05 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed r3 r1 r2 13 26 39 52 65 SE +/- 0.48, N = 3 SE +/- 0.61, N = 5 SE +/- 0.58, N = 3 58.89 57.88 57.36 1. (CC) gcc options: -O3
LevelDB Benchmark: Hot Read OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Hot Read r1 r2 r3 2 4 6 8 10 SE +/- 0.013, N = 3 SE +/- 0.075, N = 3 SE +/- 0.049, N = 3 6.946 7.099 7.128 1. (CXX) g++ options: -O3 -lsnappy -lpthread
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH r2 r3 r1 400K 800K 1200K 1600K 2000K SE +/- 21753.96, N = 4 SE +/- 8925.21, N = 3 SE +/- 25221.07, N = 3 2094056.31 2083566.29 2041750.08 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m r2 r1 r3 5 10 15 20 25 SE +/- 0.24, N = 3 SE +/- 0.06, N = 3 SE +/- 0.10, N = 3 18.91 19.16 19.38 MIN: 13.5 / MAX: 30.63 MIN: 18.07 / MAX: 22.36 MIN: 14.45 / MAX: 42.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET r3 r2 r1 500K 1000K 1500K 2000K 2500K SE +/- 6859.51, N = 3 SE +/- 3903.32, N = 3 SE +/- 17218.21, N = 3 2433543.80 2413657.00 2375800.25 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Cryptsetup PBKDF2-whirlpool OpenBenchmarking.org Iterations Per Second, More Is Better Cryptsetup PBKDF2-whirlpool r2 r1 r3 200K 400K 600K 800K 1000K SE +/- 2314.28, N = 3 SE +/- 4903.32, N = 3 SE +/- 2497.33, N = 3 830020 816282 810352
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface r1 r3 r2 0.585 1.17 1.755 2.34 2.925 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 2.54 2.57 2.60 MIN: 2.35 / MAX: 2.74 MIN: 2.45 / MAX: 2.83 MIN: 2.45 / MAX: 10.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed r3 r2 r1 13 26 39 52 65 SE +/- 0.66, N = 3 SE +/- 0.36, N = 3 SE +/- 0.59, N = 5 57.01 56.07 55.72 1. (CC) gcc options: -O3
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time r2 r1 r3 2M 4M 6M 8M 10M SE +/- 85742.14, N = 3 SE +/- 85083.98, N = 8 SE +/- 67987.28, N = 12 9839292 9703133 9629353 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
Hashcat Benchmark: 7-Zip OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: 7-Zip r1 r2 r3 80K 160K 240K 320K 400K SE +/- 1589.90, N = 3 SE +/- 1858.31, N = 3 SE +/- 3670.30, N = 3 373667 370400 366433
LevelDB Benchmark: Fill Sync OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Fill Sync r1 r3 r2 700 1400 2100 2800 3500 SE +/- 33.91, N = 3 SE +/- 60.32, N = 3 SE +/- 25.98, N = 3 3361.78 3386.08 3424.92 1. (CXX) g++ options: -O3 -lsnappy -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU r1 r3 r2 0.6248 1.2496 1.8744 2.4992 3.124 SE +/- 0.00400, N = 3 SE +/- 0.00352, N = 3 SE +/- 0.01530, N = 3 2.72558 2.74874 2.77670 MIN: 2.54 MIN: 2.54 MIN: 2.56 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Cryptsetup AES-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 256b Encryption r2 r3 r1 900 1800 2700 3600 4500 SE +/- 25.91, N = 3 SE +/- 20.10, N = 3 SE +/- 1.66, N = 3 4080.5 4023.0 4005.6
LeelaChessZero Backend: OpenCL OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: OpenCL r3 r1 r2 3K 6K 9K 12K 15K SE +/- 44.68, N = 3 SE +/- 160.45, N = 3 SE +/- 176.76, N = 3 13416 13277 13173 1. (CXX) g++ options: -flto -pthread
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile r2 r3 r1 15 30 45 60 75 SE +/- 0.30, N = 3 SE +/- 0.22, N = 3 SE +/- 0.16, N = 3 67.54 68.70 68.74
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU r2 r3 r1 2 4 6 8 10 SE +/- 0.11582, N = 12 SE +/- 0.02993, N = 3 SE +/- 0.05152, N = 3 7.04404 7.14574 7.16575 MIN: 4.11 MIN: 5.45 MIN: 5.58 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU r1 r2 r3 0.2678 0.5356 0.8034 1.0712 1.339 SE +/- 0.00, N = 3 SE +/- 0.00, N = 4 SE +/- 0.00, N = 6 1.17 1.19 1.19 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU r1 r3 r2 700 1400 2100 2800 3500 SE +/- 35.01, N = 3 SE +/- 40.89, N = 4 SE +/- 33.23, N = 5 3363.55 3347.93 3307.53 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU r1 r3 r2 0.2768 0.5536 0.8304 1.1072 1.384 SE +/- 0.00, N = 3 SE +/- 0.00, N = 4 SE +/- 0.00, N = 5 1.21 1.22 1.23 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: inception-v3 r1 r2 r3 14 28 42 56 70 SE +/- 0.15, N = 10 SE +/- 0.18, N = 11 SE +/- 0.22, N = 10 62.57 63.18 63.56 MIN: 60.82 / MAX: 96.05 MIN: 61.02 / MAX: 104.39 MIN: 60.92 / MAX: 102.85 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 r3 r2 r1 9 18 27 36 45 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.51, N = 3 37.22 37.30 37.81 MIN: 33.9 / MAX: 52.84 MIN: 33.91 / MAX: 56.28 MIN: 34.04 / MAX: 52.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet18 r2 r3 r1 5 10 15 20 25 SE +/- 0.34, N = 3 SE +/- 0.27, N = 3 SE +/- 0.00, N = 3 18.33 18.38 18.62 MIN: 14.43 / MAX: 32.39 MIN: 14.4 / MAX: 32.57 MIN: 17.13 / MAX: 20.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 r1 r3 r2 11 22 33 44 55 SE +/- 0.25, N = 3 SE +/- 0.13, N = 3 SE +/- 0.17, N = 3 49.55 50.26 50.27 1. (CC) gcc options: -O2 -ldl -lz -lpthread
Betsy GPU Compressor Codec: ETC2 RGB - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest r3 r2 r1 2 4 6 8 10 SE +/- 0.023, N = 3 SE +/- 0.018, N = 3 SE +/- 0.064, N = 13 7.903 7.912 8.016 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
clpeak OpenCL Test: Single-Precision Float OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Single-Precision Float r1 r3 r2 1300 2600 3900 5200 6500 SE +/- 83.30, N = 15 SE +/- 47.53, N = 3 SE +/- 64.05, N = 3 5940.64 5892.70 5858.32 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen r3 r2 r1 16 32 48 64 80 SE +/- 0.67, N = 3 SE +/- 0.58, N = 3 SE +/- 0.33, N = 3 73 72 72 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
Cryptsetup AES-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 512b Encryption r2 r1 r3 700 1400 2100 2800 3500 SE +/- 15.69, N = 3 SE +/- 3.15, N = 3 SE +/- 25.61, N = 3 3381.9 3346.8 3336.0
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon r2 r3 r1 3 6 9 12 15 SE +/- 0.0236, N = 3 SE +/- 0.1308, N = 3 SE +/- 0.0822, N = 3 9.2596 9.1967 9.1343 MIN: 8.82 / MAX: 14.99 MIN: 8.85 / MAX: 15 MIN: 8.81 / MAX: 15.06
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes r1 r3 r2 2 4 6 8 10 SE +/- 0.004, N = 3 SE +/- 0.011, N = 3 SE +/- 0.007, N = 3 6.020 6.093 6.102
Warsow Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better Warsow 2.5 Beta Resolution: 1920 x 1080 r3 r2 r1 200 400 600 800 1000 SE +/- 1.81, N = 3 SE +/- 1.46, N = 3 SE +/- 13.76, N = 12 968.6 967.9 955.6
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap r1 r3 r2 90 180 270 360 450 SE +/- 0.25, N = 3 SE +/- 2.45, N = 3 SE +/- 2.73, N = 3 435.20 434.24 429.37 MIN: 99.45 / MAX: 499.75 MIN: 115.25 / MAX: 499.75 MIN: 112.88 / MAX: 499.75 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Crown r1 r3 r2 2 4 6 8 10 SE +/- 0.0830, N = 3 SE +/- 0.0756, N = 5 SE +/- 0.0728, N = 4 7.0735 6.9976 6.9794 MIN: 6.66 / MAX: 12.73 MIN: 6.56 / MAX: 12.56 MIN: 6.57 / MAX: 12.32
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya r1 r3 r2 0.171 0.342 0.513 0.684 0.855 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.76 0.75 0.75 1. (CXX) g++ options: -O3 -pthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU r2 r1 r3 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 12.44 12.47 12.61 MIN: 12.09 MIN: 12.08 MIN: 12.2 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Cryptsetup AES-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 256b Decryption r2 r3 r1 900 1800 2700 3600 4500 SE +/- 17.20, N = 3 SE +/- 15.07, N = 3 SE +/- 4.92, N = 3 4055.1 4026.9 4002.4
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth r3 r1 r2 3M 6M 9M 12M 15M SE +/- 142852.80, N = 3 SE +/- 174263.56, N = 3 SE +/- 148124.86, N = 3 16180674 15984719 15974611
Rodinia Test: OpenCL Particle Filter OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Particle Filter r3 r2 r1 2 4 6 8 10 SE +/- 0.016, N = 3 SE +/- 0.013, N = 3 SE +/- 0.065, N = 3 7.027 7.055 7.115 1. (CXX) g++ options: -O2 -lOpenCL
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD r1 r3 r2 600K 1200K 1800K 2400K 3000K SE +/- 28020.60, N = 3 SE +/- 27994.25, N = 3 SE +/- 23332.27, N = 15 2660539.42 2634908.83 2628039.25 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Cryptsetup AES-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 512b Decryption r2 r3 r1 700 1400 2100 2800 3500 SE +/- 10.03, N = 3 SE +/- 13.02, N = 3 SE +/- 1.21, N = 3 3388.5 3362.9 3348.3
Unigine Superposition Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Ultra - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Superposition 1.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Ultra - Renderer: OpenGL r2 r3 r1 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 25.4 25.3 25.1 MAX: 29.4 MAX: 29.7 MAX: 29.3
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets r2 r3 r1 0.1958 0.3916 0.5874 0.7832 0.979 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.87 0.86 0.86 1. (CXX) g++ options: -O3 -pthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU r1 r3 r2 700 1400 2100 2800 3500 SE +/- 33.67, N = 3 SE +/- 34.05, N = 6 SE +/- 38.35, N = 4 3442.78 3405.92 3403.45 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet r1 r2 r3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 19.98 20.01 20.21 MIN: 18.95 / MAX: 23.24 MIN: 18.96 / MAX: 24.67 MIN: 19.11 / MAX: 32.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark r1 r3 r2 0.1388 0.2776 0.4164 0.5552 0.694 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 SE +/- 0.004, N = 3 0.617 0.614 0.610 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID r1 r3 r2 0.2003 0.4006 0.6009 0.8012 1.0015 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.89 0.88 0.88 1. (CXX) g++ options: -O3 -pthread
Betsy GPU Compressor Codec: ETC1 - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest r2 r3 r1 1.3172 2.6344 3.9516 5.2688 6.586 SE +/- 0.008, N = 3 SE +/- 0.024, N = 3 SE +/- 0.068, N = 12 5.789 5.792 5.854 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Cryptsetup Twofish-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 256b Encryption r2 r3 r1 110 220 330 440 550 SE +/- 1.08, N = 3 SE +/- 2.51, N = 3 SE +/- 0.75, N = 3 487.4 483.6 482.0
GEGL Operation: Reflect OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Reflect r1 r3 r2 7 14 21 28 35 SE +/- 0.29, N = 3 SE +/- 0.22, N = 3 SE +/- 0.30, N = 3 28.18 28.31 28.50
Hashcat Benchmark: TrueCrypt RIPEMD160 + XTS OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: TrueCrypt RIPEMD160 + XTS r2 r1 r3 60K 120K 180K 240K 300K SE +/- 851.14, N = 3 SE +/- 1322.04, N = 3 SE +/- 545.69, N = 3 301433 301233 298133
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU r1 r2 r3 3 6 9 12 15 SE +/- 0.04374, N = 3 SE +/- 0.01715, N = 3 SE +/- 0.11418, N = 3 8.96782 9.00692 9.06628 MIN: 8.14 MIN: 8.15 MIN: 8 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: resnet-v2-50 r1 r2 r3 13 26 39 52 65 SE +/- 0.40, N = 10 SE +/- 0.35, N = 11 SE +/- 0.40, N = 10 58.16 58.53 58.79 MIN: 36.86 / MAX: 81.73 MIN: 37.33 / MAX: 83.74 MIN: 36.87 / MAX: 85.77 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Timed MAFFT Alignment Multiple Sequence Alignment - LSU RNA OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 7.471 Multiple Sequence Alignment - LSU RNA r1 r2 r3 3 6 9 12 15 SE +/- 0.08, N = 12 SE +/- 0.10, N = 15 SE +/- 0.10, N = 14 10.50 10.56 10.61 1. (CC) gcc options: -std=c99 -O3 -lm -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny r2 r3 r1 8 16 24 32 40 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.48, N = 3 35.59 35.66 35.95 MIN: 34.42 / MAX: 51.24 MIN: 34.45 / MAX: 49.15 MIN: 34.4 / MAX: 55.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite r1 r2 r3 200K 400K 600K 800K 1000K SE +/- 4346.11, N = 3 SE +/- 2600.83, N = 3 SE +/- 587.84, N = 3 837911 832417 829705
RealSR-NCNN Scale: 4x - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: Yes r1 r2 r3 20 40 60 80 100 SE +/- 0.31, N = 3 SE +/- 0.48, N = 3 SE +/- 0.35, N = 3 99.81 100.62 100.75
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single r1 r2 r3 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 24.99 25.19 25.23 1. (CXX) g++ options: -O3 -pthread
Node.js V8 Web Tooling Benchmark OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark r3 r2 r1 3 6 9 12 15 SE +/- 0.11, N = 3 SE +/- 0.11, N = 3 SE +/- 0.14, N = 3 13.18 13.17 13.06 1. Nodejs
v10.19.0
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time r2 r3 r1 2M 4M 6M 8M 10M SE +/- 7176.35, N = 3 SE +/- 16578.83, N = 3 SE +/- 45086.65, N = 3 9584148 9560012 9497414 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
Unigine Superposition Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: High - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Superposition 1.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: High - Renderer: OpenGL r2 r3 r1 15 30 45 60 75 SE +/- 0.12, N = 3 SE +/- 0.09, N = 3 SE +/- 0.19, N = 3 66.5 66.2 65.9 MAX: 80.8 MAX: 80.3 MAX: 81.6
OpenVINO Model: Person Detection 0106 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP16 - Device: CPU r1 r2 r3 1100 2200 3300 4400 5500 SE +/- 4.97, N = 3 SE +/- 19.24, N = 3 SE +/- 4.20, N = 3 4961.99 4978.25 5006.34 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 0 r1 r2 r3 2 4 6 8 10 SE +/- 0.079, N = 3 SE +/- 0.061, N = 3 SE +/- 0.095, N = 3 7.288 7.345 7.353 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Cryptsetup Serpent-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 512b Encryption r2 r1 r3 200 400 600 800 1000 SE +/- 0.87, N = 3 SE +/- 0.83, N = 3 SE +/- 4.25, N = 3 882.1 878.0 874.4
GEGL Operation: Crop OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Crop r3 r2 r1 2 4 6 8 10 SE +/- 0.077, N = 8 SE +/- 0.073, N = 9 SE +/- 0.065, N = 11 8.826 8.839 8.900
Cryptsetup Serpent-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 256b Encryption r2 r3 r1 200 400 600 800 1000 SE +/- 1.25, N = 3 SE +/- 2.67, N = 3 SE +/- 0.92, N = 3 881.4 874.1 874.1
OpenVINO Model: Face Detection 0106 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP32 - Device: CPU r3 r2 r1 0.2858 0.5716 0.8574 1.1432 1.429 SE +/- 0.02, N = 3 SE +/- 0.02, N = 4 SE +/- 0.01, N = 3 1.27 1.27 1.26 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
Cryptsetup Twofish-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 256b Decryption r2 r3 r1 110 220 330 440 550 SE +/- 1.43, N = 3 SE +/- 2.21, N = 3 SE +/- 0.34, N = 3 486.3 483.0 482.5
Cryptsetup Serpent-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 512b Decryption r2 r3 r1 200 400 600 800 1000 SE +/- 1.17, N = 3 SE +/- 4.24, N = 3 SE +/- 1.28, N = 3 878.1 873.5 871.7
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU r2 r1 r3 4 8 12 16 20 SE +/- 0.08, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 17.90 18.01 18.03 MIN: 17.18 MIN: 17.22 MIN: 17.24 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: NASNet Mobile r1 r3 r2 70K 140K 210K 280K 350K SE +/- 3140.84, N = 3 SE +/- 1284.72, N = 3 SE +/- 2025.87, N = 3 302594 304079 304756
ArrayFire Test: Conjugate Gradient OpenCL OpenBenchmarking.org ms, Fewer Is Better ArrayFire 3.7 Test: Conjugate Gradient OpenCL r2 r3 r1 0.5735 1.147 1.7205 2.294 2.8675 SE +/- 0.022, N = 3 SE +/- 0.018, N = 3 SE +/- 0.015, N = 3 2.531 2.548 2.549 1. (CXX) g++ options: -rdynamic
Cryptsetup Twofish-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 512b Encryption r2 r3 r1 110 220 330 440 550 SE +/- 0.97, N = 3 SE +/- 2.12, N = 3 SE +/- 0.30, N = 2 486.4 483.8 483.0
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p r1 r3 r2 110 220 330 440 550 SE +/- 5.73, N = 14 SE +/- 3.24, N = 13 SE +/- 3.02, N = 14 489.84 487.57 486.46 MIN: 317.1 / MAX: 898.12 MIN: 316.7 / MAX: 911.47 MIN: 316.37 / MAX: 900.57 1. (CC) gcc options: -pthread
GraphicsMagick Operation: Noise-Gaussian OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian r3 r2 r1 30 60 90 120 150 SE +/- 1.20, N = 3 SE +/- 1.00, N = 3 SE +/- 1.33, N = 3 147 147 146 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
Unpacking Firefox Extracting: firefox-84.0.source.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking Firefox 84.0 Extracting: firefox-84.0.source.tar.xz r1 r3 r2 4 8 12 16 20 SE +/- 0.08, N = 4 SE +/- 0.14, N = 4 SE +/- 0.09, N = 4 16.03 16.10 16.14
VkFFT OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.1.1 r1 r3 r2 6K 12K 18K 24K 30K SE +/- 62.93, N = 3 SE +/- 108.37, N = 3 SE +/- 58.68, N = 3 25820 25683 25647 1. (CXX) g++ options: -O3 -pthread
GEGL Operation: Tile Glass OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Tile Glass r3 r2 r1 7 14 21 28 35 SE +/- 0.39, N = 3 SE +/- 0.27, N = 3 SE +/- 0.36, N = 3 28.06 28.24 28.24
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough r1 r2 r3 12 24 36 48 60 SE +/- 0.54, N = 3 SE +/- 0.54, N = 3 SE +/- 0.42, N = 3 54.29 54.38 54.65 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
GEGL Operation: Scale OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Scale r1 r2 r3 2 4 6 8 10 SE +/- 0.055, N = 12 SE +/- 0.059, N = 13 SE +/- 0.056, N = 14 6.954 6.973 7.000
clpeak OpenCL Test: Integer Compute INT OpenBenchmarking.org GIOPS, More Is Better clpeak OpenCL Test: Integer Compute INT r3 r2 r1 1200 2400 3600 4800 6000 SE +/- 81.16, N = 15 SE +/- 81.08, N = 15 SE +/- 71.93, N = 15 5540.44 5519.39 5504.35 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
Cryptsetup Serpent-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 256b Decryption r2 r1 r3 200 400 600 800 1000 SE +/- 1.50, N = 3 SE +/- 1.62, N = 3 SE +/- 4.03, N = 3 876.6 872.3 870.9
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K r1 r3 r2 30 60 90 120 150 SE +/- 1.06, N = 6 SE +/- 1.07, N = 6 SE +/- 1.08, N = 6 112.75 112.65 112.03 MIN: 99.69 / MAX: 158.99 MIN: 99.62 / MAX: 158.58 MIN: 99.17 / MAX: 157.08 1. (CC) gcc options: -pthread
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy r1 r2 r3 50 100 150 200 250 SE +/- 0.22, N = 3 SE +/- 0.24, N = 3 SE +/- 0.27, N = 3 236.6 235.4 235.1 1. (CC) gcc options: -O2 -flto -lOpenCL
Cryptsetup Twofish-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 512b Decryption r2 r3 r1 110 220 330 440 550 SE +/- 1.44, N = 3 SE +/- 2.34, N = 3 SE +/- 0.10, N = 3 485.7 483.0 482.7
Hashcat Benchmark: SHA-512 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: SHA-512 r1 r2 r3 200M 400M 600M 800M 1000M SE +/- 11546345.54, N = 15 SE +/- 2594224.35, N = 3 SE +/- 1852025.92, N = 3 1023100000 1020000000 1016800000
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark r1 r2 r3 90 180 270 360 450 SE +/- 1.54, N = 3 SE +/- 0.84, N = 3 SE +/- 0.70, N = 3 419.58 419.36 417.03
NAMD CUDA ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms r1 r3 r2 0.05 0.1 0.15 0.2 0.25 SE +/- 0.00131, N = 3 SE +/- 0.00272, N = 4 SE +/- 0.00245, N = 5 0.22103 0.22171 0.22238
GEGL Operation: Cartoon OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Cartoon r1 r3 r2 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.09, N = 3 SE +/- 0.19, N = 3 86.79 86.99 87.32
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed r2 r1 r3 2K 4K 6K 8K 10K SE +/- 4.75, N = 3 SE +/- 6.52, N = 3 SE +/- 11.24, N = 3 8127.78 8120.67 8079.18 1. (CC) gcc options: -O3
Hashcat Benchmark: SHA1 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: SHA1 r1 r2 r3 2000M 4000M 6000M 8000M 10000M SE +/- 31347213.24, N = 3 SE +/- 17380832.35, N = 3 SE +/- 18653000.95, N = 3 8585766667 8544500000 8535333333
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: alexnet r1 r3 r2 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 15.44 15.50 15.53 MIN: 14.41 / MAX: 26.42 MIN: 14.41 / MAX: 26.23 MIN: 14.41 / MAX: 25.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Crown r2 r1 r3 2 4 6 8 10 SE +/- 0.0737, N = 3 SE +/- 0.0766, N = 3 SE +/- 0.0667, N = 3 6.0989 6.0806 6.0641 MIN: 5.88 / MAX: 10.98 MIN: 5.86 / MAX: 11.02 MIN: 5.86 / MAX: 10.95
Hashcat Benchmark: MD5 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: MD5 r1 r2 r3 5000M 10000M 15000M 20000M 25000M SE +/- 110495102.96, N = 3 SE +/- 81107726.72, N = 3 SE +/- 49256167.13, N = 3 24334866667 24260200000 24196900000
Unigine Heaven Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Heaven 4.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL r2 r3 r1 30 60 90 120 150 SE +/- 0.96, N = 3 SE +/- 0.56, N = 3 SE +/- 0.71, N = 3 139.91 139.18 139.13
Blender Blend File: Fishy Cat - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: CUDA r2 r3 r1 40 80 120 160 200 SE +/- 0.11, N = 3 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 167.96 168.08 168.87
LevelDB Benchmark: Sequential Fill OpenBenchmarking.org MB/s, More Is Better LevelDB 1.22 Benchmark: Sequential Fill r1 r2 r3 9 18 27 36 45 SE +/- 0.44, N = 4 SE +/- 0.46, N = 4 SE +/- 0.39, N = 5 37.5 37.4 37.3 1. (CXX) g++ options: -O3 -lsnappy -lpthread
RealSR-NCNN Scale: 4x - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No r2 r3 r1 4 8 12 16 20 SE +/- 0.09, N = 3 SE +/- 0.11, N = 3 SE +/- 0.01, N = 3 14.66 14.69 14.73
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 10 r1 r3 r2 0.77 1.54 2.31 3.08 3.85 SE +/- 0.044, N = 3 SE +/- 0.027, N = 3 SE +/- 0.035, N = 3 3.422 3.420 3.404
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: NVIDIA OptiX r2 r3 r1 30 60 90 120 150 SE +/- 0.23, N = 3 SE +/- 0.13, N = 3 SE +/- 0.13, N = 3 116.15 116.26 116.76
LevelDB Benchmark: Seek Random OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Seek Random r2 r3 r1 3 6 9 12 15 SE +/- 0.10, N = 15 SE +/- 0.11, N = 14 SE +/- 0.11, N = 15 12.63 12.64 12.69 1. (CXX) g++ options: -O3 -lsnappy -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU r1 r2 r3 0.9867 1.9734 2.9601 3.9468 4.9335 SE +/- 0.00310, N = 3 SE +/- 0.00806, N = 3 SE +/- 0.00559, N = 3 4.36381 4.37852 4.38535 MIN: 4.23 MIN: 4.25 MIN: 4.25 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 r1 r3 r2 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 18.62 18.66 18.71 MIN: 17.08 / MAX: 32.57 MIN: 17.05 / MAX: 30.94 MIN: 17.06 / MAX: 33.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile r3 r1 r2 30 60 90 120 150 SE +/- 0.75, N = 3 SE +/- 0.33, N = 3 SE +/- 0.24, N = 3 151.48 151.66 152.21
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 r1 r2 r3 13 26 39 52 65 SE +/- 0.55, N = 3 SE +/- 0.41, N = 3 SE +/- 0.58, N = 3 55.50 55.74 55.77 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
LevelDB Benchmark: Overwrite OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Overwrite r3 r1 r2 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.15, N = 3 SE +/- 0.08, N = 3 40.76 40.93 40.96 1. (CXX) g++ options: -O3 -lsnappy -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd r2 r3 r1 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.14, N = 3 27.51 27.63 27.64 MIN: 26.93 / MAX: 43.6 MIN: 27.02 / MAX: 46.56 MIN: 27 / MAX: 40.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 5 r1 r3 r2 0.2405 0.481 0.7215 0.962 1.2025 SE +/- 0.005, N = 3 SE +/- 0.003, N = 3 SE +/- 0.004, N = 3 1.069 1.064 1.064
LevelDB Benchmark: Overwrite OpenBenchmarking.org MB/s, More Is Better LevelDB 1.22 Benchmark: Overwrite r3 r2 r1 10 20 30 40 50 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 SE +/- 0.15, N = 3 43.4 43.2 43.2 1. (CXX) g++ options: -O3 -lsnappy -lpthread
Blender Blend File: Classroom - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: CUDA r1 r3 r2 60 120 180 240 300 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 250.78 251.80 251.90
RedShift Demo OpenBenchmarking.org Seconds, Fewer Is Better RedShift Demo 3.0 r3 r2 r1 100 200 300 400 500 SE +/- 0.33, N = 3 SE +/- 0.88, N = 3 459 460 461
RawTherapee Total Benchmark Time OpenBenchmarking.org Seconds, Fewer Is Better RawTherapee Total Benchmark Time r1 r3 r2 20 40 60 80 100 SE +/- 0.53, N = 3 SE +/- 0.45, N = 3 SE +/- 0.46, N = 3 80.59 80.71 80.93 1. RawTherapee, version 5.8, command line.
Blender Blend File: Barbershop - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Barbershop - Compute: CUDA r2 r3 r1 160 320 480 640 800 SE +/- 0.26, N = 3 SE +/- 0.41, N = 3 SE +/- 0.24, N = 3 731.67 733.02 734.81
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom r1 r2 r3 0.2113 0.4226 0.6339 0.8452 1.0565 SE +/- 0.002, N = 3 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 0.939 0.938 0.935
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive r1 r2 r3 100 200 300 400 500 SE +/- 0.52, N = 3 SE +/- 0.81, N = 3 SE +/- 0.54, N = 3 447.99 449.37 449.90 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile r1 r2 r3 50 100 150 200 250 SE +/- 0.40, N = 3 SE +/- 0.49, N = 3 SE +/- 0.85, N = 3 210.05 210.71 210.95
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar r3 r2 r1 0.4851 0.9702 1.4553 1.9404 2.4255 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 2.156 2.150 2.147
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write r1 r2 r3 50 100 150 200 250 SE +/- 0.47, N = 3 SE +/- 0.26, N = 3 SE +/- 0.50, N = 3 215.7 215.6 214.8 1. (CC) gcc options: -O2 -flto -lOpenCL
GEGL Operation: Rotate 90 Degrees OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Rotate 90 Degrees r2 r3 r1 9 18 27 36 45 SE +/- 0.36, N = 3 SE +/- 0.43, N = 3 SE +/- 0.31, N = 3 37.54 37.69 37.70
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: ETC1S r1 r3 r2 13 26 39 52 65 SE +/- 0.38, N = 3 SE +/- 0.56, N = 3 SE +/- 0.15, N = 3 57.82 58.06 58.06 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU r3 r2 r1 3 6 9 12 15 SE +/- 0.03582, N = 3 SE +/- 0.03928, N = 3 SE +/- 0.04555, N = 3 9.73732 9.76468 9.77594 MIN: 8.75 MIN: 8.72 MIN: 8.77 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Unigine Superposition Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Low - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Superposition 1.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Low - Renderer: OpenGL r2 r1 r3 40 80 120 160 200 SE +/- 0.71, N = 3 SE +/- 0.23, N = 3 SE +/- 0.52, N = 3 178.1 177.7 177.4 MAX: 259.4 MAX: 260.1 MAX: 263.9
GEGL Operation: Color Enhance OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Color Enhance r3 r1 r2 12 24 36 48 60 SE +/- 0.28, N = 3 SE +/- 0.22, N = 3 SE +/- 0.04, N = 3 54.10 54.11 54.31
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: SqueezeNet r1 r2 r3 80K 160K 240K 320K 400K SE +/- 2566.21, N = 3 SE +/- 2576.61, N = 3 SE +/- 2539.06, N = 3 354892 356034 356258
LevelDB Benchmark: Sequential Fill OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Sequential Fill r1 r2 r3 11 22 33 44 55 SE +/- 0.54, N = 4 SE +/- 0.58, N = 4 SE +/- 0.48, N = 5 47.24 47.29 47.42 1. (CXX) g++ options: -O3 -lsnappy -lpthread
Darktable Test: Masskrug - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.1 Test: Masskrug - Acceleration: CPU-only r1 r2 r3 2 4 6 8 10 SE +/- 0.097, N = 12 SE +/- 0.096, N = 12 SE +/- 0.099, N = 12 7.128 7.150 7.155
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet r3 r1 r2 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.17, N = 3 SE +/- 0.01, N = 3 26.53 26.62 26.63 MIN: 25.78 / MAX: 41.25 MIN: 25.69 / MAX: 38.05 MIN: 25.7 / MAX: 41.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception ResNet V2 r1 r2 r3 1000K 2000K 3000K 4000K 5000K SE +/- 8775.31, N = 3 SE +/- 8796.49, N = 3 SE +/- 8398.83, N = 3 4660197 4670567 4677473
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: Multeasymap OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: Multeasymap r1 r2 r3 90 180 270 360 450 SE +/- 0.79, N = 3 SE +/- 2.87, N = 3 SE +/- 4.35, N = 3 413.88 412.43 412.38 MIN: 119.86 / MAX: 499.75 MIN: 103.17 / MAX: 499.75 MIN: 127.91 / MAX: 499.75 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU r3 r1 r2 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.06, N = 3 21.62 21.69 21.70 MIN: 21.51 MIN: 21.47 MIN: 21.48 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Zstd Compression Compression Level: 19 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 19 r3 r1 r2 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 28.8 28.8 28.7 1. (CC) gcc options: -O3 -pthread -lz -llzma
Inkscape Operation: SVG Files To PNG OpenBenchmarking.org Seconds, Fewer Is Better Inkscape Operation: SVG Files To PNG r1 r2 r3 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 21.00 21.05 21.07 1. Inkscape 0.92.5 (2060ec1f9f, 2020-04-08)
LevelDB Benchmark: Random Delete OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Random Delete r1 r2 r3 11 22 33 44 55 SE +/- 0.49, N = 5 SE +/- 0.57, N = 4 SE +/- 0.56, N = 4 47.23 47.30 47.39 1. (CXX) g++ options: -O3 -lsnappy -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU r1 r3 r2 1.0059 2.0118 3.0177 4.0236 5.0295 SE +/- 0.00967, N = 3 SE +/- 0.00726, N = 3 SE +/- 0.01661, N = 3 4.45564 4.46656 4.47062 MIN: 4.02 MIN: 4.01 MIN: 4.02 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
MandelGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better MandelGPU 1.3pts1 OpenCL Device: GPU r2 r3 r1 50M 100M 150M 200M 250M SE +/- 157365.45, N = 3 SE +/- 1449538.54, N = 3 SE +/- 1032565.22, N = 3 252826584.8 252822614.4 251986408.7 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric r3 r1 r2 14K 28K 42K 56K 70K 64033 63909 63822 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed r3 r1 r2 2K 4K 6K 8K 10K SE +/- 0.67, N = 3 SE +/- 1.84, N = 5 SE +/- 16.28, N = 3 9685.2 9676.3 9653.7 1. (CC) gcc options: -O3
Darktable Test: Boat - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.1 Test: Boat - Acceleration: CPU-only r3 r2 r1 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 15.86 15.87 15.91
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 r3 r2 r1 16 32 48 64 80 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 SE +/- 0.20, N = 3 71.86 71.91 72.09 MIN: 70.48 / MAX: 88 MIN: 70.43 / MAX: 92.47 MIN: 70.5 / MAX: 88.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU r3 r2 r1 20 40 60 80 100 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.21, N = 3 81.04 81.07 81.30
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed r3 r1 r2 2K 4K 6K 8K 10K SE +/- 0.78, N = 3 SE +/- 1.80, N = 5 SE +/- 15.38, N = 3 9695.2 9679.8 9664.8 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed r2 r1 r3 2K 4K 6K 8K 10K SE +/- 2.38, N = 3 SE +/- 3.96, N = 3 SE +/- 10.11, N = 3 9839.9 9823.2 9810.0 1. (CC) gcc options: -O3
OpenVINO Model: Face Detection 0106 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP32 - Device: CPU r1 r2 r3 700 1400 2100 2800 3500 SE +/- 2.58, N = 3 SE +/- 1.22, N = 4 SE +/- 2.51, N = 3 3202.53 3207.35 3212.10 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception V4 r1 r2 r3 1.1M 2.2M 3.3M 4.4M 5.5M SE +/- 5618.75, N = 3 SE +/- 7685.69, N = 3 SE +/- 8609.77, N = 3 5163190 5168183 5178263
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Quant r1 r2 r3 50K 100K 150K 200K 250K SE +/- 1686.36, N = 3 SE +/- 1668.46, N = 3 SE +/- 1810.35, N = 3 236716 237129 237406
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit r1 r3 r2 20 40 60 80 100 SE +/- 0.99, N = 4 SE +/- 1.03, N = 4 SE +/- 1.05, N = 4 86.08 85.95 85.83 MIN: 54.34 / MAX: 256.39 MIN: 54.21 / MAX: 255.72 MIN: 54.27 / MAX: 257.58 1. (CC) gcc options: -pthread
VkResample Upscale: 2x - Precision: Double OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double r1 r2 r3 60 120 180 240 300 SE +/- 0.20, N = 3 SE +/- 0.11, N = 3 SE +/- 0.20, N = 3 256.87 257.06 257.62 1. (CXX) g++ options: -O3 -pthread
PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL r2 r1 r3 400 800 1200 1600 2000 SE +/- 3.54, N = 3 SE +/- 7.57, N = 3 SE +/- 8.53, N = 3 1823.06 1819.24 1817.78
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode r2 r3 r1 2 4 6 8 10 SE +/- 0.004, N = 5 SE +/- 0.008, N = 5 SE +/- 0.009, N = 5 7.602 7.616 7.624 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 1 r3 r1 r2 0.0781 0.1562 0.2343 0.3124 0.3905 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 0.347 0.347 0.346
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: NVIDIA OptiX r2 r3 r1 14 28 42 56 70 SE +/- 0.04, N = 3 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 60.18 60.25 60.35
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 6 r1 r3 r2 0.3249 0.6498 0.9747 1.2996 1.6245 SE +/- 0.010, N = 3 SE +/- 0.012, N = 3 SE +/- 0.006, N = 3 1.444 1.443 1.440
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: mobilenet-v1-1.0 r1 r3 r2 3 6 9 12 15 SE +/- 0.01, N = 10 SE +/- 0.01, N = 10 SE +/- 0.01, N = 11 10.65 10.66 10.68 MIN: 10.33 / MAX: 34.53 MIN: 10.33 / MAX: 32.25 MIN: 10.35 / MAX: 33.35 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU r1 r3 r2 1500 3000 4500 6000 7500 SE +/- 2.95, N = 3 SE +/- 6.73, N = 3 SE +/- 4.70, N = 3 7140.50 7151.58 7159.42 MIN: 7021.68 MIN: 7027.2 MIN: 7041.4 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second r3 r1 r2 50K 100K 150K 200K 250K SE +/- 2209.16, N = 3 SE +/- 2532.07, N = 3 SE +/- 1894.03, N = 3 223892.44 223414.81 223304.98 1. (CC) gcc options: -O2 -lrt" -lrt
GEGL Operation: Wavelet Blur OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Wavelet Blur r3 r2 r1 13 26 39 52 65 SE +/- 0.25, N = 3 SE +/- 0.39, N = 3 SE +/- 0.25, N = 3 57.84 57.95 57.99
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet r2 r3 r1 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 15.46 15.49 15.50 MIN: 14.35 / MAX: 27.24 MIN: 14.41 / MAX: 24.83 MIN: 14.41 / MAX: 55.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space r3 r1 r2 200 400 600 800 1000 SE +/- 4.51, N = 3 SE +/- 5.03, N = 3 SE +/- 5.70, N = 3 776 775 774 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GEGL Operation: Antialias OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Antialias r1 r2 r3 8 16 24 32 40 SE +/- 0.45, N = 3 SE +/- 0.35, N = 3 SE +/- 0.38, N = 3 36.56 36.56 36.65
AI Benchmark Alpha Device Training Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Training Score r1 r3 r2 200 400 600 800 1000 816 814 814
Blender Blend File: Barbershop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Barbershop - Compute: NVIDIA OptiX r2 r3 r1 300 600 900 1200 1500 SE +/- 0.85, N = 3 SE +/- 2.01, N = 3 SE +/- 0.44, N = 3 1190.05 1192.80 1192.96
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet50 r1 r3 r2 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 37.25 37.26 37.34 MIN: 34.07 / MAX: 48.19 MIN: 33.79 / MAX: 52.48 MIN: 33.97 / MAX: 56.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL r3 r1 r2 300 600 900 1200 1500 SE +/- 4.92, N = 3 SE +/- 3.10, N = 3 SE +/- 2.03, N = 3 1247.93 1246.78 1244.95
LevelDB Benchmark: Random Fill OpenBenchmarking.org MB/s, More Is Better LevelDB 1.22 Benchmark: Random Fill r3 r2 r1 10 20 30 40 50 SE +/- 0.07, N = 3 SE +/- 0.19, N = 3 SE +/- 0.21, N = 3 43.2 43.1 43.1 1. (CXX) g++ options: -O3 -lsnappy -lpthread
Unigine Superposition Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Medium - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Superposition 1.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Medium - Renderer: OpenGL r2 r3 r1 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.15, N = 3 SE +/- 0.15, N = 3 90.6 90.5 90.4 MAX: 114.4 MAX: 113 MAX: 114.5
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: squeezenet_ssd r2 r3 r1 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 27.52 27.55 27.58 MIN: 26.95 / MAX: 42.6 MIN: 26.92 / MAX: 41.99 MIN: 26.94 / MAX: 43.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon r2 r1 r3 2 4 6 8 10 SE +/- 0.0719, N = 3 SE +/- 0.0643, N = 3 SE +/- 0.0754, N = 3 7.5656 7.5555 7.5496 MIN: 7.18 / MAX: 12.51 MIN: 7.18 / MAX: 12.55 MIN: 7.19 / MAX: 12.66
OpenVINO Model: Person Detection 0106 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP32 - Device: CPU r1 r3 r2 1100 2200 3300 4400 5500 SE +/- 15.43, N = 3 SE +/- 14.45, N = 5 SE +/- 9.68, N = 9 5069.44 5073.09 5079.89 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
Blender Blend File: BMW27 - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: CUDA r2 r3 r1 20 40 60 80 100 SE +/- 0.16, N = 3 SE +/- 0.10, N = 3 SE +/- 0.14, N = 3 90.82 90.93 91.00
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: vgg16 r2 r3 r1 16 32 48 64 80 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 71.82 71.86 71.96 MIN: 70.37 / MAX: 86.67 MIN: 70.4 / MAX: 88.5 MIN: 70.52 / MAX: 88.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile r3 r1 r2 20 40 60 80 100 SE +/- 0.30, N = 3 SE +/- 0.78, N = 3 SE +/- 0.39, N = 3 100.20 100.26 100.40
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU r1 r2 r3 1500 3000 4500 6000 7500 SE +/- 12.55, N = 3 SE +/- 1.75, N = 3 SE +/- 6.55, N = 3 7155.41 7159.48 7169.03 MIN: 7025.22 MIN: 7040.61 MIN: 7046.49 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 3 r1 r2 r3 20 40 60 80 100 SE +/- 0.55, N = 3 SE +/- 0.55, N = 3 SE +/- 0.53, N = 3 110.84 110.93 111.04 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
High Performance Conjugate Gradient OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 r1 r2 r3 0.8914 1.7828 2.6742 3.5656 4.457 SE +/- 0.00082, N = 3 SE +/- 0.00692, N = 3 SE +/- 0.01196, N = 3 3.96177 3.96068 3.95457 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing r1 r3 r2 120 240 360 480 600 SE +/- 2.73, N = 3 SE +/- 5.36, N = 3 SE +/- 5.00, N = 3 552 551 551 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Float r1 r2 r3 50K 100K 150K 200K 250K SE +/- 1996.41, N = 3 SE +/- 1820.00, N = 3 SE +/- 1638.46, N = 3 239119 239224 239537
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU r3 r1 r2 800 1600 2400 3200 4000 SE +/- 1.33, N = 3 SE +/- 1.61, N = 3 SE +/- 1.20, N = 3 3792.87 3797.32 3799.45 MIN: 3672.83 MIN: 3686.53 MIN: 3692.97 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Darktable Test: Server Room - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.1 Test: Server Room - Acceleration: CPU-only r2 r3 r1 0.9407 1.8814 2.8221 3.7628 4.7035 SE +/- 0.004, N = 3 SE +/- 0.006, N = 3 SE +/- 0.010, N = 3 4.174 4.178 4.181
yquake2 Renderer: Software CPU - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: Software CPU - Resolution: 1920 x 1080 r2 r1 r3 14 28 42 56 70 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 60.7 60.7 60.6 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
Blender Blend File: Pabellon Barcelona - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: CUDA r3 r1 r2 130 260 390 520 650 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 608.62 608.80 609.56
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU r1 r3 r2 1500 3000 4500 6000 7500 SE +/- 3.89, N = 3 SE +/- 2.23, N = 3 SE +/- 0.92, N = 3 7144.23 7147.09 7154.66 MIN: 7028.46 MIN: 7033.98 MIN: 7035.88 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Zstd Compression Compression Level: 3 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 3 r3 r1 r2 600 1200 1800 2400 3000 SE +/- 4.18, N = 3 SE +/- 7.25, N = 3 SE +/- 8.65, N = 3 2835.1 2833.6 2831.0 1. (CC) gcc options: -O3 -pthread -lz -llzma
LevelDB Benchmark: Random Fill OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Random Fill r3 r2 r1 9 18 27 36 45 SE +/- 0.07, N = 3 SE +/- 0.20, N = 3 SE +/- 0.19, N = 3 40.98 41.03 41.04 1. (CXX) g++ options: -O3 -lsnappy -lpthread
AI Benchmark Alpha Device AI Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device AI Score r1 r3 r2 300 600 900 1200 1500 1546 1544 1544
OctaneBench Total Score OpenBenchmarking.org Score, More Is Better OctaneBench 2020.1 Total Score r3 r2 r1 40 80 120 160 200 189.32 189.10 189.09
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read r1 r3 r2 70 140 210 280 350 SE +/- 0.18, N = 3 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 330.3 329.9 329.9 1. (CC) gcc options: -O2 -flto -lOpenCL
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU r1 r3 r2 800 1600 2400 3200 4000 SE +/- 6.76, N = 3 SE +/- 3.22, N = 3 SE +/- 4.34, N = 3 3795.81 3798.12 3800.41 MIN: 3687.23 MIN: 3685.27 MIN: 3681.23 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Basis Universal Settings: UASTC Level 2 + RDO Post-Processing OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 + RDO Post-Processing r2 r1 r3 200 400 600 800 1000 SE +/- 0.35, N = 3 SE +/- 0.74, N = 3 SE +/- 0.62, N = 3 840.32 840.35 841.23 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX r1 r2 r3 40 80 120 160 200 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 196.21 196.28 196.41
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p r1 r3 r2 100 200 300 400 500 SE +/- 3.60, N = 14 SE +/- 3.80, N = 13 SE +/- 3.46, N = 13 460.02 459.71 459.61 MIN: 375.05 / MAX: 590.01 MIN: 374.63 / MAX: 587.93 MIN: 374.03 / MAX: 582.97 1. (CC) gcc options: -pthread
FAHBench OpenBenchmarking.org Ns Per Day, More Is Better FAHBench 2.3.2 r3 r2 r1 40 80 120 160 200 SE +/- 0.11, N = 3 SE +/- 0.14, N = 3 SE +/- 0.23, N = 3 186.62 186.48 186.46
PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL r1 r3 r2 20 40 60 80 100 SE +/- 0.19, N = 3 SE +/- 0.40, N = 3 SE +/- 0.42, N = 3 110.07 109.99 109.98
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mobilenet r3 r1 r2 6 12 18 24 30 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 26.51 26.52 26.53 MIN: 25.69 / MAX: 45.35 MIN: 25.69 / MAX: 43.81 MIN: 25.76 / MAX: 43.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU r1 r2 r3 800 1600 2400 3200 4000 SE +/- 2.45, N = 3 SE +/- 2.65, N = 3 SE +/- 3.77, N = 3 3795.02 3797.05 3797.72 MIN: 3682.24 MIN: 3673.18 MIN: 3684.19 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenVINO Model: Face Detection 0106 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP16 - Device: CPU r3 r1 r2 700 1400 2100 2800 3500 SE +/- 7.78, N = 3 SE +/- 4.35, N = 3 SE +/- 3.88, N = 3 3164.51 3165.24 3166.57 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search r3 r1 r2 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 105.51 105.53 105.57 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Global Memory Bandwidth r3 r1 r2 70 140 210 280 350 SE +/- 0.28, N = 3 SE +/- 0.32, N = 3 SE +/- 0.28, N = 3 324.78 324.63 324.58 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: yolov4-tiny r2 r1 r3 8 16 24 32 40 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 35.51 35.52 35.53 MIN: 33.05 / MAX: 50.05 MIN: 34.38 / MAX: 51.44 MIN: 32.99 / MAX: 52.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
clpeak OpenCL Test: Double-Precision Double OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Double-Precision Double r3 r2 r1 70 140 210 280 350 SE +/- 3.74, N = 3 SE +/- 3.68, N = 3 SE +/- 3.78, N = 3 340.59 340.46 340.42 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
FinanceBench Benchmark: Black-Scholes OpenCL OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-06-06 Benchmark: Black-Scholes OpenCL r2 r3 r1 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 17.48 17.48 17.48 1. (CXX) g++ options: -O3 -lOpenCL
AI Benchmark Alpha Device Inference Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Inference Score r3 r2 r1 160 320 480 640 800 730 730 730
OpenVINO Model: Person Detection 0106 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP32 - Device: CPU r3 r2 r1 0.1778 0.3556 0.5334 0.7112 0.889 SE +/- 0.01, N = 5 SE +/- 0.01, N = 9 SE +/- 0.01, N = 3 0.79 0.79 0.79 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Person Detection 0106 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP16 - Device: CPU r3 r2 r1 0.18 0.36 0.54 0.72 0.9 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 0.80 0.80 0.80 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Face Detection 0106 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP16 - Device: CPU r3 r2 r1 0.288 0.576 0.864 1.152 1.44 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 1.28 1.28 1.28 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
Darktable Test: Server Rack - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.1 Test: Server Rack - Acceleration: CPU-only r1 r2 r3 0.0407 0.0814 0.1221 0.1628 0.2035 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 0.181 0.181 0.181
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced r3 r2 r1 30 60 90 120 150 SE +/- 0.67, N = 3 SE +/- 0.67, N = 3 SE +/- 0.67, N = 3 115 115 115 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Swirl OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl r3 r2 r1 50 100 150 200 250 SE +/- 1.72, N = 8 SE +/- 1.60, N = 10 SE +/- 1.72, N = 8 207 207 207 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom r3 r2 r1 0.1125 0.225 0.3375 0.45 0.5625 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.5 0.5 0.5 1. (CXX) g++ options: -O3 -pthread
yquake2 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 r3 r2 r1 13 26 39 52 65 60 60 60 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
yquake2 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 r3 r2 r1 13 26 39 52 65 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 59.9 59.9 59.9 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
LevelDB Benchmark: Fill Sync OpenBenchmarking.org MB/s, More Is Better LevelDB 1.22 Benchmark: Fill Sync r3 r2 r1 0.1125 0.225 0.3375 0.45 0.5625 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.5 0.5 0.5 1. (CXX) g++ options: -O3 -lsnappy -lpthread
NeatBench Acceleration: GPU OpenBenchmarking.org FPS, More Is Better NeatBench 5 Acceleration: GPU r3 r1 r2 6 12 18 24 30 SE +/- 0.60, N = 15 SE +/- 0.57, N = 15 SE +/- 0.47, N = 15 27.6 27.5 27.1
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: NVIDIA OptiX r2 r3 r1 9 18 27 36 45 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 3.33, N = 15 38.07 38.07 41.47
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: regnety_400m r2 r3 r1 5 10 15 20 25 SE +/- 1.83, N = 3 SE +/- 1.77, N = 3 SE +/- 0.09, N = 3 17.15 17.60 19.16 MIN: 13.3 / MAX: 38.12 MIN: 13.79 / MAX: 32.97 MIN: 17.94 / MAX: 21.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: googlenet r2 r3 r1 5 10 15 20 25 SE +/- 1.77, N = 3 SE +/- 1.84, N = 3 SE +/- 0.06, N = 3 18.20 18.26 20.05 MIN: 14.26 / MAX: 31.74 MIN: 14.28 / MAX: 36.09 MIN: 18.94 / MAX: 32.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: blazeface r2 r3 r1 0.5738 1.1476 1.7214 2.2952 2.869 SE +/- 0.26, N = 3 SE +/- 0.25, N = 3 SE +/- 0.02, N = 3 2.29 2.29 2.55 MIN: 1.68 / MAX: 8.91 MIN: 1.69 / MAX: 12.73 MIN: 2.43 / MAX: 2.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: efficientnet-b0 r3 r2 r1 3 6 9 12 15 SE +/- 0.94, N = 3 SE +/- 0.95, N = 3 SE +/- 0.10, N = 3 8.99 9.02 10.01 MIN: 6.99 / MAX: 13.79 MIN: 7 / MAX: 19.29 MIN: 9.44 / MAX: 29.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mnasnet r2 r3 r1 2 4 6 8 10 SE +/- 0.71, N = 3 SE +/- 0.76, N = 3 SE +/- 0.00, N = 3 5.86 5.91 6.63 MIN: 4.3 / MAX: 15.47 MIN: 4.32 / MAX: 7.94 MIN: 6.21 / MAX: 8.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: shufflenet-v2 r2 r3 r1 2 4 6 8 10 SE +/- 0.96, N = 3 SE +/- 0.93, N = 3 SE +/- 0.07, N = 3 6.98 7.05 7.92 MIN: 4.98 / MAX: 27.09 MIN: 5.04 / MAX: 20.37 MIN: 7.27 / MAX: 20.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 r2 r1 r3 1.3073 2.6146 3.9219 5.2292 6.5365 SE +/- 0.65, N = 3 SE +/- 0.62, N = 3 SE +/- 0.64, N = 3 5.73 5.74 5.81 MIN: 4.33 / MAX: 10.47 MIN: 4.43 / MAX: 9.64 MIN: 4.41 / MAX: 25.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 r3 r2 r1 2 4 6 8 10 SE +/- 0.73, N = 3 SE +/- 0.79, N = 3 SE +/- 0.74, N = 3 7.19 7.22 7.23 MIN: 5.52 / MAX: 9.67 MIN: 5.41 / MAX: 20.72 MIN: 5.54 / MAX: 9.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 r2 r3 r1 3 6 9 12 15 SE +/- 0.96, N = 3 SE +/- 0.96, N = 3 SE +/- 0.05, N = 3 9.05 9.06 10.00 MIN: 6.99 / MAX: 21.76 MIN: 7.04 / MAX: 12.38 MIN: 9.46 / MAX: 24.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet r2 r3 r1 2 4 6 8 10 SE +/- 0.75, N = 3 SE +/- 0.74, N = 3 SE +/- 0.02, N = 3 5.96 5.96 6.67 MIN: 4.32 / MAX: 14.32 MIN: 4.33 / MAX: 28.21 MIN: 5.99 / MAX: 21.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 r2 r3 r1 2 4 6 8 10 SE +/- 0.94, N = 3 SE +/- 0.95, N = 3 SE +/- 0.03, N = 3 6.95 7.03 7.93 MIN: 5.01 / MAX: 9.68 MIN: 5.04 / MAX: 20.64 MIN: 7.52 / MAX: 16.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 r1 r2 r3 1.3073 2.6146 3.9219 5.2292 6.5365 SE +/- 0.65, N = 3 SE +/- 0.65, N = 3 SE +/- 0.62, N = 3 5.74 5.81 5.81 MIN: 4.3 / MAX: 7.75 MIN: 4.43 / MAX: 17.76 MIN: 4.48 / MAX: 10.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 r2 r3 r1 2 4 6 8 10 SE +/- 0.73, N = 3 SE +/- 0.73, N = 3 SE +/- 0.67, N = 3 7.22 7.23 7.31 MIN: 5.54 / MAX: 12.03 MIN: 5.55 / MAX: 12.3 MIN: 5.51 / MAX: 16.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: MobileNetV2_224 r1 r3 r2 1.1905 2.381 3.5715 4.762 5.9525 SE +/- 0.210, N = 10 SE +/- 0.209, N = 10 SE +/- 0.185, N = 11 5.239 5.285 5.291 MIN: 3.19 / MAX: 26.27 MIN: 3.27 / MAX: 26.82 MIN: 3.3 / MAX: 27.38 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: SqueezeNetV1.0 r1 r3 r2 3 6 9 12 15 SE +/- 0.373, N = 10 SE +/- 0.373, N = 10 SE +/- 0.316, N = 11 8.899 8.944 8.982 MIN: 4.96 / MAX: 31.21 MIN: 5.01 / MAX: 31.89 MIN: 5.05 / MAX: 31.35 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP r1 r3 r2 700K 1400K 2100K 2800K 3500K SE +/- 36042.05, N = 3 SE +/- 181152.66, N = 12 SE +/- 3702.86, N = 3 3394660.20 2809233.48 2104092.33 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium r3 r2 r1 2 4 6 8 10 SE +/- 0.16, N = 15 SE +/- 0.11, N = 15 SE +/- 0.14, N = 15 7.58 7.61 7.68 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU r2 r3 r1 1.0682 2.1364 3.2046 4.2728 5.341 SE +/- 0.06823, N = 15 SE +/- 0.07477, N = 15 SE +/- 0.10403, N = 12 4.71457 4.73728 4.74772 MIN: 3.29 MIN: 3.29 MIN: 3.29 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU r2 r3 r1 3 6 9 12 15 SE +/- 0.15643, N = 15 SE +/- 0.22537, N = 12 SE +/- 0.23621, N = 12 9.77701 9.81238 9.87893 MIN: 6.67 MIN: 6.65 MIN: 6.66 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU r3 r2 r1 0.715 1.43 2.145 2.86 3.575 SE +/- 0.06527, N = 12 SE +/- 0.02081, N = 3 SE +/- 0.01732, N = 3 3.11291 3.16769 3.17762 MIN: 1.86 MIN: 2.39 MIN: 2.58 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein r1 r3 r2 1.1696 2.3392 3.5088 4.6784 5.848 SE +/- 0.111, N = 15 SE +/- 0.110, N = 15 SE +/- 0.109, N = 15 5.198 5.179 5.169 1. (CXX) g++ options: -O3 -pthread -lm
LuxCoreRender OpenCL Scene: Rainbow Colors and Prism OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: Rainbow Colors and Prism r3 r2 r1 1.2173 2.4346 3.6519 4.8692 6.0865 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.12, N = 12 5.41 5.39 5.30 MIN: 4.58 / MAX: 5.7 MIN: 4.6 / MAX: 5.67 MIN: 1.66 / MAX: 5.7
LuxCoreRender OpenCL Scene: LuxCore Benchmark OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: LuxCore Benchmark r2 r3 r1 0.5198 1.0396 1.5594 2.0792 2.599 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 12 2.31 2.29 2.26 MIN: 0.27 / MAX: 2.63 MIN: 0.27 / MAX: 2.64 MIN: 0.14 / MAX: 2.63
LuxCoreRender OpenCL Scene: Food OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: Food r2 r3 r1 0.297 0.594 0.891 1.188 1.485 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 12 1.32 1.30 1.27 MIN: 0.29 / MAX: 1.57 MIN: 0.26 / MAX: 1.57 MIN: 0.13 / MAX: 1.57
LuxCoreRender OpenCL Scene: DLSC OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: DLSC r2 r3 r1 0.6233 1.2466 1.8699 2.4932 3.1165 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.06, N = 12 2.77 2.76 2.70 MIN: 2.57 / MAX: 2.84 MIN: 2.56 / MAX: 2.84 MIN: 0.69 / MAX: 2.81
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: RaiNyMore2 OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: RaiNyMore2 r1 r3 r2 30 60 90 120 150 SE +/- 9.86, N = 15 SE +/- 13.14, N = 12 158.21 130.66 100.58 MIN: 7.02 / MAX: 449.03 MIN: 6.67 / MAX: 498.75 MIN: 6.72 / MAX: 493.34 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: RaiNyMore2 OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: RaiNyMore2 r1 r2 r3 40 80 120 160 200 SE +/- 9.09, N = 15 SE +/- 9.59, N = 15 SE +/- 11.09, N = 15 170.36 169.30 151.49 MIN: 2.43 / MAX: 499.5 MIN: 2.38 / MAX: 499.5 MIN: 2.37 / MAX: 499.75 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
LevelDB Benchmark: Random Read OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Random Read r3 r1 r2 3 6 9 12 15 SE +/- 0.214, N = 15 SE +/- 0.250, N = 12 SE +/- 0.206, N = 15 9.573 9.620 9.692 1. (CXX) g++ options: -O3 -lsnappy -lpthread
Phoronix Test Suite v10.8.5