HP Zbook Intel Core i9-10885H testing with a HP 8736 (S91 Ver. 01.02.01 BIOS) and NVIDIA Quadro RTX 5000 with Max-Q Design 16GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2101076-HA-HPZBOOK6247&sor .
HP Zbook Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution r1 r2 r3 Intel Core i9-10885H @ 5.30GHz (8 Cores / 16 Threads) HP 8736 (S91 Ver. 01.02.01 BIOS) Intel Comet Lake PCH 32GB 2048GB KXG50PNV2T04 KIOXIA NVIDIA Quadro RTX 5000 with Max-Q Design 16GB (600/6000MHz) Intel Comet Lake PCH cAVS Intel Wi-Fi 6 AX201 Ubuntu 20.04 5.6.0-1034-oem (x86_64) GNOME Shell 3.36.4 X Server 1.20.8 NVIDIA 450.80.02 4.6.0 OpenCL 1.2 CUDA 11.0.228 1.2.133 GCC 9.3.0 + CUDA 10.1 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - NONE / errors=remount-ro,relatime,rw / Block Size: 4096 Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xe0 - Thermald 1.9.1 OpenCL Details - GPU Compute Cores: 3072 Python Details - Python 3.8.3 Security Details - itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
HP Zbook leveldb: Hot Read leveldb: Fill Sync leveldb: Fill Sync leveldb: Overwrite leveldb: Overwrite leveldb: Rand Fill leveldb: Rand Fill leveldb: Rand Read leveldb: Seek Rand leveldb: Rand Delete leveldb: Seq Fill leveldb: Seq Fill realsr-ncnn: 4x - No realsr-ncnn: 4x - Yes waifu2x-ncnn: 2x - 3 - Yes vkfft: hashcat: MD5 hashcat: SHA1 hashcat: 7-Zip hashcat: SHA-512 hashcat: TrueCrypt RIPEMD160 + XTS financebench: Black-Scholes OpenCL viennacl: OpenCL LU Factorization cl-mem: Copy cl-mem: Read cl-mem: Write namd-cuda: ATPase Simulation - 327,506 Atoms betsy: ETC1 - Highest betsy: ETC2 RGB - Highest vkresample: 2x - Double vkresample: 2x - Single ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.0 - Default - RaiNyMore2 ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.3 - Default - RaiNyMore2 ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.0 - Default - Multeasymap ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.3 - Default - Multeasymap unigine-heaven: 1920 x 1080 - Fullscreen - OpenGL unigine-super: 1920 x 1080 - Fullscreen - Low - OpenGL unigine-super: 1920 x 1080 - Fullscreen - High - OpenGL unigine-super: 1920 x 1080 - Fullscreen - Ultra - OpenGL unigine-super: 1920 x 1080 - Fullscreen - Medium - OpenGL warsow: 1920 x 1080 yquake2: OpenGL 1.x - 1920 x 1080 yquake2: OpenGL 3.x - 1920 x 1080 yquake2: Software CPU - 1920 x 1080 octanebench: Total Score redshift: luxcorerender-cl: DLSC luxcorerender-cl: Food luxcorerender-cl: LuxCore Benchmark luxcorerender-cl: Rainbow Colors and Prism fahbench: hpcg: lczero: OpenCL rodinia: OpenCL Particle Filter clomp: Static OMP Speedup hmmer: Pfam Database Search mafft: Multiple Sequence Alignment - LSU RNA lammps: Rhodopsin Protein simdjson: Kostya simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID compress-lz4: 1 - Compression Speed compress-lz4: 1 - Decompression Speed compress-lz4: 3 - Compression Speed compress-lz4: 3 - Decompression Speed compress-lz4: 9 - Compression Speed compress-lz4: 9 - Decompression Speed compress-zstd: 3 compress-zstd: 19 crafty: Elapsed Time arrayfire: Conjugate Gradient OpenCL graphics-magick: Swirl graphics-magick: Rotate graphics-magick: Sharpen graphics-magick: Enhanced graphics-magick: Resizing graphics-magick: Noise-Gaussian graphics-magick: HWB Color Space onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU dav1d: Chimera 1080p dav1d: Summer Nature 4K dav1d: Summer Nature 1080p dav1d: Chimera 1080p 10-bit embree: Pathtracer - Crown embree: Pathtracer ISPC - Crown embree: Pathtracer - Asian Dragon embree: Pathtracer ISPC - Asian Dragon rav1e: 1 rav1e: 5 rav1e: 6 rav1e: 10 coremark: CoreMark Size 666 - Iterations Per Second stockfish: Total Time asmfish: 1024 Hash Memory, 26 Depth build-ffmpeg: Time To Compile build-linux-kernel: Time To Compile build2: Time To Compile numpy: build-eigen: Time To Compile deepspeech: CPU encode-ape: WAV To APE encode-opus: WAV To Opus Encode espeak: Text-To-Speech Synthesis rnnoise: node-web-tooling: cryptsetup: PBKDF2-sha512 cryptsetup: PBKDF2-whirlpool cryptsetup: AES-XTS 256b Encryption cryptsetup: AES-XTS 256b Decryption cryptsetup: Serpent-XTS 256b Encryption cryptsetup: Serpent-XTS 256b Decryption cryptsetup: Twofish-XTS 256b Encryption cryptsetup: Twofish-XTS 256b Decryption cryptsetup: AES-XTS 512b Encryption cryptsetup: AES-XTS 512b Decryption cryptsetup: Serpent-XTS 512b Encryption cryptsetup: Serpent-XTS 512b Decryption cryptsetup: Twofish-XTS 512b Decryption cryptsetup: Twofish-XTS 512b Encryption gromacs: Water Benchmark tensorflow-lite: SqueezeNet tensorflow-lite: Inception V4 tensorflow-lite: NASNet Mobile tensorflow-lite: Mobilenet Float tensorflow-lite: Mobilenet Quant tensorflow-lite: Inception ResNet V2 astcenc: Fast astcenc: Medium astcenc: Thorough astcenc: Exhaustive basis: ETC1S basis: UASTC Level 0 basis: UASTC Level 2 basis: UASTC Level 3 basis: UASTC Level 2 + RDO Post-Processing sqlite-speedtest: Timed Time - Size 1,000 darktable: Boat - CPU-only darktable: Masskrug - CPU-only darktable: Server Rack - CPU-only darktable: Server Room - CPU-only gegl: Crop gegl: Scale gegl: Cartoon gegl: Reflect gegl: Antialias gegl: Tile Glass gegl: Wavelet Blur gegl: Color Enhance gegl: Rotate 90 Degrees inkscape: SVG Files To PNG rawtherapee: Total Benchmark Time redis: LPOP redis: SADD redis: LPUSH redis: GET redis: SET mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v1.1 plaidml: No - Inference - IMDB LSTM - OpenCL plaidml: No - Inference - Mobilenet - OpenCL plaidml: Yes - Inference - Mobilenet - OpenCL plaidml: No - Inference - DenseNet 201 - OpenCL openvino: Face Detection 0106 FP16 - CPU openvino: Face Detection 0106 FP16 - CPU openvino: Face Detection 0106 FP32 - CPU openvino: Face Detection 0106 FP32 - CPU openvino: Person Detection 0106 FP16 - CPU openvino: Person Detection 0106 FP16 - CPU openvino: Person Detection 0106 FP32 - CPU openvino: Person Detection 0106 FP32 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP32 - CPU openvino: Age Gender Recognition Retail 0013 FP32 - CPU indigobench: CPU - Bedroom indigobench: CPU - Supercar blender: BMW27 - CUDA blender: Classroom - CUDA blender: Fishy Cat - CUDA blender: Barbershop - CUDA blender: BMW27 - NVIDIA OptiX blender: Classroom - NVIDIA OptiX blender: Fishy Cat - NVIDIA OptiX blender: Barbershop - NVIDIA OptiX blender: Pabellon Barcelona - CUDA blender: Pabellon Barcelona - NVIDIA OptiX mandelgpu: GPU clpeak: Integer Compute INT clpeak: Single-Precision Float clpeak: Double-Precision Double clpeak: Global Memory Bandwidth neatbench: GPU ai-benchmark: Device Inference Score ai-benchmark: Device Training Score ai-benchmark: Device AI Score phpbench: PHP Benchmark Suite unpack-firefox: firefox-84.0.source.tar.xz brl-cad: VGR Performance Metric r1 r2 r3 6.946 0.5 3361.777 43.2 40.925 43.1 41.037 9.620 12.694 47.228 37.5 47.235 14.734 99.812 6.020 25820 24334866667 8585766667 373667 1023100000 301233 17.477667 68.2924 236.6 330.3 215.7 0.22103 5.854 8.016 256.867 24.992 170.36 158.21 413.88 435.20 139.126 177.7 65.9 25.1 90.4 955.6 59.9 60 60.7 189.085068 461 2.70 1.27 2.26 5.30 186.4611 3.96177 13277 7.115 3.7 105.526 10.497 5.198 0.76 0.5 0.86 0.89 8120.67 9823.2 57.88 9676.3 55.72 9679.8 2833.6 28.8 9497414 2.549 207 902 72 115 552 146 775 7.16575 12.4721 3.17762 2.72558 21.6871 9.77594 9.87893 18.0080 8.96782 4.74772 7155.41 3795.02 7140.50 3795.81 4.36381 7144.23 3797.32 4.45564 489.84 112.75 460.02 86.08 6.0806 7.0735 7.5555 9.1343 0.347 1.069 1.444 3.422 223414.807558 9703133 15984719 100.257 151.656 210.051 419.58 68.744 81.29517 10.512 7.624 26.474 22.084 13.06 1919349 816282 4005.6 4002.4 874.1 872.3 482.0 482.5 3346.8 3348.3 878.0 871.7 482.7 483.0 0.617 354892 5163190 302594 239119 236716 4660197 5.44 7.68 54.29 447.99 57.824 7.288 55.499 110.838 840.347 49.547 15.914 7.128 0.181 4.181 8.900 6.954 86.789 28.183 36.556 28.243 57.993 54.114 37.697 20.996 80.586 3394660.2 2660539.42 2041750.08 3248596.08 2375800.25 8.899 58.161 5.239 10.646 62.568 26.62 7.31 5.74 7.93 6.67 10.00 2.54 19.98 72.09 18.62 15.50 37.81 35.95 27.64 19.16 26.52 7.23 5.74 7.92 6.63 10.01 2.55 20.05 71.96 18.62 15.44 37.25 35.52 27.58 19.16 321.420 272.907 463.34 1246.78 1819.24 110.07 1.28 3165.24 1.26 3202.53 0.80 4961.99 0.79 5069.44 3442.78 1.17 3363.55 1.21 0.939 2.147 91.00 250.78 168.87 734.81 41.47 116.76 60.35 1192.96 608.80 196.21 251986408.7 5504.35 5940.64 340.42 324.63 27.5 730 816 1546 837911 16.028 63909 7.099 0.5 3424.918 43.2 40.955 43.1 41.027 9.692 12.629 47.296 37.4 47.287 14.656 100.617 6.102 25647 24260200000 8544500000 370400 1020000000 301433 17.476 64.2335 235.4 329.9 215.6 0.22238 5.789 7.912 257.062 25.190 169.30 100.58 412.43 429.37 139.905 178.1 66.5 25.4 90.6 967.9 59.9 60 60.7 189.101719 460 2.77 1.32 2.31 5.39 186.4777 3.96068 13173 7.055 2.5 105.572 10.564 5.169 0.75 0.5 0.87 0.88 8127.78 9839.9 57.36 9653.7 56.07 9664.8 2831.0 28.7 9584148 2.531 207 875 72 115 551 147 774 7.04404 12.4447 3.16769 2.77670 21.6992 9.76468 9.77701 17.9035 9.00692 4.71457 7159.48 3797.05 7159.42 3800.41 4.37852 7154.66 3799.45 4.47062 486.46 112.03 459.61 85.83 6.0989 6.9794 7.5656 9.2596 0.346 1.064 1.440 3.404 223304.983286 9839292 15974611 100.397 152.208 210.712 419.36 67.543 81.07316 10.861 7.602 27.178 21.316 13.17 1943008 830020 4080.5 4055.1 881.4 876.6 487.4 486.3 3381.9 3388.5 882.1 878.1 485.7 486.4 0.610 356034 5168183 304756 239224 237129 4670567 5.59 7.61 54.38 449.37 58.063 7.345 55.742 110.926 840.319 50.268 15.870 7.150 0.181 4.174 8.839 6.973 87.319 28.496 36.557 28.242 57.950 54.312 37.541 21.048 80.934 2104092.33 2628039.25 2094056.31 3012560.83 2413657.0 8.982 58.530 5.291 10.675 63.180 26.63 7.22 5.81 6.95 5.96 9.05 2.60 20.01 71.91 18.71 15.46 37.30 35.59 27.51 18.91 26.53 7.22 5.73 6.98 5.86 9.02 2.29 18.20 71.82 18.33 15.53 37.34 35.51 27.52 17.15 295.547 264.948 477.39 1244.95 1823.06 109.98 1.28 3166.57 1.27 3207.35 0.80 4978.25 0.79 5079.89 3403.45 1.19 3307.53 1.23 0.938 2.150 90.82 251.90 167.96 731.67 38.07 116.15 60.18 1190.05 609.56 196.28 252826584.8 5519.39 5858.32 340.46 324.58 27.1 730 814 1544 832417 16.137 63822 7.128 0.5 3386.084 43.4 40.760 43.2 40.981 9.573 12.644 47.388 37.3 47.415 14.694 100.748 6.093 25683 24196900000 8535333333 366433 1016800000 298133 17.476333 65.9180 235.1 329.9 214.8 0.22171 5.792 7.903 257.615 25.225 151.49 130.66 412.38 434.24 139.184 177.4 66.2 25.3 90.5 968.6 59.9 60 60.6 189.316553 459 2.76 1.30 2.29 5.41 186.6158 3.95457 13416 7.027 3.6 105.505 10.608 5.179 0.75 0.5 0.86 0.88 8079.18 9810.0 58.89 9685.2 57.01 9695.2 2835.1 28.8 9560012 2.548 207 900 73 115 551 147 776 7.14574 12.6089 3.11291 2.74874 21.6210 9.73732 9.81238 18.0326 9.06628 4.73728 7169.03 3797.72 7151.58 3798.12 4.38535 7147.09 3792.87 4.46656 487.57 112.65 459.71 85.95 6.0641 6.9976 7.5496 9.1967 0.347 1.064 1.443 3.420 223892.444726 9629353 16180674 100.203 151.478 210.945 417.03 68.699 81.03983 10.592 7.616 27.713 22.044 13.18 1886103 810352 4023.0 4026.9 874.1 870.9 483.6 483.0 3336.0 3362.9 874.4 873.5 483.0 483.8 0.614 356258 5178263 304079 239537 237406 4677473 5.63 7.58 54.65 449.90 58.062 7.353 55.766 111.040 841.228 50.261 15.863 7.155 0.181 4.178 8.826 7.000 86.993 28.313 36.646 28.055 57.843 54.101 37.691 21.068 80.712 2809233.48 2634908.83 2083566.29 3009326.75 2433543.8 8.944 58.786 5.285 10.658 63.563 26.53 7.23 5.81 7.03 5.96 9.06 2.57 20.21 71.86 18.66 15.49 37.22 35.66 27.63 19.38 26.51 7.19 5.81 7.05 5.91 8.99 2.29 18.26 71.86 18.38 15.50 37.26 35.53 27.55 17.60 299.396 272.676 478.73 1247.93 1817.78 109.99 1.28 3164.51 1.27 3212.10 0.80 5006.34 0.79 5073.09 3405.92 1.19 3347.93 1.22 0.935 2.156 90.93 251.80 168.08 733.02 38.07 116.26 60.25 1192.80 608.62 196.41 252822614.4 5540.44 5892.70 340.59 324.78 27.6 730 814 1544 829705 16.103 64033 OpenBenchmarking.org
LevelDB Benchmark: Hot Read OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Hot Read r1 r2 r3 2 4 6 8 10 SE +/- 0.013, N = 3 SE +/- 0.075, N = 3 SE +/- 0.049, N = 3 6.946 7.099 7.128 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Fill Sync OpenBenchmarking.org MB/s, More Is Better LevelDB 1.22 Benchmark: Fill Sync r3 r2 r1 0.1125 0.225 0.3375 0.45 0.5625 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.5 0.5 0.5 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Fill Sync OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Fill Sync r1 r3 r2 700 1400 2100 2800 3500 SE +/- 33.91, N = 3 SE +/- 60.32, N = 3 SE +/- 25.98, N = 3 3361.78 3386.08 3424.92 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Overwrite OpenBenchmarking.org MB/s, More Is Better LevelDB 1.22 Benchmark: Overwrite r3 r2 r1 10 20 30 40 50 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 SE +/- 0.15, N = 3 43.4 43.2 43.2 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Overwrite OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Overwrite r3 r1 r2 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.15, N = 3 SE +/- 0.08, N = 3 40.76 40.93 40.96 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Random Fill OpenBenchmarking.org MB/s, More Is Better LevelDB 1.22 Benchmark: Random Fill r3 r2 r1 10 20 30 40 50 SE +/- 0.07, N = 3 SE +/- 0.19, N = 3 SE +/- 0.21, N = 3 43.2 43.1 43.1 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Random Fill OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Random Fill r3 r2 r1 9 18 27 36 45 SE +/- 0.07, N = 3 SE +/- 0.20, N = 3 SE +/- 0.19, N = 3 40.98 41.03 41.04 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Random Read OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Random Read r3 r1 r2 3 6 9 12 15 SE +/- 0.214, N = 15 SE +/- 0.250, N = 12 SE +/- 0.206, N = 15 9.573 9.620 9.692 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Seek Random OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Seek Random r2 r3 r1 3 6 9 12 15 SE +/- 0.10, N = 15 SE +/- 0.11, N = 14 SE +/- 0.11, N = 15 12.63 12.64 12.69 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Random Delete OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Random Delete r1 r2 r3 11 22 33 44 55 SE +/- 0.49, N = 5 SE +/- 0.57, N = 4 SE +/- 0.56, N = 4 47.23 47.30 47.39 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Sequential Fill OpenBenchmarking.org MB/s, More Is Better LevelDB 1.22 Benchmark: Sequential Fill r1 r2 r3 9 18 27 36 45 SE +/- 0.44, N = 4 SE +/- 0.46, N = 4 SE +/- 0.39, N = 5 37.5 37.4 37.3 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Sequential Fill OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Sequential Fill r1 r2 r3 11 22 33 44 55 SE +/- 0.54, N = 4 SE +/- 0.58, N = 4 SE +/- 0.48, N = 5 47.24 47.29 47.42 1. (CXX) g++ options: -O3 -lsnappy -lpthread
RealSR-NCNN Scale: 4x - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No r2 r3 r1 4 8 12 16 20 SE +/- 0.09, N = 3 SE +/- 0.11, N = 3 SE +/- 0.01, N = 3 14.66 14.69 14.73
RealSR-NCNN Scale: 4x - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: Yes r1 r2 r3 20 40 60 80 100 SE +/- 0.31, N = 3 SE +/- 0.48, N = 3 SE +/- 0.35, N = 3 99.81 100.62 100.75
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes r1 r3 r2 2 4 6 8 10 SE +/- 0.004, N = 3 SE +/- 0.011, N = 3 SE +/- 0.007, N = 3 6.020 6.093 6.102
VkFFT OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.1.1 r1 r3 r2 6K 12K 18K 24K 30K SE +/- 62.93, N = 3 SE +/- 108.37, N = 3 SE +/- 58.68, N = 3 25820 25683 25647 1. (CXX) g++ options: -O3 -pthread
Hashcat Benchmark: MD5 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: MD5 r1 r2 r3 5000M 10000M 15000M 20000M 25000M SE +/- 110495102.96, N = 3 SE +/- 81107726.72, N = 3 SE +/- 49256167.13, N = 3 24334866667 24260200000 24196900000
Hashcat Benchmark: SHA1 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: SHA1 r1 r2 r3 2000M 4000M 6000M 8000M 10000M SE +/- 31347213.24, N = 3 SE +/- 17380832.35, N = 3 SE +/- 18653000.95, N = 3 8585766667 8544500000 8535333333
Hashcat Benchmark: 7-Zip OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: 7-Zip r1 r2 r3 80K 160K 240K 320K 400K SE +/- 1589.90, N = 3 SE +/- 1858.31, N = 3 SE +/- 3670.30, N = 3 373667 370400 366433
Hashcat Benchmark: SHA-512 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: SHA-512 r1 r2 r3 200M 400M 600M 800M 1000M SE +/- 11546345.54, N = 15 SE +/- 2594224.35, N = 3 SE +/- 1852025.92, N = 3 1023100000 1020000000 1016800000
Hashcat Benchmark: TrueCrypt RIPEMD160 + XTS OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: TrueCrypt RIPEMD160 + XTS r2 r1 r3 60K 120K 180K 240K 300K SE +/- 851.14, N = 3 SE +/- 1322.04, N = 3 SE +/- 545.69, N = 3 301433 301233 298133
FinanceBench Benchmark: Black-Scholes OpenCL OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-06-06 Benchmark: Black-Scholes OpenCL r2 r3 r1 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 17.48 17.48 17.48 1. (CXX) g++ options: -O3 -lOpenCL
ViennaCL OpenCL LU Factorization OpenBenchmarking.org GFLOPS, More Is Better ViennaCL 1.4.2 OpenCL LU Factorization r1 r3 r2 15 30 45 60 75 SE +/- 0.36, N = 3 SE +/- 0.44, N = 3 SE +/- 0.08, N = 3 68.29 65.92 64.23 1. (CXX) g++ options: -rdynamic -lOpenCL
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy r1 r2 r3 50 100 150 200 250 SE +/- 0.22, N = 3 SE +/- 0.24, N = 3 SE +/- 0.27, N = 3 236.6 235.4 235.1 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read r1 r3 r2 70 140 210 280 350 SE +/- 0.18, N = 3 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 330.3 329.9 329.9 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write r1 r2 r3 50 100 150 200 250 SE +/- 0.47, N = 3 SE +/- 0.26, N = 3 SE +/- 0.50, N = 3 215.7 215.6 214.8 1. (CC) gcc options: -O2 -flto -lOpenCL
NAMD CUDA ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms r1 r3 r2 0.05 0.1 0.15 0.2 0.25 SE +/- 0.00131, N = 3 SE +/- 0.00272, N = 4 SE +/- 0.00245, N = 5 0.22103 0.22171 0.22238
Betsy GPU Compressor Codec: ETC1 - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest r2 r3 r1 1.3172 2.6344 3.9516 5.2688 6.586 SE +/- 0.008, N = 3 SE +/- 0.024, N = 3 SE +/- 0.068, N = 12 5.789 5.792 5.854 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Betsy GPU Compressor Codec: ETC2 RGB - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest r3 r2 r1 2 4 6 8 10 SE +/- 0.023, N = 3 SE +/- 0.018, N = 3 SE +/- 0.064, N = 13 7.903 7.912 8.016 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
VkResample Upscale: 2x - Precision: Double OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double r1 r2 r3 60 120 180 240 300 SE +/- 0.20, N = 3 SE +/- 0.11, N = 3 SE +/- 0.20, N = 3 256.87 257.06 257.62 1. (CXX) g++ options: -O3 -pthread
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single r1 r2 r3 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 24.99 25.19 25.23 1. (CXX) g++ options: -O3 -pthread
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: RaiNyMore2 OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: RaiNyMore2 r1 r2 r3 40 80 120 160 200 SE +/- 9.09, N = 15 SE +/- 9.59, N = 15 SE +/- 11.09, N = 15 170.36 169.30 151.49 MIN: 2.43 / MAX: 499.5 MIN: 2.38 / MAX: 499.5 MIN: 2.37 / MAX: 499.75 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: RaiNyMore2 OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: RaiNyMore2 r1 r3 r2 30 60 90 120 150 SE +/- 9.86, N = 15 SE +/- 13.14, N = 12 158.21 130.66 100.58 MIN: 7.02 / MAX: 449.03 MIN: 6.67 / MAX: 498.75 MIN: 6.72 / MAX: 493.34 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: Multeasymap OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: Multeasymap r1 r2 r3 90 180 270 360 450 SE +/- 0.79, N = 3 SE +/- 2.87, N = 3 SE +/- 4.35, N = 3 413.88 412.43 412.38 MIN: 119.86 / MAX: 499.75 MIN: 103.17 / MAX: 499.75 MIN: 127.91 / MAX: 499.75 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: Multeasymap - Total Frame Time OpenBenchmarking.org Milliseconds, Fewer Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: Multeasymap - Total Frame Time r3 r1 r2 3 6 9 12 15 Min: 2 / Avg: 2.39 / Max: 7.28 Min: 2 / Avg: 2.43 / Max: 6.55 Min: 2 / Avg: 2.46 / Max: 6.5 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap r1 r3 r2 90 180 270 360 450 SE +/- 0.25, N = 3 SE +/- 2.45, N = 3 SE +/- 2.73, N = 3 435.20 434.24 429.37 MIN: 99.45 / MAX: 499.75 MIN: 115.25 / MAX: 499.75 MIN: 112.88 / MAX: 499.75 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap - Total Frame Time OpenBenchmarking.org Milliseconds, Fewer Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap - Total Frame Time r1 r2 r3 3 6 9 12 15 Min: 2 / Avg: 2.3 / Max: 10.06 Min: 2 / Avg: 2.32 / Max: 5.18 Min: 2 / Avg: 2.32 / Max: 8.68 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
Unigine Heaven Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Heaven 4.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL r2 r3 r1 30 60 90 120 150 SE +/- 0.96, N = 3 SE +/- 0.56, N = 3 SE +/- 0.71, N = 3 139.91 139.18 139.13
Unigine Superposition Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Low - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Superposition 1.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Low - Renderer: OpenGL r2 r1 r3 40 80 120 160 200 SE +/- 0.71, N = 3 SE +/- 0.23, N = 3 SE +/- 0.52, N = 3 178.1 177.7 177.4 MAX: 259.4 MAX: 260.1 MAX: 263.9
Unigine Superposition Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: High - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Superposition 1.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: High - Renderer: OpenGL r2 r3 r1 15 30 45 60 75 SE +/- 0.12, N = 3 SE +/- 0.09, N = 3 SE +/- 0.19, N = 3 66.5 66.2 65.9 MAX: 80.8 MAX: 80.3 MAX: 81.6
Unigine Superposition Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Ultra - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Superposition 1.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Ultra - Renderer: OpenGL r2 r3 r1 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 25.4 25.3 25.1 MAX: 29.4 MAX: 29.7 MAX: 29.3
Unigine Superposition Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Medium - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Superposition 1.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Medium - Renderer: OpenGL r2 r3 r1 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.15, N = 3 SE +/- 0.15, N = 3 90.6 90.5 90.4 MAX: 114.4 MAX: 113 MAX: 114.5
Warsow Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better Warsow 2.5 Beta Resolution: 1920 x 1080 r3 r2 r1 200 400 600 800 1000 SE +/- 1.81, N = 3 SE +/- 1.46, N = 3 SE +/- 13.76, N = 12 968.6 967.9 955.6
yquake2 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 r3 r2 r1 13 26 39 52 65 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 59.9 59.9 59.9 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
yquake2 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 r3 r2 r1 13 26 39 52 65 60 60 60 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
yquake2 Renderer: Software CPU - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: Software CPU - Resolution: 1920 x 1080 r2 r1 r3 14 28 42 56 70 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 60.7 60.7 60.6 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
OctaneBench Total Score OpenBenchmarking.org Score, More Is Better OctaneBench 2020.1 Total Score r3 r2 r1 40 80 120 160 200 189.32 189.10 189.09
RedShift Demo OpenBenchmarking.org Seconds, Fewer Is Better RedShift Demo 3.0 r3 r2 r1 100 200 300 400 500 SE +/- 0.33, N = 3 SE +/- 0.88, N = 3 459 460 461
LuxCoreRender OpenCL Scene: DLSC OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: DLSC r2 r3 r1 0.6233 1.2466 1.8699 2.4932 3.1165 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.06, N = 12 2.77 2.76 2.70 MIN: 2.57 / MAX: 2.84 MIN: 2.56 / MAX: 2.84 MIN: 0.69 / MAX: 2.81
LuxCoreRender OpenCL Scene: Food OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: Food r2 r3 r1 0.297 0.594 0.891 1.188 1.485 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 12 1.32 1.30 1.27 MIN: 0.29 / MAX: 1.57 MIN: 0.26 / MAX: 1.57 MIN: 0.13 / MAX: 1.57
LuxCoreRender OpenCL Scene: LuxCore Benchmark OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: LuxCore Benchmark r2 r3 r1 0.5198 1.0396 1.5594 2.0792 2.599 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 12 2.31 2.29 2.26 MIN: 0.27 / MAX: 2.63 MIN: 0.27 / MAX: 2.64 MIN: 0.14 / MAX: 2.63
LuxCoreRender OpenCL Scene: Rainbow Colors and Prism OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: Rainbow Colors and Prism r3 r2 r1 1.2173 2.4346 3.6519 4.8692 6.0865 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.12, N = 12 5.41 5.39 5.30 MIN: 4.58 / MAX: 5.7 MIN: 4.6 / MAX: 5.67 MIN: 1.66 / MAX: 5.7
FAHBench OpenBenchmarking.org Ns Per Day, More Is Better FAHBench 2.3.2 r3 r2 r1 40 80 120 160 200 SE +/- 0.11, N = 3 SE +/- 0.14, N = 3 SE +/- 0.23, N = 3 186.62 186.48 186.46
High Performance Conjugate Gradient OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 r1 r2 r3 0.8914 1.7828 2.6742 3.5656 4.457 SE +/- 0.00082, N = 3 SE +/- 0.00692, N = 3 SE +/- 0.01196, N = 3 3.96177 3.96068 3.95457 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi
LeelaChessZero Backend: OpenCL OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: OpenCL r3 r1 r2 3K 6K 9K 12K 15K SE +/- 44.68, N = 3 SE +/- 160.45, N = 3 SE +/- 176.76, N = 3 13416 13277 13173 1. (CXX) g++ options: -flto -pthread
Rodinia Test: OpenCL Particle Filter OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Particle Filter r3 r2 r1 2 4 6 8 10 SE +/- 0.016, N = 3 SE +/- 0.013, N = 3 SE +/- 0.065, N = 3 7.027 7.055 7.115 1. (CXX) g++ options: -O2 -lOpenCL
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup r1 r3 r2 0.8325 1.665 2.4975 3.33 4.1625 SE +/- 0.03, N = 3 SE +/- 0.03, N = 15 SE +/- 0.03, N = 15 3.7 3.6 2.5 1. (CC) gcc options: -fopenmp -O3 -lm
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search r3 r1 r2 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 105.51 105.53 105.57 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
Timed MAFFT Alignment Multiple Sequence Alignment - LSU RNA OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 7.471 Multiple Sequence Alignment - LSU RNA r1 r2 r3 3 6 9 12 15 SE +/- 0.08, N = 12 SE +/- 0.10, N = 15 SE +/- 0.10, N = 14 10.50 10.56 10.61 1. (CC) gcc options: -std=c99 -O3 -lm -lpthread
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein r1 r3 r2 1.1696 2.3392 3.5088 4.6784 5.848 SE +/- 0.111, N = 15 SE +/- 0.110, N = 15 SE +/- 0.109, N = 15 5.198 5.179 5.169 1. (CXX) g++ options: -O3 -pthread -lm
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya r1 r3 r2 0.171 0.342 0.513 0.684 0.855 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.76 0.75 0.75 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom r3 r2 r1 0.1125 0.225 0.3375 0.45 0.5625 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.5 0.5 0.5 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets r2 r3 r1 0.1958 0.3916 0.5874 0.7832 0.979 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.87 0.86 0.86 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID r1 r3 r2 0.2003 0.4006 0.6009 0.8012 1.0015 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.89 0.88 0.88 1. (CXX) g++ options: -O3 -pthread
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed r2 r1 r3 2K 4K 6K 8K 10K SE +/- 4.75, N = 3 SE +/- 6.52, N = 3 SE +/- 11.24, N = 3 8127.78 8120.67 8079.18 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed r2 r1 r3 2K 4K 6K 8K 10K SE +/- 2.38, N = 3 SE +/- 3.96, N = 3 SE +/- 10.11, N = 3 9839.9 9823.2 9810.0 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed r3 r1 r2 13 26 39 52 65 SE +/- 0.48, N = 3 SE +/- 0.61, N = 5 SE +/- 0.58, N = 3 58.89 57.88 57.36 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed r3 r1 r2 2K 4K 6K 8K 10K SE +/- 0.67, N = 3 SE +/- 1.84, N = 5 SE +/- 16.28, N = 3 9685.2 9676.3 9653.7 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed r3 r2 r1 13 26 39 52 65 SE +/- 0.66, N = 3 SE +/- 0.36, N = 3 SE +/- 0.59, N = 5 57.01 56.07 55.72 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed r3 r1 r2 2K 4K 6K 8K 10K SE +/- 0.78, N = 3 SE +/- 1.80, N = 5 SE +/- 15.38, N = 3 9695.2 9679.8 9664.8 1. (CC) gcc options: -O3
Zstd Compression Compression Level: 3 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 3 r3 r1 r2 600 1200 1800 2400 3000 SE +/- 4.18, N = 3 SE +/- 7.25, N = 3 SE +/- 8.65, N = 3 2835.1 2833.6 2831.0 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 19 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 19 r3 r1 r2 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 28.8 28.8 28.7 1. (CC) gcc options: -O3 -pthread -lz -llzma
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time r2 r3 r1 2M 4M 6M 8M 10M SE +/- 7176.35, N = 3 SE +/- 16578.83, N = 3 SE +/- 45086.65, N = 3 9584148 9560012 9497414 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
ArrayFire Test: Conjugate Gradient OpenCL OpenBenchmarking.org ms, Fewer Is Better ArrayFire 3.7 Test: Conjugate Gradient OpenCL r2 r3 r1 0.5735 1.147 1.7205 2.294 2.8675 SE +/- 0.022, N = 3 SE +/- 0.018, N = 3 SE +/- 0.015, N = 3 2.531 2.548 2.549 1. (CXX) g++ options: -rdynamic
GraphicsMagick Operation: Swirl OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl r3 r2 r1 50 100 150 200 250 SE +/- 1.72, N = 8 SE +/- 1.60, N = 10 SE +/- 1.72, N = 8 207 207 207 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Rotate OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate r1 r3 r2 200 400 600 800 1000 SE +/- 2.52, N = 3 SE +/- 1.86, N = 3 SE +/- 3.18, N = 3 902 900 875 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen r3 r2 r1 16 32 48 64 80 SE +/- 0.67, N = 3 SE +/- 0.58, N = 3 SE +/- 0.33, N = 3 73 72 72 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced r3 r2 r1 30 60 90 120 150 SE +/- 0.67, N = 3 SE +/- 0.67, N = 3 SE +/- 0.67, N = 3 115 115 115 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing r1 r3 r2 120 240 360 480 600 SE +/- 2.73, N = 3 SE +/- 5.36, N = 3 SE +/- 5.00, N = 3 552 551 551 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Noise-Gaussian OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian r3 r2 r1 30 60 90 120 150 SE +/- 1.20, N = 3 SE +/- 1.00, N = 3 SE +/- 1.33, N = 3 147 147 146 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space r3 r1 r2 200 400 600 800 1000 SE +/- 4.51, N = 3 SE +/- 5.03, N = 3 SE +/- 5.70, N = 3 776 775 774 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU r2 r3 r1 2 4 6 8 10 SE +/- 0.11582, N = 12 SE +/- 0.02993, N = 3 SE +/- 0.05152, N = 3 7.04404 7.14574 7.16575 MIN: 4.11 MIN: 5.45 MIN: 5.58 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU r2 r1 r3 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 12.44 12.47 12.61 MIN: 12.09 MIN: 12.08 MIN: 12.2 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU r3 r2 r1 0.715 1.43 2.145 2.86 3.575 SE +/- 0.06527, N = 12 SE +/- 0.02081, N = 3 SE +/- 0.01732, N = 3 3.11291 3.16769 3.17762 MIN: 1.86 MIN: 2.39 MIN: 2.58 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU r1 r3 r2 0.6248 1.2496 1.8744 2.4992 3.124 SE +/- 0.00400, N = 3 SE +/- 0.00352, N = 3 SE +/- 0.01530, N = 3 2.72558 2.74874 2.77670 MIN: 2.54 MIN: 2.54 MIN: 2.56 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU r3 r1 r2 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.06, N = 3 21.62 21.69 21.70 MIN: 21.51 MIN: 21.47 MIN: 21.48 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU r3 r2 r1 3 6 9 12 15 SE +/- 0.03582, N = 3 SE +/- 0.03928, N = 3 SE +/- 0.04555, N = 3 9.73732 9.76468 9.77594 MIN: 8.75 MIN: 8.72 MIN: 8.77 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU r2 r3 r1 3 6 9 12 15 SE +/- 0.15643, N = 15 SE +/- 0.22537, N = 12 SE +/- 0.23621, N = 12 9.77701 9.81238 9.87893 MIN: 6.67 MIN: 6.65 MIN: 6.66 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU r2 r1 r3 4 8 12 16 20 SE +/- 0.08, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 17.90 18.01 18.03 MIN: 17.18 MIN: 17.22 MIN: 17.24 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU r1 r2 r3 3 6 9 12 15 SE +/- 0.04374, N = 3 SE +/- 0.01715, N = 3 SE +/- 0.11418, N = 3 8.96782 9.00692 9.06628 MIN: 8.14 MIN: 8.15 MIN: 8 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU r2 r3 r1 1.0682 2.1364 3.2046 4.2728 5.341 SE +/- 0.06823, N = 15 SE +/- 0.07477, N = 15 SE +/- 0.10403, N = 12 4.71457 4.73728 4.74772 MIN: 3.29 MIN: 3.29 MIN: 3.29 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU r1 r2 r3 1500 3000 4500 6000 7500 SE +/- 12.55, N = 3 SE +/- 1.75, N = 3 SE +/- 6.55, N = 3 7155.41 7159.48 7169.03 MIN: 7025.22 MIN: 7040.61 MIN: 7046.49 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU r1 r2 r3 800 1600 2400 3200 4000 SE +/- 2.45, N = 3 SE +/- 2.65, N = 3 SE +/- 3.77, N = 3 3795.02 3797.05 3797.72 MIN: 3682.24 MIN: 3673.18 MIN: 3684.19 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU r1 r3 r2 1500 3000 4500 6000 7500 SE +/- 2.95, N = 3 SE +/- 6.73, N = 3 SE +/- 4.70, N = 3 7140.50 7151.58 7159.42 MIN: 7021.68 MIN: 7027.2 MIN: 7041.4 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU r1 r3 r2 800 1600 2400 3200 4000 SE +/- 6.76, N = 3 SE +/- 3.22, N = 3 SE +/- 4.34, N = 3 3795.81 3798.12 3800.41 MIN: 3687.23 MIN: 3685.27 MIN: 3681.23 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU r1 r2 r3 0.9867 1.9734 2.9601 3.9468 4.9335 SE +/- 0.00310, N = 3 SE +/- 0.00806, N = 3 SE +/- 0.00559, N = 3 4.36381 4.37852 4.38535 MIN: 4.23 MIN: 4.25 MIN: 4.25 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU r1 r3 r2 1500 3000 4500 6000 7500 SE +/- 3.89, N = 3 SE +/- 2.23, N = 3 SE +/- 0.92, N = 3 7144.23 7147.09 7154.66 MIN: 7028.46 MIN: 7033.98 MIN: 7035.88 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU r3 r1 r2 800 1600 2400 3200 4000 SE +/- 1.33, N = 3 SE +/- 1.61, N = 3 SE +/- 1.20, N = 3 3792.87 3797.32 3799.45 MIN: 3672.83 MIN: 3686.53 MIN: 3692.97 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU r1 r3 r2 1.0059 2.0118 3.0177 4.0236 5.0295 SE +/- 0.00967, N = 3 SE +/- 0.00726, N = 3 SE +/- 0.01661, N = 3 4.45564 4.46656 4.47062 MIN: 4.02 MIN: 4.01 MIN: 4.02 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p r1 r3 r2 110 220 330 440 550 SE +/- 5.73, N = 14 SE +/- 3.24, N = 13 SE +/- 3.02, N = 14 489.84 487.57 486.46 MIN: 317.1 / MAX: 898.12 MIN: 316.7 / MAX: 911.47 MIN: 316.37 / MAX: 900.57 1. (CC) gcc options: -pthread
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K r1 r3 r2 30 60 90 120 150 SE +/- 1.06, N = 6 SE +/- 1.07, N = 6 SE +/- 1.08, N = 6 112.75 112.65 112.03 MIN: 99.69 / MAX: 158.99 MIN: 99.62 / MAX: 158.58 MIN: 99.17 / MAX: 157.08 1. (CC) gcc options: -pthread
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p r1 r3 r2 100 200 300 400 500 SE +/- 3.60, N = 14 SE +/- 3.80, N = 13 SE +/- 3.46, N = 13 460.02 459.71 459.61 MIN: 375.05 / MAX: 590.01 MIN: 374.63 / MAX: 587.93 MIN: 374.03 / MAX: 582.97 1. (CC) gcc options: -pthread
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit r1 r3 r2 20 40 60 80 100 SE +/- 0.99, N = 4 SE +/- 1.03, N = 4 SE +/- 1.05, N = 4 86.08 85.95 85.83 MIN: 54.34 / MAX: 256.39 MIN: 54.21 / MAX: 255.72 MIN: 54.27 / MAX: 257.58 1. (CC) gcc options: -pthread
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Crown r2 r1 r3 2 4 6 8 10 SE +/- 0.0737, N = 3 SE +/- 0.0766, N = 3 SE +/- 0.0667, N = 3 6.0989 6.0806 6.0641 MIN: 5.88 / MAX: 10.98 MIN: 5.86 / MAX: 11.02 MIN: 5.86 / MAX: 10.95
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Crown r1 r3 r2 2 4 6 8 10 SE +/- 0.0830, N = 3 SE +/- 0.0756, N = 5 SE +/- 0.0728, N = 4 7.0735 6.9976 6.9794 MIN: 6.66 / MAX: 12.73 MIN: 6.56 / MAX: 12.56 MIN: 6.57 / MAX: 12.32
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon r2 r1 r3 2 4 6 8 10 SE +/- 0.0719, N = 3 SE +/- 0.0643, N = 3 SE +/- 0.0754, N = 3 7.5656 7.5555 7.5496 MIN: 7.18 / MAX: 12.51 MIN: 7.18 / MAX: 12.55 MIN: 7.19 / MAX: 12.66
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon r2 r3 r1 3 6 9 12 15 SE +/- 0.0236, N = 3 SE +/- 0.1308, N = 3 SE +/- 0.0822, N = 3 9.2596 9.1967 9.1343 MIN: 8.82 / MAX: 14.99 MIN: 8.85 / MAX: 15 MIN: 8.81 / MAX: 15.06
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 1 r3 r1 r2 0.0781 0.1562 0.2343 0.3124 0.3905 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 0.347 0.347 0.346
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 5 r1 r3 r2 0.2405 0.481 0.7215 0.962 1.2025 SE +/- 0.005, N = 3 SE +/- 0.003, N = 3 SE +/- 0.004, N = 3 1.069 1.064 1.064
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 6 r1 r3 r2 0.3249 0.6498 0.9747 1.2996 1.6245 SE +/- 0.010, N = 3 SE +/- 0.012, N = 3 SE +/- 0.006, N = 3 1.444 1.443 1.440
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 10 r1 r3 r2 0.77 1.54 2.31 3.08 3.85 SE +/- 0.044, N = 3 SE +/- 0.027, N = 3 SE +/- 0.035, N = 3 3.422 3.420 3.404
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second r3 r1 r2 50K 100K 150K 200K 250K SE +/- 2209.16, N = 3 SE +/- 2532.07, N = 3 SE +/- 1894.03, N = 3 223892.44 223414.81 223304.98 1. (CC) gcc options: -O2 -lrt" -lrt
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time r2 r1 r3 2M 4M 6M 8M 10M SE +/- 85742.14, N = 3 SE +/- 85083.98, N = 8 SE +/- 67987.28, N = 12 9839292 9703133 9629353 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth r3 r1 r2 3M 6M 9M 12M 15M SE +/- 142852.80, N = 3 SE +/- 174263.56, N = 3 SE +/- 148124.86, N = 3 16180674 15984719 15974611
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile r3 r1 r2 20 40 60 80 100 SE +/- 0.30, N = 3 SE +/- 0.78, N = 3 SE +/- 0.39, N = 3 100.20 100.26 100.40
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile r3 r1 r2 30 60 90 120 150 SE +/- 0.75, N = 3 SE +/- 0.33, N = 3 SE +/- 0.24, N = 3 151.48 151.66 152.21
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile r1 r2 r3 50 100 150 200 250 SE +/- 0.40, N = 3 SE +/- 0.49, N = 3 SE +/- 0.85, N = 3 210.05 210.71 210.95
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark r1 r2 r3 90 180 270 360 450 SE +/- 1.54, N = 3 SE +/- 0.84, N = 3 SE +/- 0.70, N = 3 419.58 419.36 417.03
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile r2 r3 r1 15 30 45 60 75 SE +/- 0.30, N = 3 SE +/- 0.22, N = 3 SE +/- 0.16, N = 3 67.54 68.70 68.74
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU r3 r2 r1 20 40 60 80 100 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.21, N = 3 81.04 81.07 81.30
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE r1 r3 r2 3 6 9 12 15 SE +/- 0.03, N = 5 SE +/- 0.01, N = 5 SE +/- 0.04, N = 5 10.51 10.59 10.86 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode r2 r3 r1 2 4 6 8 10 SE +/- 0.004, N = 5 SE +/- 0.008, N = 5 SE +/- 0.009, N = 5 7.602 7.616 7.624 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis r1 r2 r3 7 14 21 28 35 SE +/- 0.29, N = 4 SE +/- 0.12, N = 4 SE +/- 0.04, N = 4 26.47 27.18 27.71 1. (CC) gcc options: -O2 -std=c99 -lpthread -lm
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 r2 r3 r1 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 21.32 22.04 22.08 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden -lm
Node.js V8 Web Tooling Benchmark OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark r3 r2 r1 3 6 9 12 15 SE +/- 0.11, N = 3 SE +/- 0.11, N = 3 SE +/- 0.14, N = 3 13.18 13.17 13.06 1. Nodejs
v10.19.0
Cryptsetup PBKDF2-sha512 OpenBenchmarking.org Iterations Per Second, More Is Better Cryptsetup PBKDF2-sha512 r2 r1 r3 400K 800K 1200K 1600K 2000K SE +/- 1201.00, N = 3 SE +/- 7117.07, N = 3 SE +/- 12877.64, N = 3 1943008 1919349 1886103
Cryptsetup PBKDF2-whirlpool OpenBenchmarking.org Iterations Per Second, More Is Better Cryptsetup PBKDF2-whirlpool r2 r1 r3 200K 400K 600K 800K 1000K SE +/- 2314.28, N = 3 SE +/- 4903.32, N = 3 SE +/- 2497.33, N = 3 830020 816282 810352
Cryptsetup AES-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 256b Encryption r2 r3 r1 900 1800 2700 3600 4500 SE +/- 25.91, N = 3 SE +/- 20.10, N = 3 SE +/- 1.66, N = 3 4080.5 4023.0 4005.6
Cryptsetup AES-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 256b Decryption r2 r3 r1 900 1800 2700 3600 4500 SE +/- 17.20, N = 3 SE +/- 15.07, N = 3 SE +/- 4.92, N = 3 4055.1 4026.9 4002.4
Cryptsetup Serpent-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 256b Encryption r2 r3 r1 200 400 600 800 1000 SE +/- 1.25, N = 3 SE +/- 2.67, N = 3 SE +/- 0.92, N = 3 881.4 874.1 874.1
Cryptsetup Serpent-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 256b Decryption r2 r1 r3 200 400 600 800 1000 SE +/- 1.50, N = 3 SE +/- 1.62, N = 3 SE +/- 4.03, N = 3 876.6 872.3 870.9
Cryptsetup Twofish-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 256b Encryption r2 r3 r1 110 220 330 440 550 SE +/- 1.08, N = 3 SE +/- 2.51, N = 3 SE +/- 0.75, N = 3 487.4 483.6 482.0
Cryptsetup Twofish-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 256b Decryption r2 r3 r1 110 220 330 440 550 SE +/- 1.43, N = 3 SE +/- 2.21, N = 3 SE +/- 0.34, N = 3 486.3 483.0 482.5
Cryptsetup AES-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 512b Encryption r2 r1 r3 700 1400 2100 2800 3500 SE +/- 15.69, N = 3 SE +/- 3.15, N = 3 SE +/- 25.61, N = 3 3381.9 3346.8 3336.0
Cryptsetup AES-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 512b Decryption r2 r3 r1 700 1400 2100 2800 3500 SE +/- 10.03, N = 3 SE +/- 13.02, N = 3 SE +/- 1.21, N = 3 3388.5 3362.9 3348.3
Cryptsetup Serpent-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 512b Encryption r2 r1 r3 200 400 600 800 1000 SE +/- 0.87, N = 3 SE +/- 0.83, N = 3 SE +/- 4.25, N = 3 882.1 878.0 874.4
Cryptsetup Serpent-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 512b Decryption r2 r3 r1 200 400 600 800 1000 SE +/- 1.17, N = 3 SE +/- 4.24, N = 3 SE +/- 1.28, N = 3 878.1 873.5 871.7
Cryptsetup Twofish-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 512b Decryption r2 r3 r1 110 220 330 440 550 SE +/- 1.44, N = 3 SE +/- 2.34, N = 3 SE +/- 0.10, N = 3 485.7 483.0 482.7
Cryptsetup Twofish-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 512b Encryption r2 r3 r1 110 220 330 440 550 SE +/- 0.97, N = 3 SE +/- 2.12, N = 3 SE +/- 0.30, N = 2 486.4 483.8 483.0
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark r1 r3 r2 0.1388 0.2776 0.4164 0.5552 0.694 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 SE +/- 0.004, N = 3 0.617 0.614 0.610 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: SqueezeNet r1 r2 r3 80K 160K 240K 320K 400K SE +/- 2566.21, N = 3 SE +/- 2576.61, N = 3 SE +/- 2539.06, N = 3 354892 356034 356258
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception V4 r1 r2 r3 1.1M 2.2M 3.3M 4.4M 5.5M SE +/- 5618.75, N = 3 SE +/- 7685.69, N = 3 SE +/- 8609.77, N = 3 5163190 5168183 5178263
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: NASNet Mobile r1 r3 r2 70K 140K 210K 280K 350K SE +/- 3140.84, N = 3 SE +/- 1284.72, N = 3 SE +/- 2025.87, N = 3 302594 304079 304756
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Float r1 r2 r3 50K 100K 150K 200K 250K SE +/- 1996.41, N = 3 SE +/- 1820.00, N = 3 SE +/- 1638.46, N = 3 239119 239224 239537
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Quant r1 r2 r3 50K 100K 150K 200K 250K SE +/- 1686.36, N = 3 SE +/- 1668.46, N = 3 SE +/- 1810.35, N = 3 236716 237129 237406
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception ResNet V2 r1 r2 r3 1000K 2000K 3000K 4000K 5000K SE +/- 8775.31, N = 3 SE +/- 8796.49, N = 3 SE +/- 8398.83, N = 3 4660197 4670567 4677473
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast r1 r2 r3 1.2668 2.5336 3.8004 5.0672 6.334 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.04, N = 12 5.44 5.59 5.63 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium r3 r2 r1 2 4 6 8 10 SE +/- 0.16, N = 15 SE +/- 0.11, N = 15 SE +/- 0.14, N = 15 7.58 7.61 7.68 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough r1 r2 r3 12 24 36 48 60 SE +/- 0.54, N = 3 SE +/- 0.54, N = 3 SE +/- 0.42, N = 3 54.29 54.38 54.65 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive r1 r2 r3 100 200 300 400 500 SE +/- 0.52, N = 3 SE +/- 0.81, N = 3 SE +/- 0.54, N = 3 447.99 449.37 449.90 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: ETC1S r1 r3 r2 13 26 39 52 65 SE +/- 0.38, N = 3 SE +/- 0.56, N = 3 SE +/- 0.15, N = 3 57.82 58.06 58.06 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 0 r1 r2 r3 2 4 6 8 10 SE +/- 0.079, N = 3 SE +/- 0.061, N = 3 SE +/- 0.095, N = 3 7.288 7.345 7.353 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 r1 r2 r3 13 26 39 52 65 SE +/- 0.55, N = 3 SE +/- 0.41, N = 3 SE +/- 0.58, N = 3 55.50 55.74 55.77 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 3 r1 r2 r3 20 40 60 80 100 SE +/- 0.55, N = 3 SE +/- 0.55, N = 3 SE +/- 0.53, N = 3 110.84 110.93 111.04 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 2 + RDO Post-Processing OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 + RDO Post-Processing r2 r1 r3 200 400 600 800 1000 SE +/- 0.35, N = 3 SE +/- 0.74, N = 3 SE +/- 0.62, N = 3 840.32 840.35 841.23 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 r1 r3 r2 11 22 33 44 55 SE +/- 0.25, N = 3 SE +/- 0.13, N = 3 SE +/- 0.17, N = 3 49.55 50.26 50.27 1. (CC) gcc options: -O2 -ldl -lz -lpthread
Darktable Test: Boat - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.1 Test: Boat - Acceleration: CPU-only r3 r2 r1 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 15.86 15.87 15.91
Darktable Test: Masskrug - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.1 Test: Masskrug - Acceleration: CPU-only r1 r2 r3 2 4 6 8 10 SE +/- 0.097, N = 12 SE +/- 0.096, N = 12 SE +/- 0.099, N = 12 7.128 7.150 7.155
Darktable Test: Server Rack - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.1 Test: Server Rack - Acceleration: CPU-only r1 r2 r3 0.0407 0.0814 0.1221 0.1628 0.2035 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 0.181 0.181 0.181
Darktable Test: Server Room - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.1 Test: Server Room - Acceleration: CPU-only r2 r3 r1 0.9407 1.8814 2.8221 3.7628 4.7035 SE +/- 0.004, N = 3 SE +/- 0.006, N = 3 SE +/- 0.010, N = 3 4.174 4.178 4.181
GEGL Operation: Crop OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Crop r3 r2 r1 2 4 6 8 10 SE +/- 0.077, N = 8 SE +/- 0.073, N = 9 SE +/- 0.065, N = 11 8.826 8.839 8.900
GEGL Operation: Scale OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Scale r1 r2 r3 2 4 6 8 10 SE +/- 0.055, N = 12 SE +/- 0.059, N = 13 SE +/- 0.056, N = 14 6.954 6.973 7.000
GEGL Operation: Cartoon OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Cartoon r1 r3 r2 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.09, N = 3 SE +/- 0.19, N = 3 86.79 86.99 87.32
GEGL Operation: Reflect OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Reflect r1 r3 r2 7 14 21 28 35 SE +/- 0.29, N = 3 SE +/- 0.22, N = 3 SE +/- 0.30, N = 3 28.18 28.31 28.50
GEGL Operation: Antialias OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Antialias r1 r2 r3 8 16 24 32 40 SE +/- 0.45, N = 3 SE +/- 0.35, N = 3 SE +/- 0.38, N = 3 36.56 36.56 36.65
GEGL Operation: Tile Glass OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Tile Glass r3 r2 r1 7 14 21 28 35 SE +/- 0.39, N = 3 SE +/- 0.27, N = 3 SE +/- 0.36, N = 3 28.06 28.24 28.24
GEGL Operation: Wavelet Blur OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Wavelet Blur r3 r2 r1 13 26 39 52 65 SE +/- 0.25, N = 3 SE +/- 0.39, N = 3 SE +/- 0.25, N = 3 57.84 57.95 57.99
GEGL Operation: Color Enhance OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Color Enhance r3 r1 r2 12 24 36 48 60 SE +/- 0.28, N = 3 SE +/- 0.22, N = 3 SE +/- 0.04, N = 3 54.10 54.11 54.31
GEGL Operation: Rotate 90 Degrees OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Rotate 90 Degrees r2 r3 r1 9 18 27 36 45 SE +/- 0.36, N = 3 SE +/- 0.43, N = 3 SE +/- 0.31, N = 3 37.54 37.69 37.70
Inkscape Operation: SVG Files To PNG OpenBenchmarking.org Seconds, Fewer Is Better Inkscape Operation: SVG Files To PNG r1 r2 r3 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 21.00 21.05 21.07 1. Inkscape 0.92.5 (2060ec1f9f, 2020-04-08)
RawTherapee Total Benchmark Time OpenBenchmarking.org Seconds, Fewer Is Better RawTherapee Total Benchmark Time r1 r3 r2 20 40 60 80 100 SE +/- 0.53, N = 3 SE +/- 0.45, N = 3 SE +/- 0.46, N = 3 80.59 80.71 80.93 1. RawTherapee, version 5.8, command line.
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP r1 r3 r2 700K 1400K 2100K 2800K 3500K SE +/- 36042.05, N = 3 SE +/- 181152.66, N = 12 SE +/- 3702.86, N = 3 3394660.20 2809233.48 2104092.33 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD r1 r3 r2 600K 1200K 1800K 2400K 3000K SE +/- 28020.60, N = 3 SE +/- 27994.25, N = 3 SE +/- 23332.27, N = 15 2660539.42 2634908.83 2628039.25 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH r2 r3 r1 400K 800K 1200K 1600K 2000K SE +/- 21753.96, N = 4 SE +/- 8925.21, N = 3 SE +/- 25221.07, N = 3 2094056.31 2083566.29 2041750.08 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET r1 r2 r3 700K 1400K 2100K 2800K 3500K SE +/- 41615.25, N = 3 SE +/- 13828.40, N = 3 SE +/- 8077.93, N = 3 3248596.08 3012560.83 3009326.75 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET r3 r2 r1 500K 1000K 1500K 2000K 2500K SE +/- 6859.51, N = 3 SE +/- 3903.32, N = 3 SE +/- 17218.21, N = 3 2433543.80 2413657.00 2375800.25 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: SqueezeNetV1.0 r1 r3 r2 3 6 9 12 15 SE +/- 0.373, N = 10 SE +/- 0.373, N = 10 SE +/- 0.316, N = 11 8.899 8.944 8.982 MIN: 4.96 / MAX: 31.21 MIN: 5.01 / MAX: 31.89 MIN: 5.05 / MAX: 31.35 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: resnet-v2-50 r1 r2 r3 13 26 39 52 65 SE +/- 0.40, N = 10 SE +/- 0.35, N = 11 SE +/- 0.40, N = 10 58.16 58.53 58.79 MIN: 36.86 / MAX: 81.73 MIN: 37.33 / MAX: 83.74 MIN: 36.87 / MAX: 85.77 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: MobileNetV2_224 r1 r3 r2 1.1905 2.381 3.5715 4.762 5.9525 SE +/- 0.210, N = 10 SE +/- 0.209, N = 10 SE +/- 0.185, N = 11 5.239 5.285 5.291 MIN: 3.19 / MAX: 26.27 MIN: 3.27 / MAX: 26.82 MIN: 3.3 / MAX: 27.38 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: mobilenet-v1-1.0 r1 r3 r2 3 6 9 12 15 SE +/- 0.01, N = 10 SE +/- 0.01, N = 10 SE +/- 0.01, N = 11 10.65 10.66 10.68 MIN: 10.33 / MAX: 34.53 MIN: 10.33 / MAX: 32.25 MIN: 10.35 / MAX: 33.35 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: inception-v3 r1 r2 r3 14 28 42 56 70 SE +/- 0.15, N = 10 SE +/- 0.18, N = 11 SE +/- 0.22, N = 10 62.57 63.18 63.56 MIN: 60.82 / MAX: 96.05 MIN: 61.02 / MAX: 104.39 MIN: 60.92 / MAX: 102.85 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet r3 r1 r2 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.17, N = 3 SE +/- 0.01, N = 3 26.53 26.62 26.63 MIN: 25.78 / MAX: 41.25 MIN: 25.69 / MAX: 38.05 MIN: 25.7 / MAX: 41.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 r2 r3 r1 2 4 6 8 10 SE +/- 0.73, N = 3 SE +/- 0.73, N = 3 SE +/- 0.67, N = 3 7.22 7.23 7.31 MIN: 5.54 / MAX: 12.03 MIN: 5.55 / MAX: 12.3 MIN: 5.51 / MAX: 16.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 r1 r2 r3 1.3073 2.6146 3.9219 5.2292 6.5365 SE +/- 0.65, N = 3 SE +/- 0.65, N = 3 SE +/- 0.62, N = 3 5.74 5.81 5.81 MIN: 4.3 / MAX: 7.75 MIN: 4.43 / MAX: 17.76 MIN: 4.48 / MAX: 10.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 r2 r3 r1 2 4 6 8 10 SE +/- 0.94, N = 3 SE +/- 0.95, N = 3 SE +/- 0.03, N = 3 6.95 7.03 7.93 MIN: 5.01 / MAX: 9.68 MIN: 5.04 / MAX: 20.64 MIN: 7.52 / MAX: 16.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet r2 r3 r1 2 4 6 8 10 SE +/- 0.75, N = 3 SE +/- 0.74, N = 3 SE +/- 0.02, N = 3 5.96 5.96 6.67 MIN: 4.32 / MAX: 14.32 MIN: 4.33 / MAX: 28.21 MIN: 5.99 / MAX: 21.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 r2 r3 r1 3 6 9 12 15 SE +/- 0.96, N = 3 SE +/- 0.96, N = 3 SE +/- 0.05, N = 3 9.05 9.06 10.00 MIN: 6.99 / MAX: 21.76 MIN: 7.04 / MAX: 12.38 MIN: 9.46 / MAX: 24.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface r1 r3 r2 0.585 1.17 1.755 2.34 2.925 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 2.54 2.57 2.60 MIN: 2.35 / MAX: 2.74 MIN: 2.45 / MAX: 2.83 MIN: 2.45 / MAX: 10.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet r1 r2 r3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 19.98 20.01 20.21 MIN: 18.95 / MAX: 23.24 MIN: 18.96 / MAX: 24.67 MIN: 19.11 / MAX: 32.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 r3 r2 r1 16 32 48 64 80 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 SE +/- 0.20, N = 3 71.86 71.91 72.09 MIN: 70.48 / MAX: 88 MIN: 70.43 / MAX: 92.47 MIN: 70.5 / MAX: 88.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 r1 r3 r2 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 18.62 18.66 18.71 MIN: 17.08 / MAX: 32.57 MIN: 17.05 / MAX: 30.94 MIN: 17.06 / MAX: 33.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet r2 r3 r1 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 15.46 15.49 15.50 MIN: 14.35 / MAX: 27.24 MIN: 14.41 / MAX: 24.83 MIN: 14.41 / MAX: 55.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 r3 r2 r1 9 18 27 36 45 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.51, N = 3 37.22 37.30 37.81 MIN: 33.9 / MAX: 52.84 MIN: 33.91 / MAX: 56.28 MIN: 34.04 / MAX: 52.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny r2 r3 r1 8 16 24 32 40 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.48, N = 3 35.59 35.66 35.95 MIN: 34.42 / MAX: 51.24 MIN: 34.45 / MAX: 49.15 MIN: 34.4 / MAX: 55.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd r2 r3 r1 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.14, N = 3 27.51 27.63 27.64 MIN: 26.93 / MAX: 43.6 MIN: 27.02 / MAX: 46.56 MIN: 27 / MAX: 40.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m r2 r1 r3 5 10 15 20 25 SE +/- 0.24, N = 3 SE +/- 0.06, N = 3 SE +/- 0.10, N = 3 18.91 19.16 19.38 MIN: 13.5 / MAX: 30.63 MIN: 18.07 / MAX: 22.36 MIN: 14.45 / MAX: 42.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mobilenet r3 r1 r2 6 12 18 24 30 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 26.51 26.52 26.53 MIN: 25.69 / MAX: 45.35 MIN: 25.69 / MAX: 43.81 MIN: 25.76 / MAX: 43.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 r3 r2 r1 2 4 6 8 10 SE +/- 0.73, N = 3 SE +/- 0.79, N = 3 SE +/- 0.74, N = 3 7.19 7.22 7.23 MIN: 5.52 / MAX: 9.67 MIN: 5.41 / MAX: 20.72 MIN: 5.54 / MAX: 9.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 r2 r1 r3 1.3073 2.6146 3.9219 5.2292 6.5365 SE +/- 0.65, N = 3 SE +/- 0.62, N = 3 SE +/- 0.64, N = 3 5.73 5.74 5.81 MIN: 4.33 / MAX: 10.47 MIN: 4.43 / MAX: 9.64 MIN: 4.41 / MAX: 25.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: shufflenet-v2 r2 r3 r1 2 4 6 8 10 SE +/- 0.96, N = 3 SE +/- 0.93, N = 3 SE +/- 0.07, N = 3 6.98 7.05 7.92 MIN: 4.98 / MAX: 27.09 MIN: 5.04 / MAX: 20.37 MIN: 7.27 / MAX: 20.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mnasnet r2 r3 r1 2 4 6 8 10 SE +/- 0.71, N = 3 SE +/- 0.76, N = 3 SE +/- 0.00, N = 3 5.86 5.91 6.63 MIN: 4.3 / MAX: 15.47 MIN: 4.32 / MAX: 7.94 MIN: 6.21 / MAX: 8.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: efficientnet-b0 r3 r2 r1 3 6 9 12 15 SE +/- 0.94, N = 3 SE +/- 0.95, N = 3 SE +/- 0.10, N = 3 8.99 9.02 10.01 MIN: 6.99 / MAX: 13.79 MIN: 7 / MAX: 19.29 MIN: 9.44 / MAX: 29.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: blazeface r2 r3 r1 0.5738 1.1476 1.7214 2.2952 2.869 SE +/- 0.26, N = 3 SE +/- 0.25, N = 3 SE +/- 0.02, N = 3 2.29 2.29 2.55 MIN: 1.68 / MAX: 8.91 MIN: 1.69 / MAX: 12.73 MIN: 2.43 / MAX: 2.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: googlenet r2 r3 r1 5 10 15 20 25 SE +/- 1.77, N = 3 SE +/- 1.84, N = 3 SE +/- 0.06, N = 3 18.20 18.26 20.05 MIN: 14.26 / MAX: 31.74 MIN: 14.28 / MAX: 36.09 MIN: 18.94 / MAX: 32.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: vgg16 r2 r3 r1 16 32 48 64 80 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 71.82 71.86 71.96 MIN: 70.37 / MAX: 86.67 MIN: 70.4 / MAX: 88.5 MIN: 70.52 / MAX: 88.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet18 r2 r3 r1 5 10 15 20 25 SE +/- 0.34, N = 3 SE +/- 0.27, N = 3 SE +/- 0.00, N = 3 18.33 18.38 18.62 MIN: 14.43 / MAX: 32.39 MIN: 14.4 / MAX: 32.57 MIN: 17.13 / MAX: 20.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: alexnet r1 r3 r2 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 15.44 15.50 15.53 MIN: 14.41 / MAX: 26.42 MIN: 14.41 / MAX: 26.23 MIN: 14.41 / MAX: 25.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet50 r1 r3 r2 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 37.25 37.26 37.34 MIN: 34.07 / MAX: 48.19 MIN: 33.79 / MAX: 52.48 MIN: 33.97 / MAX: 56.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: yolov4-tiny r2 r1 r3 8 16 24 32 40 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 35.51 35.52 35.53 MIN: 33.05 / MAX: 50.05 MIN: 34.38 / MAX: 51.44 MIN: 32.99 / MAX: 52.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: squeezenet_ssd r2 r3 r1 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 27.52 27.55 27.58 MIN: 26.95 / MAX: 42.6 MIN: 26.92 / MAX: 41.99 MIN: 26.94 / MAX: 43.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: regnety_400m r2 r3 r1 5 10 15 20 25 SE +/- 1.83, N = 3 SE +/- 1.77, N = 3 SE +/- 0.09, N = 3 17.15 17.60 19.16 MIN: 13.3 / MAX: 38.12 MIN: 13.79 / MAX: 32.97 MIN: 17.94 / MAX: 21.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 r2 r3 r1 70 140 210 280 350 SE +/- 0.81, N = 3 SE +/- 0.36, N = 3 SE +/- 2.78, N = 8 295.55 299.40 321.42 MIN: 292.39 / MAX: 306.56 MIN: 297.92 / MAX: 315.55 MIN: 300.42 / MAX: 371.06 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 r2 r3 r1 60 120 180 240 300 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 SE +/- 1.46, N = 3 264.95 272.68 272.91 MIN: 264.07 / MAX: 268.01 MIN: 271.53 / MAX: 277.6 MIN: 264.43 / MAX: 277.05 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL r3 r2 r1 100 200 300 400 500 SE +/- 2.86, N = 3 SE +/- 1.92, N = 3 SE +/- 0.36, N = 3 478.73 477.39 463.34
PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL r3 r1 r2 300 600 900 1200 1500 SE +/- 4.92, N = 3 SE +/- 3.10, N = 3 SE +/- 2.03, N = 3 1247.93 1246.78 1244.95
PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL r2 r1 r3 400 800 1200 1600 2000 SE +/- 3.54, N = 3 SE +/- 7.57, N = 3 SE +/- 8.53, N = 3 1823.06 1819.24 1817.78
PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL r1 r3 r2 20 40 60 80 100 SE +/- 0.19, N = 3 SE +/- 0.40, N = 3 SE +/- 0.42, N = 3 110.07 109.99 109.98
OpenVINO Model: Face Detection 0106 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP16 - Device: CPU r3 r2 r1 0.288 0.576 0.864 1.152 1.44 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 1.28 1.28 1.28 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Face Detection 0106 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP16 - Device: CPU r3 r1 r2 700 1400 2100 2800 3500 SE +/- 7.78, N = 3 SE +/- 4.35, N = 3 SE +/- 3.88, N = 3 3164.51 3165.24 3166.57 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Face Detection 0106 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP32 - Device: CPU r3 r2 r1 0.2858 0.5716 0.8574 1.1432 1.429 SE +/- 0.02, N = 3 SE +/- 0.02, N = 4 SE +/- 0.01, N = 3 1.27 1.27 1.26 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Face Detection 0106 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP32 - Device: CPU r1 r2 r3 700 1400 2100 2800 3500 SE +/- 2.58, N = 3 SE +/- 1.22, N = 4 SE +/- 2.51, N = 3 3202.53 3207.35 3212.10 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Person Detection 0106 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP16 - Device: CPU r3 r2 r1 0.18 0.36 0.54 0.72 0.9 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 0.80 0.80 0.80 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Person Detection 0106 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP16 - Device: CPU r1 r2 r3 1100 2200 3300 4400 5500 SE +/- 4.97, N = 3 SE +/- 19.24, N = 3 SE +/- 4.20, N = 3 4961.99 4978.25 5006.34 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Person Detection 0106 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP32 - Device: CPU r3 r2 r1 0.1778 0.3556 0.5334 0.7112 0.889 SE +/- 0.01, N = 5 SE +/- 0.01, N = 9 SE +/- 0.01, N = 3 0.79 0.79 0.79 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Person Detection 0106 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP32 - Device: CPU r1 r3 r2 1100 2200 3300 4400 5500 SE +/- 15.43, N = 3 SE +/- 14.45, N = 5 SE +/- 9.68, N = 9 5069.44 5073.09 5079.89 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU r1 r3 r2 700 1400 2100 2800 3500 SE +/- 33.67, N = 3 SE +/- 34.05, N = 6 SE +/- 38.35, N = 4 3442.78 3405.92 3403.45 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU r1 r2 r3 0.2678 0.5356 0.8034 1.0712 1.339 SE +/- 0.00, N = 3 SE +/- 0.00, N = 4 SE +/- 0.00, N = 6 1.17 1.19 1.19 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU r1 r3 r2 700 1400 2100 2800 3500 SE +/- 35.01, N = 3 SE +/- 40.89, N = 4 SE +/- 33.23, N = 5 3363.55 3347.93 3307.53 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU r1 r3 r2 0.2768 0.5536 0.8304 1.1072 1.384 SE +/- 0.00, N = 3 SE +/- 0.00, N = 4 SE +/- 0.00, N = 5 1.21 1.22 1.23 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom r1 r2 r3 0.2113 0.4226 0.6339 0.8452 1.0565 SE +/- 0.002, N = 3 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 0.939 0.938 0.935
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar r3 r2 r1 0.4851 0.9702 1.4553 1.9404 2.4255 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 2.156 2.150 2.147
Blender Blend File: BMW27 - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: CUDA r2 r3 r1 20 40 60 80 100 SE +/- 0.16, N = 3 SE +/- 0.10, N = 3 SE +/- 0.14, N = 3 90.82 90.93 91.00
Blender Blend File: Classroom - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: CUDA r1 r3 r2 60 120 180 240 300 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 250.78 251.80 251.90
Blender Blend File: Fishy Cat - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: CUDA r2 r3 r1 40 80 120 160 200 SE +/- 0.11, N = 3 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 167.96 168.08 168.87
Blender Blend File: Barbershop - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Barbershop - Compute: CUDA r2 r3 r1 160 320 480 640 800 SE +/- 0.26, N = 3 SE +/- 0.41, N = 3 SE +/- 0.24, N = 3 731.67 733.02 734.81
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: NVIDIA OptiX r2 r3 r1 9 18 27 36 45 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 3.33, N = 15 38.07 38.07 41.47
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: NVIDIA OptiX r2 r3 r1 30 60 90 120 150 SE +/- 0.23, N = 3 SE +/- 0.13, N = 3 SE +/- 0.13, N = 3 116.15 116.26 116.76
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: NVIDIA OptiX r2 r3 r1 14 28 42 56 70 SE +/- 0.04, N = 3 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 60.18 60.25 60.35
Blender Blend File: Barbershop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Barbershop - Compute: NVIDIA OptiX r2 r3 r1 300 600 900 1200 1500 SE +/- 0.85, N = 3 SE +/- 2.01, N = 3 SE +/- 0.44, N = 3 1190.05 1192.80 1192.96
Blender Blend File: Pabellon Barcelona - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: CUDA r3 r1 r2 130 260 390 520 650 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 608.62 608.80 609.56
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX r1 r2 r3 40 80 120 160 200 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 196.21 196.28 196.41
MandelGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better MandelGPU 1.3pts1 OpenCL Device: GPU r2 r3 r1 50M 100M 150M 200M 250M SE +/- 157365.45, N = 3 SE +/- 1449538.54, N = 3 SE +/- 1032565.22, N = 3 252826584.8 252822614.4 251986408.7 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
clpeak OpenCL Test: Integer Compute INT OpenBenchmarking.org GIOPS, More Is Better clpeak OpenCL Test: Integer Compute INT r3 r2 r1 1200 2400 3600 4800 6000 SE +/- 81.16, N = 15 SE +/- 81.08, N = 15 SE +/- 71.93, N = 15 5540.44 5519.39 5504.35 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Single-Precision Float OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Single-Precision Float r1 r3 r2 1300 2600 3900 5200 6500 SE +/- 83.30, N = 15 SE +/- 47.53, N = 3 SE +/- 64.05, N = 3 5940.64 5892.70 5858.32 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Double-Precision Double OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Double-Precision Double r3 r2 r1 70 140 210 280 350 SE +/- 3.74, N = 3 SE +/- 3.68, N = 3 SE +/- 3.78, N = 3 340.59 340.46 340.42 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Global Memory Bandwidth r3 r1 r2 70 140 210 280 350 SE +/- 0.28, N = 3 SE +/- 0.32, N = 3 SE +/- 0.28, N = 3 324.78 324.63 324.58 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
NeatBench Acceleration: GPU OpenBenchmarking.org FPS, More Is Better NeatBench 5 Acceleration: GPU r3 r1 r2 6 12 18 24 30 SE +/- 0.60, N = 15 SE +/- 0.57, N = 15 SE +/- 0.47, N = 15 27.6 27.5 27.1
AI Benchmark Alpha Device Inference Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Inference Score r3 r2 r1 160 320 480 640 800 730 730 730
AI Benchmark Alpha Device Training Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Training Score r1 r3 r2 200 400 600 800 1000 816 814 814
AI Benchmark Alpha Device AI Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device AI Score r1 r3 r2 300 600 900 1200 1500 1546 1544 1544
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite r1 r2 r3 200K 400K 600K 800K 1000K SE +/- 4346.11, N = 3 SE +/- 2600.83, N = 3 SE +/- 587.84, N = 3 837911 832417 829705
Unpacking Firefox Extracting: firefox-84.0.source.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking Firefox 84.0 Extracting: firefox-84.0.source.tar.xz r1 r3 r2 4 8 12 16 20 SE +/- 0.08, N = 4 SE +/- 0.14, N = 4 SE +/- 0.09, N = 4 16.03 16.10 16.14
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric r3 r1 r2 14K 28K 42K 56K 70K 64033 63909 63822 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
Phoronix Test Suite v10.8.5