xeon-platinum-8380-2p-smoke-run 2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2105012-IB-XEONPLATI04&grr .
xeon-platinum-8380-2p-smoke-run Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution r1 r1a r2 r2a r2b r3 r4 r5 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads) Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) Intel Device 0998 16 x 32 GB DDR4-3200MT/s Hynix HMA84GR7CJR4N-XN 2 x 7682GB INTEL SSDPF2KX076TZ + 2 x 800GB INTEL SSDPF21Q800GB + 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB INTEL SSDSC2KG96 ASPEED VE228 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP Ubuntu 20.04 5.11.0-051100-generic (x86_64) GNOME Shell 3.36.4 X Server 1.20.8 GCC 9.3.0 ext4 1920x1080 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - r1: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd000270 - r1a: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd000270 - r2: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd000270 - r2a: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 - r2b: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 - r3: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 - r4: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 - r5: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 Python Details - Python 2.7.18 + Python 3.8.5 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
xeon-platinum-8380-2p-smoke-run hammerdb-mariadb: 128 - 250 hammerdb-mariadb: 128 - 250 hammerdb-mariadb: 128 - 500 hammerdb-mariadb: 128 - 500 hammerdb-mariadb: 64 - 250 hammerdb-mariadb: 64 - 250 hammerdb-mariadb: 32 - 250 hammerdb-mariadb: 32 - 250 hammerdb-mariadb: 16 - 500 hammerdb-mariadb: 16 - 500 hammerdb-mariadb: 32 - 500 hammerdb-mariadb: 32 - 500 mysqlslap: 256 mysqlslap: 512 mysqlslap: 128 hammerdb-mariadb: 64 - 500 hammerdb-mariadb: 64 - 500 incompact3d: X3D-benchmarking input.i3d gnuradio: Hilbert Transform gnuradio: FM Deemphasis Filter gnuradio: IIR Filter gnuradio: FIR Filter gnuradio: Signal Source (Cosine) gnuradio: Five Back to Back FIR Filters mysqlslap: 64 aom-av1: Speed 4 Two-Pass - Bosphorus 4K cp2k: Fayalite-FIST hammerdb-mariadb: 16 - 250 hammerdb-mariadb: 16 - 250 luaradio: Complex Phase luaradio: Hilbert Transform luaradio: FM Deemphasis Filter luaradio: Five Back to Back FIR Filters hammerdb-mariadb: 8 - 250 hammerdb-mariadb: 8 - 250 hammerdb-mariadb: 8 - 500 hammerdb-mariadb: 8 - 500 aom-av1: Speed 6 Two-Pass - Bosphorus 4K mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 mysqlslap: 1 securemark: SecureMark-TLS aom-av1: Speed 0 Two-Pass - Bosphorus 4K mysqlslap: 32 build-llvm: Unix Makefiles aom-av1: Speed 4 Two-Pass - Bosphorus 1080p luxcorerender: Orange Juice - CPU mysqlslap: 16 build-erlang: Time To Compile luxcorerender: DLSC - CPU intel-mlc: Max Bandwidth - Stream-Triad Like intel-mlc: Max Bandwidth - 1:1 Reads-Writes intel-mlc: Max Bandwidth - 2:1 Reads-Writes intel-mlc: Max Bandwidth - 3:1 Reads-Writes intel-mlc: Max Bandwidth - All Reads mysqlslap: 8 build-llvm: Ninja aom-av1: Speed 6 Realtime - Bosphorus 4K onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU mysqlslap: 4 aom-av1: Speed 8 Realtime - Bosphorus 4K gmpbench: Total Time blender: Barbershop - CPU-Only build-nodejs: Time To Compile viennacl: CPU BLAS - dGEMM-TT viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dGEMV-T viennacl: CPU BLAS - dGEMV-N viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dCOPY viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - sAXPY viennacl: CPU BLAS - sCOPY xmrig: Monero - 1M aom-av1: Speed 6 Two-Pass - Bosphorus 1080p build-linux-kernel: Time To Compile sysbench: CPU blender: Pabellon Barcelona - CPU-Only build-wasmer: Time To Compile onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU aom-av1: Speed 9 Realtime - Bosphorus 4K blender: Classroom - CPU-Only toktx: UASTC 4 + Zstd Compression 19 onednn: Deconvolution Batch shapes_1d - f32 - CPU luxcorerender: Danish Mood - CPU luxcorerender: LuxCore Benchmark - CPU avifenc: 0 aom-av1: Speed 0 Two-Pass - Bosphorus 1080p luxcorerender: Rainbow Colors and Prism - CPU aom-av1: Speed 6 Realtime - Bosphorus 1080p blender: Fishy Cat - CPU-Only onednn: IP Shapes 1D - bf16bf16bf16 - CPU vosk: stockfish: Total Time avifenc: 6, Lossless srslte: PHY_DL_Test srslte: PHY_DL_Test avifenc: 6 srslte: OFDM_Test sysbench: RAM / Memory avifenc: 2 botan: AES-256 - Decrypt botan: AES-256 basis: ETC1S basis: UASTC Level 0 avifenc: 10, Lossless aom-av1: Speed 9 Realtime - Bosphorus 1080p blender: BMW27 - CPU-Only botan: ChaCha20Poly1305 - Decrypt botan: ChaCha20Poly1305 botan: Blowfish - Decrypt botan: Blowfish botan: Twofish - Decrypt botan: Twofish botan: CAST-256 - Decrypt botan: CAST-256 botan: KASUMI - Decrypt botan: KASUMI toybrot: TBB xmrig: Wownero - 1M intel-mlc: Peak Injection Bandwidth - Stream-Triad Like intel-mlc: Peak Injection Bandwidth - 1:1 Reads-Writes intel-mlc: Peak Injection Bandwidth - 2:1 Reads-Writes intel-mlc: Peak Injection Bandwidth - 3:1 Reads-Writes intel-mlc: Peak Injection Bandwidth - All Reads onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU helsing: 14 digit tjbench: Decompression Throughput astcenc: Thorough astcenc: Exhaustive avifenc: 10 astcenc: Medium build-mesa: Time To Compile onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU svt-hevc: 1 - Bosphorus 1080p aom-av1: Speed 8 Realtime - Bosphorus 1080p liquid-dsp: 160 - 256 - 57 liquid-dsp: 128 - 256 - 57 liquid-dsp: 64 - 256 - 57 liquid-dsp: 32 - 256 - 57 liquid-dsp: 16 - 256 - 57 liquid-dsp: 8 - 256 - 57 liquid-dsp: 4 - 256 - 57 liquid-dsp: 2 - 256 - 57 liquid-dsp: 1 - 256 - 57 toktx: Zstd Compression 19 basis: UASTC Level 3 onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU svt-vp9: VMAF Optimized - Bosphorus 1080p toktx: UASTC 3 onednn: IP Shapes 3D - f32 - CPU intel-mlc: Idle Latency onednn: IP Shapes 1D - f32 - CPU basis: UASTC Level 2 incompact3d: input.i3d 193 Cells Per Direction toktx: UASTC 3 + Zstd Compression 19 onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU incompact3d: input.i3d 129 Cells Per Direction toktx: Zstd Compression 9 onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU toybrot: C++ Tasks draco: Church Facade onednn: IP Shapes 3D - bf16bf16bf16 - CPU draco: Lion toybrot: OpenMP toybrot: C++ Threads svt-vp9: Visual Quality Optimized - Bosphorus 1080p svt-vp9: PSNR/SSIM Optimized - Bosphorus 1080p onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU svt-hevc: 10 - Bosphorus 1080p svt-hevc: 7 - Bosphorus 1080p r1 r1a r2 r2a r2b r3 r4 r5 167809 55415 173288 57190 191397 63279 209254 69054 195258 64477 208419 68818 194684 64298 313.920451 459.3 734.0 610.6 603.0 2183.5 1024.3 192913 63757 546.8 80.3 410.0 1094.8 290082 95768 285984 94379 7.37 225412 216.323 14.36 114.550 9.70 325766.94 439496.74 459455.38 426148.96 357285.28 145.717 15.09 804.392 29.20 4642.1 101.101 76.3 76.0 75.6 73.5 719 72.3 720 1058 843 620 1003 1834 19299.5 24.382 62.160 801.409 792.831 1.21594 447.971 445.144 445.519 33.07 7.49467 7.42 7.84 57.975 17.04 2.96135 35.918 181644819 32.113 76.9 183.4 13.247 120300000 31.539 5663.055 5669.700 8.852 619.458 623.494 363.255 363.038 292.736 289.126 116.074 115.972 74.320 77.287 6850 48051.5 324377.2 442422.3 459038.6 425933.7 356476.2 0.338327 0.215115 77.872 161.634619 5.477 20.952 3.53026 0.398282 36.91 3144800000 3415933333 3267133333 1735100000 885320000 441953333 217643333 110713333 57792000 0.239989 386.29 1.24809 35.1 0.918568 11.3586022 0.210919 0.593042 2.74370996 3.57247 0.864164 7879 1.80046 7318 7018 327.87 401.29 1.10991 0.877815 2.07944 499.23 290.67 173228 57242 188761 62311 311.960785 459.1 727.4 609.5 604.8 2175.3 1015.2 4.17 548.2 80.3 409.6 1094.5 7.55 225366 0.19 215.760 6.89 14.26 113.800 9.61 325184.58 441408.09 456629.89 424612.62 358364.56 145.550 15.19 793.363 28.99 4642.8 100.446 77.2 77.4 76.8 72.3 319 63.6 371 392 335 277 370 504 19452.0 21.25 24.360 61.930 804.323 791.927 1.22278 447.308 446.936 447.436 32.51 7.50059 7.55 8.04 57.710 0.51 13.34 28.66 2.96857 35.009 186263552 31.624 77.3 184.2 13.328 120133333 31.479 5663.612 5670.809 8.812 125.25 619.538 623.198 363.326 363.615 292.374 288.852 116.069 115.970 74.288 77.310 6964 50166.1 323924.2 442843.2 456260.3 424096.6 358385.5 0.341663 0.213643 78.159 156.969016 5.505 20.379 3.54367 0.395588 37.34 103.92 3162066667 3352733333 3263700000 1736800000 890273333 0.240122 393.46 1.25267 33.0 0.912279 11.2727114 0.210728 0.595661 2.73859096 3.57662 0.863214 7724 1.79881 7308 6980 329.53 408.24 1.12224 0.879137 2.08532 493.51 288.99 67.5 1374.663 325260.41 442460.05 456545.88 424818.83 358456.09 323826.9 442144.2 456408.6 424077.3 358269.7 32.5 160 166 192 307.622108 357.4 645.8 498.2 470.0 1684.4 111.2 403 2.01 458.7 78.2 370.1 804.5 3.22 53.073 3.213 4.078 48.732 7.174 3336 225343 0.14 885 226.440 3.30 14.28 1264 191.746 9.27 325409.99 441732.77 459226.53 425997.22 357774.43 1413 148.484 5.97 791.695 1614 12.03 4524.5 110.02 110.930 54.7 62.3 59.8 61.9 389.9 62.3 447.65 507.1 422.2 349 474 691 19311.1 7.45 27.997 214210.83 88.57 71.928 808.289 789.836 1.23796 446.389 447.287 447.701 14.30 71.78 56.660 28.4023 5.73 5.84 64.971 0.32 13.42 10.39 46.38 3.00464 36.424 181554218 38.395 75.0 181.6 16.065 120733333 12510.56 38.372 5662.763 5606.967 34.237 11.251 10.282 43.26 29.56 612.438 615.806 363.196 362.926 292.396 288.562 116.080 114.663 74.275 76.286 6984 49908.3 324209.8 440454.7 459309.8 425925.6 357742.9 0.341893 0.216806 78.33 160.262559 9.2907 16.3621 6.656 7.1887 21.575 3.53121 0.403409 27.80 36.20 3131866667 3400066667 3227433333 1699333333 862890000 428100000 213203333 110173333 56230333 19.781 17.163 0.243026 182.26 5.664 1.25313 0.943624 13.979 11.5617158 10.011 0.210324 0.602122 3.02281992 3.470 3.64232 0.874080 8050 7001 1.81774 6126 7412 7149 164.32 182.17 1.11874 0.869978 2.11712 234.51 158.16 189 386.390001 408.0 621.0 487.4 502.0 1723.9 580.5 404 2.05 458.2 78.2 370.3 662.8 3.20 3458 225291 0.15 887 226.199 3.36 13.89 1262 192.245 9.24 325218.50 440939.22 457141.24 424925.84 358268.00 1420 147.163 5.97 793.916 1580 11.94 4504.5 111.790 61.7 66.9 68.9 66.4 647 64.3 713.47 1024.2 913 532 862 1135 20652.9 7.38 28.018 71.130 796.689 793.080 1.24508 450.648 447.144 446.917 14.06 28.1815 5.65 5.92 65.960 0.33 16.47 10.39 3.00929 35.581 189214499 38.590 76.1 181.6 16.615 120833333 38.313 5662.342 5593.366 10.088 43.42 612.149 616.501 363.314 359.452 292.827 286.180 115.723 114.517 74.309 76.407 7003 49813.4 324227.4 449554.1 457190.5 424904.5 358463.7 0.341955 0.216586 78.079 159.187038 6.597 21.369 3.56224 0.406877 28.22 36.06 3143300000 3411000000 3232700000 1704500000 865410000 432170000 215343333 111510000 57197667 0.243308 185.53 1.24176 67.6 0.936941 14.5982965 0.218349 0.602314 3.56592774 3.64033 0.874968 8048 1.84339 7439 7203 164.51 181.52 1.14578 0.901823 2.10841 234.39 157.83 389.698280 373.8 622.0 487.7 515.6 1619.2 487.9 2.10 452.7 78.4 368.0 706.1 3.23 52.227 3.362 4.100 48.041 7.170 222747 0.14 224.290 3.36 13.94 193.839 9.25 325314.62 440315.41 458790.96 425848.09 357925.98 146.909 6.00 811.941 12.10 4525.7 109.96 111.673 63.7 67.6 72.4 70.8 647 70.2 765 1158 936 535 855 1167 20574.6 7.43 28.094 214241.34 88.68 70.758 792.296 792.049 1.24116 446.536 448.906 447.958 14.73 72.29 56.770 28.4613 5.68 5.87 65.888 0.33 14.79 10.54 46.73 3.00907 35.503 186013261 38.507 78.3 183.7 16.211 120666667 12553.44 37.796 5650.139 5611.995 34.420 11.226 10.208 42.37 29.69 615.975 619.638 363.279 359.573 292.610 286.004 116.070 114.646 74.292 76.403 7016 49937.3 324112.8 446396.0 458941.9 425822.1 358110.5 0.340243 0.215085 78.539 159.237752 9.3091 16.3729 6.746 7.1472 21.313 3.54783 0.402919 28.01 36.35 3140266667 3398800000 3245666667 1697500000 860046667 432013333 216773333 109430000 55251667 20.082 17.185 0.242450 184.07 5.562 1.24222 67.8 0.940714 14.159 14.6577489 10.029 0.217941 0.602038 3.57278153 3.697 3.64319 0.876227 8037 7082 1.81913 6170 7429 7141 162.21 179.13 1.11811 0.875421 2.10837 233.96 156.26 325312.30 440205.22 458756.46 425467.51 357550.82 324234.5 448800.1 458830.6 425508.1 357722.7 68.1 OpenBenchmarking.org
HammerDB - MariaDB Virtual Users: 128 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 128 - Warehouses: 250 r1 40K 80K 120K 160K 200K SE +/- 2616.54, N = 9 167809 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 128 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 128 - Warehouses: 250 r1 12K 24K 36K 48K 60K SE +/- 857.30, N = 9 55415 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 128 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 128 - Warehouses: 500 r1 r1a 40K 80K 120K 160K 200K SE +/- 2691.06, N = 9 SE +/- 1389.03, N = 9 173288 173228 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 128 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 128 - Warehouses: 500 r1 r1a 12K 24K 36K 48K 60K SE +/- 891.59, N = 9 SE +/- 484.29, N = 9 57190 57242 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 64 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 64 - Warehouses: 250 r1 40K 80K 120K 160K 200K SE +/- 2831.11, N = 9 191397 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 64 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 64 - Warehouses: 250 r1 14K 28K 42K 56K 70K SE +/- 937.55, N = 9 63279 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 32 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 32 - Warehouses: 250 r1 40K 80K 120K 160K 200K SE +/- 3390.81, N = 9 209254 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 32 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 32 - Warehouses: 250 r1 15K 30K 45K 60K 75K SE +/- 1078.76, N = 9 69054 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 16 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 16 - Warehouses: 500 r1 40K 80K 120K 160K 200K SE +/- 3159.46, N = 9 195258 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 16 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 16 - Warehouses: 500 r1 14K 28K 42K 56K 70K SE +/- 1031.07, N = 9 64477 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 32 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 32 - Warehouses: 500 r1 40K 80K 120K 160K 200K SE +/- 2885.40, N = 9 208419 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 32 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 32 - Warehouses: 500 r1 15K 30K 45K 60K 75K SE +/- 921.11, N = 9 68818 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
MariaDB Clients: 256 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 256 r2b 40 80 120 160 200 SE +/- 0.22, N = 3 160 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
MariaDB Clients: 512 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 512 r2b 40 80 120 160 200 SE +/- 0.87, N = 3 166 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
MariaDB Clients: 128 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 128 r2b r3 40 80 120 160 200 SE +/- 0.65, N = 3 SE +/- 0.35, N = 3 192 189 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 64 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 64 - Warehouses: 500 r1 r1a 40K 80K 120K 160K 200K SE +/- 2149.33, N = 3 SE +/- 2084.32, N = 9 194684 188761 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 64 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 64 - Warehouses: 500 r1 r1a 14K 28K 42K 56K 70K SE +/- 620.04, N = 3 SE +/- 730.55, N = 9 64298 62311 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
Xcompact3d Incompact3d Input: X3D-benchmarking input.i3d OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d r1 r1a r2b r3 r4 80 160 240 320 400 SE +/- 0.46, N = 3 SE +/- 0.12, N = 3 SE +/- 2.73, N = 9 SE +/- 4.39, N = 9 SE +/- 3.91, N = 9 313.92 311.96 307.62 386.39 389.70 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
GNU Radio Test: Hilbert Transform OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Hilbert Transform r1 r1a r2b r3 r4 100 200 300 400 500 SE +/- 2.02, N = 3 SE +/- 1.66, N = 3 SE +/- 47.90, N = 3 SE +/- 17.46, N = 9 SE +/- 24.71, N = 9 459.3 459.1 357.4 408.0 373.8 1. 3.8.1.0
GNU Radio Test: FM Deemphasis Filter OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: FM Deemphasis Filter r1 r1a r2b r3 r4 160 320 480 640 800 SE +/- 1.94, N = 3 SE +/- 1.04, N = 3 SE +/- 53.33, N = 3 SE +/- 31.57, N = 9 SE +/- 32.02, N = 9 734.0 727.4 645.8 621.0 622.0 1. 3.8.1.0
GNU Radio Test: IIR Filter OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: IIR Filter r1 r1a r2b r3 r4 130 260 390 520 650 SE +/- 0.38, N = 3 SE +/- 0.46, N = 3 SE +/- 45.07, N = 3 SE +/- 26.49, N = 9 SE +/- 25.67, N = 9 610.6 609.5 498.2 487.4 487.7 1. 3.8.1.0
GNU Radio Test: FIR Filter OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: FIR Filter r1 r1a r2b r3 r4 130 260 390 520 650 SE +/- 1.45, N = 3 SE +/- 0.20, N = 3 SE +/- 44.41, N = 3 SE +/- 16.19, N = 9 SE +/- 11.25, N = 9 603.0 604.8 470.0 502.0 515.6 1. 3.8.1.0
GNU Radio Test: Signal Source (Cosine) OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Signal Source (Cosine) r1 r1a r2b r3 r4 500 1000 1500 2000 2500 SE +/- 0.93, N = 3 SE +/- 2.24, N = 3 SE +/- 168.17, N = 3 SE +/- 72.44, N = 9 SE +/- 82.03, N = 9 2183.5 2175.3 1684.4 1723.9 1619.2 1. 3.8.1.0
GNU Radio Test: Five Back to Back FIR Filters OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Five Back to Back FIR Filters r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 2.54, N = 3 SE +/- 2.30, N = 3 SE +/- 1.12, N = 3 SE +/- 39.63, N = 9 SE +/- 48.36, N = 9 1024.3 1015.2 111.2 580.5 487.9 1. 3.8.1.0
MariaDB Clients: 64 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 64 r2b r3 90 180 270 360 450 SE +/- 0.62, N = 3 SE +/- 0.16, N = 3 403 404 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
AOM AV1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K r1a r2b r3 r4 0.9383 1.8766 2.8149 3.7532 4.6915 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 9 SE +/- 0.01, N = 3 4.17 2.01 2.05 2.10 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 8.1 Input: Fayalite-FIST r2a 300 600 900 1200 1500 1374.66
HammerDB - MariaDB Virtual Users: 16 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 16 - Warehouses: 250 r1 40K 80K 120K 160K 200K SE +/- 2649.02, N = 3 192913 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 16 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 16 - Warehouses: 250 r1 14K 28K 42K 56K 70K SE +/- 880.35, N = 3 63757 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
LuaRadio Test: Complex Phase OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Complex Phase r1 r1a r2b r3 r4 120 240 360 480 600 SE +/- 0.25, N = 3 SE +/- 0.71, N = 3 SE +/- 3.61, N = 9 SE +/- 4.31, N = 6 SE +/- 4.50, N = 6 546.8 548.2 458.7 458.2 452.7
LuaRadio Test: Hilbert Transform OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Hilbert Transform r1 r1a r2b r3 r4 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.41, N = 9 SE +/- 0.47, N = 6 SE +/- 0.61, N = 6 80.3 80.3 78.2 78.2 78.4
LuaRadio Test: FM Deemphasis Filter OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: FM Deemphasis Filter r1 r1a r2b r3 r4 90 180 270 360 450 SE +/- 0.21, N = 3 SE +/- 1.40, N = 3 SE +/- 5.30, N = 9 SE +/- 4.83, N = 6 SE +/- 1.19, N = 6 410.0 409.6 370.1 370.3 368.0
LuaRadio Test: Five Back to Back FIR Filters OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Five Back to Back FIR Filters r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 2.24, N = 3 SE +/- 0.62, N = 3 SE +/- 22.87, N = 9 SE +/- 74.31, N = 6 SE +/- 73.21, N = 6 1094.8 1094.5 804.5 662.8 706.1
HammerDB - MariaDB Virtual Users: 8 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 8 - Warehouses: 250 r1 60K 120K 180K 240K 300K SE +/- 2006.72, N = 3 290082 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 8 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 8 - Warehouses: 250 r1 20K 40K 60K 80K 100K SE +/- 675.05, N = 3 95768 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 8 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 8 - Warehouses: 500 r1 60K 120K 180K 240K 300K SE +/- 2338.98, N = 3 285984 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 8 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 8 - Warehouses: 500 r1 20K 40K 60K 80K 100K SE +/- 693.36, N = 3 94379 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K r1 r1a r2b r3 r4 2 4 6 8 10 SE +/- 0.09, N = 15 SE +/- 0.06, N = 3 SE +/- 0.03, N = 9 SE +/- 0.04, N = 3 SE +/- 0.03, N = 5 7.37 7.55 3.22 3.20 3.23 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: inception-v3 r2b r4 12 24 36 48 60 SE +/- 1.54, N = 3 SE +/- 0.75, N = 12 53.07 52.23 MIN: 49.59 / MAX: 69.62 MIN: 47.47 / MAX: 94.69 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: mobilenet-v1-1.0 r2b r4 0.7565 1.513 2.2695 3.026 3.7825 SE +/- 0.089, N = 3 SE +/- 0.021, N = 12 3.213 3.362 MIN: 2.8 / MAX: 6.7 MIN: 2.98 / MAX: 6.66 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: MobileNetV2_224 r2b r4 0.9225 1.845 2.7675 3.69 4.6125 SE +/- 0.333, N = 3 SE +/- 0.135, N = 12 4.078 4.100 MIN: 2.9 / MAX: 13.17 MIN: 2.97 / MAX: 12.98 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: resnet-v2-50 r2b r4 11 22 33 44 55 SE +/- 2.59, N = 3 SE +/- 1.07, N = 12 48.73 48.04 MIN: 43.19 / MAX: 69.59 MIN: 42.13 / MAX: 145.2 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: SqueezeNetV1.0 r2b r4 2 4 6 8 10 SE +/- 0.002, N = 3 SE +/- 0.078, N = 12 7.174 7.170 MIN: 6.95 / MAX: 7.88 MIN: 6.38 / MAX: 9.97 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
MariaDB Clients: 1 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 1 r2b r3 700 1400 2100 2800 3500 SE +/- 73.97, N = 15 SE +/- 61.33, N = 12 3336 3458 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
SecureMark Benchmark: SecureMark-TLS OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS r1 r1a r2b r3 r4 50K 100K 150K 200K 250K SE +/- 234.37, N = 3 SE +/- 236.12, N = 3 SE +/- 84.15, N = 3 SE +/- 267.95, N = 3 SE +/- 2769.20, N = 3 225412 225366 225343 225291 222747 1. (CC) gcc options: -pedantic -O3
AOM AV1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K r1a r2b r3 r4 0.0428 0.0856 0.1284 0.1712 0.214 SE +/- 0.00, N = 5 SE +/- 0.00, N = 12 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.19 0.14 0.15 0.14 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
MariaDB Clients: 32 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 32 r2b r3 200 400 600 800 1000 SE +/- 0.26, N = 3 SE +/- 1.83, N = 3 885 887 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 12.0 Build System: Unix Makefiles r1 r1a r2b r3 r4 50 100 150 200 250 SE +/- 0.91, N = 3 SE +/- 0.80, N = 3 SE +/- 0.77, N = 3 SE +/- 1.24, N = 3 SE +/- 0.43, N = 3 216.32 215.76 226.44 226.20 224.29
AOM AV1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p r1a r2b r3 r4 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 5 SE +/- 0.01, N = 3 6.89 3.30 3.36 3.36 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
LuxCoreRender Scene: Orange Juice - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Orange Juice - Acceleration: CPU r1 r1a r2b r3 r4 4 8 12 16 20 SE +/- 0.13, N = 3 SE +/- 0.21, N = 3 SE +/- 0.18, N = 3 SE +/- 0.12, N = 15 SE +/- 0.13, N = 15 14.36 14.26 14.28 13.89 13.94 MIN: 11.58 / MAX: 19.44 MIN: 11.6 / MAX: 19.3 MIN: 11.93 / MAX: 17.73 MIN: 11.08 / MAX: 17.77 MIN: 11.06 / MAX: 17.84
MariaDB Clients: 16 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 16 r2b r3 300 600 900 1200 1500 SE +/- 1.85, N = 3 SE +/- 3.49, N = 3 1264 1262 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
Timed Erlang/OTP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Erlang/OTP Compilation 23.2 Time To Compile r1 r1a r2b r3 r4 40 80 120 160 200 SE +/- 0.18, N = 3 SE +/- 0.37, N = 3 SE +/- 1.08, N = 3 SE +/- 0.31, N = 3 SE +/- 1.56, N = 3 114.55 113.80 191.75 192.25 193.84
LuxCoreRender Scene: DLSC - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: DLSC - Acceleration: CPU r1 r1a r2b r3 r4 3 6 9 12 15 SE +/- 0.09, N = 3 SE +/- 0.09, N = 15 SE +/- 0.08, N = 15 SE +/- 0.10, N = 3 SE +/- 0.09, N = 3 9.70 9.61 9.27 9.24 9.25 MIN: 8.98 / MAX: 12.22 MIN: 8 / MAX: 12.27 MIN: 8.31 / MAX: 11.98 MIN: 8.74 / MAX: 11.37 MIN: 8.59 / MAX: 11.4
Intel Memory Latency Checker Test: Max Bandwidth - Stream-Triad Like OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - Stream-Triad Like r1 r1a r2a r2b r3 r4 r5 70K 140K 210K 280K 350K SE +/- 25.05, N = 3 SE +/- 11.61, N = 3 SE +/- 53.08, N = 3 SE +/- 50.20, N = 3 SE +/- 50.80, N = 3 SE +/- 7.71, N = 3 SE +/- 22.58, N = 3 325766.94 325184.58 325260.41 325409.99 325218.50 325314.62 325312.30
Intel Memory Latency Checker Test: Max Bandwidth - 1:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - 1:1 Reads-Writes r1 r1a r2a r2b r3 r4 r5 90K 180K 270K 360K 450K SE +/- 821.19, N = 3 SE +/- 1093.30, N = 3 SE +/- 1844.14, N = 3 SE +/- 3117.58, N = 3 SE +/- 276.68, N = 3 SE +/- 2322.32, N = 3 SE +/- 1051.98, N = 3 439496.74 441408.09 442460.05 441732.77 440939.22 440315.41 440205.22
Intel Memory Latency Checker Test: Max Bandwidth - 2:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - 2:1 Reads-Writes r1 r1a r2a r2b r3 r4 r5 100K 200K 300K 400K 500K SE +/- 33.49, N = 3 SE +/- 129.26, N = 3 SE +/- 54.98, N = 3 SE +/- 51.02, N = 3 SE +/- 89.89, N = 3 SE +/- 8.60, N = 3 SE +/- 53.22, N = 3 459455.38 456629.89 456545.88 459226.53 457141.24 458790.96 458756.46
Intel Memory Latency Checker Test: Max Bandwidth - 3:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - 3:1 Reads-Writes r1 r1a r2a r2b r3 r4 r5 90K 180K 270K 360K 450K SE +/- 105.41, N = 3 SE +/- 465.24, N = 3 SE +/- 392.90, N = 3 SE +/- 71.38, N = 3 SE +/- 109.66, N = 3 SE +/- 67.02, N = 3 SE +/- 133.64, N = 3 426148.96 424612.62 424818.83 425997.22 424925.84 425848.09 425467.51
Intel Memory Latency Checker Test: Max Bandwidth - All Reads OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - All Reads r1 r1a r2a r2b r3 r4 r5 80K 160K 240K 320K 400K SE +/- 67.01, N = 3 SE +/- 142.76, N = 3 SE +/- 107.35, N = 3 SE +/- 83.63, N = 3 SE +/- 59.61, N = 3 SE +/- 83.70, N = 3 SE +/- 46.23, N = 3 357285.28 358364.56 358456.09 357774.43 358268.00 357925.98 357550.82
MariaDB Clients: 8 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 8 r2b r3 300 600 900 1200 1500 SE +/- 10.97, N = 3 SE +/- 3.56, N = 3 1413 1420 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 12.0 Build System: Ninja r1 r1a r2b r3 r4 30 60 90 120 150 SE +/- 0.52, N = 3 SE +/- 0.75, N = 3 SE +/- 1.12, N = 3 SE +/- 0.32, N = 3 SE +/- 0.56, N = 3 145.72 145.55 148.48 147.16 146.91
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K r1 r1a r2b r3 r4 4 8 12 16 20 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.07, N = 12 SE +/- 0.01, N = 3 15.09 15.19 5.97 5.97 6.00 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 7.01, N = 3 SE +/- 1.56, N = 3 SE +/- 0.61, N = 3 SE +/- 0.83, N = 3 SE +/- 16.86, N = 14 804.39 793.36 791.70 793.92 811.94 MIN: 763.49 MIN: 765.14 MIN: 769.61 MIN: 769 MIN: 761.61 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
MariaDB Clients: 4 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 4 r2b r3 300 600 900 1200 1500 SE +/- 16.07, N = 3 SE +/- 7.20, N = 3 1614 1580 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K r1 r1a r2b r3 r4 7 14 21 28 35 SE +/- 0.19, N = 3 SE +/- 0.29, N = 5 SE +/- 0.08, N = 15 SE +/- 0.12, N = 15 SE +/- 0.17, N = 3 29.20 28.99 12.03 11.94 12.10 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
GNU GMP GMPbench Total Time OpenBenchmarking.org GMPbench Score, More Is Better GNU GMP GMPbench 6.2.1 Total Time r1 r1a r2b r3 r4 1000 2000 3000 4000 5000 4642.1 4642.8 4524.5 4504.5 4525.7 1. (CC) gcc options: -O3 -fomit-frame-pointer -lm
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Barbershop - Compute: CPU-Only r2b r4 20 40 60 80 100 SE +/- 0.18, N = 3 SE +/- 0.59, N = 3 110.02 109.96
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 15.11 Time To Compile r1 r1a r2b r3 r4 30 60 90 120 150 SE +/- 0.27, N = 3 SE +/- 0.29, N = 3 SE +/- 0.50, N = 3 SE +/- 0.68, N = 3 SE +/- 0.78, N = 3 101.10 100.45 110.93 111.79 111.67
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT r1 r1a r2b r3 r4 20 40 60 80 100 SE +/- 1.45, N = 13 SE +/- 0.90, N = 3 SE +/- 1.75, N = 15 SE +/- 2.33, N = 15 SE +/- 2.94, N = 15 76.3 77.2 54.7 61.7 63.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN r1 r1a r2b r3 r4 20 40 60 80 100 SE +/- 1.67, N = 13 SE +/- 0.69, N = 3 SE +/- 2.02, N = 15 SE +/- 1.88, N = 15 SE +/- 2.43, N = 14 76.0 77.4 62.3 66.9 67.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT r1 r1a r2b r3 r4 20 40 60 80 100 SE +/- 1.88, N = 13 SE +/- 1.01, N = 3 SE +/- 1.14, N = 15 SE +/- 1.99, N = 15 SE +/- 1.98, N = 15 75.6 76.8 59.8 68.9 72.4 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN r1 r1a r2b r3 r4 16 32 48 64 80 SE +/- 1.42, N = 14 SE +/- 3.11, N = 3 SE +/- 2.06, N = 15 SE +/- 2.18, N = 15 SE +/- 1.95, N = 15 73.5 72.3 61.9 66.4 70.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T r1 r1a r2b r3 r4 160 320 480 640 800 SE +/- 2.46, N = 13 SE +/- 5.04, N = 3 SE +/- 27.49, N = 15 SE +/- 2.02, N = 15 SE +/- 3.20, N = 15 719.0 319.0 389.9 647.0 647.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N r1 r1a r2b r3 r4 16 32 48 64 80 SE +/- 0.36, N = 14 SE +/- 2.90, N = 3 SE +/- 3.75, N = 15 SE +/- 3.93, N = 15 SE +/- 0.25, N = 15 72.3 63.6 62.3 64.3 70.2 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT r1 r1a r2b r3 r4 160 320 480 640 800 SE +/- 6.43, N = 14 SE +/- 34.44, N = 3 SE +/- 34.40, N = 14 SE +/- 50.57, N = 15 SE +/- 2.76, N = 15 720.00 371.00 447.65 713.47 765.00 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 20.63, N = 14 SE +/- 23.02, N = 3 SE +/- 40.80, N = 15 SE +/- 82.34, N = 15 SE +/- 5.62, N = 15 1058.0 392.0 507.1 1024.2 1158.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 25.47, N = 14 SE +/- 29.90, N = 3 SE +/- 35.11, N = 15 SE +/- 26.97, N = 15 SE +/- 9.73, N = 15 843.0 335.0 422.2 913.0 936.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT r1 r1a r2b r3 r4 130 260 390 520 650 SE +/- 2.34, N = 14 SE +/- 11.67, N = 3 SE +/- 5.60, N = 15 SE +/- 2.55, N = 15 SE +/- 2.45, N = 15 620 277 349 532 535 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 6.62, N = 14 SE +/- 15.25, N = 3 SE +/- 10.36, N = 15 SE +/- 8.11, N = 15 SE +/- 11.35, N = 15 1003 370 474 862 855 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY r1 r1a r2b r3 r4 400 800 1200 1600 2000 SE +/- 16.63, N = 14 SE +/- 4.10, N = 3 SE +/- 22.07, N = 15 SE +/- 51.32, N = 15 SE +/- 54.62, N = 15 1834 504 691 1135 1167 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Monero - Hash Count: 1M r1 r1a r2b r3 r4 4K 8K 12K 16K 20K SE +/- 23.28, N = 3 SE +/- 20.55, N = 3 SE +/- 151.73, N = 3 SE +/- 245.77, N = 3 SE +/- 243.31, N = 15 19299.5 19452.0 19311.1 20652.9 20574.6 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p r1a r2b r3 r4 5 10 15 20 25 SE +/- 0.17, N = 3 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 21.25 7.45 7.38 7.43 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.10.20 Time To Compile r1 r1a r2b r3 r4 7 14 21 28 35 SE +/- 0.30, N = 4 SE +/- 0.28, N = 4 SE +/- 0.32, N = 14 SE +/- 0.41, N = 14 SE +/- 0.37, N = 14 24.38 24.36 28.00 28.02 28.09
Sysbench Test: CPU OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU r2b r4 50K 100K 150K 200K 250K SE +/- 247.29, N = 3 SE +/- 269.51, N = 3 214210.83 214241.34 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Pabellon Barcelona - Compute: CPU-Only r2b r4 20 40 60 80 100 SE +/- 0.08, N = 3 SE +/- 0.28, N = 3 88.57 88.68
Timed Wasmer Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Wasmer Compilation 1.0.2 Time To Compile r1 r1a r2b r3 r4 16 32 48 64 80 SE +/- 0.22, N = 3 SE +/- 0.62, N = 3 SE +/- 0.42, N = 3 SE +/- 0.66, N = 7 SE +/- 0.51, N = 3 62.16 61.93 71.93 71.13 70.76 1. (CC) gcc options: -m64 -pie -nodefaultlibs -ldl -lrt -lpthread -lgcc_s -lc -lm -lutil
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 7.46, N = 3 SE +/- 4.49, N = 3 SE +/- 9.76, N = 3 SE +/- 1.09, N = 3 SE +/- 2.67, N = 3 801.41 804.32 808.29 796.69 792.30 MIN: 767.38 MIN: 765.37 MIN: 767.97 MIN: 771.28 MIN: 763.96 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 2.07, N = 3 SE +/- 3.65, N = 3 SE +/- 1.48, N = 3 SE +/- 2.18, N = 3 SE +/- 1.96, N = 3 792.83 791.93 789.84 793.08 792.05 MIN: 763.76 MIN: 765.01 MIN: 767.03 MIN: 768.2 MIN: 765.9 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 0.2801 0.5602 0.8403 1.1204 1.4005 SE +/- 0.01080, N = 15 SE +/- 0.01126, N = 15 SE +/- 0.01174, N = 15 SE +/- 0.01066, N = 15 SE +/- 0.00891, N = 15 1.21594 1.22278 1.23796 1.24508 1.24116 MIN: 0.84 MIN: 0.85 MIN: 0.87 MIN: 0.89 MIN: 0.85 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 100 200 300 400 500 SE +/- 0.58, N = 3 SE +/- 0.90, N = 3 SE +/- 0.78, N = 3 SE +/- 2.40, N = 3 SE +/- 1.10, N = 3 447.97 447.31 446.39 450.65 446.54 MIN: 433.22 MIN: 432.33 MIN: 432.04 MIN: 432.96 MIN: 429.71 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 100 200 300 400 500 SE +/- 0.58, N = 3 SE +/- 1.79, N = 3 SE +/- 0.65, N = 3 SE +/- 1.24, N = 3 SE +/- 3.51, N = 3 445.14 446.94 447.29 447.14 448.91 MIN: 431.52 MIN: 430.47 MIN: 433.06 MIN: 432.42 MIN: 431.33 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 100 200 300 400 500 SE +/- 0.85, N = 3 SE +/- 2.18, N = 3 SE +/- 1.13, N = 3 SE +/- 0.04, N = 3 SE +/- 2.63, N = 3 445.52 447.44 447.70 446.92 447.96 MIN: 431.18 MIN: 429.4 MIN: 433.04 MIN: 433.64 MIN: 429.99 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K r1 r1a r2b r3 r4 8 16 24 32 40 SE +/- 0.28, N = 3 SE +/- 0.28, N = 3 SE +/- 0.15, N = 15 SE +/- 0.18, N = 4 SE +/- 0.08, N = 3 33.07 32.51 14.30 14.06 14.73 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Classroom - Compute: CPU-Only r2b r4 16 32 48 64 80 SE +/- 0.08, N = 3 SE +/- 0.13, N = 3 71.78 72.29
KTX-Software toktx Settings: UASTC 4 + Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 4 + Zstd Compression 19 r2b r4 13 26 39 52 65 SE +/- 0.68, N = 4 SE +/- 0.74, N = 3 56.66 56.77
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 7 14 21 28 35 SE +/- 0.02080, N = 3 SE +/- 0.01835, N = 3 SE +/- 0.31773, N = 13 SE +/- 0.30585, N = 15 SE +/- 0.38629, N = 12 7.49467 7.50059 28.40230 28.18150 28.46130 MIN: 6.98 MIN: 6.91 MIN: 14.66 MIN: 14.34 MIN: 14.76 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
LuxCoreRender Scene: Danish Mood - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Danish Mood - Acceleration: CPU r1 r1a r2b r3 r4 2 4 6 8 10 SE +/- 0.08, N = 3 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 7.42 7.55 5.73 5.65 5.68 MIN: 3.2 / MAX: 8.74 MIN: 3.28 / MAX: 8.86 MIN: 1.3 / MAX: 7.65 MIN: 1.24 / MAX: 7.63 MIN: 1.26 / MAX: 7.6
LuxCoreRender Scene: LuxCore Benchmark - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: LuxCore Benchmark - Acceleration: CPU r1 r1a r2b r3 r4 2 4 6 8 10 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 7.84 8.04 5.84 5.92 5.87 MIN: 3.44 / MAX: 9.2 MIN: 3.51 / MAX: 9.33 MIN: 1.16 / MAX: 7.97 MIN: 1.15 / MAX: 7.98 MIN: 1.15 / MAX: 7.95
libavif avifenc Encoder Speed: 0 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 0 r1 r1a r2b r3 r4 15 30 45 60 75 SE +/- 0.21, N = 3 SE +/- 0.24, N = 3 SE +/- 0.22, N = 3 SE +/- 0.20, N = 3 SE +/- 0.68, N = 3 57.98 57.71 64.97 65.96 65.89 1. (CXX) g++ options: -O3 -fPIC -lm
AOM AV1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p r1a r2b r3 r4 0.1148 0.2296 0.3444 0.4592 0.574 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.51 0.32 0.33 0.33 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
LuxCoreRender Scene: Rainbow Colors and Prism - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Rainbow Colors and Prism - Acceleration: CPU r1 r1a r2b r3 r4 4 8 12 16 20 SE +/- 1.05, N = 15 SE +/- 0.47, N = 15 SE +/- 0.87, N = 13 SE +/- 1.13, N = 12 SE +/- 0.79, N = 12 17.04 13.34 13.42 16.47 14.79 MIN: 11.27 / MAX: 22.05 MIN: 10.32 / MAX: 17.45 MIN: 8.28 / MAX: 21.15 MIN: 10.39 / MAX: 21.43 MIN: 9.85 / MAX: 20.95
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p r1a r2b r3 r4 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 28.66 10.39 10.39 10.54 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Fishy Cat - Compute: CPU-Only r2b r4 11 22 33 44 55 SE +/- 0.15, N = 3 SE +/- 0.25, N = 3 46.38 46.73
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 0.6771 1.3542 2.0313 2.7084 3.3855 SE +/- 0.00128, N = 3 SE +/- 0.00276, N = 3 SE +/- 0.02287, N = 13 SE +/- 0.02478, N = 14 SE +/- 0.02449, N = 14 2.96135 2.96857 3.00464 3.00929 3.00907 MIN: 2.84 MIN: 2.84 MIN: 2.84 MIN: 2.84 MIN: 2.84 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
VOSK Speech Recognition Toolkit OpenBenchmarking.org Seconds, Fewer Is Better VOSK Speech Recognition Toolkit 0.3.21 r1 r1a r2b r3 r4 8 16 24 32 40 SE +/- 0.32, N = 3 SE +/- 0.29, N = 8 SE +/- 0.43, N = 3 SE +/- 0.43, N = 3 SE +/- 0.32, N = 3 35.92 35.01 36.42 35.58 35.50
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 13 Total Time r1 r1a r2b r3 r4 40M 80M 120M 160M 200M SE +/- 1585265.68, N = 15 SE +/- 2404481.41, N = 3 SE +/- 1982639.48, N = 3 SE +/- 1924842.52, N = 3 SE +/- 2183262.34, N = 4 181644819 186263552 181554218 189214499 186013261 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fprofile-use -fno-peel-loops -fno-tracer -pedantic -O3 -msse -msse3 -mpopcnt -mavx2 -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto=jobserver
libavif avifenc Encoder Speed: 6, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 6, Lossless r1 r1a r2b r3 r4 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.09, N = 3 SE +/- 0.24, N = 3 SE +/- 0.35, N = 3 SE +/- 0.36, N = 6 32.11 31.62 38.40 38.59 38.51 1. (CXX) g++ options: -O3 -fPIC -lm
srsLTE Test: PHY_DL_Test OpenBenchmarking.org UE Mb/s, More Is Better srsLTE 20.10.1 Test: PHY_DL_Test r1 r1a r2b r3 r4 20 40 60 80 100 SE +/- 0.76, N = 3 SE +/- 1.16, N = 3 SE +/- 0.38, N = 3 SE +/- 1.14, N = 3 SE +/- 0.62, N = 3 76.9 77.3 75.0 76.1 78.3 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
srsLTE Test: PHY_DL_Test OpenBenchmarking.org eNb Mb/s, More Is Better srsLTE 20.10.1 Test: PHY_DL_Test r1 r1a r2b r3 r4 40 80 120 160 200 SE +/- 1.15, N = 3 SE +/- 0.36, N = 3 SE +/- 1.23, N = 3 SE +/- 2.42, N = 3 SE +/- 0.58, N = 3 183.4 184.2 181.6 181.6 183.7 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
libavif avifenc Encoder Speed: 6 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 6 r1 r1a r2b r3 r4 4 8 12 16 20 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.23, N = 3 SE +/- 0.13, N = 15 SE +/- 0.12, N = 15 13.25 13.33 16.07 16.62 16.21 1. (CXX) g++ options: -O3 -fPIC -lm
srsLTE Test: OFDM_Test OpenBenchmarking.org Samples / Second, More Is Better srsLTE 20.10.1 Test: OFDM_Test r1 r1a r2b r3 r4 30M 60M 90M 120M 150M SE +/- 611010.09, N = 3 SE +/- 240370.09, N = 3 SE +/- 366666.67, N = 3 SE +/- 600925.21, N = 3 SE +/- 233333.33, N = 3 120300000 120133333 120733333 120833333 120666667 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
Sysbench Test: RAM / Memory OpenBenchmarking.org MiB/sec, More Is Better Sysbench 1.0.20 Test: RAM / Memory r2b r4 3K 6K 9K 12K 15K SE +/- 125.16, N = 15 SE +/- 118.72, N = 15 12510.56 12553.44 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
libavif avifenc Encoder Speed: 2 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 2 r1 r1a r2b r3 r4 9 18 27 36 45 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 SE +/- 0.40, N = 3 SE +/- 0.20, N = 3 SE +/- 0.08, N = 3 31.54 31.48 38.37 38.31 37.80 1. (CXX) g++ options: -O3 -fPIC -lm
Botan Test: AES-256 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 - Decrypt r1 r1a r2b r3 r4 1200 2400 3600 4800 6000 SE +/- 1.20, N = 3 SE +/- 0.12, N = 3 SE +/- 0.94, N = 3 SE +/- 1.10, N = 3 SE +/- 12.66, N = 3 5663.06 5663.61 5662.76 5662.34 5650.14 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: AES-256 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 r1 r1a r2b r3 r4 1200 2400 3600 4800 6000 SE +/- 0.92, N = 3 SE +/- 0.28, N = 3 SE +/- 55.60, N = 3 SE +/- 42.23, N = 3 SE +/- 51.03, N = 3 5669.70 5670.81 5606.97 5593.37 5612.00 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: ETC1S r2b r4 8 16 24 32 40 SE +/- 0.21, N = 3 SE +/- 0.42, N = 3 34.24 34.42 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 0 r2b r4 3 6 9 12 15 SE +/- 0.08, N = 15 SE +/- 0.08, N = 3 11.25 11.23 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
libavif avifenc Encoder Speed: 10, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 10, Lossless r1 r1a r2b r3 r4 3 6 9 12 15 SE +/- 0.036, N = 3 SE +/- 0.016, N = 3 SE +/- 0.154, N = 15 SE +/- 0.130, N = 15 SE +/- 0.157, N = 15 8.852 8.812 10.282 10.088 10.208 1. (CXX) g++ options: -O3 -fPIC -lm
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p r1a r2b r3 r4 30 60 90 120 150 SE +/- 0.82, N = 15 SE +/- 0.49, N = 3 SE +/- 0.31, N = 15 SE +/- 0.28, N = 3 125.25 43.26 43.42 42.37 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: BMW27 - Compute: CPU-Only r2b r4 7 14 21 28 35 SE +/- 0.08, N = 3 SE +/- 0.32, N = 3 29.56 29.69
Botan Test: ChaCha20Poly1305 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt r1 r1a r2b r3 r4 130 260 390 520 650 SE +/- 0.40, N = 3 SE +/- 0.57, N = 3 SE +/- 3.49, N = 3 SE +/- 3.74, N = 3 SE +/- 2.81, N = 3 619.46 619.54 612.44 612.15 615.98 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: ChaCha20Poly1305 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 r1 r1a r2b r3 r4 130 260 390 520 650 SE +/- 0.03, N = 3 SE +/- 0.17, N = 3 SE +/- 3.48, N = 3 SE +/- 3.19, N = 3 SE +/- 2.98, N = 3 623.49 623.20 615.81 616.50 619.64 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Blowfish - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish - Decrypt r1 r1a r2b r3 r4 80 160 240 320 400 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 363.26 363.33 363.20 363.31 363.28 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Blowfish OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish r1 r1a r2b r3 r4 80 160 240 320 400 SE +/- 0.56, N = 3 SE +/- 0.05, N = 3 SE +/- 0.11, N = 3 SE +/- 3.73, N = 3 SE +/- 3.51, N = 3 363.04 363.62 362.93 359.45 359.57 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Twofish - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish - Decrypt r1 r1a r2b r3 r4 60 120 180 240 300 SE +/- 0.14, N = 3 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 292.74 292.37 292.40 292.83 292.61 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Twofish OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish r1 r1a r2b r3 r4 60 120 180 240 300 SE +/- 0.14, N = 3 SE +/- 0.14, N = 3 SE +/- 0.11, N = 3 SE +/- 2.66, N = 3 SE +/- 2.83, N = 3 289.13 288.85 288.56 286.18 286.00 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: CAST-256 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 - Decrypt r1 r1a r2b r3 r4 30 60 90 120 150 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.35, N = 3 SE +/- 0.01, N = 3 116.07 116.07 116.08 115.72 116.07 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: CAST-256 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 r1 r1a r2b r3 r4 30 60 90 120 150 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 1.15, N = 3 SE +/- 1.33, N = 3 SE +/- 1.17, N = 3 115.97 115.97 114.66 114.52 114.65 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: KASUMI - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI - Decrypt r1 r1a r2b r3 r4 20 40 60 80 100 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 74.32 74.29 74.28 74.31 74.29 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: KASUMI OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI r1 r1a r2b r3 r4 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 1.01, N = 3 SE +/- 0.77, N = 3 SE +/- 0.87, N = 3 77.29 77.31 76.29 76.41 76.40 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
toyBrot Fractal Generator Implementation: TBB OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: TBB r1 r1a r2b r3 r4 1500 3000 4500 6000 7500 SE +/- 59.06, N = 15 SE +/- 80.68, N = 3 SE +/- 73.83, N = 15 SE +/- 69.20, N = 15 SE +/- 81.70, N = 15 6850 6964 6984 7003 7016 1. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M r1 r1a r2b r3 r4 11K 22K 33K 44K 55K SE +/- 425.40, N = 7 SE +/- 588.34, N = 3 SE +/- 238.38, N = 3 SE +/- 358.18, N = 3 SE +/- 235.04, N = 3 48051.5 50166.1 49908.3 49813.4 49937.3 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Intel Memory Latency Checker Test: Peak Injection Bandwidth - Stream-Triad Like OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - Stream-Triad Like r1 r1a r2a r2b r3 r4 r5 70K 140K 210K 280K 350K SE +/- 177.93, N = 3 SE +/- 38.10, N = 3 SE +/- 34.05, N = 3 SE +/- 12.95, N = 3 SE +/- 32.03, N = 3 SE +/- 60.42, N = 3 SE +/- 55.81, N = 3 324377.2 323924.2 323826.9 324209.8 324227.4 324112.8 324234.5
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 1:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - 1:1 Reads-Writes r1 r1a r2a r2b r3 r4 r5 100K 200K 300K 400K 500K SE +/- 1187.16, N = 3 SE +/- 148.63, N = 3 SE +/- 212.40, N = 3 SE +/- 314.54, N = 3 SE +/- 138.13, N = 3 SE +/- 1601.80, N = 3 SE +/- 847.23, N = 3 442422.3 442843.2 442144.2 440454.7 449554.1 446396.0 448800.1
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 2:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - 2:1 Reads-Writes r1 r1a r2a r2b r3 r4 r5 100K 200K 300K 400K 500K SE +/- 274.15, N = 3 SE +/- 130.28, N = 3 SE +/- 115.55, N = 3 SE +/- 64.32, N = 3 SE +/- 73.04, N = 3 SE +/- 36.24, N = 3 SE +/- 12.06, N = 3 459038.6 456260.3 456408.6 459309.8 457190.5 458941.9 458830.6
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 3:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - 3:1 Reads-Writes r1 r1a r2a r2b r3 r4 r5 90K 180K 270K 360K 450K SE +/- 163.24, N = 3 SE +/- 94.95, N = 3 SE +/- 236.99, N = 3 SE +/- 25.04, N = 3 SE +/- 88.34, N = 3 SE +/- 23.30, N = 3 SE +/- 23.30, N = 3 425933.7 424096.6 424077.3 425925.6 424904.5 425822.1 425508.1
Intel Memory Latency Checker Test: Peak Injection Bandwidth - All Reads OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - All Reads r1 r1a r2a r2b r3 r4 r5 80K 160K 240K 320K 400K SE +/- 709.43, N = 3 SE +/- 14.58, N = 3 SE +/- 37.47, N = 3 SE +/- 14.54, N = 3 SE +/- 24.95, N = 3 SE +/- 26.62, N = 3 SE +/- 23.85, N = 3 356476.2 358385.5 358269.7 357742.9 358463.7 358110.5 357722.7
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 0.0769 0.1538 0.2307 0.3076 0.3845 SE +/- 0.000853, N = 3 SE +/- 0.002562, N = 3 SE +/- 0.003448, N = 5 SE +/- 0.003372, N = 6 SE +/- 0.004121, N = 3 0.338327 0.341663 0.341893 0.341955 0.340243 MIN: 0.3 MIN: 0.31 MIN: 0.3 MIN: 0.31 MIN: 0.3 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 0.0488 0.0976 0.1464 0.1952 0.244 SE +/- 0.000867, N = 3 SE +/- 0.000781, N = 3 SE +/- 0.001893, N = 8 SE +/- 0.002019, N = 7 SE +/- 0.001544, N = 12 0.215115 0.213643 0.216806 0.216586 0.215085 MIN: 0.19 MIN: 0.19 MIN: 0.19 MIN: 0.19 MIN: 0.19 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Helsing Digit Range: 14 digit OpenBenchmarking.org Seconds, Fewer Is Better Helsing 1.0-beta Digit Range: 14 digit r1 r1a r2b r3 r4 20 40 60 80 100 77.87 78.16 78.33 78.08 78.54 1. (CC) gcc options: -O2 -pthread -lcrypto
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 2.1.0 Test: Decompression Throughput r1 r1a r2b r3 r4 40 80 120 160 200 SE +/- 0.15, N = 3 SE +/- 0.39, N = 3 SE +/- 0.07, N = 3 SE +/- 1.04, N = 3 SE +/- 0.47, N = 3 161.63 156.97 160.26 159.19 159.24 1. (CC) gcc options: -O3 -rdynamic
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Thorough r2b r4 3 6 9 12 15 SE +/- 0.0796, N = 8 SE +/- 0.0879, N = 7 9.2907 9.3091 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Exhaustive r2b r4 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 16.36 16.37 1. (CXX) g++ options: -O3 -flto -pthread
libavif avifenc Encoder Speed: 10 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 10 r1 r1a r2b r3 r4 2 4 6 8 10 SE +/- 0.038, N = 3 SE +/- 0.014, N = 3 SE +/- 0.116, N = 15 SE +/- 0.145, N = 15 SE +/- 0.130, N = 15 5.477 5.505 6.656 6.597 6.746 1. (CXX) g++ options: -O3 -fPIC -lm
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Medium r2b r4 2 4 6 8 10 SE +/- 0.0906, N = 15 SE +/- 0.0290, N = 3 7.1887 7.1472 1. (CXX) g++ options: -O3 -flto -pthread
Timed Mesa Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 21.0 Time To Compile r1 r1a r2b r3 r4 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.12, N = 3 SE +/- 0.04, N = 3 SE +/- 0.15, N = 3 SE +/- 0.11, N = 3 20.95 20.38 21.58 21.37 21.31
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 0.8015 1.603 2.4045 3.206 4.0075 SE +/- 0.00193, N = 3 SE +/- 0.00732, N = 3 SE +/- 0.00854, N = 3 SE +/- 0.01280, N = 3 SE +/- 0.00650, N = 3 3.53026 3.54367 3.53121 3.56224 3.54783 MIN: 3.38 MIN: 3.38 MIN: 3.37 MIN: 3.39 MIN: 3.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 0.0915 0.183 0.2745 0.366 0.4575 SE +/- 0.001135, N = 3 SE +/- 0.001124, N = 3 SE +/- 0.004259, N = 4 SE +/- 0.003204, N = 10 SE +/- 0.002415, N = 14 0.398282 0.395588 0.403409 0.406877 0.402919 MIN: 0.37 MIN: 0.36 MIN: 0.36 MIN: 0.37 MIN: 0.36 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
SVT-HEVC Tuning: 1 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p r1 r1a r2b r3 r4 9 18 27 36 45 SE +/- 0.29, N = 3 SE +/- 0.24, N = 3 SE +/- 0.09, N = 3 SE +/- 0.14, N = 3 SE +/- 0.31, N = 3 36.91 37.34 27.80 28.22 28.01 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p r1a r2b r3 r4 20 40 60 80 100 SE +/- 1.01, N = 15 SE +/- 0.19, N = 3 SE +/- 0.26, N = 3 SE +/- 0.27, N = 3 103.92 36.20 36.06 36.35 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Liquid-DSP Threads: 160 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 160 - Buffer Length: 256 - Filter Length: 57 r1 r1a r2b r3 r4 700M 1400M 2100M 2800M 3500M SE +/- 17047384.94, N = 3 SE +/- 2062630.47, N = 3 SE +/- 14685858.66, N = 3 SE +/- 14901789.60, N = 3 SE +/- 16411005.79, N = 3 3144800000 3162066667 3131866667 3143300000 3140266667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 128 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 128 - Buffer Length: 256 - Filter Length: 57 r1 r1a r2b r3 r4 700M 1400M 2100M 2800M 3500M SE +/- 8088331.79, N = 3 SE +/- 38975091.76, N = 3 SE +/- 14312737.14, N = 3 SE +/- 6896617.53, N = 3 SE +/- 16537936.19, N = 3 3415933333 3352733333 3400066667 3411000000 3398800000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 64 - Buffer Length: 256 - Filter Length: 57 r1 r1a r2b r3 r4 700M 1400M 2100M 2800M 3500M SE +/- 5206513.02, N = 3 SE +/- 2150193.79, N = 3 SE +/- 17049079.48, N = 3 SE +/- 14893734.70, N = 3 SE +/- 12876378.03, N = 3 3267133333 3263700000 3227433333 3232700000 3245666667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 r1 r1a r2b r3 r4 400M 800M 1200M 1600M 2000M SE +/- 3951371.07, N = 3 SE +/- 2515949.13, N = 3 SE +/- 10121648.97, N = 3 SE +/- 6582552.70, N = 3 1735100000 1736800000 1699333333 1704500000 1697500000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 r1 r1a r2b r3 r4 200M 400M 600M 800M 1000M SE +/- 691953.76, N = 3 SE +/- 669162.00, N = 3 SE +/- 3620722.76, N = 3 SE +/- 859903.10, N = 3 SE +/- 10609570.10, N = 3 885320000 890273333 862890000 865410000 860046667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 r1 r2b r3 r4 90M 180M 270M 360M 450M SE +/- 422150.58, N = 3 SE +/- 2458908.97, N = 3 SE +/- 1240739.03, N = 3 SE +/- 2739929.03, N = 3 441953333 428100000 432170000 432013333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 4 - Buffer Length: 256 - Filter Length: 57 r1 r2b r3 r4 50M 100M 150M 200M 250M SE +/- 1090112.12, N = 3 SE +/- 824809.74, N = 3 SE +/- 1663583.82, N = 3 SE +/- 1956802.95, N = 3 217643333 213203333 215343333 216773333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 2 - Buffer Length: 256 - Filter Length: 57 r1 r2b r3 r4 20M 40M 60M 80M 100M SE +/- 729984.78, N = 3 SE +/- 907677.13, N = 3 SE +/- 430348.70, N = 3 SE +/- 132035.35, N = 3 110713333 110173333 111510000 109430000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 1 - Buffer Length: 256 - Filter Length: 57 r1 r2b r3 r4 12M 24M 36M 48M 60M SE +/- 173700.89, N = 3 SE +/- 613156.95, N = 3 SE +/- 550708.74, N = 3 SE +/- 534784.17, N = 3 57792000 56230333 57197667 55251667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
KTX-Software toktx Settings: Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: Zstd Compression 19 r2b r4 5 10 15 20 25 SE +/- 0.22, N = 3 SE +/- 0.20, N = 3 19.78 20.08
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 3 r2b r4 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 17.16 17.19 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 0.0547 0.1094 0.1641 0.2188 0.2735 SE +/- 0.000856, N = 3 SE +/- 0.000662, N = 3 SE +/- 0.003187, N = 3 SE +/- 0.002507, N = 5 SE +/- 0.002245, N = 7 0.239989 0.240122 0.243026 0.243308 0.242450 MIN: 0.22 MIN: 0.23 MIN: 0.22 MIN: 0.22 MIN: 0.22 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
SVT-VP9 Tuning: VMAF Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p r1 r1a r2b r3 r4 90 180 270 360 450 SE +/- 15.40, N = 12 SE +/- 16.03, N = 12 SE +/- 4.05, N = 12 SE +/- 1.57, N = 3 SE +/- 0.65, N = 3 386.29 393.46 182.26 185.53 184.07 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
KTX-Software toktx Settings: UASTC 3 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 r2b r4 1.2744 2.5488 3.8232 5.0976 6.372 SE +/- 0.053, N = 15 SE +/- 0.008, N = 3 5.664 5.562
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 0.282 0.564 0.846 1.128 1.41 SE +/- 0.00180, N = 3 SE +/- 0.01592, N = 15 SE +/- 0.00964, N = 3 SE +/- 0.01211, N = 3 SE +/- 0.01282, N = 3 1.24809 1.25267 1.25313 1.24176 1.24222 MIN: 1.2 MIN: 1.19 MIN: 1.2 MIN: 1.18 MIN: 1.19 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Intel Memory Latency Checker Test: Idle Latency OpenBenchmarking.org ns, Fewer Is Better Intel Memory Latency Checker Test: Idle Latency r1 r1a r2 r2a r3 r4 r5 15 30 45 60 75 SE +/- 0.10, N = 3 SE +/- 0.39, N = 3 SE +/- 0.09, N = 3 SE +/- 0.28, N = 8 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 SE +/- 0.09, N = 3 35.1 33.0 67.5 32.5 67.6 67.8 68.1
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 0.2123 0.4246 0.6369 0.8492 1.0615 SE +/- 0.002101, N = 3 SE +/- 0.002111, N = 3 SE +/- 0.011253, N = 3 SE +/- 0.007264, N = 3 SE +/- 0.008450, N = 3 0.918568 0.912279 0.943624 0.936941 0.940714 MIN: 0.85 MIN: 0.86 MIN: 0.86 MIN: 0.85 MIN: 0.86 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 2 r2b r4 4 8 12 16 20 SE +/- 0.18, N = 3 SE +/- 0.15, N = 3 13.98 14.16 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction r1 r1a r2b r3 r4 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 11.36 11.27 11.56 14.60 14.66 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
KTX-Software toktx Settings: UASTC 3 + Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 + Zstd Compression 19 r2b r4 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.11, N = 5 10.01 10.03
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 0.0491 0.0982 0.1473 0.1964 0.2455 SE +/- 0.002205, N = 15 SE +/- 0.001109, N = 3 SE +/- 0.004449, N = 15 SE +/- 0.003384, N = 15 SE +/- 0.004970, N = 15 0.210919 0.210728 0.210324 0.218349 0.217941 MIN: 0.19 MIN: 0.2 MIN: 0.18 MIN: 0.19 MIN: 0.19 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 0.1355 0.271 0.4065 0.542 0.6775 SE +/- 0.001703, N = 3 SE +/- 0.000780, N = 3 SE +/- 0.004180, N = 3 SE +/- 0.004400, N = 3 SE +/- 0.003648, N = 3 0.593042 0.595661 0.602122 0.602314 0.602038 MIN: 0.56 MIN: 0.56 MIN: 0.56 MIN: 0.56 MIN: 0.56 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction r1 r1a r2b r3 r4 0.8039 1.6078 2.4117 3.2156 4.0195 SE +/- 0.00774937, N = 3 SE +/- 0.01532048, N = 3 SE +/- 0.02799890, N = 3 SE +/- 0.03072276, N = 15 SE +/- 0.02850005, N = 15 2.74370996 2.73859096 3.02281992 3.56592774 3.57278153 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
KTX-Software toktx Settings: Zstd Compression 9 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: Zstd Compression 9 r2b r4 0.8318 1.6636 2.4954 3.3272 4.159 SE +/- 0.003, N = 3 SE +/- 0.064, N = 15 3.470 3.697
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 0.8197 1.6394 2.4591 3.2788 4.0985 SE +/- 0.00924, N = 3 SE +/- 0.00795, N = 3 SE +/- 0.05421, N = 14 SE +/- 0.05675, N = 14 SE +/- 0.05617, N = 14 3.57247 3.57662 3.64232 3.64033 3.64319 MIN: 3.53 MIN: 3.5 MIN: 3.51 MIN: 3.47 MIN: 3.5 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 0.1972 0.3944 0.5916 0.7888 0.986 SE +/- 0.002419, N = 3 SE +/- 0.002055, N = 3 SE +/- 0.008361, N = 14 SE +/- 0.007890, N = 14 SE +/- 0.007461, N = 14 0.864164 0.863214 0.874080 0.874968 0.876227 MIN: 0.84 MIN: 0.84 MIN: 0.83 MIN: 0.84 MIN: 0.84 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
toyBrot Fractal Generator Implementation: C++ Tasks OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: C++ Tasks r1 r1a r2b r3 r4 2K 4K 6K 8K 10K SE +/- 43.45, N = 3 SE +/- 80.44, N = 4 SE +/- 102.03, N = 3 SE +/- 93.55, N = 4 SE +/- 85.46, N = 4 7879 7724 8050 8048 8037 1. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.4.1 Model: Church Facade r2b r4 1500 3000 4500 6000 7500 SE +/- 20.01, N = 3 SE +/- 3.33, N = 3 7001 7082 1. (CXX) g++ options: -O3
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 0.4148 0.8296 1.2444 1.6592 2.074 SE +/- 0.00580, N = 3 SE +/- 0.00121, N = 3 SE +/- 0.01382, N = 3 SE +/- 0.02043, N = 3 SE +/- 0.00968, N = 3 1.80046 1.79881 1.81774 1.84339 1.81913 MIN: 1.68 MIN: 1.69 MIN: 1.69 MIN: 1.67 MIN: 1.68 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.4.1 Model: Lion r2b r4 1300 2600 3900 5200 6500 SE +/- 25.21, N = 3 SE +/- 21.15, N = 3 6126 6170 1. (CXX) g++ options: -O3
toyBrot Fractal Generator Implementation: OpenMP OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: OpenMP r1 r1a r2b r3 r4 1600 3200 4800 6400 8000 SE +/- 5.13, N = 3 SE +/- 0.88, N = 3 SE +/- 101.59, N = 3 SE +/- 85.45, N = 4 SE +/- 91.12, N = 4 7318 7308 7412 7439 7429 1. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc
toyBrot Fractal Generator Implementation: C++ Threads OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: C++ Threads r1 r1a r2b r3 r4 1500 3000 4500 6000 7500 SE +/- 49.12, N = 3 SE +/- 29.96, N = 3 SE +/- 89.67, N = 3 SE +/- 98.76, N = 3 SE +/- 76.94, N = 4 7018 6980 7149 7203 7141 1. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc
SVT-VP9 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p r1 r1a r2b r3 r4 70 140 210 280 350 SE +/- 1.20, N = 3 SE +/- 1.10, N = 3 SE +/- 1.13, N = 3 SE +/- 1.63, N = 3 SE +/- 1.59, N = 3 327.87 329.53 164.32 164.51 162.21 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-VP9 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p r1 r1a r2b r3 r4 90 180 270 360 450 SE +/- 1.44, N = 3 SE +/- 0.66, N = 3 SE +/- 0.90, N = 3 SE +/- 2.25, N = 3 SE +/- 0.47, N = 3 401.29 408.24 182.17 181.52 179.13 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 0.2578 0.5156 0.7734 1.0312 1.289 SE +/- 0.00274, N = 3 SE +/- 0.00124, N = 3 SE +/- 0.00330, N = 3 SE +/- 0.00975, N = 3 SE +/- 0.01182, N = 3 1.10991 1.12224 1.11874 1.14578 1.11811 MIN: 1.02 MIN: 1.02 MIN: 1.02 MIN: 1.04 MIN: 1.02 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 0.2029 0.4058 0.6087 0.8116 1.0145 SE +/- 0.006225, N = 3 SE +/- 0.003986, N = 3 SE +/- 0.004902, N = 3 SE +/- 0.006631, N = 3 SE +/- 0.005244, N = 3 0.877815 0.879137 0.869978 0.901823 0.875421 MIN: 0.82 MIN: 0.83 MIN: 0.82 MIN: 0.84 MIN: 0.82 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 0.4764 0.9528 1.4292 1.9056 2.382 SE +/- 0.00138, N = 3 SE +/- 0.00168, N = 3 SE +/- 0.01980, N = 3 SE +/- 0.01943, N = 3 SE +/- 0.01801, N = 3 2.07944 2.08532 2.11712 2.10841 2.10837 MIN: 2.03 MIN: 2.03 MIN: 2.03 MIN: 2.03 MIN: 2.03 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
SVT-HEVC Tuning: 10 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p r1 r1a r2b r3 r4 110 220 330 440 550 SE +/- 3.80, N = 3 SE +/- 4.78, N = 3 SE +/- 2.64, N = 4 SE +/- 1.80, N = 10 SE +/- 1.14, N = 3 499.23 493.51 234.51 234.39 233.96 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
SVT-HEVC Tuning: 7 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p r1 r1a r2b r3 r4 60 120 180 240 300 SE +/- 1.68, N = 3 SE +/- 1.37, N = 3 SE +/- 1.76, N = 5 SE +/- 1.64, N = 3 SE +/- 1.22, N = 3 290.67 288.99 158.16 157.83 156.26 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
Phoronix Test Suite v10.8.5