xeon-platinum-8380-2p-smoke-run 2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2105012-IB-XEONPLATI04&sro&grs .
xeon-platinum-8380-2p-smoke-run Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution r1 r1a r2 r2a r2b r3 r4 r5 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads) Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) Intel Device 0998 16 x 32 GB DDR4-3200MT/s Hynix HMA84GR7CJR4N-XN 2 x 7682GB INTEL SSDPF2KX076TZ + 2 x 800GB INTEL SSDPF21Q800GB + 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB INTEL SSDSC2KG96 ASPEED VE228 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP Ubuntu 20.04 5.11.0-051100-generic (x86_64) GNOME Shell 3.36.4 X Server 1.20.8 GCC 9.3.0 ext4 1920x1080 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - r1: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd000270 - r1a: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd000270 - r2: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd000270 - r2a: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 - r2b: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 - r3: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 - r4: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 - r5: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 Python Details - Python 2.7.18 + Python 3.8.5 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
xeon-platinum-8380-2p-smoke-run onednn: Deconvolution Batch shapes_1d - f32 - CPU aom-av1: Speed 9 Realtime - Bosphorus 1080p aom-av1: Speed 8 Realtime - Bosphorus 1080p aom-av1: Speed 6 Two-Pass - Bosphorus 1080p aom-av1: Speed 6 Realtime - Bosphorus 1080p aom-av1: Speed 6 Realtime - Bosphorus 4K aom-av1: Speed 8 Realtime - Bosphorus 4K aom-av1: Speed 6 Two-Pass - Bosphorus 4K aom-av1: Speed 9 Realtime - Bosphorus 4K svt-vp9: PSNR/SSIM Optimized - Bosphorus 1080p svt-hevc: 10 - Bosphorus 1080p intel-mlc: Idle Latency aom-av1: Speed 4 Two-Pass - Bosphorus 1080p aom-av1: Speed 4 Two-Pass - Bosphorus 4K svt-vp9: Visual Quality Optimized - Bosphorus 1080p svt-hevc: 7 - Bosphorus 1080p build-erlang: Time To Compile aom-av1: Speed 0 Two-Pass - Bosphorus 1080p luxcorerender: LuxCore Benchmark - CPU aom-av1: Speed 0 Two-Pass - Bosphorus 4K svt-hevc: 1 - Bosphorus 1080p luxcorerender: Danish Mood - CPU incompact3d: input.i3d 129 Cells Per Direction incompact3d: input.i3d 193 Cells Per Direction incompact3d: X3D-benchmarking input.i3d avifenc: 6 avifenc: 6, Lossless avifenc: 2 luaradio: Complex Phase avifenc: 10, Lossless build-wasmer: Time To Compile build-linux-kernel: Time To Compile avifenc: 0 luaradio: FM Deemphasis Filter build-nodejs: Time To Compile xmrig: Monero - 1M build-mesa: Time To Compile luxcorerender: DLSC - CPU build-llvm: Unix Makefiles mnn: mobilenet-v1-1.0 liquid-dsp: 1 - 256 - 57 xmrig: Wownero - 1M srslte: PHY_DL_Test toybrot: C++ Tasks stockfish: Total Time vosk: onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU liquid-dsp: 16 - 256 - 57 onednn: IP Shapes 1D - f32 - CPU luxcorerender: Orange Juice - CPU liquid-dsp: 8 - 256 - 57 onednn: Convolution Batch Shapes Auto - f32 - CPU toybrot: C++ Threads hammerdb-mariadb: 64 - 500 hammerdb-mariadb: 64 - 500 gmpbench: Total Time tjbench: Decompression Throughput onednn: IP Shapes 3D - u8s8f32 - CPU luaradio: Hilbert Transform onednn: IP Shapes 3D - bf16bf16bf16 - CPU toybrot: TBB onednn: IP Shapes 1D - u8s8f32 - CPU liquid-dsp: 32 - 256 - 57 mysqlslap: 4 liquid-dsp: 4 - 256 - 57 intel-mlc: Peak Injection Bandwidth - 1:1 Reads-Writes onednn: Recurrent Neural Network Training - f32 - CPU build-llvm: Ninja onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU liquid-dsp: 2 - 256 - 57 liquid-dsp: 128 - 256 - 57 toktx: UASTC 3 onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU toybrot: OpenMP mnn: inception-v3 onednn: IP Shapes 1D - bf16bf16bf16 - CPU mysqlslap: 128 onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU toktx: Zstd Compression 19 onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU srslte: PHY_DL_Test botan: AES-256 onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU botan: KASUMI basis: UASTC Level 2 botan: CAST-256 botan: ChaCha20Poly1305 liquid-dsp: 64 - 256 - 57 botan: ChaCha20Poly1305 - Decrypt securemark: SecureMark-TLS botan: Blowfish draco: Church Facade botan: Twofish onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU liquid-dsp: 160 - 256 - 57 onednn: Recurrent Neural Network Inference - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU helsing: 14 digit onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU blender: Fishy Cat - CPU-Only draco: Lion blender: Classroom - CPU-Only intel-mlc: Max Bandwidth - 1:1 Reads-Writes intel-mlc: Peak Injection Bandwidth - 2:1 Reads-Writes intel-mlc: Max Bandwidth - 2:1 Reads-Writes srslte: OFDM_Test astcenc: Medium intel-mlc: Peak Injection Bandwidth - All Reads onednn: Recurrent Neural Network Inference - u8s8f32 - CPU basis: ETC1S mysqlslap: 8 blender: BMW27 - CPU-Only intel-mlc: Peak Injection Bandwidth - 3:1 Reads-Writes onednn: Recurrent Neural Network Training - u8s8f32 - CPU intel-mlc: Max Bandwidth - 3:1 Reads-Writes sysbench: RAM / Memory intel-mlc: Max Bandwidth - All Reads botan: CAST-256 - Decrypt mysqlslap: 64 botan: AES-256 - Decrypt mysqlslap: 32 basis: UASTC Level 0 astcenc: Thorough toktx: UASTC 4 + Zstd Compression 19 toktx: UASTC 3 + Zstd Compression 19 intel-mlc: Max Bandwidth - Stream-Triad Like intel-mlc: Peak Injection Bandwidth - Stream-Triad Like mysqlslap: 16 botan: Twofish - Decrypt basis: UASTC Level 3 blender: Pabellon Barcelona - CPU-Only hammerdb-mariadb: 128 - 500 astcenc: Exhaustive botan: KASUMI - Decrypt mnn: SqueezeNetV1.0 blender: Barbershop - CPU-Only botan: Blowfish - Decrypt hammerdb-mariadb: 128 - 500 sysbench: CPU mysqlslap: 512 mysqlslap: 256 cp2k: Fayalite-FIST hammerdb-mariadb: 128 - 250 hammerdb-mariadb: 128 - 250 hammerdb-mariadb: 64 - 250 hammerdb-mariadb: 64 - 250 hammerdb-mariadb: 32 - 500 hammerdb-mariadb: 32 - 500 hammerdb-mariadb: 32 - 250 hammerdb-mariadb: 32 - 250 hammerdb-mariadb: 16 - 500 hammerdb-mariadb: 16 - 500 hammerdb-mariadb: 16 - 250 hammerdb-mariadb: 16 - 250 hammerdb-mariadb: 8 - 500 hammerdb-mariadb: 8 - 500 hammerdb-mariadb: 8 - 250 hammerdb-mariadb: 8 - 250 mnn: MobileNetV2_224 mnn: resnet-v2-50 toktx: Zstd Compression 9 mysqlslap: 1 viennacl: CPU BLAS - dGEMM-TT viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dGEMV-T viennacl: CPU BLAS - dGEMV-N viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dCOPY viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - sAXPY viennacl: CPU BLAS - sCOPY onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU avifenc: 10 svt-vp9: VMAF Optimized - Bosphorus 1080p luxcorerender: Rainbow Colors and Prism - CPU gnuradio: Hilbert Transform gnuradio: FM Deemphasis Filter gnuradio: IIR Filter gnuradio: FIR Filter gnuradio: Signal Source (Cosine) gnuradio: Five Back to Back FIR Filters luaradio: Five Back to Back FIR Filters r1 r1a r2 r2a r2b r3 r4 r5 7.49467 15.09 29.20 7.37 33.07 401.29 499.23 35.1 327.87 290.67 114.550 7.84 36.91 7.42 2.74370996 11.3586022 313.920451 13.247 32.113 31.539 546.8 8.852 62.160 24.382 57.975 410.0 101.101 19299.5 20.952 9.70 216.323 57792000 48051.5 76.9 7879 181644819 35.918 0.877815 885320000 0.918568 14.36 441953333 1.10991 7018 64298 194684 4642.1 161.634619 0.398282 80.3 1.80046 6850 1.21594 1735100000 217643333 442422.3 801.409 145.717 3.57247 110713333 3415933333 2.07944 7318 2.96135 0.593042 0.864164 0.215115 183.4 5669.700 0.239989 77.287 115.972 623.494 3267133333 619.458 225412 363.038 289.126 0.338327 3144800000 447.971 1.24809 3.53026 77.872 445.144 439496.74 459038.6 459455.38 120300000 356476.2 445.519 425933.7 792.831 426148.96 357285.28 116.074 5663.055 325766.94 324377.2 292.736 57190 74.320 363.255 173288 167809 55415 191397 63279 208419 68818 209254 69054 195258 64477 192913 63757 285984 94379 290082 95768 76.3 76.0 75.6 73.5 719 72.3 720 1058 843 620 1003 1834 804.392 0.210919 5.477 386.29 17.04 459.3 734.0 610.6 603.0 2183.5 1024.3 1094.8 7.50059 125.25 103.92 21.25 28.66 15.19 28.99 7.55 32.51 408.24 493.51 33.0 6.89 4.17 329.53 288.99 113.800 0.51 8.04 0.19 37.34 7.55 2.73859096 11.2727114 311.960785 13.328 31.624 31.479 548.2 8.812 61.930 24.360 57.710 409.6 100.446 19452.0 20.379 9.61 215.760 50166.1 77.3 7724 186263552 35.009 0.879137 890273333 0.912279 14.26 1.12224 6980 62311 188761 4642.8 156.969016 0.395588 80.3 1.79881 6964 1.22278 1736800000 442843.2 804.323 145.550 3.57662 3352733333 2.08532 7308 2.96857 0.595661 0.863214 0.213643 184.2 5670.809 0.240122 77.310 115.970 623.198 3263700000 619.538 225366 363.615 288.852 0.341663 3162066667 447.308 1.25267 3.54367 78.159 446.936 441408.09 456260.3 456629.89 120133333 358385.5 447.436 424096.6 791.927 424612.62 358364.56 116.069 5663.612 325184.58 323924.2 292.374 57242 74.288 363.326 173228 77.2 77.4 76.8 72.3 319 63.6 371 392 335 277 370 504 793.363 0.210728 5.505 393.46 13.34 459.1 727.4 609.5 604.8 2175.3 1015.2 1094.5 67.5 32.5 442144.2 442460.05 456408.6 456545.88 358269.7 424077.3 424818.83 358456.09 325260.41 323826.9 1374.663 28.4023 43.26 36.20 7.45 10.39 5.97 12.03 3.22 14.30 182.17 234.51 3.30 2.01 164.32 158.16 191.746 0.32 5.84 0.14 27.80 5.73 3.02281992 11.5617158 307.622108 16.065 38.395 38.372 458.7 10.282 71.928 27.997 64.971 370.1 110.930 19311.1 21.575 9.27 226.440 3.213 56230333 49908.3 75.0 8050 181554218 36.424 0.869978 862890000 0.943624 14.28 428100000 1.11874 7149 4524.5 160.262559 0.403409 78.2 1.81774 6984 1.23796 1699333333 1614 213203333 440454.7 808.289 148.484 3.64232 110173333 3400066667 5.664 2.11712 7412 53.073 3.00464 192 0.602122 19.781 0.874080 0.216806 181.6 5606.967 0.243026 76.286 13.979 114.663 615.806 3227433333 612.438 225343 362.926 7001 288.562 0.341893 3131866667 446.389 1.25313 3.53121 78.33 447.287 46.38 6126 71.78 441732.77 459309.8 459226.53 120733333 7.1887 357742.9 447.701 34.237 1413 29.56 425925.6 789.836 425997.22 12510.56 357774.43 116.080 403 5662.763 885 11.251 9.2907 56.660 10.011 325409.99 324209.8 1264 292.396 17.163 88.57 16.3621 74.275 7.174 110.02 363.196 214210.83 166 160 4.078 48.732 3.470 3336 54.7 62.3 59.8 61.9 389.9 62.3 447.65 507.1 422.2 349 474 691 791.695 0.210324 6.656 182.26 13.42 357.4 645.8 498.2 470.0 1684.4 111.2 804.5 28.1815 43.42 36.06 7.38 10.39 5.97 11.94 3.20 14.06 181.52 234.39 67.6 3.36 2.05 164.51 157.83 192.245 0.33 5.92 0.15 28.22 5.65 3.56592774 14.5982965 386.390001 16.615 38.590 38.313 458.2 10.088 71.130 28.018 65.960 370.3 111.790 20652.9 21.369 9.24 226.199 57197667 49813.4 76.1 8048 189214499 35.581 0.901823 865410000 0.936941 13.89 432170000 1.14578 7203 4504.5 159.187038 0.406877 78.2 1.84339 7003 1.24508 1704500000 1580 215343333 449554.1 796.689 147.163 3.64033 111510000 3411000000 2.10841 7439 3.00929 189 0.602314 0.874968 0.216586 181.6 5593.366 0.243308 76.407 114.517 616.501 3232700000 612.149 225291 359.452 286.180 0.341955 3143300000 450.648 1.24176 3.56224 78.079 447.144 440939.22 457190.5 457141.24 120833333 358463.7 446.917 1420 424904.5 793.080 424925.84 358268.00 115.723 404 5662.342 887 325218.50 324227.4 1262 292.827 74.309 363.314 3458 61.7 66.9 68.9 66.4 647 64.3 713.47 1024.2 913 532 862 1135 793.916 0.218349 6.597 185.53 16.47 408.0 621.0 487.4 502.0 1723.9 580.5 662.8 28.4613 42.37 36.35 7.43 10.54 6.00 12.10 3.23 14.73 179.13 233.96 67.8 3.36 2.10 162.21 156.26 193.839 0.33 5.87 0.14 28.01 5.68 3.57278153 14.6577489 389.698280 16.211 38.507 37.796 452.7 10.208 70.758 28.094 65.888 368.0 111.673 20574.6 21.313 9.25 224.290 3.362 55251667 49937.3 78.3 8037 186013261 35.503 0.875421 860046667 0.940714 13.94 432013333 1.11811 7141 4525.7 159.237752 0.402919 78.4 1.81913 7016 1.24116 1697500000 216773333 446396.0 792.296 146.909 3.64319 109430000 3398800000 5.562 2.10837 7429 52.227 3.00907 0.602038 20.082 0.876227 0.215085 183.7 5611.995 0.242450 76.403 14.159 114.646 619.638 3245666667 615.975 222747 359.573 7082 286.004 0.340243 3140266667 446.536 1.24222 3.54783 78.539 448.906 46.73 6170 72.29 440315.41 458941.9 458790.96 120666667 7.1472 358110.5 447.958 34.420 29.69 425822.1 792.049 425848.09 12553.44 357925.98 116.070 5650.139 11.226 9.3091 56.770 10.029 325314.62 324112.8 292.610 17.185 88.68 16.3729 74.292 7.170 109.96 363.279 214241.34 4.100 48.041 3.697 63.7 67.6 72.4 70.8 647 70.2 765 1158 936 535 855 1167 811.941 0.217941 6.746 184.07 14.79 373.8 622.0 487.7 515.6 1619.2 487.9 706.1 68.1 448800.1 440205.22 458830.6 458756.46 357722.7 425508.1 425467.51 357550.82 325312.30 324234.5 OpenBenchmarking.org
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 7 14 21 28 35 SE +/- 0.02080, N = 3 SE +/- 0.01835, N = 3 SE +/- 0.31773, N = 13 SE +/- 0.30585, N = 15 SE +/- 0.38629, N = 12 7.49467 7.50059 28.40230 28.18150 28.46130 MIN: 6.98 MIN: 6.91 MIN: 14.66 MIN: 14.34 MIN: 14.76 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p r1a r2b r3 r4 30 60 90 120 150 SE +/- 0.82, N = 15 SE +/- 0.49, N = 3 SE +/- 0.31, N = 15 SE +/- 0.28, N = 3 125.25 43.26 43.42 42.37 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p r1a r2b r3 r4 20 40 60 80 100 SE +/- 1.01, N = 15 SE +/- 0.19, N = 3 SE +/- 0.26, N = 3 SE +/- 0.27, N = 3 103.92 36.20 36.06 36.35 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p r1a r2b r3 r4 5 10 15 20 25 SE +/- 0.17, N = 3 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 21.25 7.45 7.38 7.43 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p r1a r2b r3 r4 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 28.66 10.39 10.39 10.54 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K r1 r1a r2b r3 r4 4 8 12 16 20 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.07, N = 12 SE +/- 0.01, N = 3 15.09 15.19 5.97 5.97 6.00 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K r1 r1a r2b r3 r4 7 14 21 28 35 SE +/- 0.19, N = 3 SE +/- 0.29, N = 5 SE +/- 0.08, N = 15 SE +/- 0.12, N = 15 SE +/- 0.17, N = 3 29.20 28.99 12.03 11.94 12.10 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K r1 r1a r2b r3 r4 2 4 6 8 10 SE +/- 0.09, N = 15 SE +/- 0.06, N = 3 SE +/- 0.03, N = 9 SE +/- 0.04, N = 3 SE +/- 0.03, N = 5 7.37 7.55 3.22 3.20 3.23 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K r1 r1a r2b r3 r4 8 16 24 32 40 SE +/- 0.28, N = 3 SE +/- 0.28, N = 3 SE +/- 0.15, N = 15 SE +/- 0.18, N = 4 SE +/- 0.08, N = 3 33.07 32.51 14.30 14.06 14.73 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
SVT-VP9 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p r1 r1a r2b r3 r4 90 180 270 360 450 SE +/- 1.44, N = 3 SE +/- 0.66, N = 3 SE +/- 0.90, N = 3 SE +/- 2.25, N = 3 SE +/- 0.47, N = 3 401.29 408.24 182.17 181.52 179.13 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-HEVC Tuning: 10 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p r1 r1a r2b r3 r4 110 220 330 440 550 SE +/- 3.80, N = 3 SE +/- 4.78, N = 3 SE +/- 2.64, N = 4 SE +/- 1.80, N = 10 SE +/- 1.14, N = 3 499.23 493.51 234.51 234.39 233.96 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
Intel Memory Latency Checker Test: Idle Latency OpenBenchmarking.org ns, Fewer Is Better Intel Memory Latency Checker Test: Idle Latency r1 r1a r2 r2a r3 r4 r5 15 30 45 60 75 SE +/- 0.10, N = 3 SE +/- 0.39, N = 3 SE +/- 0.09, N = 3 SE +/- 0.28, N = 8 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 SE +/- 0.09, N = 3 35.1 33.0 67.5 32.5 67.6 67.8 68.1
AOM AV1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p r1a r2b r3 r4 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 5 SE +/- 0.01, N = 3 6.89 3.30 3.36 3.36 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K r1a r2b r3 r4 0.9383 1.8766 2.8149 3.7532 4.6915 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 9 SE +/- 0.01, N = 3 4.17 2.01 2.05 2.10 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
SVT-VP9 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p r1 r1a r2b r3 r4 70 140 210 280 350 SE +/- 1.20, N = 3 SE +/- 1.10, N = 3 SE +/- 1.13, N = 3 SE +/- 1.63, N = 3 SE +/- 1.59, N = 3 327.87 329.53 164.32 164.51 162.21 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-HEVC Tuning: 7 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p r1 r1a r2b r3 r4 60 120 180 240 300 SE +/- 1.68, N = 3 SE +/- 1.37, N = 3 SE +/- 1.76, N = 5 SE +/- 1.64, N = 3 SE +/- 1.22, N = 3 290.67 288.99 158.16 157.83 156.26 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
Timed Erlang/OTP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Erlang/OTP Compilation 23.2 Time To Compile r1 r1a r2b r3 r4 40 80 120 160 200 SE +/- 0.18, N = 3 SE +/- 0.37, N = 3 SE +/- 1.08, N = 3 SE +/- 0.31, N = 3 SE +/- 1.56, N = 3 114.55 113.80 191.75 192.25 193.84
AOM AV1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p r1a r2b r3 r4 0.1148 0.2296 0.3444 0.4592 0.574 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.51 0.32 0.33 0.33 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
LuxCoreRender Scene: LuxCore Benchmark - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: LuxCore Benchmark - Acceleration: CPU r1 r1a r2b r3 r4 2 4 6 8 10 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 7.84 8.04 5.84 5.92 5.87 MIN: 3.44 / MAX: 9.2 MIN: 3.51 / MAX: 9.33 MIN: 1.16 / MAX: 7.97 MIN: 1.15 / MAX: 7.98 MIN: 1.15 / MAX: 7.95
AOM AV1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K r1a r2b r3 r4 0.0428 0.0856 0.1284 0.1712 0.214 SE +/- 0.00, N = 5 SE +/- 0.00, N = 12 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.19 0.14 0.15 0.14 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
SVT-HEVC Tuning: 1 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p r1 r1a r2b r3 r4 9 18 27 36 45 SE +/- 0.29, N = 3 SE +/- 0.24, N = 3 SE +/- 0.09, N = 3 SE +/- 0.14, N = 3 SE +/- 0.31, N = 3 36.91 37.34 27.80 28.22 28.01 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
LuxCoreRender Scene: Danish Mood - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Danish Mood - Acceleration: CPU r1 r1a r2b r3 r4 2 4 6 8 10 SE +/- 0.08, N = 3 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 7.42 7.55 5.73 5.65 5.68 MIN: 3.2 / MAX: 8.74 MIN: 3.28 / MAX: 8.86 MIN: 1.3 / MAX: 7.65 MIN: 1.24 / MAX: 7.63 MIN: 1.26 / MAX: 7.6
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction r1 r1a r2b r3 r4 0.8039 1.6078 2.4117 3.2156 4.0195 SE +/- 0.00774937, N = 3 SE +/- 0.01532048, N = 3 SE +/- 0.02799890, N = 3 SE +/- 0.03072276, N = 15 SE +/- 0.02850005, N = 15 2.74370996 2.73859096 3.02281992 3.56592774 3.57278153 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction r1 r1a r2b r3 r4 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 11.36 11.27 11.56 14.60 14.66 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
Xcompact3d Incompact3d Input: X3D-benchmarking input.i3d OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d r1 r1a r2b r3 r4 80 160 240 320 400 SE +/- 0.46, N = 3 SE +/- 0.12, N = 3 SE +/- 2.73, N = 9 SE +/- 4.39, N = 9 SE +/- 3.91, N = 9 313.92 311.96 307.62 386.39 389.70 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
libavif avifenc Encoder Speed: 6 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 6 r1 r1a r2b r3 r4 4 8 12 16 20 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.23, N = 3 SE +/- 0.13, N = 15 SE +/- 0.12, N = 15 13.25 13.33 16.07 16.62 16.21 1. (CXX) g++ options: -O3 -fPIC -lm
libavif avifenc Encoder Speed: 6, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 6, Lossless r1 r1a r2b r3 r4 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.09, N = 3 SE +/- 0.24, N = 3 SE +/- 0.35, N = 3 SE +/- 0.36, N = 6 32.11 31.62 38.40 38.59 38.51 1. (CXX) g++ options: -O3 -fPIC -lm
libavif avifenc Encoder Speed: 2 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 2 r1 r1a r2b r3 r4 9 18 27 36 45 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 SE +/- 0.40, N = 3 SE +/- 0.20, N = 3 SE +/- 0.08, N = 3 31.54 31.48 38.37 38.31 37.80 1. (CXX) g++ options: -O3 -fPIC -lm
LuaRadio Test: Complex Phase OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Complex Phase r1 r1a r2b r3 r4 120 240 360 480 600 SE +/- 0.25, N = 3 SE +/- 0.71, N = 3 SE +/- 3.61, N = 9 SE +/- 4.31, N = 6 SE +/- 4.50, N = 6 546.8 548.2 458.7 458.2 452.7
libavif avifenc Encoder Speed: 10, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 10, Lossless r1 r1a r2b r3 r4 3 6 9 12 15 SE +/- 0.036, N = 3 SE +/- 0.016, N = 3 SE +/- 0.154, N = 15 SE +/- 0.130, N = 15 SE +/- 0.157, N = 15 8.852 8.812 10.282 10.088 10.208 1. (CXX) g++ options: -O3 -fPIC -lm
Timed Wasmer Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Wasmer Compilation 1.0.2 Time To Compile r1 r1a r2b r3 r4 16 32 48 64 80 SE +/- 0.22, N = 3 SE +/- 0.62, N = 3 SE +/- 0.42, N = 3 SE +/- 0.66, N = 7 SE +/- 0.51, N = 3 62.16 61.93 71.93 71.13 70.76 1. (CC) gcc options: -m64 -pie -nodefaultlibs -ldl -lrt -lpthread -lgcc_s -lc -lm -lutil
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.10.20 Time To Compile r1 r1a r2b r3 r4 7 14 21 28 35 SE +/- 0.30, N = 4 SE +/- 0.28, N = 4 SE +/- 0.32, N = 14 SE +/- 0.41, N = 14 SE +/- 0.37, N = 14 24.38 24.36 28.00 28.02 28.09
libavif avifenc Encoder Speed: 0 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 0 r1 r1a r2b r3 r4 15 30 45 60 75 SE +/- 0.21, N = 3 SE +/- 0.24, N = 3 SE +/- 0.22, N = 3 SE +/- 0.20, N = 3 SE +/- 0.68, N = 3 57.98 57.71 64.97 65.96 65.89 1. (CXX) g++ options: -O3 -fPIC -lm
LuaRadio Test: FM Deemphasis Filter OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: FM Deemphasis Filter r1 r1a r2b r3 r4 90 180 270 360 450 SE +/- 0.21, N = 3 SE +/- 1.40, N = 3 SE +/- 5.30, N = 9 SE +/- 4.83, N = 6 SE +/- 1.19, N = 6 410.0 409.6 370.1 370.3 368.0
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 15.11 Time To Compile r1 r1a r2b r3 r4 30 60 90 120 150 SE +/- 0.27, N = 3 SE +/- 0.29, N = 3 SE +/- 0.50, N = 3 SE +/- 0.68, N = 3 SE +/- 0.78, N = 3 101.10 100.45 110.93 111.79 111.67
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Monero - Hash Count: 1M r1 r1a r2b r3 r4 4K 8K 12K 16K 20K SE +/- 23.28, N = 3 SE +/- 20.55, N = 3 SE +/- 151.73, N = 3 SE +/- 245.77, N = 3 SE +/- 243.31, N = 15 19299.5 19452.0 19311.1 20652.9 20574.6 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Timed Mesa Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 21.0 Time To Compile r1 r1a r2b r3 r4 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.12, N = 3 SE +/- 0.04, N = 3 SE +/- 0.15, N = 3 SE +/- 0.11, N = 3 20.95 20.38 21.58 21.37 21.31
LuxCoreRender Scene: DLSC - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: DLSC - Acceleration: CPU r1 r1a r2b r3 r4 3 6 9 12 15 SE +/- 0.09, N = 3 SE +/- 0.09, N = 15 SE +/- 0.08, N = 15 SE +/- 0.10, N = 3 SE +/- 0.09, N = 3 9.70 9.61 9.27 9.24 9.25 MIN: 8.98 / MAX: 12.22 MIN: 8 / MAX: 12.27 MIN: 8.31 / MAX: 11.98 MIN: 8.74 / MAX: 11.37 MIN: 8.59 / MAX: 11.4
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 12.0 Build System: Unix Makefiles r1 r1a r2b r3 r4 50 100 150 200 250 SE +/- 0.91, N = 3 SE +/- 0.80, N = 3 SE +/- 0.77, N = 3 SE +/- 1.24, N = 3 SE +/- 0.43, N = 3 216.32 215.76 226.44 226.20 224.29
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: mobilenet-v1-1.0 r2b r4 0.7565 1.513 2.2695 3.026 3.7825 SE +/- 0.089, N = 3 SE +/- 0.021, N = 12 3.213 3.362 MIN: 2.8 / MAX: 6.7 MIN: 2.98 / MAX: 6.66 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 1 - Buffer Length: 256 - Filter Length: 57 r1 r2b r3 r4 12M 24M 36M 48M 60M SE +/- 173700.89, N = 3 SE +/- 613156.95, N = 3 SE +/- 550708.74, N = 3 SE +/- 534784.17, N = 3 57792000 56230333 57197667 55251667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M r1 r1a r2b r3 r4 11K 22K 33K 44K 55K SE +/- 425.40, N = 7 SE +/- 588.34, N = 3 SE +/- 238.38, N = 3 SE +/- 358.18, N = 3 SE +/- 235.04, N = 3 48051.5 50166.1 49908.3 49813.4 49937.3 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
srsLTE Test: PHY_DL_Test OpenBenchmarking.org UE Mb/s, More Is Better srsLTE 20.10.1 Test: PHY_DL_Test r1 r1a r2b r3 r4 20 40 60 80 100 SE +/- 0.76, N = 3 SE +/- 1.16, N = 3 SE +/- 0.38, N = 3 SE +/- 1.14, N = 3 SE +/- 0.62, N = 3 76.9 77.3 75.0 76.1 78.3 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
toyBrot Fractal Generator Implementation: C++ Tasks OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: C++ Tasks r1 r1a r2b r3 r4 2K 4K 6K 8K 10K SE +/- 43.45, N = 3 SE +/- 80.44, N = 4 SE +/- 102.03, N = 3 SE +/- 93.55, N = 4 SE +/- 85.46, N = 4 7879 7724 8050 8048 8037 1. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 13 Total Time r1 r1a r2b r3 r4 40M 80M 120M 160M 200M SE +/- 1585265.68, N = 15 SE +/- 2404481.41, N = 3 SE +/- 1982639.48, N = 3 SE +/- 1924842.52, N = 3 SE +/- 2183262.34, N = 4 181644819 186263552 181554218 189214499 186013261 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fprofile-use -fno-peel-loops -fno-tracer -pedantic -O3 -msse -msse3 -mpopcnt -mavx2 -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto=jobserver
VOSK Speech Recognition Toolkit OpenBenchmarking.org Seconds, Fewer Is Better VOSK Speech Recognition Toolkit 0.3.21 r1 r1a r2b r3 r4 8 16 24 32 40 SE +/- 0.32, N = 3 SE +/- 0.29, N = 8 SE +/- 0.43, N = 3 SE +/- 0.43, N = 3 SE +/- 0.32, N = 3 35.92 35.01 36.42 35.58 35.50
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 0.2029 0.4058 0.6087 0.8116 1.0145 SE +/- 0.006225, N = 3 SE +/- 0.003986, N = 3 SE +/- 0.004902, N = 3 SE +/- 0.006631, N = 3 SE +/- 0.005244, N = 3 0.877815 0.879137 0.869978 0.901823 0.875421 MIN: 0.82 MIN: 0.83 MIN: 0.82 MIN: 0.84 MIN: 0.82 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 r1 r1a r2b r3 r4 200M 400M 600M 800M 1000M SE +/- 691953.76, N = 3 SE +/- 669162.00, N = 3 SE +/- 3620722.76, N = 3 SE +/- 859903.10, N = 3 SE +/- 10609570.10, N = 3 885320000 890273333 862890000 865410000 860046667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 0.2123 0.4246 0.6369 0.8492 1.0615 SE +/- 0.002101, N = 3 SE +/- 0.002111, N = 3 SE +/- 0.011253, N = 3 SE +/- 0.007264, N = 3 SE +/- 0.008450, N = 3 0.918568 0.912279 0.943624 0.936941 0.940714 MIN: 0.85 MIN: 0.86 MIN: 0.86 MIN: 0.85 MIN: 0.86 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
LuxCoreRender Scene: Orange Juice - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Orange Juice - Acceleration: CPU r1 r1a r2b r3 r4 4 8 12 16 20 SE +/- 0.13, N = 3 SE +/- 0.21, N = 3 SE +/- 0.18, N = 3 SE +/- 0.12, N = 15 SE +/- 0.13, N = 15 14.36 14.26 14.28 13.89 13.94 MIN: 11.58 / MAX: 19.44 MIN: 11.6 / MAX: 19.3 MIN: 11.93 / MAX: 17.73 MIN: 11.08 / MAX: 17.77 MIN: 11.06 / MAX: 17.84
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 r1 r2b r3 r4 90M 180M 270M 360M 450M SE +/- 422150.58, N = 3 SE +/- 2458908.97, N = 3 SE +/- 1240739.03, N = 3 SE +/- 2739929.03, N = 3 441953333 428100000 432170000 432013333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 0.2578 0.5156 0.7734 1.0312 1.289 SE +/- 0.00274, N = 3 SE +/- 0.00124, N = 3 SE +/- 0.00330, N = 3 SE +/- 0.00975, N = 3 SE +/- 0.01182, N = 3 1.10991 1.12224 1.11874 1.14578 1.11811 MIN: 1.02 MIN: 1.02 MIN: 1.02 MIN: 1.04 MIN: 1.02 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
toyBrot Fractal Generator Implementation: C++ Threads OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: C++ Threads r1 r1a r2b r3 r4 1500 3000 4500 6000 7500 SE +/- 49.12, N = 3 SE +/- 29.96, N = 3 SE +/- 89.67, N = 3 SE +/- 98.76, N = 3 SE +/- 76.94, N = 4 7018 6980 7149 7203 7141 1. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc
HammerDB - MariaDB Virtual Users: 64 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 64 - Warehouses: 500 r1 r1a 14K 28K 42K 56K 70K SE +/- 620.04, N = 3 SE +/- 730.55, N = 9 64298 62311 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 64 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 64 - Warehouses: 500 r1 r1a 40K 80K 120K 160K 200K SE +/- 2149.33, N = 3 SE +/- 2084.32, N = 9 194684 188761 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
GNU GMP GMPbench Total Time OpenBenchmarking.org GMPbench Score, More Is Better GNU GMP GMPbench 6.2.1 Total Time r1 r1a r2b r3 r4 1000 2000 3000 4000 5000 4642.1 4642.8 4524.5 4504.5 4525.7 1. (CC) gcc options: -O3 -fomit-frame-pointer -lm
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 2.1.0 Test: Decompression Throughput r1 r1a r2b r3 r4 40 80 120 160 200 SE +/- 0.15, N = 3 SE +/- 0.39, N = 3 SE +/- 0.07, N = 3 SE +/- 1.04, N = 3 SE +/- 0.47, N = 3 161.63 156.97 160.26 159.19 159.24 1. (CC) gcc options: -O3 -rdynamic
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 0.0915 0.183 0.2745 0.366 0.4575 SE +/- 0.001135, N = 3 SE +/- 0.001124, N = 3 SE +/- 0.004259, N = 4 SE +/- 0.003204, N = 10 SE +/- 0.002415, N = 14 0.398282 0.395588 0.403409 0.406877 0.402919 MIN: 0.37 MIN: 0.36 MIN: 0.36 MIN: 0.37 MIN: 0.36 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
LuaRadio Test: Hilbert Transform OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Hilbert Transform r1 r1a r2b r3 r4 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.41, N = 9 SE +/- 0.47, N = 6 SE +/- 0.61, N = 6 80.3 80.3 78.2 78.2 78.4
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 0.4148 0.8296 1.2444 1.6592 2.074 SE +/- 0.00580, N = 3 SE +/- 0.00121, N = 3 SE +/- 0.01382, N = 3 SE +/- 0.02043, N = 3 SE +/- 0.00968, N = 3 1.80046 1.79881 1.81774 1.84339 1.81913 MIN: 1.68 MIN: 1.69 MIN: 1.69 MIN: 1.67 MIN: 1.68 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
toyBrot Fractal Generator Implementation: TBB OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: TBB r1 r1a r2b r3 r4 1500 3000 4500 6000 7500 SE +/- 59.06, N = 15 SE +/- 80.68, N = 3 SE +/- 73.83, N = 15 SE +/- 69.20, N = 15 SE +/- 81.70, N = 15 6850 6964 6984 7003 7016 1. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 0.2801 0.5602 0.8403 1.1204 1.4005 SE +/- 0.01080, N = 15 SE +/- 0.01126, N = 15 SE +/- 0.01174, N = 15 SE +/- 0.01066, N = 15 SE +/- 0.00891, N = 15 1.21594 1.22278 1.23796 1.24508 1.24116 MIN: 0.84 MIN: 0.85 MIN: 0.87 MIN: 0.89 MIN: 0.85 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 r1 r1a r2b r3 r4 400M 800M 1200M 1600M 2000M SE +/- 3951371.07, N = 3 SE +/- 2515949.13, N = 3 SE +/- 10121648.97, N = 3 SE +/- 6582552.70, N = 3 1735100000 1736800000 1699333333 1704500000 1697500000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
MariaDB Clients: 4 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 4 r2b r3 300 600 900 1200 1500 SE +/- 16.07, N = 3 SE +/- 7.20, N = 3 1614 1580 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 4 - Buffer Length: 256 - Filter Length: 57 r1 r2b r3 r4 50M 100M 150M 200M 250M SE +/- 1090112.12, N = 3 SE +/- 824809.74, N = 3 SE +/- 1663583.82, N = 3 SE +/- 1956802.95, N = 3 217643333 213203333 215343333 216773333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 1:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - 1:1 Reads-Writes r1 r1a r2a r2b r3 r4 r5 100K 200K 300K 400K 500K SE +/- 1187.16, N = 3 SE +/- 148.63, N = 3 SE +/- 212.40, N = 3 SE +/- 314.54, N = 3 SE +/- 138.13, N = 3 SE +/- 1601.80, N = 3 SE +/- 847.23, N = 3 442422.3 442843.2 442144.2 440454.7 449554.1 446396.0 448800.1
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 7.46, N = 3 SE +/- 4.49, N = 3 SE +/- 9.76, N = 3 SE +/- 1.09, N = 3 SE +/- 2.67, N = 3 801.41 804.32 808.29 796.69 792.30 MIN: 767.38 MIN: 765.37 MIN: 767.97 MIN: 771.28 MIN: 763.96 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 12.0 Build System: Ninja r1 r1a r2b r3 r4 30 60 90 120 150 SE +/- 0.52, N = 3 SE +/- 0.75, N = 3 SE +/- 1.12, N = 3 SE +/- 0.32, N = 3 SE +/- 0.56, N = 3 145.72 145.55 148.48 147.16 146.91
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 0.8197 1.6394 2.4591 3.2788 4.0985 SE +/- 0.00924, N = 3 SE +/- 0.00795, N = 3 SE +/- 0.05421, N = 14 SE +/- 0.05675, N = 14 SE +/- 0.05617, N = 14 3.57247 3.57662 3.64232 3.64033 3.64319 MIN: 3.53 MIN: 3.5 MIN: 3.51 MIN: 3.47 MIN: 3.5 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 2 - Buffer Length: 256 - Filter Length: 57 r1 r2b r3 r4 20M 40M 60M 80M 100M SE +/- 729984.78, N = 3 SE +/- 907677.13, N = 3 SE +/- 430348.70, N = 3 SE +/- 132035.35, N = 3 110713333 110173333 111510000 109430000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 128 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 128 - Buffer Length: 256 - Filter Length: 57 r1 r1a r2b r3 r4 700M 1400M 2100M 2800M 3500M SE +/- 8088331.79, N = 3 SE +/- 38975091.76, N = 3 SE +/- 14312737.14, N = 3 SE +/- 6896617.53, N = 3 SE +/- 16537936.19, N = 3 3415933333 3352733333 3400066667 3411000000 3398800000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
KTX-Software toktx Settings: UASTC 3 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 r2b r4 1.2744 2.5488 3.8232 5.0976 6.372 SE +/- 0.053, N = 15 SE +/- 0.008, N = 3 5.664 5.562
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 0.4764 0.9528 1.4292 1.9056 2.382 SE +/- 0.00138, N = 3 SE +/- 0.00168, N = 3 SE +/- 0.01980, N = 3 SE +/- 0.01943, N = 3 SE +/- 0.01801, N = 3 2.07944 2.08532 2.11712 2.10841 2.10837 MIN: 2.03 MIN: 2.03 MIN: 2.03 MIN: 2.03 MIN: 2.03 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
toyBrot Fractal Generator Implementation: OpenMP OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: OpenMP r1 r1a r2b r3 r4 1600 3200 4800 6400 8000 SE +/- 5.13, N = 3 SE +/- 0.88, N = 3 SE +/- 101.59, N = 3 SE +/- 85.45, N = 4 SE +/- 91.12, N = 4 7318 7308 7412 7439 7429 1. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: inception-v3 r2b r4 12 24 36 48 60 SE +/- 1.54, N = 3 SE +/- 0.75, N = 12 53.07 52.23 MIN: 49.59 / MAX: 69.62 MIN: 47.47 / MAX: 94.69 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 0.6771 1.3542 2.0313 2.7084 3.3855 SE +/- 0.00128, N = 3 SE +/- 0.00276, N = 3 SE +/- 0.02287, N = 13 SE +/- 0.02478, N = 14 SE +/- 0.02449, N = 14 2.96135 2.96857 3.00464 3.00929 3.00907 MIN: 2.84 MIN: 2.84 MIN: 2.84 MIN: 2.84 MIN: 2.84 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
MariaDB Clients: 128 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 128 r2b r3 40 80 120 160 200 SE +/- 0.65, N = 3 SE +/- 0.35, N = 3 192 189 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 0.1355 0.271 0.4065 0.542 0.6775 SE +/- 0.001703, N = 3 SE +/- 0.000780, N = 3 SE +/- 0.004180, N = 3 SE +/- 0.004400, N = 3 SE +/- 0.003648, N = 3 0.593042 0.595661 0.602122 0.602314 0.602038 MIN: 0.56 MIN: 0.56 MIN: 0.56 MIN: 0.56 MIN: 0.56 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
KTX-Software toktx Settings: Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: Zstd Compression 19 r2b r4 5 10 15 20 25 SE +/- 0.22, N = 3 SE +/- 0.20, N = 3 19.78 20.08
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 0.1972 0.3944 0.5916 0.7888 0.986 SE +/- 0.002419, N = 3 SE +/- 0.002055, N = 3 SE +/- 0.008361, N = 14 SE +/- 0.007890, N = 14 SE +/- 0.007461, N = 14 0.864164 0.863214 0.874080 0.874968 0.876227 MIN: 0.84 MIN: 0.84 MIN: 0.83 MIN: 0.84 MIN: 0.84 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 0.0488 0.0976 0.1464 0.1952 0.244 SE +/- 0.000867, N = 3 SE +/- 0.000781, N = 3 SE +/- 0.001893, N = 8 SE +/- 0.002019, N = 7 SE +/- 0.001544, N = 12 0.215115 0.213643 0.216806 0.216586 0.215085 MIN: 0.19 MIN: 0.19 MIN: 0.19 MIN: 0.19 MIN: 0.19 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
srsLTE Test: PHY_DL_Test OpenBenchmarking.org eNb Mb/s, More Is Better srsLTE 20.10.1 Test: PHY_DL_Test r1 r1a r2b r3 r4 40 80 120 160 200 SE +/- 1.15, N = 3 SE +/- 0.36, N = 3 SE +/- 1.23, N = 3 SE +/- 2.42, N = 3 SE +/- 0.58, N = 3 183.4 184.2 181.6 181.6 183.7 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
Botan Test: AES-256 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 r1 r1a r2b r3 r4 1200 2400 3600 4800 6000 SE +/- 0.92, N = 3 SE +/- 0.28, N = 3 SE +/- 55.60, N = 3 SE +/- 42.23, N = 3 SE +/- 51.03, N = 3 5669.70 5670.81 5606.97 5593.37 5612.00 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 0.0547 0.1094 0.1641 0.2188 0.2735 SE +/- 0.000856, N = 3 SE +/- 0.000662, N = 3 SE +/- 0.003187, N = 3 SE +/- 0.002507, N = 5 SE +/- 0.002245, N = 7 0.239989 0.240122 0.243026 0.243308 0.242450 MIN: 0.22 MIN: 0.23 MIN: 0.22 MIN: 0.22 MIN: 0.22 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Botan Test: KASUMI OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI r1 r1a r2b r3 r4 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 1.01, N = 3 SE +/- 0.77, N = 3 SE +/- 0.87, N = 3 77.29 77.31 76.29 76.41 76.40 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 2 r2b r4 4 8 12 16 20 SE +/- 0.18, N = 3 SE +/- 0.15, N = 3 13.98 14.16 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Botan Test: CAST-256 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 r1 r1a r2b r3 r4 30 60 90 120 150 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 1.15, N = 3 SE +/- 1.33, N = 3 SE +/- 1.17, N = 3 115.97 115.97 114.66 114.52 114.65 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: ChaCha20Poly1305 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 r1 r1a r2b r3 r4 130 260 390 520 650 SE +/- 0.03, N = 3 SE +/- 0.17, N = 3 SE +/- 3.48, N = 3 SE +/- 3.19, N = 3 SE +/- 2.98, N = 3 623.49 623.20 615.81 616.50 619.64 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 64 - Buffer Length: 256 - Filter Length: 57 r1 r1a r2b r3 r4 700M 1400M 2100M 2800M 3500M SE +/- 5206513.02, N = 3 SE +/- 2150193.79, N = 3 SE +/- 17049079.48, N = 3 SE +/- 14893734.70, N = 3 SE +/- 12876378.03, N = 3 3267133333 3263700000 3227433333 3232700000 3245666667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Botan Test: ChaCha20Poly1305 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt r1 r1a r2b r3 r4 130 260 390 520 650 SE +/- 0.40, N = 3 SE +/- 0.57, N = 3 SE +/- 3.49, N = 3 SE +/- 3.74, N = 3 SE +/- 2.81, N = 3 619.46 619.54 612.44 612.15 615.98 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
SecureMark Benchmark: SecureMark-TLS OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS r1 r1a r2b r3 r4 50K 100K 150K 200K 250K SE +/- 234.37, N = 3 SE +/- 236.12, N = 3 SE +/- 84.15, N = 3 SE +/- 267.95, N = 3 SE +/- 2769.20, N = 3 225412 225366 225343 225291 222747 1. (CC) gcc options: -pedantic -O3
Botan Test: Blowfish OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish r1 r1a r2b r3 r4 80 160 240 320 400 SE +/- 0.56, N = 3 SE +/- 0.05, N = 3 SE +/- 0.11, N = 3 SE +/- 3.73, N = 3 SE +/- 3.51, N = 3 363.04 363.62 362.93 359.45 359.57 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.4.1 Model: Church Facade r2b r4 1500 3000 4500 6000 7500 SE +/- 20.01, N = 3 SE +/- 3.33, N = 3 7001 7082 1. (CXX) g++ options: -O3
Botan Test: Twofish OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish r1 r1a r2b r3 r4 60 120 180 240 300 SE +/- 0.14, N = 3 SE +/- 0.14, N = 3 SE +/- 0.11, N = 3 SE +/- 2.66, N = 3 SE +/- 2.83, N = 3 289.13 288.85 288.56 286.18 286.00 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 0.0769 0.1538 0.2307 0.3076 0.3845 SE +/- 0.000853, N = 3 SE +/- 0.002562, N = 3 SE +/- 0.003448, N = 5 SE +/- 0.003372, N = 6 SE +/- 0.004121, N = 3 0.338327 0.341663 0.341893 0.341955 0.340243 MIN: 0.3 MIN: 0.31 MIN: 0.3 MIN: 0.31 MIN: 0.3 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Liquid-DSP Threads: 160 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 160 - Buffer Length: 256 - Filter Length: 57 r1 r1a r2b r3 r4 700M 1400M 2100M 2800M 3500M SE +/- 17047384.94, N = 3 SE +/- 2062630.47, N = 3 SE +/- 14685858.66, N = 3 SE +/- 14901789.60, N = 3 SE +/- 16411005.79, N = 3 3144800000 3162066667 3131866667 3143300000 3140266667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 100 200 300 400 500 SE +/- 0.58, N = 3 SE +/- 0.90, N = 3 SE +/- 0.78, N = 3 SE +/- 2.40, N = 3 SE +/- 1.10, N = 3 447.97 447.31 446.39 450.65 446.54 MIN: 433.22 MIN: 432.33 MIN: 432.04 MIN: 432.96 MIN: 429.71 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU r1 r1a r2b r3 r4 0.282 0.564 0.846 1.128 1.41 SE +/- 0.00180, N = 3 SE +/- 0.01592, N = 15 SE +/- 0.00964, N = 3 SE +/- 0.01211, N = 3 SE +/- 0.01282, N = 3 1.24809 1.25267 1.25313 1.24176 1.24222 MIN: 1.2 MIN: 1.19 MIN: 1.2 MIN: 1.18 MIN: 1.19 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 0.8015 1.603 2.4045 3.206 4.0075 SE +/- 0.00193, N = 3 SE +/- 0.00732, N = 3 SE +/- 0.00854, N = 3 SE +/- 0.01280, N = 3 SE +/- 0.00650, N = 3 3.53026 3.54367 3.53121 3.56224 3.54783 MIN: 3.38 MIN: 3.38 MIN: 3.37 MIN: 3.39 MIN: 3.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Helsing Digit Range: 14 digit OpenBenchmarking.org Seconds, Fewer Is Better Helsing 1.0-beta Digit Range: 14 digit r1 r1a r2b r3 r4 20 40 60 80 100 77.87 78.16 78.33 78.08 78.54 1. (CC) gcc options: -O2 -pthread -lcrypto
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 100 200 300 400 500 SE +/- 0.58, N = 3 SE +/- 1.79, N = 3 SE +/- 0.65, N = 3 SE +/- 1.24, N = 3 SE +/- 3.51, N = 3 445.14 446.94 447.29 447.14 448.91 MIN: 431.52 MIN: 430.47 MIN: 433.06 MIN: 432.42 MIN: 431.33 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Fishy Cat - Compute: CPU-Only r2b r4 11 22 33 44 55 SE +/- 0.15, N = 3 SE +/- 0.25, N = 3 46.38 46.73
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.4.1 Model: Lion r2b r4 1300 2600 3900 5200 6500 SE +/- 25.21, N = 3 SE +/- 21.15, N = 3 6126 6170 1. (CXX) g++ options: -O3
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Classroom - Compute: CPU-Only r2b r4 16 32 48 64 80 SE +/- 0.08, N = 3 SE +/- 0.13, N = 3 71.78 72.29
Intel Memory Latency Checker Test: Max Bandwidth - 1:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - 1:1 Reads-Writes r1 r1a r2a r2b r3 r4 r5 90K 180K 270K 360K 450K SE +/- 821.19, N = 3 SE +/- 1093.30, N = 3 SE +/- 1844.14, N = 3 SE +/- 3117.58, N = 3 SE +/- 276.68, N = 3 SE +/- 2322.32, N = 3 SE +/- 1051.98, N = 3 439496.74 441408.09 442460.05 441732.77 440939.22 440315.41 440205.22
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 2:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - 2:1 Reads-Writes r1 r1a r2a r2b r3 r4 r5 100K 200K 300K 400K 500K SE +/- 274.15, N = 3 SE +/- 130.28, N = 3 SE +/- 115.55, N = 3 SE +/- 64.32, N = 3 SE +/- 73.04, N = 3 SE +/- 36.24, N = 3 SE +/- 12.06, N = 3 459038.6 456260.3 456408.6 459309.8 457190.5 458941.9 458830.6
Intel Memory Latency Checker Test: Max Bandwidth - 2:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - 2:1 Reads-Writes r1 r1a r2a r2b r3 r4 r5 100K 200K 300K 400K 500K SE +/- 33.49, N = 3 SE +/- 129.26, N = 3 SE +/- 54.98, N = 3 SE +/- 51.02, N = 3 SE +/- 89.89, N = 3 SE +/- 8.60, N = 3 SE +/- 53.22, N = 3 459455.38 456629.89 456545.88 459226.53 457141.24 458790.96 458756.46
srsLTE Test: OFDM_Test OpenBenchmarking.org Samples / Second, More Is Better srsLTE 20.10.1 Test: OFDM_Test r1 r1a r2b r3 r4 30M 60M 90M 120M 150M SE +/- 611010.09, N = 3 SE +/- 240370.09, N = 3 SE +/- 366666.67, N = 3 SE +/- 600925.21, N = 3 SE +/- 233333.33, N = 3 120300000 120133333 120733333 120833333 120666667 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Medium r2b r4 2 4 6 8 10 SE +/- 0.0906, N = 15 SE +/- 0.0290, N = 3 7.1887 7.1472 1. (CXX) g++ options: -O3 -flto -pthread
Intel Memory Latency Checker Test: Peak Injection Bandwidth - All Reads OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - All Reads r1 r1a r2a r2b r3 r4 r5 80K 160K 240K 320K 400K SE +/- 709.43, N = 3 SE +/- 14.58, N = 3 SE +/- 37.47, N = 3 SE +/- 14.54, N = 3 SE +/- 24.95, N = 3 SE +/- 26.62, N = 3 SE +/- 23.85, N = 3 356476.2 358385.5 358269.7 357742.9 358463.7 358110.5 357722.7
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 100 200 300 400 500 SE +/- 0.85, N = 3 SE +/- 2.18, N = 3 SE +/- 1.13, N = 3 SE +/- 0.04, N = 3 SE +/- 2.63, N = 3 445.52 447.44 447.70 446.92 447.96 MIN: 431.18 MIN: 429.4 MIN: 433.04 MIN: 433.64 MIN: 429.99 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: ETC1S r2b r4 8 16 24 32 40 SE +/- 0.21, N = 3 SE +/- 0.42, N = 3 34.24 34.42 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
MariaDB Clients: 8 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 8 r2b r3 300 600 900 1200 1500 SE +/- 10.97, N = 3 SE +/- 3.56, N = 3 1413 1420 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: BMW27 - Compute: CPU-Only r2b r4 7 14 21 28 35 SE +/- 0.08, N = 3 SE +/- 0.32, N = 3 29.56 29.69
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 3:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - 3:1 Reads-Writes r1 r1a r2a r2b r3 r4 r5 90K 180K 270K 360K 450K SE +/- 163.24, N = 3 SE +/- 94.95, N = 3 SE +/- 236.99, N = 3 SE +/- 25.04, N = 3 SE +/- 88.34, N = 3 SE +/- 23.30, N = 3 SE +/- 23.30, N = 3 425933.7 424096.6 424077.3 425925.6 424904.5 425822.1 425508.1
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 2.07, N = 3 SE +/- 3.65, N = 3 SE +/- 1.48, N = 3 SE +/- 2.18, N = 3 SE +/- 1.96, N = 3 792.83 791.93 789.84 793.08 792.05 MIN: 763.76 MIN: 765.01 MIN: 767.03 MIN: 768.2 MIN: 765.9 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Intel Memory Latency Checker Test: Max Bandwidth - 3:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - 3:1 Reads-Writes r1 r1a r2a r2b r3 r4 r5 90K 180K 270K 360K 450K SE +/- 105.41, N = 3 SE +/- 465.24, N = 3 SE +/- 392.90, N = 3 SE +/- 71.38, N = 3 SE +/- 109.66, N = 3 SE +/- 67.02, N = 3 SE +/- 133.64, N = 3 426148.96 424612.62 424818.83 425997.22 424925.84 425848.09 425467.51
Sysbench Test: RAM / Memory OpenBenchmarking.org MiB/sec, More Is Better Sysbench 1.0.20 Test: RAM / Memory r2b r4 3K 6K 9K 12K 15K SE +/- 125.16, N = 15 SE +/- 118.72, N = 15 12510.56 12553.44 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Intel Memory Latency Checker Test: Max Bandwidth - All Reads OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - All Reads r1 r1a r2a r2b r3 r4 r5 80K 160K 240K 320K 400K SE +/- 67.01, N = 3 SE +/- 142.76, N = 3 SE +/- 107.35, N = 3 SE +/- 83.63, N = 3 SE +/- 59.61, N = 3 SE +/- 83.70, N = 3 SE +/- 46.23, N = 3 357285.28 358364.56 358456.09 357774.43 358268.00 357925.98 357550.82
Botan Test: CAST-256 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 - Decrypt r1 r1a r2b r3 r4 30 60 90 120 150 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.35, N = 3 SE +/- 0.01, N = 3 116.07 116.07 116.08 115.72 116.07 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
MariaDB Clients: 64 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 64 r2b r3 90 180 270 360 450 SE +/- 0.62, N = 3 SE +/- 0.16, N = 3 403 404 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
Botan Test: AES-256 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 - Decrypt r1 r1a r2b r3 r4 1200 2400 3600 4800 6000 SE +/- 1.20, N = 3 SE +/- 0.12, N = 3 SE +/- 0.94, N = 3 SE +/- 1.10, N = 3 SE +/- 12.66, N = 3 5663.06 5663.61 5662.76 5662.34 5650.14 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
MariaDB Clients: 32 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 32 r2b r3 200 400 600 800 1000 SE +/- 0.26, N = 3 SE +/- 1.83, N = 3 885 887 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 0 r2b r4 3 6 9 12 15 SE +/- 0.08, N = 15 SE +/- 0.08, N = 3 11.25 11.23 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Thorough r2b r4 3 6 9 12 15 SE +/- 0.0796, N = 8 SE +/- 0.0879, N = 7 9.2907 9.3091 1. (CXX) g++ options: -O3 -flto -pthread
KTX-Software toktx Settings: UASTC 4 + Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 4 + Zstd Compression 19 r2b r4 13 26 39 52 65 SE +/- 0.68, N = 4 SE +/- 0.74, N = 3 56.66 56.77
KTX-Software toktx Settings: UASTC 3 + Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 + Zstd Compression 19 r2b r4 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.11, N = 5 10.01 10.03
Intel Memory Latency Checker Test: Max Bandwidth - Stream-Triad Like OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - Stream-Triad Like r1 r1a r2a r2b r3 r4 r5 70K 140K 210K 280K 350K SE +/- 25.05, N = 3 SE +/- 11.61, N = 3 SE +/- 53.08, N = 3 SE +/- 50.20, N = 3 SE +/- 50.80, N = 3 SE +/- 7.71, N = 3 SE +/- 22.58, N = 3 325766.94 325184.58 325260.41 325409.99 325218.50 325314.62 325312.30
Intel Memory Latency Checker Test: Peak Injection Bandwidth - Stream-Triad Like OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - Stream-Triad Like r1 r1a r2a r2b r3 r4 r5 70K 140K 210K 280K 350K SE +/- 177.93, N = 3 SE +/- 38.10, N = 3 SE +/- 34.05, N = 3 SE +/- 12.95, N = 3 SE +/- 32.03, N = 3 SE +/- 60.42, N = 3 SE +/- 55.81, N = 3 324377.2 323924.2 323826.9 324209.8 324227.4 324112.8 324234.5
MariaDB Clients: 16 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 16 r2b r3 300 600 900 1200 1500 SE +/- 1.85, N = 3 SE +/- 3.49, N = 3 1264 1262 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
Botan Test: Twofish - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish - Decrypt r1 r1a r2b r3 r4 60 120 180 240 300 SE +/- 0.14, N = 3 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 292.74 292.37 292.40 292.83 292.61 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 3 r2b r4 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 17.16 17.19 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Pabellon Barcelona - Compute: CPU-Only r2b r4 20 40 60 80 100 SE +/- 0.08, N = 3 SE +/- 0.28, N = 3 88.57 88.68
HammerDB - MariaDB Virtual Users: 128 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 128 - Warehouses: 500 r1 r1a 12K 24K 36K 48K 60K SE +/- 891.59, N = 9 SE +/- 484.29, N = 9 57190 57242 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Exhaustive r2b r4 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 16.36 16.37 1. (CXX) g++ options: -O3 -flto -pthread
Botan Test: KASUMI - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI - Decrypt r1 r1a r2b r3 r4 20 40 60 80 100 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 74.32 74.29 74.28 74.31 74.29 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: SqueezeNetV1.0 r2b r4 2 4 6 8 10 SE +/- 0.002, N = 3 SE +/- 0.078, N = 12 7.174 7.170 MIN: 6.95 / MAX: 7.88 MIN: 6.38 / MAX: 9.97 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Barbershop - Compute: CPU-Only r2b r4 20 40 60 80 100 SE +/- 0.18, N = 3 SE +/- 0.59, N = 3 110.02 109.96
Botan Test: Blowfish - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish - Decrypt r1 r1a r2b r3 r4 80 160 240 320 400 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 363.26 363.33 363.20 363.31 363.28 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
HammerDB - MariaDB Virtual Users: 128 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 128 - Warehouses: 500 r1 r1a 40K 80K 120K 160K 200K SE +/- 2691.06, N = 9 SE +/- 1389.03, N = 9 173288 173228 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
Sysbench Test: CPU OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU r2b r4 50K 100K 150K 200K 250K SE +/- 247.29, N = 3 SE +/- 269.51, N = 3 214210.83 214241.34 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
MariaDB Clients: 512 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 512 r2b 40 80 120 160 200 SE +/- 0.87, N = 3 166 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
MariaDB Clients: 256 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 256 r2b 40 80 120 160 200 SE +/- 0.22, N = 3 160 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 8.1 Input: Fayalite-FIST r2a 300 600 900 1200 1500 1374.66
HammerDB - MariaDB Virtual Users: 128 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 128 - Warehouses: 250 r1 40K 80K 120K 160K 200K SE +/- 2616.54, N = 9 167809 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 128 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 128 - Warehouses: 250 r1 12K 24K 36K 48K 60K SE +/- 857.30, N = 9 55415 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 64 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 64 - Warehouses: 250 r1 40K 80K 120K 160K 200K SE +/- 2831.11, N = 9 191397 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 64 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 64 - Warehouses: 250 r1 14K 28K 42K 56K 70K SE +/- 937.55, N = 9 63279 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 32 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 32 - Warehouses: 500 r1 40K 80K 120K 160K 200K SE +/- 2885.40, N = 9 208419 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 32 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 32 - Warehouses: 500 r1 15K 30K 45K 60K 75K SE +/- 921.11, N = 9 68818 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 32 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 32 - Warehouses: 250 r1 40K 80K 120K 160K 200K SE +/- 3390.81, N = 9 209254 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 32 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 32 - Warehouses: 250 r1 15K 30K 45K 60K 75K SE +/- 1078.76, N = 9 69054 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 16 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 16 - Warehouses: 500 r1 40K 80K 120K 160K 200K SE +/- 3159.46, N = 9 195258 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 16 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 16 - Warehouses: 500 r1 14K 28K 42K 56K 70K SE +/- 1031.07, N = 9 64477 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 16 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 16 - Warehouses: 250 r1 40K 80K 120K 160K 200K SE +/- 2649.02, N = 3 192913 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 16 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 16 - Warehouses: 250 r1 14K 28K 42K 56K 70K SE +/- 880.35, N = 3 63757 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 8 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 8 - Warehouses: 500 r1 60K 120K 180K 240K 300K SE +/- 2338.98, N = 3 285984 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 8 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 8 - Warehouses: 500 r1 20K 40K 60K 80K 100K SE +/- 693.36, N = 3 94379 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 8 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 8 - Warehouses: 250 r1 60K 120K 180K 240K 300K SE +/- 2006.72, N = 3 290082 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 8 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 8 - Warehouses: 250 r1 20K 40K 60K 80K 100K SE +/- 675.05, N = 3 95768 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: MobileNetV2_224 r2b r4 0.9225 1.845 2.7675 3.69 4.6125 SE +/- 0.333, N = 3 SE +/- 0.135, N = 12 4.078 4.100 MIN: 2.9 / MAX: 13.17 MIN: 2.97 / MAX: 12.98 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: resnet-v2-50 r2b r4 11 22 33 44 55 SE +/- 2.59, N = 3 SE +/- 1.07, N = 12 48.73 48.04 MIN: 43.19 / MAX: 69.59 MIN: 42.13 / MAX: 145.2 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
KTX-Software toktx Settings: Zstd Compression 9 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: Zstd Compression 9 r2b r4 0.8318 1.6636 2.4954 3.3272 4.159 SE +/- 0.003, N = 3 SE +/- 0.064, N = 15 3.470 3.697
MariaDB Clients: 1 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 1 r2b r3 700 1400 2100 2800 3500 SE +/- 73.97, N = 15 SE +/- 61.33, N = 12 3336 3458 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT r1 r1a r2b r3 r4 20 40 60 80 100 SE +/- 1.45, N = 13 SE +/- 0.90, N = 3 SE +/- 1.75, N = 15 SE +/- 2.33, N = 15 SE +/- 2.94, N = 15 76.3 77.2 54.7 61.7 63.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN r1 r1a r2b r3 r4 20 40 60 80 100 SE +/- 1.67, N = 13 SE +/- 0.69, N = 3 SE +/- 2.02, N = 15 SE +/- 1.88, N = 15 SE +/- 2.43, N = 14 76.0 77.4 62.3 66.9 67.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT r1 r1a r2b r3 r4 20 40 60 80 100 SE +/- 1.88, N = 13 SE +/- 1.01, N = 3 SE +/- 1.14, N = 15 SE +/- 1.99, N = 15 SE +/- 1.98, N = 15 75.6 76.8 59.8 68.9 72.4 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN r1 r1a r2b r3 r4 16 32 48 64 80 SE +/- 1.42, N = 14 SE +/- 3.11, N = 3 SE +/- 2.06, N = 15 SE +/- 2.18, N = 15 SE +/- 1.95, N = 15 73.5 72.3 61.9 66.4 70.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T r1 r1a r2b r3 r4 160 320 480 640 800 SE +/- 2.46, N = 13 SE +/- 5.04, N = 3 SE +/- 27.49, N = 15 SE +/- 2.02, N = 15 SE +/- 3.20, N = 15 719.0 319.0 389.9 647.0 647.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N r1 r1a r2b r3 r4 16 32 48 64 80 SE +/- 0.36, N = 14 SE +/- 2.90, N = 3 SE +/- 3.75, N = 15 SE +/- 3.93, N = 15 SE +/- 0.25, N = 15 72.3 63.6 62.3 64.3 70.2 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT r1 r1a r2b r3 r4 160 320 480 640 800 SE +/- 6.43, N = 14 SE +/- 34.44, N = 3 SE +/- 34.40, N = 14 SE +/- 50.57, N = 15 SE +/- 2.76, N = 15 720.00 371.00 447.65 713.47 765.00 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 20.63, N = 14 SE +/- 23.02, N = 3 SE +/- 40.80, N = 15 SE +/- 82.34, N = 15 SE +/- 5.62, N = 15 1058.0 392.0 507.1 1024.2 1158.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 25.47, N = 14 SE +/- 29.90, N = 3 SE +/- 35.11, N = 15 SE +/- 26.97, N = 15 SE +/- 9.73, N = 15 843.0 335.0 422.2 913.0 936.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT r1 r1a r2b r3 r4 130 260 390 520 650 SE +/- 2.34, N = 14 SE +/- 11.67, N = 3 SE +/- 5.60, N = 15 SE +/- 2.55, N = 15 SE +/- 2.45, N = 15 620 277 349 532 535 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 6.62, N = 14 SE +/- 15.25, N = 3 SE +/- 10.36, N = 15 SE +/- 8.11, N = 15 SE +/- 11.35, N = 15 1003 370 474 862 855 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY r1 r1a r2b r3 r4 400 800 1200 1600 2000 SE +/- 16.63, N = 14 SE +/- 4.10, N = 3 SE +/- 22.07, N = 15 SE +/- 51.32, N = 15 SE +/- 54.62, N = 15 1834 504 691 1135 1167 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 7.01, N = 3 SE +/- 1.56, N = 3 SE +/- 0.61, N = 3 SE +/- 0.83, N = 3 SE +/- 16.86, N = 14 804.39 793.36 791.70 793.92 811.94 MIN: 763.49 MIN: 765.14 MIN: 769.61 MIN: 769 MIN: 761.61 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU r1 r1a r2b r3 r4 0.0491 0.0982 0.1473 0.1964 0.2455 SE +/- 0.002205, N = 15 SE +/- 0.001109, N = 3 SE +/- 0.004449, N = 15 SE +/- 0.003384, N = 15 SE +/- 0.004970, N = 15 0.210919 0.210728 0.210324 0.218349 0.217941 MIN: 0.19 MIN: 0.2 MIN: 0.18 MIN: 0.19 MIN: 0.19 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
libavif avifenc Encoder Speed: 10 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 10 r1 r1a r2b r3 r4 2 4 6 8 10 SE +/- 0.038, N = 3 SE +/- 0.014, N = 3 SE +/- 0.116, N = 15 SE +/- 0.145, N = 15 SE +/- 0.130, N = 15 5.477 5.505 6.656 6.597 6.746 1. (CXX) g++ options: -O3 -fPIC -lm
SVT-VP9 Tuning: VMAF Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p r1 r1a r2b r3 r4 90 180 270 360 450 SE +/- 15.40, N = 12 SE +/- 16.03, N = 12 SE +/- 4.05, N = 12 SE +/- 1.57, N = 3 SE +/- 0.65, N = 3 386.29 393.46 182.26 185.53 184.07 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
LuxCoreRender Scene: Rainbow Colors and Prism - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Rainbow Colors and Prism - Acceleration: CPU r1 r1a r2b r3 r4 4 8 12 16 20 SE +/- 1.05, N = 15 SE +/- 0.47, N = 15 SE +/- 0.87, N = 13 SE +/- 1.13, N = 12 SE +/- 0.79, N = 12 17.04 13.34 13.42 16.47 14.79 MIN: 11.27 / MAX: 22.05 MIN: 10.32 / MAX: 17.45 MIN: 8.28 / MAX: 21.15 MIN: 10.39 / MAX: 21.43 MIN: 9.85 / MAX: 20.95
GNU Radio Test: Hilbert Transform OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Hilbert Transform r1 r1a r2b r3 r4 100 200 300 400 500 SE +/- 2.02, N = 3 SE +/- 1.66, N = 3 SE +/- 47.90, N = 3 SE +/- 17.46, N = 9 SE +/- 24.71, N = 9 459.3 459.1 357.4 408.0 373.8 1. 3.8.1.0
GNU Radio Test: FM Deemphasis Filter OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: FM Deemphasis Filter r1 r1a r2b r3 r4 160 320 480 640 800 SE +/- 1.94, N = 3 SE +/- 1.04, N = 3 SE +/- 53.33, N = 3 SE +/- 31.57, N = 9 SE +/- 32.02, N = 9 734.0 727.4 645.8 621.0 622.0 1. 3.8.1.0
GNU Radio Test: IIR Filter OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: IIR Filter r1 r1a r2b r3 r4 130 260 390 520 650 SE +/- 0.38, N = 3 SE +/- 0.46, N = 3 SE +/- 45.07, N = 3 SE +/- 26.49, N = 9 SE +/- 25.67, N = 9 610.6 609.5 498.2 487.4 487.7 1. 3.8.1.0
GNU Radio Test: FIR Filter OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: FIR Filter r1 r1a r2b r3 r4 130 260 390 520 650 SE +/- 1.45, N = 3 SE +/- 0.20, N = 3 SE +/- 44.41, N = 3 SE +/- 16.19, N = 9 SE +/- 11.25, N = 9 603.0 604.8 470.0 502.0 515.6 1. 3.8.1.0
GNU Radio Test: Signal Source (Cosine) OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Signal Source (Cosine) r1 r1a r2b r3 r4 500 1000 1500 2000 2500 SE +/- 0.93, N = 3 SE +/- 2.24, N = 3 SE +/- 168.17, N = 3 SE +/- 72.44, N = 9 SE +/- 82.03, N = 9 2183.5 2175.3 1684.4 1723.9 1619.2 1. 3.8.1.0
GNU Radio Test: Five Back to Back FIR Filters OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Five Back to Back FIR Filters r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 2.54, N = 3 SE +/- 2.30, N = 3 SE +/- 1.12, N = 3 SE +/- 39.63, N = 9 SE +/- 48.36, N = 9 1024.3 1015.2 111.2 580.5 487.9 1. 3.8.1.0
LuaRadio Test: Five Back to Back FIR Filters OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Five Back to Back FIR Filters r1 r1a r2b r3 r4 200 400 600 800 1000 SE +/- 2.24, N = 3 SE +/- 0.62, N = 3 SE +/- 22.87, N = 9 SE +/- 74.31, N = 6 SE +/- 73.21, N = 6 1094.8 1094.5 804.5 662.8 706.1
Phoronix Test Suite v10.8.5