Core i7 6800K Broadwell Intel Core i7-6800K testing with a MSI X99A WORKSTATION (MS-7A54) v1.0 (1.10 BIOS) and Zotac NVIDIA NV137 2GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2102070-HA-COREI768056&sro&grr .
Core i7 6800K Broadwell Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution 1 2 3 4 Intel Core i7-6800K @ 3.80GHz (6 Cores / 12 Threads) MSI X99A WORKSTATION (MS-7A54) v1.0 (1.10 BIOS) Intel Xeon E7 v4/Xeon 16GB 120GB TOSHIBA TR150 Zotac NVIDIA NV137 2GB Realtek ALC1150 G237HL Intel I218-LM + Intel I210 Ubuntu 20.10 5.8.0-33-generic (x86_64) GNOME Shell 3.38.1 X Server 1.20.9 nouveau 4.3 Mesa 20.2.1 GCC 10.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_cpufreq ondemand - CPU Microcode: 0xb000038 Python Details - Python 3.8.6 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable
Core i7 6800K Broadwell lammps: 20k Atoms qe: AUSURF112 webp2: Quality 100, Lossless Compression openfoam: Motorbike 60M webp2: Quality 95, Compression Effort 7 cp2k: Fayalite-FIST Data webp2: Quality 75, Compression Effort 7 gnupg: 2.7GB Sample File Encryption openfoam: Motorbike 30M npb: EP.D build-godot: Time To Compile gcrypt: dav1d: Chimera 1080p 10-bit financebench: Bonds OpenMP askap: tConvolve MPI - Gridding askap: tConvolve MPI - Degridding pennant: sedovbig pennant: leblancbig cloverleaf: Lagrangian-Eulerian Hydrodynamics mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 financebench: Repo OpenMP onnx: fcn-resnet101-11 - OpenMP CPU onnx: bertsquad-10 - OpenMP CPU onnx: yolov4 - OpenMP CPU onnx: shufflenet-v2-10 - OpenMP CPU onnx: super-resolution-10 - OpenMP CPU npb: LU.C quantlib: qmcpack: simple-H2O kripke: npb: EP.C rav1e: 5 rav1e: 1 askap: tConvolve MT - Degridding askap: tConvolve MT - Gridding rav1e: 6 lzbench: XZ 0 - Decompression lzbench: XZ 0 - Compression dav1d: Summer Nature 4K rav1e: 10 dav1d: Chimera 1080p etcpak: ETC2 lammps: Rhodopsin Protein redis: GET synthmark: VoiceMark_100 lzbench: Zstd 8 - Decompression lzbench: Zstd 8 - Compression lzbench: Brotli 2 - Decompression lzbench: Brotli 2 - Compression lzbench: Brotli 0 - Decompression lzbench: Brotli 0 - Compression askap: Hogbom Clean OpenMP lzbench: Crush 0 - Decompression lzbench: Crush 0 - Compression lzbench: Zstd 1 - Decompression lzbench: Zstd 1 - Compression webp2: Quality 100, Compression Effort 5 tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v1.1 etcpak: ETC1 + Dithering etcpak: ETC1 redis: LPUSH redis: SADD redis: SET lzbench: Libdeflate 1 - Compression askap: tConvolve OpenMP - Degridding askap: tConvolve OpenMP - Gridding redis: LPOP amg: dav1d: Summer Nature 1080p webp2: Default lulesh: etcpak: DXT1 1 2 3 4 2.622 2234.35 1366.813 1180.20 763.421 1389.702 420.873 74.228 305.71 679.72 251.105 242.024 68.72 89486.112305 1424.62 1107.995 146.2509 96.91914 135.83 52.505 4.912 4.234 47.848 7.245 63321.796484 45 488 277 10369 3483 17528.76 2054.6 48.633 37977043 617.96 0.946 0.323 1629.03 1264.54 1.237 105 37 109.20 2.686 376.11 150.909 2.866 2169857.05 614.327 1610 75 668 160 577 412 201.884 483 91 1555 441 21.056 300.937 294.641 251.448 267.298 1391544.79 1858095.32 1622230.00 199 2012.03 1358.47 2360946.67 270577233 334.36 7.442 1024.3880 1118.279 2.606 2237.08 1366.258 1180.39 761.959 1382.848 423.761 74.351 301.03 680.36 250.845 242.658 68.49 90365.164062 1417.70 1128.17 145.2323 97.78042 132.89 55.068 5.532 4.166 46.596 7.226 62223.018229 45 488 277 10372 3489 17588.79 2040.1 50.048 38369580 627.12 0.948 0.322 1630.14 1265.38 1.249 105 37 109.16 2.683 374.53 150.981 2.923 1902703.00 614.041 1607 75 669 160 572 414 202.704 483 91 1558 444 21.074 301.042 294.403 250.378 268.831 1375021.75 1875200.92 1419988.83 199 1967.43 1313.90 1476177.92 274610300 336.60 7.380 1026.6576 1123.243 2244.69 1178.12 1389.359 302.33 671.75 144.5588 93.68373 133.01 17412.50 2037.4 48.762 619.24 105 37 150.950 1607 75 669 160 576 414 483 91 1553 445 251.858 267.131 199 274249833 1123.194 2.612 2231.47 1364.603 1182.70 763.753 1386.315 420.404 74.408 300.80 672.86 251.051 242.259 68.73 90116.080729 1388.83 1102.36 146.1675 97.29220 132.99 53.764 4.914 4.030 47.599 7.266 63499.069010 45 487 277 10382 3488 17463.90 2031.9 47.806 37965963 636.84 0.954 0.321 1627.93 1263.04 1.245 106 37 109.01 2.677 377.25 151.617 2.880 1940969.00 615.486 1606 75 669 160 575 413 201.208 483 90 1538 445 21.077 302.312 294.823 250.290 268.422 1374636.71 1831314.21 1418826.16 199 2017.09 1348.70 1439625.16 274539633 336.75 7.352 1026.6542 1122.173 OpenBenchmarking.org
LAMMPS Molecular Dynamics Simulator Model: 20k Atoms OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: 20k Atoms 1 2 4 0.59 1.18 1.77 2.36 2.95 SE +/- 0.012, N = 3 SE +/- 0.014, N = 3 SE +/- 0.007, N = 3 2.622 2.606 2.612 1. (CXX) g++ options: -O3 -pthread -lm
Quantum ESPRESSO Input: AUSURF112 OpenBenchmarking.org Seconds, Fewer Is Better Quantum ESPRESSO 6.7 Input: AUSURF112 1 2 3 4 500 1000 1500 2000 2500 SE +/- 15.22, N = 3 SE +/- 18.79, N = 3 SE +/- 23.10, N = 5 SE +/- 25.11, N = 4 2234.35 2237.08 2244.69 2231.47 1. (F9X) gfortran options: -lopenblas -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common -lFoX_utils -lFoX_fsys -lfftw3 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz
WebP2 Image Encode Encode Settings: Quality 100, Lossless Compression OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 100, Lossless Compression 1 2 4 300 600 900 1200 1500 SE +/- 0.15, N = 3 SE +/- 0.70, N = 3 SE +/- 0.37, N = 3 1366.81 1366.26 1364.60 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
OpenFOAM Input: Motorbike 60M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 60M 1 2 3 4 300 600 900 1200 1500 SE +/- 2.04, N = 3 SE +/- 0.58, N = 3 SE +/- 2.24, N = 3 SE +/- 1.99, N = 3 1180.20 1180.39 1178.12 1182.70 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -ldynamicMesh -lgenericPatchFields -lOpenFOAM -ldl -lm
WebP2 Image Encode Encode Settings: Quality 95, Compression Effort 7 OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 95, Compression Effort 7 1 2 4 160 320 480 640 800 SE +/- 0.90, N = 3 SE +/- 1.11, N = 3 SE +/- 1.31, N = 3 763.42 761.96 763.75 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
CP2K Molecular Dynamics Fayalite-FIST Data OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 8.1 Fayalite-FIST Data 1 2 3 4 300 600 900 1200 1500 1389.70 1382.85 1389.36 1386.32
WebP2 Image Encode Encode Settings: Quality 75, Compression Effort 7 OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 75, Compression Effort 7 1 2 4 90 180 270 360 450 SE +/- 0.32, N = 3 SE +/- 1.25, N = 3 SE +/- 0.64, N = 3 420.87 423.76 420.40 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
GnuPG 2.7GB Sample File Encryption OpenBenchmarking.org Seconds, Fewer Is Better GnuPG 2.2.27 2.7GB Sample File Encryption 1 2 4 20 40 60 80 100 SE +/- 0.67, N = 13 SE +/- 0.71, N = 13 SE +/- 0.64, N = 13 74.23 74.35 74.41 1. (CC) gcc options: -O2
OpenFOAM Input: Motorbike 30M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 30M 1 2 3 4 70 140 210 280 350 SE +/- 0.57, N = 3 SE +/- 0.52, N = 3 SE +/- 0.48, N = 3 SE +/- 0.32, N = 3 305.71 301.03 302.33 300.80 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -ldynamicMesh -lgenericPatchFields -lOpenFOAM -ldl -lm
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D 1 2 3 4 150 300 450 600 750 SE +/- 3.26, N = 3 SE +/- 4.51, N = 3 SE +/- 5.42, N = 9 SE +/- 0.75, N = 3 679.72 680.36 671.75 672.86 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.3
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 3.2.3 Time To Compile 1 2 4 50 100 150 200 250 SE +/- 0.72, N = 3 SE +/- 0.30, N = 3 SE +/- 0.08, N = 3 251.11 250.85 251.05
Gcrypt Library OpenBenchmarking.org Seconds, Fewer Is Better Gcrypt Library 1.9 1 2 4 50 100 150 200 250 SE +/- 0.47, N = 3 SE +/- 0.08, N = 3 SE +/- 0.10, N = 3 242.02 242.66 242.26 1. (CC) gcc options: -O2 -fvisibility=hidden -lgpg-error
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit 1 2 4 15 30 45 60 75 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 68.72 68.49 68.73 MIN: 44.63 / MAX: 173.21 MIN: 44.63 / MAX: 171.9 MIN: 44.64 / MAX: 172.53 1. (CC) gcc options: -pthread
FinanceBench Benchmark: Bonds OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Bonds OpenMP 1 2 4 20K 40K 60K 80K 100K SE +/- 782.31, N = 8 SE +/- 681.69, N = 3 SE +/- 836.51, N = 6 89486.11 90365.16 90116.08 1. (CXX) g++ options: -O3 -march=native -fopenmp
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding 1 2 4 300 600 900 1200 1500 SE +/- 45.47, N = 15 SE +/- 47.82, N = 12 SE +/- 40.34, N = 15 1424.62 1417.70 1388.83 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Degridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding 1 2 4 200 400 600 800 1000 SE +/- 25.87, N = 15 SE +/- 31.02, N = 12 SE +/- 17.94, N = 15 1108.00 1128.17 1102.36 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Pennant Test: sedovbig OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: sedovbig 1 2 3 4 30 60 90 120 150 SE +/- 1.70, N = 3 SE +/- 0.89, N = 3 SE +/- 1.66, N = 4 SE +/- 1.30, N = 3 146.25 145.23 144.56 146.17 1. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi
Pennant Test: leblancbig OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: leblancbig 1 2 3 4 20 40 60 80 100 SE +/- 0.04, N = 3 SE +/- 0.93, N = 3 SE +/- 1.16, N = 3 SE +/- 0.75, N = 10 96.92 97.78 93.68 97.29 1. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi
CloverLeaf Lagrangian-Eulerian Hydrodynamics OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf Lagrangian-Eulerian Hydrodynamics 1 2 3 4 30 60 90 120 150 SE +/- 0.20, N = 3 SE +/- 0.13, N = 3 SE +/- 0.14, N = 3 SE +/- 0.19, N = 3 135.83 132.89 133.01 132.99 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: inception-v3 1 2 4 12 24 36 48 60 SE +/- 0.36, N = 3 SE +/- 0.99, N = 3 SE +/- 0.52, N = 3 52.51 55.07 53.76 MIN: 49.16 / MAX: 82.48 MIN: 46.94 / MAX: 90.47 MIN: 49.79 / MAX: 90.86 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: mobilenet-v1-1.0 1 2 4 1.2447 2.4894 3.7341 4.9788 6.2235 SE +/- 0.027, N = 3 SE +/- 0.446, N = 3 SE +/- 0.059, N = 3 4.912 5.532 4.914 MIN: 4.54 / MAX: 22.9 MIN: 4.23 / MAX: 16.41 MIN: 4.26 / MAX: 23.95 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: MobileNetV2_224 1 2 4 0.9527 1.9054 2.8581 3.8108 4.7635 SE +/- 0.162, N = 3 SE +/- 0.183, N = 3 SE +/- 0.075, N = 3 4.234 4.166 4.030 MIN: 3.64 / MAX: 22.78 MIN: 3.64 / MAX: 25 MIN: 3.63 / MAX: 18.94 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: resnet-v2-50 1 2 4 11 22 33 44 55 SE +/- 0.48, N = 3 SE +/- 0.24, N = 3 SE +/- 1.13, N = 3 47.85 46.60 47.60 MIN: 37.96 / MAX: 79.71 MIN: 36.23 / MAX: 79.17 MIN: 37.37 / MAX: 79.49 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: SqueezeNetV1.0 1 2 4 2 4 6 8 10 SE +/- 0.074, N = 3 SE +/- 0.072, N = 3 SE +/- 0.090, N = 3 7.245 7.226 7.266 MIN: 6.76 / MAX: 25.21 MIN: 6.79 / MAX: 26.4 MIN: 6.78 / MAX: 26.54 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
FinanceBench Benchmark: Repo OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Repo OpenMP 1 2 4 14K 28K 42K 56K 70K SE +/- 484.36, N = 10 SE +/- 833.85, N = 3 SE +/- 656.55, N = 3 63321.80 62223.02 63499.07 1. (CXX) g++ options: -O3 -march=native -fopenmp
ONNX Runtime Model: fcn-resnet101-11 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: fcn-resnet101-11 - Device: OpenMP CPU 1 2 4 10 20 30 40 50 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 45 45 45 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: bertsquad-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: bertsquad-10 - Device: OpenMP CPU 1 2 4 110 220 330 440 550 SE +/- 0.44, N = 3 SE +/- 0.29, N = 3 SE +/- 0.33, N = 3 488 488 487 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: yolov4 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: yolov4 - Device: OpenMP CPU 1 2 4 60 120 180 240 300 SE +/- 0.17, N = 3 SE +/- 0.17, N = 3 SE +/- 0.17, N = 3 277 277 277 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: shufflenet-v2-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: shufflenet-v2-10 - Device: OpenMP CPU 1 2 4 2K 4K 6K 8K 10K SE +/- 13.87, N = 3 SE +/- 8.91, N = 3 SE +/- 17.32, N = 3 10369 10372 10382 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: super-resolution-10 - Device: OpenMP CPU 1 2 4 700 1400 2100 2800 3500 SE +/- 8.93, N = 3 SE +/- 1.32, N = 3 SE +/- 5.22, N = 3 3483 3489 3488 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C 1 2 3 4 4K 8K 12K 16K 20K SE +/- 88.88, N = 3 SE +/- 223.23, N = 3 SE +/- 99.96, N = 3 SE +/- 71.12, N = 3 17528.76 17588.79 17412.50 17463.90 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.3
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 1 2 3 4 400 800 1200 1600 2000 SE +/- 4.27, N = 3 SE +/- 14.42, N = 12 SE +/- 13.38, N = 13 SE +/- 18.39, N = 7 2054.6 2040.1 2037.4 2031.9 1. (CXX) g++ options: -O3 -march=native -rdynamic
QMCPACK Input: simple-H2O OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.10 Input: simple-H2O 1 2 3 4 11 22 33 44 55 SE +/- 0.12, N = 3 SE +/- 0.29, N = 3 SE +/- 0.63, N = 3 SE +/- 0.52, N = 15 48.63 50.05 48.76 47.81 1. (CXX) g++ options: -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -march=native -O3 -fomit-frame-pointer -ffast-math -pthread -lm
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 1 2 4 8M 16M 24M 32M 40M SE +/- 109535.42, N = 3 SE +/- 50361.23, N = 3 SE +/- 223603.64, N = 3 37977043 38369580 37965963 1. (CXX) g++ options: -O3 -fopenmp
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C 1 2 3 4 140 280 420 560 700 SE +/- 13.91, N = 15 SE +/- 10.35, N = 15 SE +/- 10.09, N = 15 SE +/- 8.17, N = 15 617.96 627.12 619.24 636.84 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.3
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 5 1 2 4 0.2147 0.4294 0.6441 0.8588 1.0735 SE +/- 0.005, N = 3 SE +/- 0.007, N = 3 SE +/- 0.003, N = 3 0.946 0.948 0.954
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 1 1 2 4 0.0727 0.1454 0.2181 0.2908 0.3635 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 0.323 0.322 0.321
ASKAP Test: tConvolve MT - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Degridding 1 2 4 400 800 1200 1600 2000 SE +/- 1.21, N = 3 SE +/- 0.96, N = 3 SE +/- 1.00, N = 3 1629.03 1630.14 1627.93 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MT - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Gridding 1 2 4 300 600 900 1200 1500 SE +/- 0.17, N = 3 SE +/- 0.29, N = 3 SE +/- 0.88, N = 3 1264.54 1265.38 1263.04 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 6 1 2 4 0.281 0.562 0.843 1.124 1.405 SE +/- 0.010, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 1.237 1.249 1.245
lzbench Test: XZ 0 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: XZ 0 - Process: Decompression 1 2 3 4 20 40 60 80 100 SE +/- 0.33, N = 3 105 105 105 106 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: XZ 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: XZ 0 - Process: Compression 1 2 3 4 9 18 27 36 45 37 37 37 37 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K 1 2 4 20 40 60 80 100 SE +/- 0.13, N = 3 SE +/- 0.04, N = 3 SE +/- 0.14, N = 3 109.20 109.16 109.01 MIN: 103.03 / MAX: 122.93 MIN: 103.11 / MAX: 122.63 MIN: 103.01 / MAX: 122.86 1. (CC) gcc options: -pthread
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 10 1 2 4 0.6044 1.2088 1.8132 2.4176 3.022 SE +/- 0.028, N = 3 SE +/- 0.006, N = 3 SE +/- 0.024, N = 3 2.686 2.683 2.677
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 1 2 4 80 160 240 320 400 SE +/- 0.96, N = 3 SE +/- 0.96, N = 3 SE +/- 0.26, N = 3 376.11 374.53 377.25 MIN: 279 / MAX: 582.19 MIN: 277.99 / MAX: 569.54 MIN: 279.46 / MAX: 578.3 1. (CC) gcc options: -pthread
Etcpak Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 1 2 3 4 30 60 90 120 150 SE +/- 0.24, N = 3 SE +/- 0.35, N = 3 SE +/- 0.17, N = 3 SE +/- 0.11, N = 3 150.91 150.98 150.95 151.62 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 1 2 4 0.6577 1.3154 1.9731 2.6308 3.2885 SE +/- 0.060, N = 15 SE +/- 0.110, N = 15 SE +/- 0.087, N = 12 2.866 2.923 2.880 1. (CXX) g++ options: -O3 -pthread -lm
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET 1 2 4 500K 1000K 1500K 2000K 2500K SE +/- 33099.62, N = 12 SE +/- 25808.83, N = 3 SE +/- 17441.42, N = 3 2169857.05 1902703.00 1940969.00 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Google SynthMark Test: VoiceMark_100 OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 1 2 4 130 260 390 520 650 SE +/- 0.38, N = 3 SE +/- 0.24, N = 3 SE +/- 0.82, N = 3 614.33 614.04 615.49 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
lzbench Test: Zstd 8 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 8 - Process: Decompression 1 2 3 4 300 600 900 1200 1500 SE +/- 2.85, N = 3 SE +/- 0.88, N = 3 SE +/- 2.85, N = 3 1610 1607 1607 1606 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Zstd 8 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 8 - Process: Compression 1 2 3 4 20 40 60 80 100 75 75 75 75 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Brotli 2 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 2 - Process: Decompression 1 2 3 4 140 280 420 560 700 SE +/- 0.88, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 668 669 669 669 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Brotli 2 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 2 - Process: Compression 1 2 3 4 40 80 120 160 200 SE +/- 0.58, N = 3 SE +/- 0.33, N = 3 160 160 160 160 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Brotli 0 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 0 - Process: Decompression 1 2 3 4 120 240 360 480 600 SE +/- 4.58, N = 3 SE +/- 0.67, N = 3 SE +/- 1.86, N = 3 577 572 576 575 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Brotli 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 0 - Process: Compression 1 2 3 4 90 180 270 360 450 412 414 414 413 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
ASKAP Test: Hogbom Clean OpenMP OpenBenchmarking.org Iterations Per Second, More Is Better ASKAP 1.0 Test: Hogbom Clean OpenMP 1 2 4 40 80 120 160 200 SE +/- 0.14, N = 3 SE +/- 0.36, N = 3 SE +/- 0.23, N = 3 201.88 202.70 201.21 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
lzbench Test: Crush 0 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Crush 0 - Process: Decompression 1 2 3 4 100 200 300 400 500 483 483 483 483 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Crush 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Crush 0 - Process: Compression 1 2 3 4 20 40 60 80 100 SE +/- 1.00, N = 3 SE +/- 1.20, N = 3 SE +/- 0.67, N = 3 SE +/- 0.88, N = 3 91 91 91 90 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Zstd 1 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 1 - Process: Decompression 1 2 3 4 300 600 900 1200 1500 SE +/- 3.21, N = 3 SE +/- 1.76, N = 3 SE +/- 1.15, N = 3 SE +/- 13.53, N = 3 1555 1558 1553 1538 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Zstd 1 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 1 - Process: Compression 1 2 3 4 100 200 300 400 500 SE +/- 1.20, N = 3 SE +/- 1.00, N = 3 441 444 445 445 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
WebP2 Image Encode Encode Settings: Quality 100, Compression Effort 5 OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 100, Compression Effort 5 1 2 4 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 21.06 21.07 21.08 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 1 2 4 70 140 210 280 350 SE +/- 0.21, N = 3 SE +/- 0.46, N = 3 SE +/- 0.88, N = 3 300.94 301.04 302.31 MIN: 298.58 / MAX: 308.32 MIN: 298.88 / MAX: 315.68 MIN: 299.23 / MAX: 325.81 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 1 2 4 60 120 180 240 300 SE +/- 0.11, N = 3 SE +/- 0.60, N = 3 SE +/- 0.42, N = 3 294.64 294.40 294.82 MIN: 292.53 / MAX: 302.03 MIN: 291.65 / MAX: 305.43 MIN: 292.2 / MAX: 297.29 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
Etcpak Configuration: ETC1 + Dithering OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 + Dithering 1 2 3 4 60 120 180 240 300 SE +/- 0.09, N = 3 SE +/- 0.23, N = 3 SE +/- 0.08, N = 3 SE +/- 0.13, N = 3 251.45 250.38 251.86 250.29 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Etcpak Configuration: ETC1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 1 2 3 4 60 120 180 240 300 SE +/- 0.20, N = 3 SE +/- 0.14, N = 3 SE +/- 0.29, N = 3 SE +/- 0.32, N = 3 267.30 268.83 267.13 268.42 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH 1 2 4 300K 600K 900K 1200K 1500K SE +/- 6897.86, N = 3 SE +/- 14301.64, N = 3 SE +/- 5529.55, N = 3 1391544.79 1375021.75 1374636.71 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD 1 2 4 400K 800K 1200K 1600K 2000K SE +/- 22404.68, N = 4 SE +/- 15518.13, N = 3 SE +/- 10271.92, N = 3 1858095.32 1875200.92 1831314.21 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET 1 2 4 300K 600K 900K 1200K 1500K SE +/- 20237.27, N = 3 SE +/- 5724.28, N = 3 SE +/- 5141.37, N = 3 1622230.00 1419988.83 1418826.16 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
lzbench Test: Libdeflate 1 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Libdeflate 1 - Process: Compression 1 2 3 4 40 80 120 160 200 199 199 199 199 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
ASKAP Test: tConvolve OpenMP - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Degridding 1 2 4 400 800 1200 1600 2000 SE +/- 5.06, N = 3 SE +/- 4.84, N = 3 SE +/- 0.00, N = 4 2012.03 1967.43 2017.09 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve OpenMP - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Gridding 1 2 4 300 600 900 1200 1500 SE +/- 4.00, N = 3 SE +/- 9.44, N = 3 SE +/- 15.91, N = 4 1358.47 1313.90 1348.70 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP 1 2 4 500K 1000K 1500K 2000K 2500K SE +/- 5765.15, N = 3 SE +/- 2253.77, N = 3 SE +/- 7518.77, N = 3 2360946.67 1476177.92 1439625.16 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 1 2 3 4 60M 120M 180M 240M 300M SE +/- 1777861.64, N = 3 SE +/- 435487.00, N = 3 SE +/- 258877.10, N = 3 SE +/- 197532.95, N = 3 270577233 274610300 274249833 274539633 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p 1 2 4 70 140 210 280 350 SE +/- 0.56, N = 3 SE +/- 1.35, N = 3 SE +/- 0.77, N = 3 334.36 336.60 336.75 MIN: 282.32 / MAX: 365.25 MIN: 292.11 / MAX: 366.71 MIN: 293.79 / MAX: 367.96 1. (CC) gcc options: -pthread
WebP2 Image Encode Encode Settings: Default OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Default 1 2 4 2 4 6 8 10 SE +/- 0.010, N = 3 SE +/- 0.044, N = 3 SE +/- 0.052, N = 3 7.442 7.380 7.352 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 1 2 4 200 400 600 800 1000 SE +/- 0.40, N = 3 SE +/- 1.88, N = 3 SE +/- 0.49, N = 3 1024.39 1026.66 1026.65 1. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi
Etcpak Configuration: DXT1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 1 2 3 4 200 400 600 800 1000 SE +/- 1.85, N = 3 SE +/- 2.95, N = 3 SE +/- 0.94, N = 3 SE +/- 2.77, N = 3 1118.28 1123.24 1123.19 1122.17 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Phoronix Test Suite v10.8.5