Core i7 6800K Broadwell Intel Core i7-6800K testing with a MSI X99A WORKSTATION (MS-7A54) v1.0 (1.10 BIOS) and Zotac NVIDIA NV137 2GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2102070-HA-COREI768056&grs .
Core i7 6800K Broadwell Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution 1 2 3 4 Intel Core i7-6800K @ 3.80GHz (6 Cores / 12 Threads) MSI X99A WORKSTATION (MS-7A54) v1.0 (1.10 BIOS) Intel Xeon E7 v4/Xeon 16GB 120GB TOSHIBA TR150 Zotac NVIDIA NV137 2GB Realtek ALC1150 G237HL Intel I218-LM + Intel I210 Ubuntu 20.10 5.8.0-33-generic (x86_64) GNOME Shell 3.38.1 X Server 1.20.9 nouveau 4.3 Mesa 20.2.1 GCC 10.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_cpufreq ondemand - CPU Microcode: 0xb000038 Python Details - Python 3.8.6 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable
Core i7 6800K Broadwell redis: LPOP redis: SET redis: GET mnn: inception-v3 qmcpack: simple-H2O pennant: leblancbig askap: tConvolve OpenMP - Gridding mnn: resnet-v2-50 askap: tConvolve OpenMP - Degridding redis: SADD cloverleaf: Lagrangian-Eulerian Hydrodynamics financebench: Repo OpenMP openfoam: Motorbike 30M amg: lzbench: Zstd 1 - Decompression npb: EP.D redis: LPUSH webp2: Default pennant: sedovbig quantlib: lzbench: Crush 0 - Compression kripke: npb: LU.C financebench: Bonds OpenMP rav1e: 6 lzbench: XZ 0 - Decompression lzbench: Zstd 1 - Compression lzbench: Brotli 0 - Decompression rav1e: 5 webp2: Quality 75, Compression Effort 7 askap: Hogbom Clean OpenMP dav1d: Chimera 1080p dav1d: Summer Nature 1080p etcpak: ETC1 etcpak: ETC1 + Dithering rav1e: 1 lammps: 20k Atoms qe: AUSURF112 mnn: SqueezeNetV1.0 cp2k: Fayalite-FIST Data lzbench: Brotli 0 - Compression etcpak: ETC2 tnn: CPU - MobileNet v2 etcpak: DXT1 openfoam: Motorbike 60M dav1d: Chimera 1080p 10-bit rav1e: 10 gcrypt: lzbench: Zstd 8 - Decompression gnupg: 2.7GB Sample File Encryption webp2: Quality 95, Compression Effort 7 synthmark: VoiceMark_100 lulesh: onnx: bertsquad-10 - OpenMP CPU askap: tConvolve MT - Gridding dav1d: Summer Nature 4K onnx: super-resolution-10 - OpenMP CPU webp2: Quality 100, Lossless Compression lzbench: Brotli 2 - Decompression tnn: CPU - SqueezeNet v1.1 askap: tConvolve MT - Degridding onnx: shufflenet-v2-10 - OpenMP CPU build-godot: Time To Compile webp2: Quality 100, Compression Effort 5 onnx: fcn-resnet101-11 - OpenMP CPU onnx: yolov4 - OpenMP CPU lzbench: Libdeflate 1 - Compression lzbench: Brotli 2 - Compression lzbench: Crush 0 - Decompression lzbench: Zstd 8 - Compression lzbench: XZ 0 - Compression mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 askap: tConvolve MPI - Gridding askap: tConvolve MPI - Degridding lammps: Rhodopsin Protein npb: EP.C 1 2 3 4 2360946.67 1622230.00 2169857.05 52.505 48.633 96.91914 1358.47 47.848 2012.03 1858095.32 135.83 63321.796484 305.71 270577233 1555 679.72 1391544.79 7.442 146.2509 2054.6 91 37977043 17528.76 89486.112305 1.237 105 441 577 0.946 420.873 201.884 376.11 334.36 267.298 251.448 0.323 2.622 2234.35 7.245 1389.702 412 150.909 300.937 1118.279 1180.20 68.72 2.686 242.024 1610 74.228 763.421 614.327 1024.3880 488 1264.54 109.20 3483 1366.813 668 294.641 1629.03 10369 251.105 21.056 45 277 199 160 483 75 37 4.912 4.234 1424.62 1107.995 2.866 617.96 1476177.92 1419988.83 1902703.00 55.068 50.048 97.78042 1313.90 46.596 1967.43 1875200.92 132.89 62223.018229 301.03 274610300 1558 680.36 1375021.75 7.380 145.2323 2040.1 91 38369580 17588.79 90365.164062 1.249 105 444 572 0.948 423.761 202.704 374.53 336.60 268.831 250.378 0.322 2.606 2237.08 7.226 1382.848 414 150.981 301.042 1123.243 1180.39 68.49 2.683 242.658 1607 74.351 761.959 614.041 1026.6576 488 1265.38 109.16 3489 1366.258 669 294.403 1630.14 10372 250.845 21.074 45 277 199 160 483 75 37 5.532 4.166 1417.70 1128.17 2.923 627.12 48.762 93.68373 133.01 302.33 274249833 1553 671.75 144.5588 2037.4 91 17412.50 105 445 576 267.131 251.858 2244.69 1389.359 414 150.950 1123.194 1178.12 1607 669 199 160 483 75 37 619.24 1439625.16 1418826.16 1940969.00 53.764 47.806 97.29220 1348.70 47.599 2017.09 1831314.21 132.99 63499.069010 300.80 274539633 1538 672.86 1374636.71 7.352 146.1675 2031.9 90 37965963 17463.90 90116.080729 1.245 106 445 575 0.954 420.404 201.208 377.25 336.75 268.422 250.290 0.321 2.612 2231.47 7.266 1386.315 413 151.617 302.312 1122.173 1182.70 68.73 2.677 242.259 1606 74.408 763.753 615.486 1026.6542 487 1263.04 109.01 3488 1364.603 669 294.823 1627.93 10382 251.051 21.077 45 277 199 160 483 75 37 4.914 4.030 1388.83 1102.36 2.880 636.84 OpenBenchmarking.org
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP 1 2 4 500K 1000K 1500K 2000K 2500K SE +/- 5765.15, N = 3 SE +/- 2253.77, N = 3 SE +/- 7518.77, N = 3 2360946.67 1476177.92 1439625.16 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET 1 2 4 300K 600K 900K 1200K 1500K SE +/- 20237.27, N = 3 SE +/- 5724.28, N = 3 SE +/- 5141.37, N = 3 1622230.00 1419988.83 1418826.16 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET 1 2 4 500K 1000K 1500K 2000K 2500K SE +/- 33099.62, N = 12 SE +/- 25808.83, N = 3 SE +/- 17441.42, N = 3 2169857.05 1902703.00 1940969.00 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: inception-v3 1 2 4 12 24 36 48 60 SE +/- 0.36, N = 3 SE +/- 0.99, N = 3 SE +/- 0.52, N = 3 52.51 55.07 53.76 MIN: 49.16 / MAX: 82.48 MIN: 46.94 / MAX: 90.47 MIN: 49.79 / MAX: 90.86 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
QMCPACK Input: simple-H2O OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.10 Input: simple-H2O 1 2 3 4 11 22 33 44 55 SE +/- 0.12, N = 3 SE +/- 0.29, N = 3 SE +/- 0.63, N = 3 SE +/- 0.52, N = 15 48.63 50.05 48.76 47.81 1. (CXX) g++ options: -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -march=native -O3 -fomit-frame-pointer -ffast-math -pthread -lm
Pennant Test: leblancbig OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: leblancbig 1 2 3 4 20 40 60 80 100 SE +/- 0.04, N = 3 SE +/- 0.93, N = 3 SE +/- 1.16, N = 3 SE +/- 0.75, N = 10 96.92 97.78 93.68 97.29 1. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi
ASKAP Test: tConvolve OpenMP - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Gridding 1 2 4 300 600 900 1200 1500 SE +/- 4.00, N = 3 SE +/- 9.44, N = 3 SE +/- 15.91, N = 4 1358.47 1313.90 1348.70 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: resnet-v2-50 1 2 4 11 22 33 44 55 SE +/- 0.48, N = 3 SE +/- 0.24, N = 3 SE +/- 1.13, N = 3 47.85 46.60 47.60 MIN: 37.96 / MAX: 79.71 MIN: 36.23 / MAX: 79.17 MIN: 37.37 / MAX: 79.49 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
ASKAP Test: tConvolve OpenMP - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Degridding 1 2 4 400 800 1200 1600 2000 SE +/- 5.06, N = 3 SE +/- 4.84, N = 3 SE +/- 0.00, N = 4 2012.03 1967.43 2017.09 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD 1 2 4 400K 800K 1200K 1600K 2000K SE +/- 22404.68, N = 4 SE +/- 15518.13, N = 3 SE +/- 10271.92, N = 3 1858095.32 1875200.92 1831314.21 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
CloverLeaf Lagrangian-Eulerian Hydrodynamics OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf Lagrangian-Eulerian Hydrodynamics 1 2 3 4 30 60 90 120 150 SE +/- 0.20, N = 3 SE +/- 0.13, N = 3 SE +/- 0.14, N = 3 SE +/- 0.19, N = 3 135.83 132.89 133.01 132.99 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
FinanceBench Benchmark: Repo OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Repo OpenMP 1 2 4 14K 28K 42K 56K 70K SE +/- 484.36, N = 10 SE +/- 833.85, N = 3 SE +/- 656.55, N = 3 63321.80 62223.02 63499.07 1. (CXX) g++ options: -O3 -march=native -fopenmp
OpenFOAM Input: Motorbike 30M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 30M 1 2 3 4 70 140 210 280 350 SE +/- 0.57, N = 3 SE +/- 0.52, N = 3 SE +/- 0.48, N = 3 SE +/- 0.32, N = 3 305.71 301.03 302.33 300.80 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -ldynamicMesh -lgenericPatchFields -lOpenFOAM -ldl -lm
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 1 2 3 4 60M 120M 180M 240M 300M SE +/- 1777861.64, N = 3 SE +/- 435487.00, N = 3 SE +/- 258877.10, N = 3 SE +/- 197532.95, N = 3 270577233 274610300 274249833 274539633 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
lzbench Test: Zstd 1 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 1 - Process: Decompression 1 2 3 4 300 600 900 1200 1500 SE +/- 3.21, N = 3 SE +/- 1.76, N = 3 SE +/- 1.15, N = 3 SE +/- 13.53, N = 3 1555 1558 1553 1538 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D 1 2 3 4 150 300 450 600 750 SE +/- 3.26, N = 3 SE +/- 4.51, N = 3 SE +/- 5.42, N = 9 SE +/- 0.75, N = 3 679.72 680.36 671.75 672.86 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.3
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH 1 2 4 300K 600K 900K 1200K 1500K SE +/- 6897.86, N = 3 SE +/- 14301.64, N = 3 SE +/- 5529.55, N = 3 1391544.79 1375021.75 1374636.71 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
WebP2 Image Encode Encode Settings: Default OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Default 1 2 4 2 4 6 8 10 SE +/- 0.010, N = 3 SE +/- 0.044, N = 3 SE +/- 0.052, N = 3 7.442 7.380 7.352 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
Pennant Test: sedovbig OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: sedovbig 1 2 3 4 30 60 90 120 150 SE +/- 1.70, N = 3 SE +/- 0.89, N = 3 SE +/- 1.66, N = 4 SE +/- 1.30, N = 3 146.25 145.23 144.56 146.17 1. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 1 2 3 4 400 800 1200 1600 2000 SE +/- 4.27, N = 3 SE +/- 14.42, N = 12 SE +/- 13.38, N = 13 SE +/- 18.39, N = 7 2054.6 2040.1 2037.4 2031.9 1. (CXX) g++ options: -O3 -march=native -rdynamic
lzbench Test: Crush 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Crush 0 - Process: Compression 1 2 3 4 20 40 60 80 100 SE +/- 1.00, N = 3 SE +/- 1.20, N = 3 SE +/- 0.67, N = 3 SE +/- 0.88, N = 3 91 91 91 90 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 1 2 4 8M 16M 24M 32M 40M SE +/- 109535.42, N = 3 SE +/- 50361.23, N = 3 SE +/- 223603.64, N = 3 37977043 38369580 37965963 1. (CXX) g++ options: -O3 -fopenmp
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C 1 2 3 4 4K 8K 12K 16K 20K SE +/- 88.88, N = 3 SE +/- 223.23, N = 3 SE +/- 99.96, N = 3 SE +/- 71.12, N = 3 17528.76 17588.79 17412.50 17463.90 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.3
FinanceBench Benchmark: Bonds OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Bonds OpenMP 1 2 4 20K 40K 60K 80K 100K SE +/- 782.31, N = 8 SE +/- 681.69, N = 3 SE +/- 836.51, N = 6 89486.11 90365.16 90116.08 1. (CXX) g++ options: -O3 -march=native -fopenmp
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 6 1 2 4 0.281 0.562 0.843 1.124 1.405 SE +/- 0.010, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 1.237 1.249 1.245
lzbench Test: XZ 0 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: XZ 0 - Process: Decompression 1 2 3 4 20 40 60 80 100 SE +/- 0.33, N = 3 105 105 105 106 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Zstd 1 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 1 - Process: Compression 1 2 3 4 100 200 300 400 500 SE +/- 1.20, N = 3 SE +/- 1.00, N = 3 441 444 445 445 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Brotli 0 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 0 - Process: Decompression 1 2 3 4 120 240 360 480 600 SE +/- 4.58, N = 3 SE +/- 0.67, N = 3 SE +/- 1.86, N = 3 577 572 576 575 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 5 1 2 4 0.2147 0.4294 0.6441 0.8588 1.0735 SE +/- 0.005, N = 3 SE +/- 0.007, N = 3 SE +/- 0.003, N = 3 0.946 0.948 0.954
WebP2 Image Encode Encode Settings: Quality 75, Compression Effort 7 OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 75, Compression Effort 7 1 2 4 90 180 270 360 450 SE +/- 0.32, N = 3 SE +/- 1.25, N = 3 SE +/- 0.64, N = 3 420.87 423.76 420.40 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
ASKAP Test: Hogbom Clean OpenMP OpenBenchmarking.org Iterations Per Second, More Is Better ASKAP 1.0 Test: Hogbom Clean OpenMP 1 2 4 40 80 120 160 200 SE +/- 0.14, N = 3 SE +/- 0.36, N = 3 SE +/- 0.23, N = 3 201.88 202.70 201.21 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 1 2 4 80 160 240 320 400 SE +/- 0.96, N = 3 SE +/- 0.96, N = 3 SE +/- 0.26, N = 3 376.11 374.53 377.25 MIN: 279 / MAX: 582.19 MIN: 277.99 / MAX: 569.54 MIN: 279.46 / MAX: 578.3 1. (CC) gcc options: -pthread
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p 1 2 4 70 140 210 280 350 SE +/- 0.56, N = 3 SE +/- 1.35, N = 3 SE +/- 0.77, N = 3 334.36 336.60 336.75 MIN: 282.32 / MAX: 365.25 MIN: 292.11 / MAX: 366.71 MIN: 293.79 / MAX: 367.96 1. (CC) gcc options: -pthread
Etcpak Configuration: ETC1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 1 2 3 4 60 120 180 240 300 SE +/- 0.20, N = 3 SE +/- 0.14, N = 3 SE +/- 0.29, N = 3 SE +/- 0.32, N = 3 267.30 268.83 267.13 268.42 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Etcpak Configuration: ETC1 + Dithering OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 + Dithering 1 2 3 4 60 120 180 240 300 SE +/- 0.09, N = 3 SE +/- 0.23, N = 3 SE +/- 0.08, N = 3 SE +/- 0.13, N = 3 251.45 250.38 251.86 250.29 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 1 1 2 4 0.0727 0.1454 0.2181 0.2908 0.3635 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 0.323 0.322 0.321
LAMMPS Molecular Dynamics Simulator Model: 20k Atoms OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: 20k Atoms 1 2 4 0.59 1.18 1.77 2.36 2.95 SE +/- 0.012, N = 3 SE +/- 0.014, N = 3 SE +/- 0.007, N = 3 2.622 2.606 2.612 1. (CXX) g++ options: -O3 -pthread -lm
Quantum ESPRESSO Input: AUSURF112 OpenBenchmarking.org Seconds, Fewer Is Better Quantum ESPRESSO 6.7 Input: AUSURF112 1 2 3 4 500 1000 1500 2000 2500 SE +/- 15.22, N = 3 SE +/- 18.79, N = 3 SE +/- 23.10, N = 5 SE +/- 25.11, N = 4 2234.35 2237.08 2244.69 2231.47 1. (F9X) gfortran options: -lopenblas -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common -lFoX_utils -lFoX_fsys -lfftw3 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: SqueezeNetV1.0 1 2 4 2 4 6 8 10 SE +/- 0.074, N = 3 SE +/- 0.072, N = 3 SE +/- 0.090, N = 3 7.245 7.226 7.266 MIN: 6.76 / MAX: 25.21 MIN: 6.79 / MAX: 26.4 MIN: 6.78 / MAX: 26.54 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
CP2K Molecular Dynamics Fayalite-FIST Data OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 8.1 Fayalite-FIST Data 1 2 3 4 300 600 900 1200 1500 1389.70 1382.85 1389.36 1386.32
lzbench Test: Brotli 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 0 - Process: Compression 1 2 3 4 90 180 270 360 450 412 414 414 413 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Etcpak Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 1 2 3 4 30 60 90 120 150 SE +/- 0.24, N = 3 SE +/- 0.35, N = 3 SE +/- 0.17, N = 3 SE +/- 0.11, N = 3 150.91 150.98 150.95 151.62 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 1 2 4 70 140 210 280 350 SE +/- 0.21, N = 3 SE +/- 0.46, N = 3 SE +/- 0.88, N = 3 300.94 301.04 302.31 MIN: 298.58 / MAX: 308.32 MIN: 298.88 / MAX: 315.68 MIN: 299.23 / MAX: 325.81 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
Etcpak Configuration: DXT1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 1 2 3 4 200 400 600 800 1000 SE +/- 1.85, N = 3 SE +/- 2.95, N = 3 SE +/- 0.94, N = 3 SE +/- 2.77, N = 3 1118.28 1123.24 1123.19 1122.17 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
OpenFOAM Input: Motorbike 60M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 60M 1 2 3 4 300 600 900 1200 1500 SE +/- 2.04, N = 3 SE +/- 0.58, N = 3 SE +/- 2.24, N = 3 SE +/- 1.99, N = 3 1180.20 1180.39 1178.12 1182.70 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -ldynamicMesh -lgenericPatchFields -lOpenFOAM -ldl -lm
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit 1 2 4 15 30 45 60 75 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 68.72 68.49 68.73 MIN: 44.63 / MAX: 173.21 MIN: 44.63 / MAX: 171.9 MIN: 44.64 / MAX: 172.53 1. (CC) gcc options: -pthread
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 10 1 2 4 0.6044 1.2088 1.8132 2.4176 3.022 SE +/- 0.028, N = 3 SE +/- 0.006, N = 3 SE +/- 0.024, N = 3 2.686 2.683 2.677
Gcrypt Library OpenBenchmarking.org Seconds, Fewer Is Better Gcrypt Library 1.9 1 2 4 50 100 150 200 250 SE +/- 0.47, N = 3 SE +/- 0.08, N = 3 SE +/- 0.10, N = 3 242.02 242.66 242.26 1. (CC) gcc options: -O2 -fvisibility=hidden -lgpg-error
lzbench Test: Zstd 8 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 8 - Process: Decompression 1 2 3 4 300 600 900 1200 1500 SE +/- 2.85, N = 3 SE +/- 0.88, N = 3 SE +/- 2.85, N = 3 1610 1607 1607 1606 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
GnuPG 2.7GB Sample File Encryption OpenBenchmarking.org Seconds, Fewer Is Better GnuPG 2.2.27 2.7GB Sample File Encryption 1 2 4 20 40 60 80 100 SE +/- 0.67, N = 13 SE +/- 0.71, N = 13 SE +/- 0.64, N = 13 74.23 74.35 74.41 1. (CC) gcc options: -O2
WebP2 Image Encode Encode Settings: Quality 95, Compression Effort 7 OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 95, Compression Effort 7 1 2 4 160 320 480 640 800 SE +/- 0.90, N = 3 SE +/- 1.11, N = 3 SE +/- 1.31, N = 3 763.42 761.96 763.75 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
Google SynthMark Test: VoiceMark_100 OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 1 2 4 130 260 390 520 650 SE +/- 0.38, N = 3 SE +/- 0.24, N = 3 SE +/- 0.82, N = 3 614.33 614.04 615.49 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 1 2 4 200 400 600 800 1000 SE +/- 0.40, N = 3 SE +/- 1.88, N = 3 SE +/- 0.49, N = 3 1024.39 1026.66 1026.65 1. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi
ONNX Runtime Model: bertsquad-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: bertsquad-10 - Device: OpenMP CPU 1 2 4 110 220 330 440 550 SE +/- 0.44, N = 3 SE +/- 0.29, N = 3 SE +/- 0.33, N = 3 488 488 487 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ASKAP Test: tConvolve MT - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Gridding 1 2 4 300 600 900 1200 1500 SE +/- 0.17, N = 3 SE +/- 0.29, N = 3 SE +/- 0.88, N = 3 1264.54 1265.38 1263.04 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K 1 2 4 20 40 60 80 100 SE +/- 0.13, N = 3 SE +/- 0.04, N = 3 SE +/- 0.14, N = 3 109.20 109.16 109.01 MIN: 103.03 / MAX: 122.93 MIN: 103.11 / MAX: 122.63 MIN: 103.01 / MAX: 122.86 1. (CC) gcc options: -pthread
ONNX Runtime Model: super-resolution-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: super-resolution-10 - Device: OpenMP CPU 1 2 4 700 1400 2100 2800 3500 SE +/- 8.93, N = 3 SE +/- 1.32, N = 3 SE +/- 5.22, N = 3 3483 3489 3488 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
WebP2 Image Encode Encode Settings: Quality 100, Lossless Compression OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 100, Lossless Compression 1 2 4 300 600 900 1200 1500 SE +/- 0.15, N = 3 SE +/- 0.70, N = 3 SE +/- 0.37, N = 3 1366.81 1366.26 1364.60 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
lzbench Test: Brotli 2 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 2 - Process: Decompression 1 2 3 4 140 280 420 560 700 SE +/- 0.88, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 668 669 669 669 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 1 2 4 60 120 180 240 300 SE +/- 0.11, N = 3 SE +/- 0.60, N = 3 SE +/- 0.42, N = 3 294.64 294.40 294.82 MIN: 292.53 / MAX: 302.03 MIN: 291.65 / MAX: 305.43 MIN: 292.2 / MAX: 297.29 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
ASKAP Test: tConvolve MT - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Degridding 1 2 4 400 800 1200 1600 2000 SE +/- 1.21, N = 3 SE +/- 0.96, N = 3 SE +/- 1.00, N = 3 1629.03 1630.14 1627.93 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ONNX Runtime Model: shufflenet-v2-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: shufflenet-v2-10 - Device: OpenMP CPU 1 2 4 2K 4K 6K 8K 10K SE +/- 13.87, N = 3 SE +/- 8.91, N = 3 SE +/- 17.32, N = 3 10369 10372 10382 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 3.2.3 Time To Compile 1 2 4 50 100 150 200 250 SE +/- 0.72, N = 3 SE +/- 0.30, N = 3 SE +/- 0.08, N = 3 251.11 250.85 251.05
WebP2 Image Encode Encode Settings: Quality 100, Compression Effort 5 OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 100, Compression Effort 5 1 2 4 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 21.06 21.07 21.08 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
ONNX Runtime Model: fcn-resnet101-11 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: fcn-resnet101-11 - Device: OpenMP CPU 1 2 4 10 20 30 40 50 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 45 45 45 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: yolov4 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: yolov4 - Device: OpenMP CPU 1 2 4 60 120 180 240 300 SE +/- 0.17, N = 3 SE +/- 0.17, N = 3 SE +/- 0.17, N = 3 277 277 277 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
lzbench Test: Libdeflate 1 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Libdeflate 1 - Process: Compression 1 2 3 4 40 80 120 160 200 199 199 199 199 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Brotli 2 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 2 - Process: Compression 1 2 3 4 40 80 120 160 200 SE +/- 0.58, N = 3 SE +/- 0.33, N = 3 160 160 160 160 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Crush 0 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Crush 0 - Process: Decompression 1 2 3 4 100 200 300 400 500 483 483 483 483 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Zstd 8 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 8 - Process: Compression 1 2 3 4 20 40 60 80 100 75 75 75 75 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: XZ 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: XZ 0 - Process: Compression 1 2 3 4 9 18 27 36 45 37 37 37 37 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: mobilenet-v1-1.0 1 2 4 1.2447 2.4894 3.7341 4.9788 6.2235 SE +/- 0.027, N = 3 SE +/- 0.446, N = 3 SE +/- 0.059, N = 3 4.912 5.532 4.914 MIN: 4.54 / MAX: 22.9 MIN: 4.23 / MAX: 16.41 MIN: 4.26 / MAX: 23.95 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: MobileNetV2_224 1 2 4 0.9527 1.9054 2.8581 3.8108 4.7635 SE +/- 0.162, N = 3 SE +/- 0.183, N = 3 SE +/- 0.075, N = 3 4.234 4.166 4.030 MIN: 3.64 / MAX: 22.78 MIN: 3.64 / MAX: 25 MIN: 3.63 / MAX: 18.94 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding 1 2 4 300 600 900 1200 1500 SE +/- 45.47, N = 15 SE +/- 47.82, N = 12 SE +/- 40.34, N = 15 1424.62 1417.70 1388.83 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Degridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding 1 2 4 200 400 600 800 1000 SE +/- 25.87, N = 15 SE +/- 31.02, N = 12 SE +/- 17.94, N = 15 1108.00 1128.17 1102.36 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 1 2 4 0.6577 1.3154 1.9731 2.6308 3.2885 SE +/- 0.060, N = 15 SE +/- 0.110, N = 15 SE +/- 0.087, N = 12 2.866 2.923 2.880 1. (CXX) g++ options: -O3 -pthread -lm
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C 1 2 3 4 140 280 420 560 700 SE +/- 13.91, N = 15 SE +/- 10.35, N = 15 SE +/- 10.09, N = 15 SE +/- 8.17, N = 15 617.96 627.12 619.24 636.84 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.3
Phoronix Test Suite v10.8.5