AMD Ryzen 9 3900X 12-Core testing of GCC 9 and GCC 10 development with Znver2 tuning following recent cost table updates, etc. Benchmarks by Michael Larabel for a future article..
GCC 9.1.0 Environment Notes: CXXFLAGS=-O3 CFLAGS=-O3Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
GCC 9.1.0 znver2 Processor: AMD Ryzen 9 3900X 12-Core @ 3.80GHz (12 Cores / 24 Threads), Motherboard: ASUS ROG CROSSHAIR VIII HERO (WI-FI) (0702 BIOS), Chipset: AMD Device 1480, Memory: 16384MB, Disk: 2000GB Force MP600, Graphics: Sapphire AMD Radeon RX 550 640SP / 560/560X 4GB (1300/1750MHz), Audio: AMD Device aae0, Monitor: ASUS VP28U, Network: Realtek Device 8125 + Intel I211 + Intel Device 2723
OS: Ubuntu 18.04, Kernel: 5.3.0-999-generic (x86_64) 20190725, Desktop: GNOME Shell 3.28.4, Display Server: X Server 1.20.4, Display Driver: modesetting 1.20.4, OpenGL: 4.5 Mesa 19.0.2 (LLVM 8.0.0), Compiler: GCC 9.1.0, File-System: ext4, Screen Resolution: 3840x2160
Environment Notes: CXXFLAGS=-O3-march=znver2 CFLAGS=-O3-march=znver2Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
GCC 10.0.0 Environment Notes: CXXFLAGS=-O3 CFLAGS=-O3Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
GCC 10.0.0 znver2 OS: Ubuntu 18.04, Kernel: 5.3.0-999-generic (x86_64) 20190725, Desktop: GNOME Shell 3.28.4, Display Server: X Server 1.20.4, Display Driver: modesetting 1.20.4, OpenGL: 4.5 Mesa 19.0.2 (LLVM 8.0.0), Compiler: GCC 10.0.0 20190727, File-System: ext4, Screen Resolution: 3840x2160
Environment Notes: CXXFLAGS=-O3-march=znver2 CFLAGS=-O3-march=znver2Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
AOM AV1 This is a simple test of the AOMedia AV1 encoder run on the CPU with a sample video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2019-02-11 AV1 Video Encoding GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 0.072 0.144 0.216 0.288 0.36 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.31 0.27 0.32 0.27 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
SVT-AV1 This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-AV1 CPU-based multi-threaded video encoder for the AV1 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.5 1080p 8-bit YUV To AV1 Video Encode GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 11 22 33 44 55 SE +/- 0.19, N = 3 SE +/- 0.15, N = 3 SE +/- 0.13, N = 3 SE +/- 0.27, N = 3 46.39 46.45 46.49 46.22 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -pie -lpthread -lm
SVT-HEVC This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 2019-02-03 1080p 8-bit YUV To HEVC Video Encode GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 50 100 150 200 250 SE +/- 0.72, N = 3 SE +/- 3.72, N = 3 SE +/- 1.78, N = 3 SE +/- 1.73, N = 3 247.33 246.01 247.99 248.85 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -fPIE -fPIC -O2 -flto -fvisibility=hidden -march=native -pie -rdynamic -lpthread -lrt
SVT-VP9 This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-VP9 CPU-based multi-threaded video encoder for the VP9 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 2019-02-17 1080p 8-bit YUV To VP9 Video Encode GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 20 40 60 80 100 SE +/- 0.08, N = 3 SE +/- 0.28, N = 3 SE +/- 0.19, N = 3 SE +/- 0.15, N = 3 96.54 89.99 92.35 89.84 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -fPIE -fPIC -O2 -flto -fvisibility=hidden -mavx -pie -rdynamic -lpthread -lrt -lm
x264 This is a simple test of the x264 encoder run on the CPU (OpenCL support disabled) with a sample video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better x264 2018-09-25 H.264 Video Encoding GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 30 60 90 120 150 SE +/- 2.03, N = 3 SE +/- 1.55, N = 7 SE +/- 2.09, N = 4 SE +/- 2.27, N = 3 138.41 139.59 139.82 138.74 -march=znver2 -march=znver2 1. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize
x265 This is a simple test of the x265 encoder run on the CPU with a sample 1080p video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better x265 3.0 H.265 1080p Video Encoding GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 12 24 36 48 60 SE +/- 0.06, N = 3 SE +/- 0.28, N = 3 SE +/- 0.19, N = 3 SE +/- 0.20, N = 3 52.53 52.94 52.40 53.00 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 0.6643 1.3286 1.9929 2.6572 3.3215 SE +/- 0.00095, N = 3 SE +/- 0.00047, N = 3 SE +/- 0.00082, N = 3 SE +/- 0.00151, N = 3 2.95225 2.73255 2.94730 2.72974 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 0.3894 0.7788 1.1682 1.5576 1.947 SE +/- 0.00081, N = 3 SE +/- 0.00015, N = 3 SE +/- 0.00091, N = 3 SE +/- 0.00098, N = 3 1.71668 1.70820 1.73055 1.72205 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 1.1354 2.2708 3.4062 4.5416 5.677 SE +/- 0.07571, N = 3 SE +/- 0.02698, N = 3 SE +/- 0.04322, N = 3 SE +/- 0.05697, N = 3 4.98832 4.89161 5.04603 4.94947 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
OpenBenchmarking.org GFLOP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 2 4 6 8 10 SE +/- 0.06300, N = 3 SE +/- 0.02013, N = 3 SE +/- 0.02559, N = 3 SE +/- 0.18198, N = 3 8.60514 8.59803 8.63794 8.81748 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 16 32 48 64 80 SE +/- 0.37, N = 3 SE +/- 0.23, N = 3 SE +/- 0.22, N = 3 SE +/- 0.08, N = 3 71.78 70.97 71.05 71.07 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 2 4 6 8 10 SE +/- 0.06300, N = 3 SE +/- 0.02013, N = 3 SE +/- 0.02559, N = 3 SE +/- 0.18198, N = 3 8.60514 8.59803 8.63794 8.81748 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 8 16 24 32 40 SE +/- 0.42, N = 3 SE +/- 0.19, N = 3 SE +/- 0.22, N = 3 SE +/- 0.11, N = 3 32.60 32.83 32.84 32.86 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
GNU MPC GNU MPC is a C library for the arithmetic of complex numbers. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Global Score, More Is Better GNU MPC 1.1.0 Multi-Precision Benchmark GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 2K 4K 6K 8K 10K SE +/- 50.44, N = 3 SE +/- 31.80, N = 3 SE +/- 102.03, N = 3 SE +/- 26.46, N = 3 9577 9597 9357 9580 -march=znver2 -march=znver2 1. (CC) gcc options: -lm -O3 -MT -MD -MP -MF
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 0.022 0.044 0.066 0.088 0.11 SE +/- 0.00042, N = 3 SE +/- 0.00036, N = 3 SE +/- 0.00044, N = 3 SE +/- 0.00041, N = 3 0.09798 0.09757 0.09771 0.09778 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests to stress the system's CPU. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Swirl GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 60 120 180 240 300 SE +/- 1.20, N = 3 SE +/- 1.86, N = 3 SE +/- 0.88, N = 3 259 251 264 254 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Rotate GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 60 120 180 240 300 SE +/- 1.20, N = 3 SE +/- 1.86, N = 3 SE +/- 4.33, N = 3 263 262 277 262 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Sharpen GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 40 80 120 160 200 SE +/- 0.33, N = 3 195 181 196 181 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Enhanced GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 50 100 150 200 250 SE +/- 1.20, N = 3 221 209 223 208 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Resizing GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 60 120 180 240 300 SE +/- 1.15, N = 3 SE +/- 2.19, N = 3 SE +/- 1.53, N = 3 SE +/- 2.65, N = 3 280 275 286 274 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Noise-Gaussian GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 40 80 120 160 200 SE +/- 0.58, N = 3 SE +/- 0.67, N = 3 SE +/- 0.33, N = 3 171 170 173 170 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: HWB Color Space GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 70 140 210 280 350 SE +/- 2.19, N = 3 SE +/- 2.60, N = 3 SE +/- 0.33, N = 3 SE +/- 2.19, N = 3 293 287 302 288 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
Coremark This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 120K 240K 360K 480K 600K SE +/- 2761.64, N = 3 SE +/- 1430.19, N = 3 SE +/- 1036.74, N = 3 SE +/- 1210.22, N = 3 555154.60 567987.34 567096.65 568329.00 -march=znver2 -march=znver2 1. (CC) gcc options: -O2 -O3 -lrt" -lrt
OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: deep GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 2K 4K 6K 8K 10K SE +/- 926.03, N = 12 SE +/- 8.82, N = 3 SE +/- 3.33, N = 3 10230.34 11190.00 11123.00 11137.00 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: lbry GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 8K 16K 24K 32K 40K SE +/- 5.77, N = 3 SE +/- 550.28, N = 3 SE +/- 20.82, N = 3 SE +/- 460.86, N = 5 34420 34583 34630 35288 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: skein GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 9K 18K 27K 36K 45K SE +/- 21.86, N = 3 SE +/- 602.50, N = 3 SE +/- 133.46, N = 3 SE +/- 5.77, N = 3 39797 39397 39843 39720 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: myr-gr GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 3K 6K 9K 12K 15K SE +/- 26.03, N = 3 SE +/- 49.78, N = 3 SE +/- 6.67, N = 3 SE +/- 40.00, N = 3 14023 14127 14137 14130 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: sha256t GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 20K 40K 60K 80K 100K SE +/- 1027.26, N = 6 SE +/- 990.16, N = 7 SE +/- 180.83, N = 3 SE +/- 116.81, N = 3 87238 87951 86440 86417 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 5K 10K 15K 20K 25K SE +/- 119.42, N = 3 SE +/- 159.70, N = 3 SE +/- 195.64, N = 3 SE +/- 62.37, N = 3 23832.61 24227.25 23993.04 23885.44 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
lzbench lzbench is an in-memory benchmark of various compressors. The file used for compression is a Linux kernel source tree tarball. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: XZ 0 - Process: Compression GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 9 18 27 36 45 SE +/- 0.33, N = 3 40 39 37 40 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: XZ 0 - Process: Decompression GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 30 60 90 120 150 SE +/- 0.33, N = 3 116 116 108 113 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Zstd 1 - Process: Compression GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 100 200 300 400 500 SE +/- 4.91, N = 8 SE +/- 3.18, N = 3 SE +/- 0.33, N = 3 468 468 453 467 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Zstd 1 - Process: Decompression GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 300 600 900 1200 1500 SE +/- 12.79, N = 8 SE +/- 9.50, N = 3 SE +/- 0.33, N = 3 SE +/- 0.58, N = 3 1268 1269 1250 1287 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Brotli 0 - Process: Compression GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 110 220 330 440 550 SE +/- 4.10, N = 3 SE +/- 0.67, N = 3 SE +/- 0.88, N = 3 SE +/- 0.88, N = 3 515 494 505 522 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Libdeflate 1 - Process: Compression GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 60 120 180 240 300 SE +/- 1.86, N = 3 SE +/- 0.67, N = 3 257 239 248 250 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Libdeflate 1 - Process: Decompression GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 300 600 900 1200 1500 SE +/- 10.00, N = 3 SE +/- 0.33, N = 3 1183 1119 1159 1147 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Sockperf This is a network socket API performance benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Messages Per Second, More Is Better Sockperf 3.4 Test: Throughput GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 110K 220K 330K 440K 550K SE +/- 4175.03, N = 5 SE +/- 4767.11, N = 5 SE +/- 3715.76, N = 18 SE +/- 5409.10, N = 5 517095 514551 529657 514748 -march=znver2 -march=znver2 1. (CXX) g++ options: --param -O3 -rdynamic -ldl -lpthread
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 3K 6K 9K 12K 15K SE +/- 15.90, N = 3 SE +/- 1.76, N = 3 SE +/- 5.51, N = 3 SE +/- 110.06, N = 3 11828 12958 14113 12748 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 3K 6K 9K 12K 15K SE +/- 141.66, N = 3 SE +/- 155.95, N = 3 SE +/- 2.19, N = 3 14314 11909 14119 12902 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 512 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 2K 4K 6K 8K 10K SE +/- 148.34, N = 4 SE +/- 30.08, N = 3 SE +/- 10.67, N = 3 SE +/- 19.17, N = 3 10814.00 9028.10 10531.00 9583.73 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 2K 4K 6K 8K 10K SE +/- 67.62, N = 3 SE +/- 73.85, N = 3 SE +/- 95.02, N = 3 SE +/- 40.30, N = 3 7920.17 7063.03 7823.27 7071.30 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 10K 20K 30K 40K 50K SE +/- 105.51, N = 3 SE +/- 663.38, N = 4 SE +/- 28.47, N = 3 SE +/- 54.85, N = 3 44951 45253 46305 45361 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
SciMark This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 800 1600 2400 3200 4000 SE +/- 5.91, N = 3 SE +/- 25.64, N = 3 SE +/- 13.97, N = 3 SE +/- 5.96, N = 3 3686.60 2768.16 3553.67 3127.49 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 200 400 600 800 1000 SE +/- 0.74, N = 3 SE +/- 7.16, N = 3 SE +/- 0.29, N = 3 SE +/- 0.24, N = 3 800.23 761.38 759.97 777.17 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 70 140 210 280 350 SE +/- 0.21, N = 3 SE +/- 2.85, N = 3 SE +/- 0.24, N = 3 SE +/- 0.54, N = 3 273.49 295.18 261.10 301.17 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 800 1600 2400 3200 4000 SE +/- 13.73, N = 3 SE +/- 37.65, N = 3 SE +/- 58.43, N = 3 SE +/- 15.78, N = 3 3580.73 3767.63 3675.94 3856.63 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 2K 4K 6K 8K 10K SE +/- 15.98, N = 3 SE +/- 60.72, N = 3 SE +/- 12.24, N = 3 SE +/- 19.91, N = 3 11370.27 6891.37 10777.88 8526.66 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 500 1000 1500 2000 2500 SE +/- 0.53, N = 3 SE +/- 20.16, N = 3 SE +/- 0.89, N = 3 SE +/- 0.16, N = 3 2408.26 2125.26 2293.46 2175.85 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
Himeno Benchmark The Himeno benchmark is a linear solver of pressure Poisson using a point-Jacobi method. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 300 600 900 1200 1500 SE +/- 6.19, N = 3 SE +/- 11.21, N = 3 SE +/- 0.48, N = 3 SE +/- 2.93, N = 3 1378.46 1322.90 1385.88 1385.23 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -mavx2
TSCP This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 300K 600K 900K 1200K 1500K SE +/- 10688.23, N = 5 SE +/- 620.00, N = 5 SE +/- 6261.48, N = 5 SE +/- 676.60, N = 5 1337188 1305781 1408752 1366017 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -march=native
Stockfish This is a test of Stockfish, an advanced C++11 chess benchmark that can scale up to 128 CPU cores. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 9 Total Time GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 8M 16M 24M 32M 40M SE +/- 164232.11, N = 3 SE +/- 210046.69, N = 3 SE +/- 131167.27, N = 3 SE +/- 237875.03, N = 3 39561655 39278964 39540328 39631993 -march=znver2 -march=znver2 1. (CXX) g++ options: -m64 -lpthread -O3 -fno-exceptions -std=c++11 -pedantic -msse -msse3 -mpopcnt -flto
GROMACS The Gromacs molecular dynamics package testing on the CPU with the water_GMX50 data. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2018.3 Water Benchmark GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 0.2228 0.4456 0.6684 0.8912 1.114 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.99 0.98 0.98 0.97 -march=znver2 -march=znver2 1. (CXX) g++ options: -march=core-avx2 -O3 -std=c++11 -funroll-all-loops -fopenmp -lrt -lpthread -lm
OpenBenchmarking.org Operations Per Second, More Is Better Memcached mcperf 1.5.10 Method: Set GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 13K 26K 39K 52K 65K SE +/- 3850.96, N = 15 SE +/- 293.82, N = 3 SE +/- 393.33, N = 3 SE +/- 2058.38, N = 15 59232.07 52914.10 52910.87 57193.25 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm -rdynamic
Redis Redis is an open-source data structure server. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better Redis 4.0.8 Test: GET GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 700K 1400K 2100K 2800K 3500K SE +/- 61029.58, N = 15 SE +/- 40781.06, N = 3 SE +/- 51486.64, N = 15 SE +/- 47460.73, N = 15 3066070.28 3297713.33 3031706.22 3042507.47 1. (CC) gcc options: -ggdb -rdynamic -lm -ldl -pthread
OpenBenchmarking.org Requests Per Second, More Is Better Redis 4.0.8 Test: SET GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 500K 1000K 1500K 2000K 2500K SE +/- 19021.82, N = 3 SE +/- 30290.32, N = 4 SE +/- 14796.01, N = 3 SE +/- 28123.08, N = 3 2169531.00 2122162.94 2084989.88 2051361.33 1. (CC) gcc options: -ggdb -rdynamic -lm -ldl -pthread
NGINX Benchmark This is a test of ab, which is the Apache Benchmark program running against nginx. This test profile measures how many requests per second a given system can sustain when carrying out 2,000,000 requests with 500 requests being carried out concurrently. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better NGINX Benchmark 1.9.9 Static Web Page Serving GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 9K 18K 27K 36K 45K SE +/- 112.05, N = 3 SE +/- 102.83, N = 3 SE +/- 23.74, N = 3 SE +/- 158.42, N = 3 39602.49 39734.85 39346.91 39525.70 -march=znver2 -march=znver2 1. (CC) gcc options: -lpthread -lcrypt -lcrypto -lz -O3 -march=native
Apache Benchmark This is a test of ab, which is the Apache benchmark program. This test profile measures how many requests per second a given system can sustain when carrying out 1,000,000 requests with 100 requests being carried out concurrently. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.29 Static Web Page Serving GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 8K 16K 24K 32K 40K SE +/- 139.15, N = 3 SE +/- 57.64, N = 3 SE +/- 79.10, N = 3 SE +/- 65.39, N = 3 38022.79 38392.29 38009.25 38490.98 -march=znver2 -march=znver2 1. (CC) gcc options: -shared -fPIC -pthread -O3
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test measures the RSA 4096-bit performance of OpenSSL. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 800 1600 2400 3200 4000 SE +/- 1.42, N = 3 SE +/- 7.07, N = 3 SE +/- 0.70, N = 3 SE +/- 1.89, N = 3 3481.50 3516.27 3487.10 3492.53 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
PostgreSQL pgbench This is a simple benchmark of PostgreSQL using pgbench. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Only GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 60K 120K 180K 240K 300K SE +/- 235.79, N = 3 SE +/- 513.53, N = 3 SE +/- 237.85, N = 3 SE +/- 102.78, N = 3 297539.89 300353.09 298969.75 300244.81 -march=znver2 -march=znver2 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 6K 12K 18K 24K 30K SE +/- 40.41, N = 3 SE +/- 55.36, N = 3 SE +/- 124.84, N = 3 SE +/- 31.16, N = 3 29149.20 29178.23 29148.60 29372.39 -march=znver2 -march=znver2 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
Apache Siege This is a test of the Apache web server performance being facilitated by the Siege web serverb enchmark program. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Transactions Per Second, More Is Better Apache Siege 2.4.29 Concurrent Users: 200 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 20K 40K 60K 80K 100K SE +/- 3575.15, N = 15 SE +/- 798.56, N = 3 SE +/- 1288.23, N = 15 SE +/- 3302.37, N = 12 99824.49 60835.79 83275.06 82293.14 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lpthread -ldl -lssl -lcrypto
OpenBenchmarking.org Transactions Per Second, More Is Better Apache Siege 2.4.29 Concurrent Users: 250 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 20K 40K 60K 80K 100K SE +/- 4063.46, N = 12 SE +/- 3755.13, N = 15 SE +/- 1636.75, N = 12 SE +/- 122.71, N = 3 96842.13 98050.91 102423.07 62725.24 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lpthread -ldl -lssl -lcrypto
MKL-DNN This is a test of the Intel MKL-DNN as the Intel Math Kernel Library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: IP Batch 1D - Data Type: f32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 30 60 90 120 150 SE +/- 2.27, N = 15 SE +/- 3.35, N = 15 SE +/- 3.21, N = 14 SE +/- 3.18, N = 12 155.34 152.51 157.72 154.09 -march=znver2 - MIN: 127.99 MIN: 111.42 -march=znver2 - MIN: 127 MIN: 129 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: IP Batch All - Data Type: f32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 300 600 900 1200 1500 SE +/- 17.21, N = 3 SE +/- 7.48, N = 3 SE +/- 25.50, N = 3 SE +/- 5.99, N = 3 1599.68 1523.52 1556.91 1582.78 -march=znver2 - MIN: 1393.73 MIN: 1357.02 -march=znver2 - MIN: 1368.2 MIN: 1385.56 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_3d - Data Type: f32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 30 60 90 120 150 SE +/- 0.45, N = 3 SE +/- 0.16, N = 3 SE +/- 1.48, N = 4 SE +/- 0.79, N = 3 118.47 117.60 118.02 116.62 -march=znver2 - MIN: 103.47 MIN: 103.13 -march=znver2 - MIN: 102.11 MIN: 102.39 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_all - Data Type: f32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 4K 8K 12K 16K 20K SE +/- 22.35, N = 3 SE +/- 42.61, N = 3 SE +/- 41.11, N = 3 SE +/- 87.03, N = 3 19696.57 19613.70 19694.33 19803.57 -march=znver2 - MIN: 19033.5 MIN: 18961.5 -march=znver2 - MIN: 18995.6 MIN: 19014.9 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_1d - Data Type: f32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 50 100 150 200 250 SE +/- 1.79, N = 13 SE +/- 2.00, N = 15 SE +/- 1.85, N = 15 SE +/- 0.29, N = 3 217.02 221.23 218.66 212.83 -march=znver2 - MIN: 203.65 MIN: 202.07 -march=znver2 - MIN: 203.42 MIN: 201.7 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_3d - Data Type: f32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 13 26 39 52 65 SE +/- 0.49, N = 15 SE +/- 0.66, N = 7 SE +/- 0.58, N = 8 SE +/- 0.69, N = 15 57.97 59.00 56.87 58.16 -march=znver2 - MIN: 51.57 MIN: 50.8 -march=znver2 - MIN: 50.96 MIN: 50.91 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_alexnet - Data Type: f32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 500 1000 1500 2000 2500 SE +/- 9.61, N = 3 SE +/- 9.57, N = 3 SE +/- 17.20, N = 3 SE +/- 6.13, N = 3 2520.01 2527.50 2543.93 2507.16 -march=znver2 - MIN: 2467.07 MIN: 2462.11 -march=znver2 - MIN: 2467.76 MIN: 2461.57 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_all - Data Type: f32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 11K 22K 33K 44K 55K SE +/- 668.40, N = 3 SE +/- 589.20, N = 6 SE +/- 390.75, N = 3 SE +/- 224.88, N = 3 52238.80 51813.05 50679.53 50039.13 -march=znver2 - MIN: 49224.9 MIN: 48543.1 -march=znver2 - MIN: 48056.6 MIN: 46883.1 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 200 400 600 800 1000 SE +/- 6.23, N = 3 SE +/- 6.51, N = 3 SE +/- 5.61, N = 3 SE +/- 6.39, N = 3 1153.46 1147.62 1145.95 1145.01 -march=znver2 - MIN: 1057.54 MIN: 1052.13 -march=znver2 - MIN: 1052.71 MIN: 1050.58 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 10 20 30 40 50 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 39.42 43.09 39.36 42.63 -march=znver2 -march=znver2 1. (CC) gcc options: -lm -lpthread -O3
Smallpt Smallpt is a C++ global illumination renderer written in less than 100 lines of code. Global illumination is done via unbiased Monte Carlo path tracing and there is multi-threading support via the OpenMP library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 128 Samples GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 7.67 7.78 7.53 7.84 -march=znver2 -march=znver2 1. (CXX) g++ options: -fopenmp -O3
AOBench AOBench is a lightweight ambient occlusion renderer, written in C. The test profile is using a size of 2048 x 2048. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better AOBench Size: 2048 x 2048 - Total Time GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 8 16 24 32 40 SE +/- 0.10, N = 3 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 33.20 34.60 33.05 35.98 -march=znver2 -march=znver2 1. (CC) gcc options: -lm -O3
OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 3000 Fall GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 0.774 1.548 2.322 3.096 3.87 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 3.22 3.20 3.27 3.44 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 1000 Stack GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 0.9248 1.8496 2.7744 3.6992 4.624 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 3.77 3.88 3.85 4.11 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 1000 Convex GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 0.8393 1.6786 2.5179 3.3572 4.1965 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 3.57 3.51 3.60 3.73 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 136 Ragdolls GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 0.495 0.99 1.485 1.98 2.475 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 2.04 2.05 2.05 2.20 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Prim Trimesh GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 0.1823 0.3646 0.5469 0.7292 0.9115 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 0.77 0.75 0.77 0.81 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Convex Trimesh GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 0.2138 0.4276 0.6414 0.8552 1.069 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 0.90 0.89 0.91 0.95 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
XZ Compression This test measures the time needed to compress a sample file (an Ubuntu file-system image) using XZ compression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better XZ Compression 5.2.4 Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 6 12 18 24 30 SE +/- 0.13, N = 3 SE +/- 0.10, N = 3 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 25.25 25.23 25.39 25.26 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -fvisibility=hidden -O3
LAME MP3 Encoding LAME is an MP3 encoder licensed under the LGPL. This test measures the time required to encode a WAV file to MP3 format. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.100 WAV To MP3 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 6.94 7.25 7.45 7.28 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lncurses -lm
Ogg Encoding This test times how long it takes to encode a sample WAV file to Ogg format using vorbis-tools, libvorbis, and libogg. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Ogg Encoding 1.3.3 WAV To Ogg GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 1.206 2.412 3.618 4.824 6.03 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 4 SE +/- 0.00, N = 3 5.05 5.13 5.36 5.05 -march=znver2 -march=znver2 1. (CC) gcc options: -O2 -ffast-math -fsigned-char -O3 -logg
FFmpeg This test uses FFmpeg for testing the system's audio/video encoding performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 4.0.2 H.264 HD To NTSC DV GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 2 4 6 8 10 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 6.83 6.86 6.78 6.88 -march=znver2 -march=znver2 1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lXv -lX11 -lXext -lm -lxcb -lxcb-shape -lxcb-xfixes -lasound -lSDL2 -lsndio -pthread -lbz2 -llzma -O3 -std=c11 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -MMD -MF -MT
m-queens A solver for the N-queens problem with multi-threading support via the OpenMP library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better m-queens 1.2 Time To Solve GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 11 22 33 44 55 SE +/- 0.08, N = 3 SE +/- 0.12, N = 3 SE +/- 0.10, N = 3 SE +/- 0.11, N = 3 47.27 47.12 47.21 47.14 -march=znver2 -march=znver2 1. (CXX) g++ options: -fopenmp -O3 -O2 -march=native
OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Ctype GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 7 14 21 28 35 SE +/- 0.14, N = 3 SE +/- 0.28, N = 3 SE +/- 0.03, N = 3 SE +/- 0.38, N = 5 31.43 31.52 31.30 31.51 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Math Library GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 70 140 210 280 350 SE +/- 3.91, N = 3 SE +/- 0.26, N = 3 SE +/- 2.37, N = 3 SE +/- 4.29, N = 3 302.82 309.36 306.02 307.23 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Random Numbers GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 200 400 600 800 1000 SE +/- 0.27, N = 3 SE +/- 2.69, N = 3 SE +/- 10.35, N = 5 SE +/- 4.15, N = 3 750.66 751.15 787.77 799.88 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Stepanov Vector GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.35, N = 3 SE +/- 0.04, N = 3 SE +/- 0.88, N = 3 74.08 76.45 77.22 74.26 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Function Objects GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 SE +/- 0.20, N = 4 SE +/- 0.17, N = 3 14.15 14.40 14.90 15.10 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Stepanov Abstraction GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 7 14 21 28 35 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.45, N = 3 SE +/- 0.08, N = 3 28.93 27.60 28.30 28.19 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
Sockperf This is a network socket API performance benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org usec, Fewer Is Better Sockperf 3.4 Test: Latency Ping Pong GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 0.7088 1.4176 2.1264 2.8352 3.544 SE +/- 0.04, N = 6 SE +/- 0.04, N = 5 SE +/- 0.02, N = 25 SE +/- 0.02, N = 25 3.15 3.12 3.04 3.03 -march=znver2 -march=znver2 1. (CXX) g++ options: --param -O3 -rdynamic -ldl -lpthread
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 0.0747 0.1494 0.2241 0.2988 0.3735 SE +/- 0.00042, N = 3 SE +/- 0.00047, N = 3 SE +/- 0.00071, N = 3 SE +/- 0.00125, N = 3 0.32698 0.32596 0.32521 0.33186 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
GCC 9.1.0 Environment Notes: CXXFLAGS=-O3 CFLAGS=-O3Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
Testing initiated at 27 July 2019 08:54 by user phoronix.
GCC 9.1.0 znver2 Processor: AMD Ryzen 9 3900X 12-Core @ 3.80GHz (12 Cores / 24 Threads), Motherboard: ASUS ROG CROSSHAIR VIII HERO (WI-FI) (0702 BIOS), Chipset: AMD Device 1480, Memory: 16384MB, Disk: 2000GB Force MP600, Graphics: Sapphire AMD Radeon RX 550 640SP / 560/560X 4GB (1300/1750MHz), Audio: AMD Device aae0, Monitor: ASUS VP28U, Network: Realtek Device 8125 + Intel I211 + Intel Device 2723
OS: Ubuntu 18.04, Kernel: 5.3.0-999-generic (x86_64) 20190725, Desktop: GNOME Shell 3.28.4, Display Server: X Server 1.20.4, Display Driver: modesetting 1.20.4, OpenGL: 4.5 Mesa 19.0.2 (LLVM 8.0.0), Compiler: GCC 9.1.0, File-System: ext4, Screen Resolution: 3840x2160
Environment Notes: CXXFLAGS=-O3-march=znver2 CFLAGS=-O3-march=znver2Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
Testing initiated at 27 July 2019 19:17 by user phoronix.
GCC 10.0.0 Environment Notes: CXXFLAGS=-O3 CFLAGS=-O3Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
Testing initiated at 28 July 2019 18:18 by user phoronix.
GCC 10.0.0 znver2 Processor: AMD Ryzen 9 3900X 12-Core @ 3.80GHz (12 Cores / 24 Threads), Motherboard: ASUS ROG CROSSHAIR VIII HERO (WI-FI) (0702 BIOS), Chipset: AMD Device 1480, Memory: 16384MB, Disk: 2000GB Force MP600, Graphics: Sapphire AMD Radeon RX 550 640SP / 560/560X 4GB (1300/1750MHz), Audio: AMD Device aae0, Monitor: ASUS VP28U, Network: Realtek Device 8125 + Intel I211 + Intel Device 2723
OS: Ubuntu 18.04, Kernel: 5.3.0-999-generic (x86_64) 20190725, Desktop: GNOME Shell 3.28.4, Display Server: X Server 1.20.4, Display Driver: modesetting 1.20.4, OpenGL: 4.5 Mesa 19.0.2 (LLVM 8.0.0), Compiler: GCC 10.0.0 20190727, File-System: ext4, Screen Resolution: 3840x2160
Environment Notes: CXXFLAGS=-O3-march=znver2 CFLAGS=-O3-march=znver2Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
Testing initiated at 28 July 2019 06:10 by user phoronix.