Ryzen 9 3900X Znver2 Compiler Tuning AMD Ryzen 9 3900X 12-Core testing of GCC 9 and GCC 10 development with Znver2 tuning following recent cost table updates, etc. Benchmarks by Michael Larabel for a future article..
HTML result view exported from: https://openbenchmarking.org/result/1907290-HV-RYZEN939034&sro&grs .
Ryzen 9 3900X Znver2 Compiler Tuning Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 GCC 10.0.0 znver2 AMD Ryzen 9 3900X 12-Core @ 3.80GHz (12 Cores / 24 Threads) ASUS ROG CROSSHAIR VIII HERO (WI-FI) (0702 BIOS) AMD Device 1480 16384MB 2000GB Force MP600 Sapphire AMD Radeon RX 550 640SP / 560/560X 4GB (1300/1750MHz) AMD Device aae0 ASUS VP28U Realtek Device 8125 + Intel I211 + Intel Device 2723 Ubuntu 18.04 5.3.0-999-generic (x86_64) 20190725 GNOME Shell 3.28.4 X Server 1.20.4 modesetting 1.20.4 4.5 Mesa 19.0.2 (LLVM 8.0.0) GCC 9.1.0 ext4 3840x2160 GCC 10.0.0 20190727 OpenBenchmarking.org Environment Details - GCC 9.1.0: CXXFLAGS=-O3 CFLAGS=-O3 - GCC 9.1.0 znver2: CXXFLAGS=-O3-march=znver2 CFLAGS=-O3-march=znver2 - GCC 10.0.0: CXXFLAGS=-O3 CFLAGS=-O3 - GCC 10.0.0 znver2: CXXFLAGS=-O3-march=znver2 CFLAGS=-O3-march=znver2 Compiler Details - --disable-multilib --enable-checking=release Processor Details - Scaling Governor: acpi-cpufreq ondemand Python Details - Python 2.7.15+ + Python 3.6.8 Security Details - l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
Ryzen 9 3900X Znver2 Compiler Tuning scimark2: Dense LU Matrix Factorization scimark2: Composite fftw: Stock - 2D FFT Size 32 fftw: Stock - 2D FFT Size 512 fftw: Stock - 1D FFT Size 32 aom-av1: AV1 Video Encoding scimark2: Fast Fourier Transform scimark2: Jacobi Successive Over-Relaxation fftw: Stock - 2D FFT Size 4096 c-ray: Total Time - 4K, 16 Rays Per Pixel bullet: 1000 Stack aobench: 2048 x 2048 - Total Time graphics-magick: Sharpen hpcc: G-Ptrans lzbench: XZ 0 - Compression bullet: Prim Trimesh tscp: AI Chess Performance bullet: 136 Ragdolls scimark2: Sparse Matrix Multiply lzbench: Libdeflate 1 - Compression bullet: 3000 Fall svt-vp9: 1080p 8-bit YUV To VP9 Video Encode lzbench: XZ 0 - Decompression encode-mp3: WAV To MP3 graphics-magick: Enhanced build-llvm: Time To Compile cpp-perf-bench: Atol bullet: Convex Trimesh cpp-perf-bench: Function Objects bullet: Raytests cpp-perf-bench: Rand Numbers bullet: 1000 Convex encode-ogg: WAV To Ogg redis: SET graphics-magick: Rotate lzbench: Libdeflate 1 - Decompression lzbench: Brotli 0 - Compression encode-flac: WAV To FLAC scimark2: Monte Carlo mcperf: Get graphics-magick: HWB Color Space graphics-magick: Swirl mkl-dnn: IP Batch All - f32 cpp-perf-bench: Stepanov Abstraction himeno: Poisson Pressure Solver mkl-dnn: Deconvolution Batch deconv_all - f32 graphics-magick: Resizing cpp-perf-bench: Stepanov Vector smallpt: Global Illumination Renderer; 128 Samples sockperf: Latency Ping Pong mkl-dnn: Deconvolution Batch deconv_1d - f32 mkl-dnn: Deconvolution Batch deconv_3d - f32 tjbench: Decompression Throughput lzbench: Zstd 1 - Compression build-php: Time To Compile hpcc: Rand Ring Bandwidth fftw: Float + SSE - 2D FFT Size 32 lzbench: Zstd 1 - Decompression sockperf: Throughput mpcbench: Multi-Precision Benchmark hpcc: G-Ffte hpcc: G-Ffte cpuminer-opt: lbry coremark: CoreMark Size 666 - Iterations Per Second cpp-perf-bench: Math Library gromacs: Water Benchmark hpcc: Rand Ring Latency cpuminer-opt: sha256t graphics-magick: Noise-Gaussian hpcc: Max Ping Pong Bandwidth mkl-dnn: Convolution Batch conv_3d - f32 ffmpeg: H.264 HD To NTSC DV mkl-dnn: Convolution Batch conv_alexnet - f32 hpcc: EP-STREAM Triad apache: Static Web Page Serving svt-hevc: 1080p 8-bit YUV To HEVC Video Encode x265: H.265 1080p Video Encoding hpcc: G-HPL cpuminer-opt: skein x264: H.264 Video Encoding openssl: RSA 4096-bit Performance nginx: Static Web Page Serving mkl-dnn: Convolution Batch conv_all - f32 pgbench: Buffer Test - Normal Load - Read Only hpcg: stockfish: Total Time john-the-ripper: Blowfish cpuminer-opt: myr-gr hpcc: EP-DGEMM pgbench: Buffer Test - Normal Load - Read Write mkl-dnn: Convolution Batch conv_googlenet_v3 - f32 cpp-perf-bench: Ctype compress-xz: Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9 svt-av1: 1080p 8-bit YUV To AV1 Video Encode cpuminer-opt: m7m hpcc: G-Rand Access m-queens: Time To Solve apache-siege: 250 apache-siege: 200 mcperf: Set redis: GET cpuminer-opt: deep mkl-dnn: IP Batch 1D - f32 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 GCC 10.0.0 znver2 6891.37 2768.16 11909 9028.10 12958 0.27 295.18 2125.26 7063.03 43.09 3.88 34.60 181 2.73255 39 0.75 1305781 2.05 3767.63 239 3.20 89.99 116 7.25 209 280.27 59.31 0.89 14.40 1.98 751.15 3.51 5.13 2122162.94 262 1119 494 7.70 761.38 92376.40 287 251 1523.52 27.60 1322.90 51813.05 275 76.45 7.78 3.12 221.23 59.00 218.09 468 52.71 4.89161 45253 1269 514551 9597 8.59803 8.59803 34583 567987.34 309.36 0.98 0.32596 87951 170 24227.253 117.60 6.86 2527.50 1.70820 38392.29 246.01 52.94 70.96900 39397 139.59 3516.27 39734.85 19613.70 300353.09 1.09 39278964 20335 14127 32.83083 29178.23 1147.62 31.52 25.23 46.45 593.89 0.09757 47.12 98050.91 60835.79 52914.10 3297713.33 11190 152.51 11370.27 3686.60 14314 10814 11828 0.31 273.49 2408.26 7920.17 39.42 3.77 33.20 195 2.95225 40 0.77 1337188 2.04 3580.73 257 3.22 96.54 116 6.94 221 284.10 60.38 0.90 14.15 2.04 750.66 3.57 5.05 2169531.00 263 1183 515 7.99 800.23 93850.59 293 259 1599.68 28.93 1378.46 52238.80 280 74.08 7.67 3.15 217.02 57.97 225.64 468 53.91 4.98832 44951 1268 517095 9577 8.60514 8.60514 34420 555154.60 302.82 0.99 0.32698 87238 171 23832.608 118.47 6.83 2520.01 1.71668 38022.79 247.33 52.53 71.77663 39797 138.41 3481.50 39602.49 19696.57 297539.89 1.08 39561655 20253 14023 32.60263 29149.20 1153.46 31.43 25.25 46.39 590.66 0.09798 47.27 96842.13 99824.49 59232.07 3066070.28 10230.34 155.34 8526.66 3127.49 12902 9583.73 12748 0.27 301.17 2175.85 7071.30 42.63 4.11 35.98 181 2.72974 40 0.81 1366017 2.20 3856.63 250 3.44 89.84 113 7.28 208 292.53 59.97 0.95 15.10 2.11 799.88 3.73 5.05 2051361.33 262 1147 507 7.72 777.17 95710.60 288 254 1582.78 28.19 1385.23 50039.13 274 74.26 7.84 3.03 212.83 58.16 220.33 467 54.43 4.94947 45361 1287 514748 9580 8.81748 8.81748 35288 568329.00 307.23 0.97 0.33186 86417 170 23885.438 116.62 6.88 2507.16 1.72205 38490.98 248.85 53.00 71.07010 39720 138.74 3492.53 39525.70 19803.57 300244.81 1.09 39631993 20426 14130 32.86343 29372.39 1145.01 31.51 25.26 46.22 591.32 0.09778 47.14 62725.24 82293.14 57193.25 3042507.47 11137 154.09 10777.88 3553.67 14119 10531 14113 0.32 261.10 2293.46 7823.27 39.36 3.85 33.05 196 2.94730 37 0.77 1408752 2.05 3675.94 248 3.27 92.35 108 7.45 223 300.31 63.34 0.91 14.90 2.06 787.77 3.60 5.36 2084989.88 277 1159 499 8.11 759.97 97228.27 302 264 1556.91 28.30 1385.88 50679.53 286 77.22 7.53 3.04 218.66 56.87 225.44 453 53.76 5.04603 46305 1250 529657 9357 8.63794 8.63794 34630 567096.65 306.02 0.98 0.32521 86440 173 23993.043 118.02 6.78 2543.93 1.73055 38009.25 247.99 52.40 71.04930 39843 139.82 3487.10 39346.91 19694.33 298969.75 1.08 39540328 20426 14137 32.84363 29148.60 1145.95 31.30 25.39 46.49 590.80 0.09771 47.21 102423.07 83275.06 52910.87 3031706.22 11123 157.72 OpenBenchmarking.org
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 2K 4K 6K 8K 10K SE +/- 19.91, N = 3 SE +/- 12.24, N = 3 SE +/- 60.72, N = 3 SE +/- 15.98, N = 3 8526.66 10777.88 6891.37 11370.27 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 800 1600 2400 3200 4000 SE +/- 5.96, N = 3 SE +/- 13.97, N = 3 SE +/- 25.64, N = 3 SE +/- 5.91, N = 3 3127.49 3553.67 2768.16 3686.60 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
FFTW Build: Stock - Size: 2D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 32 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 3K 6K 9K 12K 15K SE +/- 2.19, N = 3 SE +/- 155.95, N = 3 SE +/- 141.66, N = 3 12902 14119 11909 14314 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
FFTW Build: Stock - Size: 2D FFT Size 512 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 512 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 2K 4K 6K 8K 10K SE +/- 19.17, N = 3 SE +/- 10.67, N = 3 SE +/- 30.08, N = 3 SE +/- 148.34, N = 4 9583.73 10531.00 9028.10 10814.00 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
FFTW Build: Stock - Size: 1D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 32 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 3K 6K 9K 12K 15K SE +/- 110.06, N = 3 SE +/- 5.51, N = 3 SE +/- 1.76, N = 3 SE +/- 15.90, N = 3 12748 14113 12958 11828 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
AOM AV1 AV1 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2019-02-11 AV1 Video Encoding GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.072 0.144 0.216 0.288 0.36 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.27 0.32 0.27 0.31 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 70 140 210 280 350 SE +/- 0.54, N = 3 SE +/- 0.24, N = 3 SE +/- 2.85, N = 3 SE +/- 0.21, N = 3 301.17 261.10 295.18 273.49 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 500 1000 1500 2000 2500 SE +/- 0.16, N = 3 SE +/- 0.89, N = 3 SE +/- 20.16, N = 3 SE +/- 0.53, N = 3 2175.85 2293.46 2125.26 2408.26 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
FFTW Build: Stock - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 2K 4K 6K 8K 10K SE +/- 40.30, N = 3 SE +/- 95.02, N = 3 SE +/- 73.85, N = 3 SE +/- 67.62, N = 3 7071.30 7823.27 7063.03 7920.17 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
C-Ray Total Time - 4K, 16 Rays Per Pixel OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 10 20 30 40 50 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 42.63 39.36 43.09 39.42 -march=znver2 -march=znver2 1. (CC) gcc options: -lm -lpthread -O3
Bullet Physics Engine Test: 1000 Stack OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 1000 Stack GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.9248 1.8496 2.7744 3.6992 4.624 SE +/- 0.00, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 4.11 3.85 3.88 3.77 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
AOBench Size: 2048 x 2048 - Total Time OpenBenchmarking.org Seconds, Fewer Is Better AOBench Size: 2048 x 2048 - Total Time GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 8 16 24 32 40 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 35.98 33.05 34.60 33.20 -march=znver2 -march=znver2 1. (CC) gcc options: -lm -O3
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Sharpen GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 40 80 120 160 200 SE +/- 0.33, N = 3 181 196 181 195 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
HPC Challenge Test / Class: G-Ptrans OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.6643 1.3286 1.9929 2.6572 3.3215 SE +/- 0.00151, N = 3 SE +/- 0.00082, N = 3 SE +/- 0.00047, N = 3 SE +/- 0.00095, N = 3 2.72974 2.94730 2.73255 2.95225 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
lzbench Test: XZ 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: XZ 0 - Process: Compression GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 9 18 27 36 45 SE +/- 0.33, N = 3 40 37 39 40 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Bullet Physics Engine Test: Prim Trimesh OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Prim Trimesh GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.1823 0.3646 0.5469 0.7292 0.9115 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.81 0.77 0.75 0.77 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 300K 600K 900K 1200K 1500K SE +/- 676.60, N = 5 SE +/- 6261.48, N = 5 SE +/- 620.00, N = 5 SE +/- 10688.23, N = 5 1366017 1408752 1305781 1337188 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -march=native
Bullet Physics Engine Test: 136 Ragdolls OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 136 Ragdolls GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.495 0.99 1.485 1.98 2.475 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 2.20 2.05 2.05 2.04 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 800 1600 2400 3200 4000 SE +/- 15.78, N = 3 SE +/- 58.43, N = 3 SE +/- 37.65, N = 3 SE +/- 13.73, N = 3 3856.63 3675.94 3767.63 3580.73 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
lzbench Test: Libdeflate 1 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Libdeflate 1 - Process: Compression GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 60 120 180 240 300 SE +/- 0.67, N = 3 SE +/- 1.86, N = 3 250 248 239 257 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Bullet Physics Engine Test: 3000 Fall OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 3000 Fall GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.774 1.548 2.322 3.096 3.87 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 3.44 3.27 3.20 3.22 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
SVT-VP9 1080p 8-bit YUV To VP9 Video Encode OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 2019-02-17 1080p 8-bit YUV To VP9 Video Encode GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.19, N = 3 SE +/- 0.28, N = 3 SE +/- 0.08, N = 3 89.84 92.35 89.99 96.54 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -fPIE -fPIC -O2 -flto -fvisibility=hidden -mavx -pie -rdynamic -lpthread -lrt -lm
lzbench Test: XZ 0 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: XZ 0 - Process: Decompression GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 30 60 90 120 150 SE +/- 0.33, N = 3 113 108 116 116 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.100 WAV To MP3 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 7.28 7.45 7.25 6.94 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lncurses -lm
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Enhanced GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 50 100 150 200 250 SE +/- 1.20, N = 3 208 223 209 221 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
Timed LLVM Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 6.0.1 Time To Compile GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 70 140 210 280 350 292.53 300.31 280.27 284.10
CppPerformanceBenchmarks Test: Atol OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Atol GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 14 28 42 56 70 SE +/- 0.17, N = 3 SE +/- 0.06, N = 3 SE +/- 0.30, N = 3 SE +/- 0.53, N = 11 59.97 63.34 59.31 60.38 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
Bullet Physics Engine Test: Convex Trimesh OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Convex Trimesh GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.2138 0.4276 0.6414 0.8552 1.069 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.95 0.91 0.89 0.90 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
CppPerformanceBenchmarks Test: Function Objects OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Function Objects GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 4 8 12 16 20 SE +/- 0.17, N = 3 SE +/- 0.20, N = 4 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 15.10 14.90 14.40 14.15 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
Bullet Physics Engine Test: Raytests OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Raytests GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.4748 0.9496 1.4244 1.8992 2.374 SE +/- 0.00, N = 6 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 2.11 2.06 1.98 2.04 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
CppPerformanceBenchmarks Test: Random Numbers OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Random Numbers GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 200 400 600 800 1000 SE +/- 4.15, N = 3 SE +/- 10.35, N = 5 SE +/- 2.69, N = 3 SE +/- 0.27, N = 3 799.88 787.77 751.15 750.66 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
Bullet Physics Engine Test: 1000 Convex OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 1000 Convex GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.8393 1.6786 2.5179 3.3572 4.1965 SE +/- 0.00, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 3.73 3.60 3.51 3.57 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
Ogg Encoding WAV To Ogg OpenBenchmarking.org Seconds, Fewer Is Better Ogg Encoding 1.3.3 WAV To Ogg GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 1.206 2.412 3.618 4.824 6.03 SE +/- 0.00, N = 3 SE +/- 0.00, N = 4 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 5.05 5.36 5.13 5.05 -march=znver2 -march=znver2 1. (CC) gcc options: -O2 -ffast-math -fsigned-char -O3 -logg
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 4.0.8 Test: SET GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 500K 1000K 1500K 2000K 2500K SE +/- 28123.08, N = 3 SE +/- 14796.01, N = 3 SE +/- 30290.32, N = 4 SE +/- 19021.82, N = 3 2051361.33 2084989.88 2122162.94 2169531.00 1. (CC) gcc options: -ggdb -rdynamic -lm -ldl -pthread
GraphicsMagick Operation: Rotate OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Rotate GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 60 120 180 240 300 SE +/- 4.33, N = 3 SE +/- 1.86, N = 3 SE +/- 1.20, N = 3 262 277 262 263 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
lzbench Test: Libdeflate 1 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Libdeflate 1 - Process: Decompression GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 300 600 900 1200 1500 SE +/- 0.33, N = 3 SE +/- 10.00, N = 3 1147 1159 1119 1183 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Brotli 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Brotli 0 - Process: Compression GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 110 220 330 440 550 SE +/- 0.88, N = 3 SE +/- 4.47, N = 11 SE +/- 0.67, N = 3 SE +/- 4.10, N = 3 507 499 494 515 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.2 WAV To FLAC GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 2 4 6 8 10 SE +/- 0.03, N = 5 SE +/- 0.04, N = 5 SE +/- 0.02, N = 5 SE +/- 0.01, N = 5 7.72 8.11 7.70 7.99 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 200 400 600 800 1000 SE +/- 0.24, N = 3 SE +/- 0.29, N = 3 SE +/- 7.16, N = 3 SE +/- 0.74, N = 3 777.17 759.97 761.38 800.23 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
Memcached mcperf Method: Get OpenBenchmarking.org Operations Per Second, More Is Better Memcached mcperf 1.5.10 Method: Get GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 20K 40K 60K 80K 100K SE +/- 1025.20, N = 15 SE +/- 1267.59, N = 3 SE +/- 1551.16, N = 3 SE +/- 937.65, N = 15 95710.60 97228.27 92376.40 93850.59 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm -rdynamic
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: HWB Color Space GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 70 140 210 280 350 SE +/- 2.19, N = 3 SE +/- 0.33, N = 3 SE +/- 2.60, N = 3 SE +/- 2.19, N = 3 288 302 287 293 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
GraphicsMagick Operation: Swirl OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Swirl GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 60 120 180 240 300 SE +/- 0.88, N = 3 SE +/- 1.86, N = 3 SE +/- 1.20, N = 3 254 264 251 259 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
MKL-DNN Harness: IP Batch All - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: IP Batch All - Data Type: f32 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 300 600 900 1200 1500 SE +/- 5.99, N = 3 SE +/- 25.50, N = 3 SE +/- 7.48, N = 3 SE +/- 17.21, N = 3 1582.78 1556.91 1523.52 1599.68 MIN: 1385.56 -march=znver2 - MIN: 1368.2 MIN: 1357.02 -march=znver2 - MIN: 1393.73 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
CppPerformanceBenchmarks Test: Stepanov Abstraction OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Stepanov Abstraction GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 7 14 21 28 35 SE +/- 0.08, N = 3 SE +/- 0.45, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 28.19 28.30 27.60 28.93 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 300 600 900 1200 1500 SE +/- 2.93, N = 3 SE +/- 0.48, N = 3 SE +/- 11.21, N = 3 SE +/- 6.19, N = 3 1385.23 1385.88 1322.90 1378.46 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -mavx2
MKL-DNN Harness: Deconvolution Batch deconv_all - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_all - Data Type: f32 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 11K 22K 33K 44K 55K SE +/- 224.88, N = 3 SE +/- 390.75, N = 3 SE +/- 589.20, N = 6 SE +/- 668.40, N = 3 50039.13 50679.53 51813.05 52238.80 MIN: 46883.1 -march=znver2 - MIN: 48056.6 MIN: 48543.1 -march=znver2 - MIN: 49224.9 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Resizing GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 60 120 180 240 300 SE +/- 2.65, N = 3 SE +/- 1.53, N = 3 SE +/- 2.19, N = 3 SE +/- 1.15, N = 3 274 286 275 280 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
CppPerformanceBenchmarks Test: Stepanov Vector OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Stepanov Vector GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 20 40 60 80 100 SE +/- 0.88, N = 3 SE +/- 0.04, N = 3 SE +/- 0.35, N = 3 SE +/- 0.12, N = 3 74.26 77.22 76.45 74.08 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
Smallpt Global Illumination Renderer; 128 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 128 Samples GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 7.84 7.53 7.78 7.67 -march=znver2 -march=znver2 1. (CXX) g++ options: -fopenmp -O3
Sockperf Test: Latency Ping Pong OpenBenchmarking.org usec, Fewer Is Better Sockperf 3.4 Test: Latency Ping Pong GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.7088 1.4176 2.1264 2.8352 3.544 SE +/- 0.02, N = 25 SE +/- 0.02, N = 25 SE +/- 0.04, N = 5 SE +/- 0.04, N = 6 3.03 3.04 3.12 3.15 -march=znver2 -march=znver2 1. (CXX) g++ options: --param -O3 -rdynamic -ldl -lpthread
MKL-DNN Harness: Deconvolution Batch deconv_1d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_1d - Data Type: f32 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 50 100 150 200 250 SE +/- 0.29, N = 3 SE +/- 1.85, N = 15 SE +/- 2.00, N = 15 SE +/- 1.79, N = 13 212.83 218.66 221.23 217.02 MIN: 201.7 -march=znver2 - MIN: 203.42 MIN: 202.07 -march=znver2 - MIN: 203.65 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
MKL-DNN Harness: Deconvolution Batch deconv_3d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_3d - Data Type: f32 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 13 26 39 52 65 SE +/- 0.69, N = 15 SE +/- 0.58, N = 8 SE +/- 0.66, N = 7 SE +/- 0.49, N = 15 58.16 56.87 59.00 57.97 MIN: 50.91 -march=znver2 - MIN: 50.96 MIN: 50.8 -march=znver2 - MIN: 51.57 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 2.0.2 Test: Decompression Throughput GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 50 100 150 200 250 SE +/- 2.32, N = 3 SE +/- 0.30, N = 3 SE +/- 0.44, N = 3 SE +/- 0.31, N = 3 220.33 225.44 218.09 225.64 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -rdynamic
lzbench Test: Zstd 1 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Zstd 1 - Process: Compression GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 100 200 300 400 500 SE +/- 0.33, N = 3 SE +/- 3.18, N = 3 SE +/- 4.91, N = 8 467 453 468 468 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 7.1.9 Time To Compile GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 12 24 36 48 60 SE +/- 0.09, N = 3 SE +/- 0.51, N = 3 SE +/- 0.15, N = 3 SE +/- 0.36, N = 3 54.43 53.76 52.71 53.91 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -pedantic -ldl -lz -lm
HPC Challenge Test / Class: Random Ring Bandwidth OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 1.1354 2.2708 3.4062 4.5416 5.677 SE +/- 0.05697, N = 3 SE +/- 0.04322, N = 3 SE +/- 0.02698, N = 3 SE +/- 0.07571, N = 3 4.94947 5.04603 4.89161 4.98832 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
FFTW Build: Float + SSE - Size: 2D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 32 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 10K 20K 30K 40K 50K SE +/- 54.85, N = 3 SE +/- 28.47, N = 3 SE +/- 663.38, N = 4 SE +/- 105.51, N = 3 45361 46305 45253 44951 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
lzbench Test: Zstd 1 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Zstd 1 - Process: Decompression GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 300 600 900 1200 1500 SE +/- 0.58, N = 3 SE +/- 0.33, N = 3 SE +/- 9.50, N = 3 SE +/- 12.79, N = 8 1287 1250 1269 1268 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Sockperf Test: Throughput OpenBenchmarking.org Messages Per Second, More Is Better Sockperf 3.4 Test: Throughput GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 110K 220K 330K 440K 550K SE +/- 5409.10, N = 5 SE +/- 3715.76, N = 18 SE +/- 4767.11, N = 5 SE +/- 4175.03, N = 5 514748 529657 514551 517095 -march=znver2 -march=znver2 1. (CXX) g++ options: --param -O3 -rdynamic -ldl -lpthread
GNU MPC Multi-Precision Benchmark OpenBenchmarking.org Global Score, More Is Better GNU MPC 1.1.0 Multi-Precision Benchmark GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 2K 4K 6K 8K 10K SE +/- 26.46, N = 3 SE +/- 102.03, N = 3 SE +/- 31.80, N = 3 SE +/- 50.44, N = 3 9580 9357 9597 9577 -march=znver2 -march=znver2 1. (CC) gcc options: -lm -O3 -MT -MD -MP -MF
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 2 4 6 8 10 SE +/- 0.18198, N = 3 SE +/- 0.02559, N = 3 SE +/- 0.02013, N = 3 SE +/- 0.06300, N = 3 8.81748 8.63794 8.59803 8.60514 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 2 4 6 8 10 SE +/- 0.18198, N = 3 SE +/- 0.02559, N = 3 SE +/- 0.02013, N = 3 SE +/- 0.06300, N = 3 8.81748 8.63794 8.59803 8.60514 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
Cpuminer-Opt Algorithm: lbry OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: lbry GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 8K 16K 24K 32K 40K SE +/- 460.86, N = 5 SE +/- 20.82, N = 3 SE +/- 550.28, N = 3 SE +/- 5.77, N = 3 35288 34630 34583 34420 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 120K 240K 360K 480K 600K SE +/- 1210.22, N = 3 SE +/- 1036.74, N = 3 SE +/- 1430.19, N = 3 SE +/- 2761.64, N = 3 568329.00 567096.65 567987.34 555154.60 -march=znver2 -march=znver2 1. (CC) gcc options: -O2 -O3 -lrt" -lrt
CppPerformanceBenchmarks Test: Math Library OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Math Library GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 70 140 210 280 350 SE +/- 4.29, N = 3 SE +/- 2.37, N = 3 SE +/- 0.26, N = 3 SE +/- 3.91, N = 3 307.23 306.02 309.36 302.82 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2018.3 Water Benchmark GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.2228 0.4456 0.6684 0.8912 1.114 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.97 0.98 0.98 0.99 -march=znver2 -march=znver2 1. (CXX) g++ options: -march=core-avx2 -O3 -std=c++11 -funroll-all-loops -fopenmp -lrt -lpthread -lm
HPC Challenge Test / Class: Random Ring Latency OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.0747 0.1494 0.2241 0.2988 0.3735 SE +/- 0.00125, N = 3 SE +/- 0.00071, N = 3 SE +/- 0.00047, N = 3 SE +/- 0.00042, N = 3 0.33186 0.32521 0.32596 0.32698 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
Cpuminer-Opt Algorithm: sha256t OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: sha256t GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 20K 40K 60K 80K 100K SE +/- 116.81, N = 3 SE +/- 180.83, N = 3 SE +/- 990.16, N = 7 SE +/- 1027.26, N = 6 86417 86440 87951 87238 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
GraphicsMagick Operation: Noise-Gaussian OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Noise-Gaussian GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 40 80 120 160 200 SE +/- 0.33, N = 3 SE +/- 0.67, N = 3 SE +/- 0.58, N = 3 170 173 170 171 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
HPC Challenge Test / Class: Max Ping Pong Bandwidth OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 5K 10K 15K 20K 25K SE +/- 62.37, N = 3 SE +/- 195.64, N = 3 SE +/- 159.70, N = 3 SE +/- 119.42, N = 3 23885.44 23993.04 24227.25 23832.61 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
MKL-DNN Harness: Convolution Batch conv_3d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_3d - Data Type: f32 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 30 60 90 120 150 SE +/- 0.79, N = 3 SE +/- 1.48, N = 4 SE +/- 0.16, N = 3 SE +/- 0.45, N = 3 116.62 118.02 117.60 118.47 MIN: 102.39 -march=znver2 - MIN: 102.11 MIN: 103.13 -march=znver2 - MIN: 103.47 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
FFmpeg H.264 HD To NTSC DV OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 4.0.2 H.264 HD To NTSC DV GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 6.88 6.78 6.86 6.83 -march=znver2 -march=znver2 1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lXv -lX11 -lXext -lm -lxcb -lxcb-shape -lxcb-xfixes -lasound -lSDL2 -lsndio -pthread -lbz2 -llzma -O3 -std=c11 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -MMD -MF -MT
MKL-DNN Harness: Convolution Batch conv_alexnet - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_alexnet - Data Type: f32 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 500 1000 1500 2000 2500 SE +/- 6.13, N = 3 SE +/- 17.20, N = 3 SE +/- 9.57, N = 3 SE +/- 9.61, N = 3 2507.16 2543.93 2527.50 2520.01 MIN: 2461.57 -march=znver2 - MIN: 2467.76 MIN: 2462.11 -march=znver2 - MIN: 2467.07 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
HPC Challenge Test / Class: EP-STREAM Triad OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.3894 0.7788 1.1682 1.5576 1.947 SE +/- 0.00098, N = 3 SE +/- 0.00091, N = 3 SE +/- 0.00015, N = 3 SE +/- 0.00081, N = 3 1.72205 1.73055 1.70820 1.71668 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
Apache Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.29 Static Web Page Serving GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 8K 16K 24K 32K 40K SE +/- 65.39, N = 3 SE +/- 79.10, N = 3 SE +/- 57.64, N = 3 SE +/- 139.15, N = 3 38490.98 38009.25 38392.29 38022.79 -march=znver2 -march=znver2 1. (CC) gcc options: -shared -fPIC -pthread -O3
SVT-HEVC 1080p 8-bit YUV To HEVC Video Encode OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 2019-02-03 1080p 8-bit YUV To HEVC Video Encode GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 50 100 150 200 250 SE +/- 1.73, N = 3 SE +/- 1.78, N = 3 SE +/- 3.72, N = 3 SE +/- 0.72, N = 3 248.85 247.99 246.01 247.33 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -fPIE -fPIC -O2 -flto -fvisibility=hidden -march=native -pie -rdynamic -lpthread -lrt
x265 H.265 1080p Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x265 3.0 H.265 1080p Video Encoding GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 12 24 36 48 60 SE +/- 0.20, N = 3 SE +/- 0.19, N = 3 SE +/- 0.28, N = 3 SE +/- 0.06, N = 3 53.00 52.40 52.94 52.53 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
HPC Challenge Test / Class: G-HPL OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 16 32 48 64 80 SE +/- 0.08, N = 3 SE +/- 0.22, N = 3 SE +/- 0.23, N = 3 SE +/- 0.37, N = 3 71.07 71.05 70.97 71.78 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
Cpuminer-Opt Algorithm: skein OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: skein GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 9K 18K 27K 36K 45K SE +/- 5.77, N = 3 SE +/- 133.46, N = 3 SE +/- 602.50, N = 3 SE +/- 21.86, N = 3 39720 39843 39397 39797 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
x264 H.264 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x264 2018-09-25 H.264 Video Encoding GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 30 60 90 120 150 SE +/- 2.27, N = 3 SE +/- 2.09, N = 4 SE +/- 1.55, N = 7 SE +/- 2.03, N = 3 138.74 139.82 139.59 138.41 -march=znver2 -march=znver2 1. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 800 1600 2400 3200 4000 SE +/- 1.89, N = 3 SE +/- 0.70, N = 3 SE +/- 7.07, N = 3 SE +/- 1.42, N = 3 3492.53 3487.10 3516.27 3481.50 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
NGINX Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better NGINX Benchmark 1.9.9 Static Web Page Serving GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 9K 18K 27K 36K 45K SE +/- 158.42, N = 3 SE +/- 23.74, N = 3 SE +/- 102.83, N = 3 SE +/- 112.05, N = 3 39525.70 39346.91 39734.85 39602.49 -march=znver2 -march=znver2 1. (CC) gcc options: -lpthread -lcrypt -lcrypto -lz -O3 -march=native
MKL-DNN Harness: Convolution Batch conv_all - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_all - Data Type: f32 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 4K 8K 12K 16K 20K SE +/- 87.03, N = 3 SE +/- 41.11, N = 3 SE +/- 42.61, N = 3 SE +/- 22.35, N = 3 19803.57 19694.33 19613.70 19696.57 MIN: 19014.9 -march=znver2 - MIN: 18995.6 MIN: 18961.5 -march=znver2 - MIN: 19033.5 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
PostgreSQL pgbench Scaling: Buffer Test - Test: Normal Load - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Only GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 60K 120K 180K 240K 300K SE +/- 102.78, N = 3 SE +/- 237.85, N = 3 SE +/- 513.53, N = 3 SE +/- 235.79, N = 3 300244.81 298969.75 300353.09 297539.89 -march=znver2 -march=znver2 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
High Performance Conjugate Gradient OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.0 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.2453 0.4906 0.7359 0.9812 1.2265 SE +/- 0.01, N = 4 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 1.09 1.08 1.09 1.08
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 9 Total Time GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 8M 16M 24M 32M 40M SE +/- 237875.03, N = 3 SE +/- 131167.27, N = 3 SE +/- 210046.69, N = 3 SE +/- 164232.11, N = 3 39631993 39540328 39278964 39561655 -march=znver2 -march=znver2 1. (CXX) g++ options: -m64 -lpthread -O3 -fno-exceptions -std=c++11 -pedantic -msse -msse3 -mpopcnt -flto
John The Ripper Test: Blowfish OpenBenchmarking.org Real C/S, More Is Better John The Ripper 1.9.0-jumbo-1 Test: Blowfish GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 4K 8K 12K 16K 20K SE +/- 63.74, N = 3 SE +/- 64.22, N = 3 SE +/- 64.93, N = 3 SE +/- 63.01, N = 3 20426 20426 20335 20253 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt -lbz2
Cpuminer-Opt Algorithm: myr-gr OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: myr-gr GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 3K 6K 9K 12K 15K SE +/- 40.00, N = 3 SE +/- 6.67, N = 3 SE +/- 49.78, N = 3 SE +/- 26.03, N = 3 14130 14137 14127 14023 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
HPC Challenge Test / Class: EP-DGEMM OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 8 16 24 32 40 SE +/- 0.11, N = 3 SE +/- 0.22, N = 3 SE +/- 0.19, N = 3 SE +/- 0.42, N = 3 32.86 32.84 32.83 32.60 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
PostgreSQL pgbench Scaling: Buffer Test - Test: Normal Load - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 6K 12K 18K 24K 30K SE +/- 31.16, N = 3 SE +/- 124.84, N = 3 SE +/- 55.36, N = 3 SE +/- 40.41, N = 3 29372.39 29148.60 29178.23 29149.20 -march=znver2 -march=znver2 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
MKL-DNN Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 200 400 600 800 1000 SE +/- 6.39, N = 3 SE +/- 5.61, N = 3 SE +/- 6.51, N = 3 SE +/- 6.23, N = 3 1145.01 1145.95 1147.62 1153.46 MIN: 1050.58 -march=znver2 - MIN: 1052.71 MIN: 1052.13 -march=znver2 - MIN: 1057.54 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
CppPerformanceBenchmarks Test: Ctype OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Ctype GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 7 14 21 28 35 SE +/- 0.38, N = 5 SE +/- 0.03, N = 3 SE +/- 0.28, N = 3 SE +/- 0.14, N = 3 31.51 31.30 31.52 31.43 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
XZ Compression Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9 OpenBenchmarking.org Seconds, Fewer Is Better XZ Compression 5.2.4 Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 6 12 18 24 30 SE +/- 0.12, N = 3 SE +/- 0.11, N = 3 SE +/- 0.10, N = 3 SE +/- 0.13, N = 3 25.26 25.39 25.23 25.25 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -fvisibility=hidden -O3
SVT-AV1 1080p 8-bit YUV To AV1 Video Encode OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.5 1080p 8-bit YUV To AV1 Video Encode GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 11 22 33 44 55 SE +/- 0.27, N = 3 SE +/- 0.13, N = 3 SE +/- 0.15, N = 3 SE +/- 0.19, N = 3 46.22 46.49 46.45 46.39 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -pie -lpthread -lm
Cpuminer-Opt Algorithm: m7m OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: m7m GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 130 260 390 520 650 SE +/- 0.27, N = 3 SE +/- 0.35, N = 3 SE +/- 0.29, N = 3 SE +/- 0.15, N = 3 591.32 590.80 593.89 590.66 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
HPC Challenge Test / Class: G-Random Access OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 0.022 0.044 0.066 0.088 0.11 SE +/- 0.00041, N = 3 SE +/- 0.00044, N = 3 SE +/- 0.00036, N = 3 SE +/- 0.00042, N = 3 0.09778 0.09771 0.09757 0.09798 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
m-queens Time To Solve OpenBenchmarking.org Seconds, Fewer Is Better m-queens 1.2 Time To Solve GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 11 22 33 44 55 SE +/- 0.11, N = 3 SE +/- 0.10, N = 3 SE +/- 0.12, N = 3 SE +/- 0.08, N = 3 47.14 47.21 47.12 47.27 -march=znver2 -march=znver2 1. (CXX) g++ options: -fopenmp -O3 -O2 -march=native
Apache Siege Concurrent Users: 250 OpenBenchmarking.org Transactions Per Second, More Is Better Apache Siege 2.4.29 Concurrent Users: 250 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 20K 40K 60K 80K 100K SE +/- 122.71, N = 3 SE +/- 1636.75, N = 12 SE +/- 3755.13, N = 15 SE +/- 4063.46, N = 12 62725.24 102423.07 98050.91 96842.13 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lpthread -ldl -lssl -lcrypto
Apache Siege Concurrent Users: 200 OpenBenchmarking.org Transactions Per Second, More Is Better Apache Siege 2.4.29 Concurrent Users: 200 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 20K 40K 60K 80K 100K SE +/- 3302.37, N = 12 SE +/- 1288.23, N = 15 SE +/- 798.56, N = 3 SE +/- 3575.15, N = 15 82293.14 83275.06 60835.79 99824.49 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lpthread -ldl -lssl -lcrypto
Memcached mcperf Method: Set OpenBenchmarking.org Operations Per Second, More Is Better Memcached mcperf 1.5.10 Method: Set GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 13K 26K 39K 52K 65K SE +/- 2058.38, N = 15 SE +/- 393.33, N = 3 SE +/- 293.82, N = 3 SE +/- 3850.96, N = 15 57193.25 52910.87 52914.10 59232.07 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm -rdynamic
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 4.0.8 Test: GET GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 700K 1400K 2100K 2800K 3500K SE +/- 47460.73, N = 15 SE +/- 51486.64, N = 15 SE +/- 40781.06, N = 3 SE +/- 61029.58, N = 15 3042507.47 3031706.22 3297713.33 3066070.28 1. (CC) gcc options: -ggdb -rdynamic -lm -ldl -pthread
Cpuminer-Opt Algorithm: deep OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: deep GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 2K 4K 6K 8K 10K SE +/- 3.33, N = 3 SE +/- 8.82, N = 3 SE +/- 926.03, N = 12 11137.00 11123.00 11190.00 10230.34 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
MKL-DNN Harness: IP Batch 1D - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: IP Batch 1D - Data Type: f32 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 30 60 90 120 150 SE +/- 3.18, N = 12 SE +/- 3.21, N = 14 SE +/- 3.35, N = 15 SE +/- 2.27, N = 15 154.09 157.72 152.51 155.34 MIN: 129 -march=znver2 - MIN: 127 MIN: 111.42 -march=znver2 - MIN: 127.99 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
Phoronix Test Suite v10.8.5