AMD Ryzen 9 3950X Ubuntu Linux AMD Ryzen 9 3950X 16-Core testing with a ASUS ROG CROSSHAIR VIII HERO (WI-FI) (1302 BIOS) and NVIDIA GeForce RTX 2080 Ti 11GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2009247-FI-AMDRYZEN924&grs&sor .
AMD Ryzen 9 3950X Ubuntu Linux Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution Run 1 Run 2 Run 3 AMD Ryzen 9 3950X 16-Core @ 3.50GHz (16 Cores / 32 Threads) ASUS ROG CROSSHAIR VIII HERO (WI-FI) (1302 BIOS) AMD Starship/Matisse 16GB 2000GB Corsair Force MP600 + 2000GB NVIDIA GeForce RTX 2080 Ti 11GB (1350/7000MHz) NVIDIA TU102 HD Audio DELL P2415Q Realtek RTL8125 2.5GbE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 20.04 5.4.0-47-generic (x86_64) GNOME Shell 3.36.4 X Server 1.20.8 NVIDIA 450.66 4.6.0 OpenCL 1.2 CUDA 11.0.228 + OpenCL 2.0 AMD-APP (3182.0) 1.2.133 GCC 9.3.0 + CUDA 11.0 ext4 3840x2160 NVIDIA GeForce RTX 2080 Ti 11GB (420/405MHz) NVIDIA GeForce RTX 2080 Ti 11GB (1350/7000MHz) OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8701013 OpenCL Details - GPU Compute Cores: 4352 Python Details - Python 3.8.2 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
AMD Ryzen 9 3950X Ubuntu Linux opencv: Object Detection perf-bench: Syscall Basic osbench: Create Files viennacl: OpenCL LU Factorization perf-bench: Epoll Wait osbench: Create Processes webp: Quality 100, Highest Compression perf-bench: Memset 1MB mixbench: NVIDIA CUDA - Integer webp: Quality 100, Lossless perf-bench: Futex Lock-Pi webp: Default influxdb: 64 - 10000 - 2,5000,1 - 10000 influxdb: 4 - 10000 - 2,5000,1 - 10000 influxdb: 1024 - 10000 - 2,5000,1 - 10000 perf-bench: Memcpy 1MB osbench: Memory Allocations webp: Quality 100, Lossless, Highest Compression osbench: Launch Programs mandelgpu: GPU perf-bench: Sched Pipe espeak: Text-To-Speech Synthesis mpv: Big Buck Bunny Sunflower 1080p - Software Only libraw: Post-Processing Benchmark hashcat: SHA1 arrayfire: Conjugate Gradient OpenCL octanebench: Total Score clpeak: Double-Precision Double hashcat: SHA-512 hashcat: MD5 hashcat: TrueCrypt RIPEMD160 + XTS mixbench: NVIDIA CUDA - Half Precision clpeak: Integer Compute INT webp: Quality 100 rodinia: OpenCL Particle Filter plaidml: No - Inference - IMDB LSTM - OpenCL mpv: Big Buck Bunny Sunflower 4K - Software Only clpeak: Single-Precision Float fahbench: plaidml: No - Inference - DenseNet 201 - OpenCL blender: Pabellon Barcelona - NVIDIA OptiX plaidml: No - Inference - Mobilenet - OpenCL cl-mem: Write plaidml: Yes - Inference - Mobilenet - OpenCL blender: BMW27 - NVIDIA OptiX hashcat: 7-Zip blender: Classroom - CUDA perf-bench: Futex Hash blender: Barbershop - NVIDIA OptiX redshift: blender: Classroom - NVIDIA OptiX lczero: OpenCL namd-cuda: ATPase Simulation - 327,506 Atoms cl-mem: Copy blender: Fishy Cat - CUDA blender: Fishy Cat - NVIDIA OptiX blender: Pabellon Barcelona - CUDA financebench: Black-Scholes OpenCL plaidml: No - Training - Mobilenet - OpenCL blender: BMW27 - CUDA cl-mem: Read clpeak: Global Memory Bandwidth blender: Barbershop - CUDA opencv: Features 2D osbench: Create Threads mixbench: NVIDIA CUDA - Single Precision mixbench: NVIDIA CUDA - Double Precision Run 1 Run 2 Run 3 37391 21570020 11.187511 79.4030 33012 28.947194 6.718 73.239641 14651.01 15.505 451 1.456 1503297.6 1349119.4 1534587.9 15.340630 65.836350 32.337 37.233829 450731366.0 392388 26.779 1236.99 35.30 17962833333 1.676 308.84294 522.54 2469700000 56554000000 650967 32630.52 13318.75 2.225 4.485 749.41 372.29 13379.77 287.4210 213.43 103.93 2409.86 447.7 2750.18 20.16 880967 151.94 5042694 896.83 247 73.37 11693 0.17955 325.3 73.15 33.10 292.14 6.030 187.74 40.73 545.4 507.56 538.72 149676 13.888200 14126.10 440.80 36473 22900058 10.591332 75.7065 34159 28.053125 6.811 73.369746 14263.82 15.876 454 1.424 1530852.5 1372772.2 1562422.1 15.362763 64.730247 32.879 37.083626 455807158.0 397063 26.605 1249.00 34.98 17822700000 1.671 310.026199 521.13 2451866667 56137133333 646367 32596.27 13258.65 2.212 4.459 751.39 374.39 13387.08 288.8771 212.51 104.28 2421.41 449.8 2763.05 20.25 877133 152.47 5021169 893.03 248 73.33 11673 0.17951 325.2 73.27 33.05 292.45 6.037 187.63 40.77 545.4 507.87 538.60 147838 13.912996 13791.73 419.44 34895 22735445 10.791248 76.8259 34049 28.186639 6.912 71.418354 14510.11 15.466 443 1.426 1534746.9 1376801.6 1563580.1 15.094659 65.164725 32.847 37.697156 448875915.0 391536 26.491 1242.68 35.05 17812100000 1.662 307.443811 518.21 2450466667 56145600000 646600 32409.08 13234.84 2.213 4.466 747.07 372.48 13313.50 288.9235 212.36 104.43 2414.96 447.7 2756.92 20.24 879667 152.60 5028714 893.64 248 73.58 11657 0.18004 324.5 73.17 33.05 292.50 6.033 187.84 40.73 544.9 507.91 538.81 148206 14.068445 15130.47 428.03 OpenBenchmarking.org
OpenCV Test: Object Detection OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.4 Test: Object Detection Run 3 Run 2 Run 1 8K 16K 24K 32K 40K SE +/- 588.74, N = 3 SE +/- 484.41, N = 3 SE +/- 382.04, N = 3 34895 36473 37391 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
perf-bench Benchmark: Syscall Basic OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Syscall Basic Run 2 Run 3 Run 1 5M 10M 15M 20M 25M SE +/- 98896.75, N = 3 SE +/- 164860.89, N = 3 SE +/- 81316.08, N = 3 22900058 22735445 21570020 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lz -llzma -lnuma
OSBench Test: Create Files OpenBenchmarking.org us Per Event, Fewer Is Better OSBench Test: Create Files Run 2 Run 3 Run 1 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 10.59 10.79 11.19 1. (CC) gcc options: -lm
ViennaCL OpenCL LU Factorization OpenBenchmarking.org GFLOPS, More Is Better ViennaCL 1.4.2 OpenCL LU Factorization Run 1 Run 3 Run 2 20 40 60 80 100 SE +/- 1.01, N = 3 SE +/- 1.21, N = 3 SE +/- 0.98, N = 3 79.40 76.83 75.71 1. (CXX) g++ options: -rdynamic -lOpenCL
perf-bench Benchmark: Epoll Wait OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Epoll Wait Run 2 Run 3 Run 1 7K 14K 21K 28K 35K SE +/- 308.84, N = 3 SE +/- 438.57, N = 4 SE +/- 204.50, N = 3 34159 34049 33012 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lz -llzma -lnuma
OSBench Test: Create Processes OpenBenchmarking.org us Per Event, Fewer Is Better OSBench Test: Create Processes Run 2 Run 3 Run 1 7 14 21 28 35 SE +/- 0.25, N = 3 SE +/- 0.25, N = 3 SE +/- 0.34, N = 3 28.05 28.19 28.95 1. (CC) gcc options: -lm
WebP Image Encode Encode Settings: Quality 100, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression Run 1 Run 2 Run 3 2 4 6 8 10 SE +/- 0.032, N = 3 SE +/- 0.071, N = 3 SE +/- 0.091, N = 3 6.718 6.811 6.912 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
perf-bench Benchmark: Memset 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memset 1MB Run 2 Run 1 Run 3 16 32 48 64 80 SE +/- 1.08, N = 4 SE +/- 0.94, N = 4 SE +/- 1.13, N = 3 73.37 73.24 71.42 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lz -llzma -lnuma
Mixbench Backend: NVIDIA CUDA - Benchmark: Integer OpenBenchmarking.org GIOPS, More Is Better Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Integer Run 1 Run 3 Run 2 3K 6K 9K 12K 15K SE +/- 20.03, N = 3 SE +/- 28.60, N = 3 SE +/- 207.34, N = 15 14651.01 14510.11 14263.82 1. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2
WebP Image Encode Encode Settings: Quality 100, Lossless OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless Run 3 Run 1 Run 2 4 8 12 16 20 SE +/- 0.23, N = 3 SE +/- 0.19, N = 5 SE +/- 0.11, N = 3 15.47 15.51 15.88 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
perf-bench Benchmark: Futex Lock-Pi OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Lock-Pi Run 2 Run 1 Run 3 100 200 300 400 500 SE +/- 3.89, N = 15 SE +/- 2.08, N = 3 SE +/- 5.86, N = 3 454 451 443 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lz -llzma -lnuma
WebP Image Encode Encode Settings: Default OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Default Run 2 Run 3 Run 1 0.3276 0.6552 0.9828 1.3104 1.638 SE +/- 0.024, N = 3 SE +/- 0.019, N = 3 SE +/- 0.024, N = 3 1.424 1.426 1.456 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
InfluxDB Concurrent Streams: 64 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 OpenBenchmarking.org val/sec, More Is Better InfluxDB 1.8.2 Concurrent Streams: 64 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 Run 3 Run 2 Run 1 300K 600K 900K 1200K 1500K SE +/- 2016.10, N = 3 SE +/- 1988.25, N = 3 SE +/- 691.55, N = 3 1534746.9 1530852.5 1503297.6
InfluxDB Concurrent Streams: 4 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 OpenBenchmarking.org val/sec, More Is Better InfluxDB 1.8.2 Concurrent Streams: 4 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 Run 3 Run 2 Run 1 300K 600K 900K 1200K 1500K SE +/- 2246.17, N = 3 SE +/- 2002.03, N = 3 SE +/- 2562.57, N = 3 1376801.6 1372772.2 1349119.4
InfluxDB Concurrent Streams: 1024 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 OpenBenchmarking.org val/sec, More Is Better InfluxDB 1.8.2 Concurrent Streams: 1024 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 Run 3 Run 2 Run 1 300K 600K 900K 1200K 1500K SE +/- 929.80, N = 3 SE +/- 2819.29, N = 3 SE +/- 2777.14, N = 3 1563580.1 1562422.1 1534587.9
perf-bench Benchmark: Memcpy 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memcpy 1MB Run 2 Run 1 Run 3 4 8 12 16 20 SE +/- 0.10, N = 3 SE +/- 0.20, N = 3 SE +/- 0.18, N = 5 15.36 15.34 15.09 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lz -llzma -lnuma
OSBench Test: Memory Allocations OpenBenchmarking.org Ns Per Event, Fewer Is Better OSBench Test: Memory Allocations Run 2 Run 3 Run 1 15 30 45 60 75 SE +/- 0.11, N = 3 SE +/- 0.43, N = 3 SE +/- 0.84, N = 3 64.73 65.16 65.84 1. (CC) gcc options: -lm
WebP Image Encode Encode Settings: Quality 100, Lossless, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression Run 1 Run 3 Run 2 8 16 24 32 40 SE +/- 0.39, N = 3 SE +/- 0.16, N = 3 SE +/- 0.30, N = 3 32.34 32.85 32.88 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
OSBench Test: Launch Programs OpenBenchmarking.org us Per Event, Fewer Is Better OSBench Test: Launch Programs Run 2 Run 1 Run 3 9 18 27 36 45 SE +/- 0.25, N = 3 SE +/- 0.28, N = 3 SE +/- 0.19, N = 3 37.08 37.23 37.70 1. (CC) gcc options: -lm
MandelGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better MandelGPU 1.3pts1 OpenCL Device: GPU Run 2 Run 1 Run 3 100M 200M 300M 400M 500M SE +/- 5831583.77, N = 3 SE +/- 6814468.33, N = 3 SE +/- 1749343.37, N = 3 455807158.0 450731366.0 448875915.0 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
perf-bench Benchmark: Sched Pipe OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Sched Pipe Run 2 Run 1 Run 3 90K 180K 270K 360K 450K SE +/- 4833.02, N = 3 SE +/- 3949.47, N = 3 SE +/- 3250.27, N = 3 397063 392388 391536 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lz -llzma -lnuma
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis Run 3 Run 2 Run 1 6 12 18 24 30 SE +/- 0.26, N = 4 SE +/- 0.09, N = 4 SE +/- 0.08, N = 4 26.49 26.61 26.78 1. (CC) gcc options: -O2 -std=c99
MPV Video Input: Big Buck Bunny Sunflower 1080p - Decode: Software Only OpenBenchmarking.org FPS, More Is Better MPV Video Input: Big Buck Bunny Sunflower 1080p - Decode: Software Only Run 2 Run 3 Run 1 300 600 900 1200 1500 SE +/- 3.00, N = 3 SE +/- 0.55, N = 3 SE +/- 2.06, N = 3 1249.00 1242.68 1236.99 MIN: 823.81 / MAX: 1678.43 MIN: 824.09 / MAX: 1672.6 MIN: 818.67 / MAX: 1669.1 1. mpv 0.32.0
LibRaw Post-Processing Benchmark OpenBenchmarking.org Mpix/sec, More Is Better LibRaw 0.20 Post-Processing Benchmark Run 1 Run 3 Run 2 8 16 24 32 40 SE +/- 0.02, N = 3 SE +/- 0.09, N = 3 SE +/- 0.19, N = 3 35.30 35.05 34.98 1. (CXX) g++ options: -O2 -fopenmp -ljpeg -lz -lm
Hashcat Benchmark: SHA1 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: SHA1 Run 1 Run 2 Run 3 4000M 8000M 12000M 16000M 20000M SE +/- 28555054.04, N = 3 SE +/- 28850361.06, N = 3 SE +/- 18200000.00, N = 3 17962833333 17822700000 17812100000
ArrayFire Test: Conjugate Gradient OpenCL OpenBenchmarking.org ms, Fewer Is Better ArrayFire 3.7 Test: Conjugate Gradient OpenCL Run 3 Run 2 Run 1 0.3771 0.7542 1.1313 1.5084 1.8855 SE +/- 0.006, N = 3 SE +/- 0.003, N = 3 SE +/- 0.004, N = 3 1.662 1.671 1.676 1. (CXX) g++ options: -rdynamic
OctaneBench Total Score OpenBenchmarking.org Score, More Is Better OctaneBench 4.00c Total Score Run 2 Run 1 Run 3 70 140 210 280 350 310.03 308.84 307.44
clpeak OpenCL Test: Double-Precision Double OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Double-Precision Double Run 1 Run 2 Run 3 110 220 330 440 550 SE +/- 1.65, N = 3 SE +/- 0.32, N = 3 SE +/- 1.44, N = 3 522.54 521.13 518.21 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
Hashcat Benchmark: SHA-512 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: SHA-512 Run 1 Run 2 Run 3 500M 1000M 1500M 2000M 2500M SE +/- 3523256.07, N = 3 SE +/- 3868390.42, N = 3 SE +/- 2796624.95, N = 3 2469700000 2451866667 2450466667
Hashcat Benchmark: MD5 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: MD5 Run 1 Run 3 Run 2 12000M 24000M 36000M 48000M 60000M SE +/- 78954438.34, N = 3 SE +/- 62943175.43, N = 3 SE +/- 48887501.24, N = 3 56554000000 56145600000 56137133333
Hashcat Benchmark: TrueCrypt RIPEMD160 + XTS OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: TrueCrypt RIPEMD160 + XTS Run 1 Run 3 Run 2 140K 280K 420K 560K 700K SE +/- 648.93, N = 3 SE +/- 346.41, N = 3 SE +/- 433.33, N = 3 650967 646600 646367
Mixbench Backend: NVIDIA CUDA - Benchmark: Half Precision OpenBenchmarking.org GFLOPS, More Is Better Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Half Precision Run 1 Run 2 Run 3 7K 14K 21K 28K 35K SE +/- 6.67, N = 3 SE +/- 18.28, N = 3 SE +/- 29.77, N = 3 32630.52 32596.27 32409.08 1. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2
clpeak OpenCL Test: Integer Compute INT OpenBenchmarking.org GIOPS, More Is Better clpeak OpenCL Test: Integer Compute INT Run 1 Run 2 Run 3 3K 6K 9K 12K 15K SE +/- 159.82, N = 15 SE +/- 137.42, N = 15 SE +/- 159.10, N = 6 13318.75 13258.65 13234.84 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
WebP Image Encode Encode Settings: Quality 100 OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100 Run 2 Run 3 Run 1 0.5006 1.0012 1.5018 2.0024 2.503 SE +/- 0.026, N = 3 SE +/- 0.037, N = 3 SE +/- 0.026, N = 3 2.212 2.213 2.225 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
Rodinia Test: OpenCL Particle Filter OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Particle Filter Run 2 Run 3 Run 1 1.0091 2.0182 3.0273 4.0364 5.0455 SE +/- 0.016, N = 3 SE +/- 0.017, N = 3 SE +/- 0.027, N = 3 4.459 4.466 4.485 1. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl
PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL Run 2 Run 1 Run 3 160 320 480 640 800 SE +/- 3.71, N = 3 SE +/- 2.47, N = 3 SE +/- 1.84, N = 3 751.39 749.41 747.07
MPV Video Input: Big Buck Bunny Sunflower 4K - Decode: Software Only OpenBenchmarking.org FPS, More Is Better MPV Video Input: Big Buck Bunny Sunflower 4K - Decode: Software Only Run 2 Run 3 Run 1 80 160 240 320 400 SE +/- 0.92, N = 3 SE +/- 1.32, N = 3 SE +/- 0.73, N = 3 374.39 372.48 372.29 MIN: 288.04 / MAX: 445.07 MIN: 286.84 / MAX: 432.21 MIN: 286.67 / MAX: 441.69 1. mpv 0.32.0
clpeak OpenCL Test: Single-Precision Float OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Single-Precision Float Run 2 Run 1 Run 3 3K 6K 9K 12K 15K SE +/- 184.15, N = 15 SE +/- 169.22, N = 15 SE +/- 178.75, N = 15 13387.08 13379.77 13313.50 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
FAHBench OpenBenchmarking.org Ns Per Day, More Is Better FAHBench 2.3.2 Run 3 Run 2 Run 1 60 120 180 240 300 SE +/- 0.37, N = 3 SE +/- 0.56, N = 3 SE +/- 0.87, N = 3 288.92 288.88 287.42
PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL Run 1 Run 2 Run 3 50 100 150 200 250 SE +/- 0.02, N = 3 SE +/- 0.13, N = 3 SE +/- 0.21, N = 3 213.43 212.51 212.36
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX Run 1 Run 2 Run 3 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 103.93 104.28 104.43
PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL Run 2 Run 3 Run 1 500 1000 1500 2000 2500 SE +/- 2.19, N = 3 SE +/- 3.60, N = 3 SE +/- 3.69, N = 3 2421.41 2414.96 2409.86
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write Run 2 Run 3 Run 1 100 200 300 400 500 SE +/- 0.26, N = 3 SE +/- 1.25, N = 3 SE +/- 0.66, N = 3 449.8 447.7 447.7 1. (CC) gcc options: -O2 -flto -lOpenCL
PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL Run 2 Run 3 Run 1 600 1200 1800 2400 3000 SE +/- 2.19, N = 3 SE +/- 0.87, N = 3 SE +/- 2.56, N = 3 2763.05 2756.92 2750.18
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: NVIDIA OptiX Run 1 Run 3 Run 2 5 10 15 20 25 SE +/- 0.20, N = 3 SE +/- 0.26, N = 3 SE +/- 0.30, N = 3 20.16 20.24 20.25
Hashcat Benchmark: 7-Zip OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: 7-Zip Run 1 Run 3 Run 2 200K 400K 600K 800K 1000K SE +/- 1637.41, N = 3 SE +/- 1848.72, N = 3 SE +/- 1278.45, N = 3 880967 879667 877133
Blender Blend File: Classroom - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: CUDA Run 1 Run 2 Run 3 30 60 90 120 150 SE +/- 0.01, N = 3 SE +/- 0.24, N = 3 SE +/- 0.39, N = 3 151.94 152.47 152.60
perf-bench Benchmark: Futex Hash OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Hash Run 1 Run 3 Run 2 1.1M 2.2M 3.3M 4.4M 5.5M SE +/- 7897.00, N = 3 SE +/- 6643.91, N = 3 SE +/- 7280.95, N = 3 5042694 5028714 5021169 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lz -llzma -lnuma
Blender Blend File: Barbershop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Barbershop - Compute: NVIDIA OptiX Run 2 Run 3 Run 1 200 400 600 800 1000 SE +/- 0.32, N = 3 SE +/- 1.68, N = 3 SE +/- 0.44, N = 3 893.03 893.64 896.83
RedShift Demo OpenBenchmarking.org Seconds, Fewer Is Better RedShift Demo 3.0 Run 1 Run 2 Run 3 50 100 150 200 250 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.88, N = 3 247 248 248
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: NVIDIA OptiX Run 2 Run 1 Run 3 16 32 48 64 80 SE +/- 0.28, N = 3 SE +/- 0.25, N = 3 SE +/- 0.18, N = 3 73.33 73.37 73.58
LeelaChessZero Backend: OpenCL OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: OpenCL Run 1 Run 2 Run 3 3K 6K 9K 12K 15K SE +/- 82.15, N = 3 SE +/- 69.91, N = 3 SE +/- 41.53, N = 3 11693 11673 11657 1. (CXX) g++ options: -flto -pthread
NAMD CUDA ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms Run 2 Run 1 Run 3 0.0405 0.081 0.1215 0.162 0.2025 SE +/- 0.00020, N = 3 SE +/- 0.00010, N = 3 SE +/- 0.00039, N = 3 0.17951 0.17955 0.18004
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy Run 1 Run 2 Run 3 70 140 210 280 350 SE +/- 0.19, N = 3 SE +/- 0.15, N = 3 SE +/- 0.50, N = 3 325.3 325.2 324.5 1. (CC) gcc options: -O2 -flto -lOpenCL
Blender Blend File: Fishy Cat - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: CUDA Run 1 Run 3 Run 2 16 32 48 64 80 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 73.15 73.17 73.27
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: NVIDIA OptiX Run 2 Run 3 Run 1 8 16 24 32 40 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 33.05 33.05 33.10
Blender Blend File: Pabellon Barcelona - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: CUDA Run 1 Run 2 Run 3 60 120 180 240 300 SE +/- 0.05, N = 3 SE +/- 0.19, N = 3 SE +/- 0.04, N = 3 292.14 292.45 292.50
FinanceBench Benchmark: Black-Scholes OpenCL OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-06-06 Benchmark: Black-Scholes OpenCL Run 1 Run 3 Run 2 2 4 6 8 10 SE +/- 0.002, N = 3 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 6.030 6.033 6.037 1. (CXX) g++ options: -O3 -lOpenCL
PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL Run 3 Run 1 Run 2 40 80 120 160 200 SE +/- 0.05, N = 3 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 187.84 187.74 187.63
Blender Blend File: BMW27 - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: CUDA Run 1 Run 3 Run 2 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 SE +/- 0.04, N = 3 40.73 40.73 40.77
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read Run 2 Run 1 Run 3 120 240 360 480 600 SE +/- 0.27, N = 3 SE +/- 0.32, N = 3 SE +/- 1.08, N = 3 545.4 545.4 544.9 1. (CC) gcc options: -O2 -flto -lOpenCL
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Global Memory Bandwidth Run 3 Run 2 Run 1 110 220 330 440 550 SE +/- 0.44, N = 3 SE +/- 0.76, N = 3 SE +/- 0.68, N = 3 507.91 507.87 507.56 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
Blender Blend File: Barbershop - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Barbershop - Compute: CUDA Run 2 Run 1 Run 3 120 240 360 480 600 SE +/- 0.40, N = 3 SE +/- 0.16, N = 3 SE +/- 0.06, N = 3 538.60 538.72 538.81
OpenCV Test: Features 2D OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.4 Test: Features 2D Run 2 Run 3 Run 1 30K 60K 90K 120K 150K SE +/- 2777.13, N = 9 SE +/- 3470.33, N = 12 SE +/- 1997.08, N = 12 147838 148206 149676 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OSBench Test: Create Threads OpenBenchmarking.org us Per Event, Fewer Is Better OSBench Test: Create Threads Run 1 Run 2 Run 3 4 8 12 16 20 SE +/- 0.26, N = 15 SE +/- 0.29, N = 15 SE +/- 0.30, N = 15 13.89 13.91 14.07 1. (CC) gcc options: -lm
Mixbench Backend: NVIDIA CUDA - Benchmark: Single Precision OpenBenchmarking.org GFLOPS, More Is Better Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Single Precision Run 3 Run 1 Run 2 3K 6K 9K 12K 15K SE +/- 510.15, N = 15 SE +/- 646.92, N = 15 SE +/- 646.88, N = 15 15130.47 14126.10 13791.73 1. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2
Mixbench Backend: NVIDIA CUDA - Benchmark: Double Precision OpenBenchmarking.org GFLOPS, More Is Better Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Double Precision Run 1 Run 3 Run 2 100 200 300 400 500 SE +/- 0.03, N = 3 SE +/- 4.74, N = 15 SE +/- 7.42, N = 15 440.80 428.03 419.44 1. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2
Phoronix Test Suite v10.8.5