nvidia

AMD Ryzen 5 3600 6-Core testing with a MSI MPG X570 GAMING PLUS (MS-7C37) v2.0 (A.M0 BIOS) and MSI NVIDIA GeForce RTX 2080 8GB on Linuxmint 21.3 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2406227-FLOB-NVIDIA592&grt&rdt.

nvidiaProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLVulkanCompilerFile-SystemScreen Resolution02102023_232422062024_1716AMD Ryzen 5 3600 6-Core @ 3.60GHz (6 Cores / 12 Threads)MSI MPG X570 GAMING PLUS (MS-7C37) v2.0 (A.L0 BIOS)AMD Starship/Matisse32GB240GB Corsair Force MP510 + 1000GB Samsung SSD 970 EVO 1TB + 0GB SD/MMC + 0GB SM/xD-Picture + 0GB Compact Flash + 0GB MS/MS-ProMSI NVIDIA GeForce RTX 2080 8GBNVIDIA TU104 HD Audio2 x PL2470H + JBL Bar 5.1Realtek RTL8111/8168/8411Linuxmint 21.26.2.0-33-generic (x86_64)Cinnamon 5.8.4X Server 1.21.1.4NVIDIA 535.54.034.6.0OpenCL 3.0 CUDA 12.2.791.3.242GCC 11.4.0 + CUDA 12.2ext43840x1080MSI MPG X570 GAMING PLUS (MS-7C37) v2.0 (A.M0 BIOS)240GB Corsair Force MP510 + 1000GB Samsung SSD 970 EVO 1TB + 0GB SD/MMC + 0GB Compact Flash + 0GB SM/xD-Picture + 0GB MS/MS-ProLinuxmint 21.36.5.0-41-generic (x86_64)Cinnamon 6.0.4NVIDIA 550.90.07OpenCL 3.0 CUDA 12.4.1311.3.277GCC 12.3.0 + CUDA 12.4OpenBenchmarking.orgKernel Details- Transparent Huge Pages: neverCompiler Details- 02102023_2324: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - 22062024_1716: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-ALHxjy/gcc-12-12.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-ALHxjy/gcc-12-12.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- 02102023_2324: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8701030- 22062024_1716: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x8701030Graphics Details- BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 90.04.23.80.2aOpenCL Details- GPU Compute Cores: 2944Python Details- Python 3.10.12Security Details- 02102023_2324: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - 22062024_1716: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

nvidiaarrayfire: Conjugate Gradient OpenCLblender: BMW27 - NVIDIA OptiXblender: Junkshop - NVIDIA OptiXblender: Classroom - NVIDIA OptiXblender: Fishy Cat - NVIDIA OptiXblender: Barbershop - NVIDIA OptiXblender: Pabellon Barcelona - NVIDIA OptiXcl-mem: Copycl-mem: Readcl-mem: Writeclpeak: Integer Compute INTclpeak: Single-Precision Floatclpeak: Double-Precision Doubleclpeak: Global Memory Bandwidthfahbench: financebench: Black-Scholes OpenCLgromacs: NVIDIA CUDA GPU - water_GMX50_barehashcat: MD5hashcat: SHA1hashcat: 7-Ziphashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTSindigobench: OpenCL GPU - Bedroomindigobench: OpenCL GPU - Supercarluxcorerender: DLSC - GPUluxcorerender: Danish Mood - GPUluxcorerender: Orange Juice - GPUluxcorerender: LuxCore Benchmark - GPUluxcorerender: Rainbow Colors and Prism - GPUmandelgpu: GPUmixbench: OpenCL - Integermixbench: NVIDIA CUDA - Integermixbench: OpenCL - Double Precisionmixbench: OpenCL - Single Precisionmixbench: NVIDIA CUDA - Half Precisionmixbench: NVIDIA CUDA - Double Precisionmixbench: NVIDIA CUDA - Single Precisionnamd-cuda: ATPase Simulation - 327,506 Atomsncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetneatbench: GPUoctanebench: Total Scorerealsr-ncnn: 4x - Norealsr-ncnn: 4x - Yesrodinia: OpenCL Particle Filtershoc: OpenCL - S3Dshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Reductionshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthviennacl: CPU BLAS - sCOPYviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMM-NTviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-TTviennacl: CPU BLAS - dGEMV-Tviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-TTvkfft: FFT + iFFT R2C / C2Rvkfft: FFT + iFFT C2C 1D batched in half precisionvkfft: FFT + iFFT C2C Bluestein in single precisionvkfft: FFT + iFFT C2C 1D batched in double precisionvkfft: FFT + iFFT C2C 1D batched in single precisionvkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT C2C Bluestein benchmark in double precisionvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingvkresample: 2x - Doublevkresample: 2x - Singlewaifu2x-ncnn: 2x - 3 - Yes02102023_232422062024_17163810643333311869866667594313174040000044826710848.2210243.87291.9410886.3421004.45291.8510802.7511.24867.432198.02112.78865.0892.08512.9326.9434.8623.64132.6541.85293.5396.1317.78990.849771.86348.51369.10257.005711.26511.231378531666671206880000059796717368333334481678.36125.4965.292.905.463.7711.90330171776.510900.5310221.89290.3810925.7020916.66292.4310811.180.1653614.554.623.684.364.006.691.5812.4947.188.938.3919.5323.4311.769.49106.744.652080266.11635111.51967.2616.218198.18812.76651080.1124.0791329.9173336.9211069.713.131913.5641133.7023.635.137.823.635.438.03829.628.730.33039.228232325936738740127933732832432729972887218427155945991924685261363398500.04023.2125.143OpenBenchmarking.org

ArrayFire

Test: Conjugate Gradient OpenCL

OpenBenchmarking.orgms, Fewer Is BetterArrayFire 3.9Test: Conjugate Gradient OpenCL22062024_17160.46910.93821.40731.87642.3455SE +/- 0.013, N = 32.0851. (CXX) g++ options: -O3

Blender

Blend File: BMW27 - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: BMW27 - Compute: NVIDIA OptiX22062024_17163691215SE +/- 0.02, N = 312.93

Blender

Blend File: Junkshop - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Junkshop - Compute: NVIDIA OptiX22062024_1716612182430SE +/- 0.30, N = 526.94

Blender

Blend File: Classroom - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Classroom - Compute: NVIDIA OptiX22062024_1716816243240SE +/- 0.03, N = 334.86

Blender

Blend File: Fishy Cat - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Fishy Cat - Compute: NVIDIA OptiX22062024_1716612182430SE +/- 0.01, N = 323.64

Blender

Blend File: Barbershop - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Barbershop - Compute: NVIDIA OptiX22062024_1716306090120150SE +/- 0.14, N = 3132.65

Blender

Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX22062024_17161020304050SE +/- 0.03, N = 341.85

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Copy22062024_171660120180240300SE +/- 0.30, N = 3293.51. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Read22062024_171690180270360450SE +/- 0.10, N = 3396.11. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Write22062024_171670140210280350SE +/- 0.64, N = 3317.71. (CC) gcc options: -O2 -flto -lOpenCL

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer Compute INT22062024_17162K4K6K8K10KSE +/- 91.68, N = 68990.841. (CXX) g++ options: -O3

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision Float22062024_17162K4K6K8K10KSE +/- 102.62, N = 39771.861. (CXX) g++ options: -O3

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision Double22062024_171680160240320400SE +/- 0.60, N = 3348.511. (CXX) g++ options: -O3

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory Bandwidth22062024_171680160240320400SE +/- 0.31, N = 3369.101. (CXX) g++ options: -O3

FAHBench

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.222062024_171660120180240300SE +/- 0.06, N = 3257.01

FinanceBench

Benchmark: Black-Scholes OpenCL

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCL22062024_17163691215SE +/- 0.00, N = 311.271. (CXX) g++ options: -O3 -march=native -fopenmp

GROMACS

Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2024Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare22062024_17163691215SE +/- 0.02, N = 311.231. (CXX) g++ options: -O3 -lm

Hashcat

Benchmark: MD5

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD502102023_232422062024_17168000M16000M24000M32000M40000MSE +/- 166104327.99, N = 3SE +/- 32232712.85, N = 33810643333337853166667

Hashcat

Benchmark: SHA1

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA102102023_232422062024_17163000M6000M9000M12000M15000MSE +/- 131976782.47, N = 3SE +/- 29009021.59, N = 31186986666712068800000

Hashcat

Benchmark: 7-Zip

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-Zip02102023_232422062024_1716130K260K390K520K650KSE +/- 5846.45, N = 15SE +/- 1797.53, N = 3594313597967

Hashcat

Benchmark: SHA-512

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-51202102023_232422062024_1716400M800M1200M1600M2000MSE +/- 2253885.53, N = 3SE +/- 1560270.63, N = 317404000001736833333

Hashcat

Benchmark: TrueCrypt RIPEMD160 + XTS

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTS02102023_232422062024_1716100K200K300K400K500KSE +/- 384.42, N = 3SE +/- 284.80, N = 3448267448167

IndigoBench

Acceleration: OpenCL GPU - Scene: Bedroom

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: Bedroom22062024_1716246810SE +/- 0.002, N = 38.361

IndigoBench

Acceleration: OpenCL GPU - Scene: Supercar

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: Supercar22062024_1716612182430SE +/- 0.01, N = 325.50

LuxCoreRender

Scene: DLSC - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: DLSC - Acceleration: GPU22062024_17161.19032.38063.57094.76125.9515SE +/- 0.01, N = 35.29MIN: 4.81 / MAX: 5.41

LuxCoreRender

Scene: Danish Mood - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Danish Mood - Acceleration: GPU22062024_17160.65251.3051.95752.613.2625SE +/- 0.05, N = 152.90MIN: 0.58 / MAX: 4.05

LuxCoreRender

Scene: Orange Juice - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Orange Juice - Acceleration: GPU22062024_17161.22852.4573.68554.9146.1425SE +/- 0.01, N = 35.46MIN: 4.47 / MAX: 6.23

LuxCoreRender

Scene: LuxCore Benchmark - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: LuxCore Benchmark - Acceleration: GPU22062024_17160.84831.69662.54493.39324.2415SE +/- 0.02, N = 33.77MIN: 0.88 / MAX: 4.99

LuxCoreRender

Scene: Rainbow Colors and Prism - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Rainbow Colors and Prism - Acceleration: GPU22062024_17163691215SE +/- 0.00, N = 311.90MIN: 10.76 / MAX: 12.69

MandelGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPU22062024_171670M140M210M280M350MSE +/- 392572.42, N = 3330171776.51. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

Mixbench

Backend: OpenCL - Benchmark: Integer

OpenBenchmarking.orgGIOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Integer02102023_232422062024_17162K4K6K8K10KSE +/- 2.74, N = 3SE +/- 27.69, N = 310848.2210900.531. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Integer

OpenBenchmarking.orgGIOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Integer02102023_232422062024_17162K4K6K8K10KSE +/- 4.09, N = 3SE +/- 7.91, N = 310243.8710221.891. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: OpenCL - Benchmark: Double Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Double Precision02102023_232422062024_171660120180240300SE +/- 0.00, N = 3SE +/- 1.56, N = 3291.94290.381. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: OpenCL - Benchmark: Single Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Single Precision02102023_232422062024_17162K4K6K8K10KSE +/- 1.78, N = 3SE +/- 1.15, N = 310886.3410925.701. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Half Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Half Precision02102023_232422062024_17164K8K12K16K20KSE +/- 42.66, N = 3SE +/- 22.07, N = 321004.4520916.661. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Double Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Double Precision02102023_232422062024_171660120180240300SE +/- 0.00, N = 3SE +/- 1.62, N = 3291.85292.431. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Single Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Single Precision02102023_232422062024_17162K4K6K8K10KSE +/- 1.52, N = 3SE +/- 23.13, N = 310802.7510811.181. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

NAMD CUDA

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD CUDA 2.14ATPase Simulation - 327,506 Atoms22062024_17160.03720.07440.11160.14880.186SE +/- 0.00054, N = 30.16536

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenet22062024_171648121620SE +/- 0.04, N = 314.55MIN: 14.28 / MAX: 17.541. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v222062024_17161.03952.0793.11854.1585.1975SE +/- 0.06, N = 34.62MIN: 4.37 / MAX: 7.111. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v322062024_17160.8281.6562.4843.3124.14SE +/- 0.02, N = 33.68MIN: 3.51 / MAX: 5.751. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v222062024_17160.9811.9622.9433.9244.905SE +/- 0.02, N = 34.36MIN: 4.2 / MAX: 6.231. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnet22062024_17160.91.82.73.64.5SE +/- 0.04, N = 34.00MIN: 3.8 / MAX: 5.961. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b022062024_1716246810SE +/- 0.06, N = 36.69MIN: 6.42 / MAX: 12.541. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazeface22062024_17160.35550.7111.06651.4221.7775SE +/- 0.01, N = 31.58MIN: 1.51 / MAX: 3.61. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenet22062024_17163691215SE +/- 0.10, N = 312.49MIN: 12.06 / MAX: 46.571. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg1622062024_17161122334455SE +/- 0.08, N = 347.18MIN: 46.4 / MAX: 79.341. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet1822062024_1716246810SE +/- 0.04, N = 38.93MIN: 8.64 / MAX: 40.721. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnet22062024_1716246810SE +/- 0.02, N = 38.39MIN: 8.08 / MAX: 40.851. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet5022062024_1716510152025SE +/- 0.18, N = 319.53MIN: 19.02 / MAX: 99.521. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tiny22062024_1716612182430SE +/- 0.07, N = 323.43MIN: 22.71 / MAX: 113.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssd22062024_17163691215SE +/- 0.10, N = 311.76MIN: 11.35 / MAX: 95.131. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400m22062024_17163691215SE +/- 0.02, N = 39.49MIN: 9.09 / MAX: 69.481. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformer22062024_171620406080100SE +/- 0.07, N = 3106.74MIN: 105.63 / MAX: 175.711. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDet22062024_17161.04632.09263.13894.18525.2315SE +/- 0.08, N = 34.65MIN: 4.49 / MAX: 8.041. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NeatBench

Acceleration: GPU

OpenBenchmarking.orgFPS, More Is BetterNeatBench 5Acceleration: GPU22062024_1716400800120016002000SE +/- 0.00, N = 32080

OctaneBench

Total Score

OpenBenchmarking.orgScore, More Is BetterOctaneBench 2020.1Total Score22062024_171660120180240300266.12

RealSR-NCNN

Scale: 4x - TAA: No

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: No02102023_232422062024_17163691215SE +/- 0.02, N = 3SE +/- 0.03, N = 311.2511.52

RealSR-NCNN

Scale: 4x - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: Yes02102023_232422062024_17161530456075SE +/- 0.41, N = 3SE +/- 0.10, N = 367.4367.26

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle Filter22062024_1716246810SE +/- 0.078, N = 36.2181. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: S3D

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3D02102023_232422062024_17164080120160200SE +/- 0.36, N = 3SE +/- 0.07, N = 3198.02198.191. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Triad02102023_232422062024_17163691215SE +/- 0.00, N = 3SE +/- 0.00, N = 312.7912.771. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SP22062024_17162004006008001000SE +/- 0.45, N = 31080.111. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 Hash22062024_1716612182430SE +/- 0.03, N = 324.081. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Reduction

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Reduction22062024_171670140210280350SE +/- 0.11, N = 3329.921. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: GEMM SGEMM_N

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_N22062024_17167001400210028003500SE +/- 1.27, N = 33336.921. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP Flops22062024_17162K4K6K8K10KSE +/- 29.33, N = 311069.71. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Download22062024_17163691215SE +/- 0.00, N = 313.131. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Readback22062024_17163691215SE +/- 0.00, N = 313.561. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read Bandwidth22062024_17162004006008001000SE +/- 2.71, N = 31133.701. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

ViennaCL

Test: CPU BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPY22062024_1716612182430SE +/- 0.03, N = 323.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPY22062024_1716816243240SE +/- 0.30, N = 335.11. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOT22062024_1716918273645SE +/- 0.03, N = 337.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPY22062024_1716612182430SE +/- 0.05, N = 223.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPY22062024_1716816243240SE +/- 0.09, N = 335.41. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOT22062024_1716918273645SE +/- 0.06, N = 338.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-N22062024_1716918273645381. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NN22062024_1716714212835SE +/- 0.03, N = 329.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NT22062024_1716714212835SE +/- 0.12, N = 328.71. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TN22062024_1716714212835SE +/- 0.03, N = 330.31. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TT22062024_1716714212835SE +/- 0.00, N = 3301. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-T22062024_1716918273645SE +/- 0.25, N = 239.21. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPY22062024_171660120180240300SE +/- 0.67, N = 32821. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPY22062024_171670140210280350SE +/- 0.00, N = 33231. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOT22062024_171660120180240300SE +/- 0.33, N = 32591. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPY22062024_171680160240320400SE +/- 0.33, N = 33671. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPY22062024_171680160240320400SE +/- 0.33, N = 33871. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOT22062024_171690180270360450SE +/- 0.58, N = 34011. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-N22062024_171660120180240300SE +/- 0.58, N = 32791. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-T22062024_171670140210280350SE +/- 0.67, N = 33371. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NN22062024_171670140210280350SE +/- 2.96, N = 33281. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TN22062024_171670140210280350SE +/- 2.60, N = 33241. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TT22062024_171670140210280350SE +/- 2.00, N = 23271. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

VkFFT

Test: FFT + iFFT R2C / C2R

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT R2C / C2R22062024_17166K12K18K24K30KSE +/- 28.42, N = 3299721. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in half precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in half precision22062024_171620K40K60K80K100KSE +/- 872.40, N = 3887211. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein in single precision22062024_17162K4K6K8K10KSE +/- 57.10, N = 384271. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in double precision22062024_17163K6K9K12K15KSE +/- 144.33, N = 3155941. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision22062024_171613K26K39K52K65KSE +/- 31.52, N = 3599191. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C multidimensional in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C multidimensional in single precision22062024_17165K10K15K20K25KSE +/- 181.93, N = 3246851. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein benchmark in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein benchmark in double precision22062024_17166001200180024003000SE +/- 5.86, N = 326131. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling22062024_171614K28K42K56K70KSE +/- 14.18, N = 3633981. (CXX) g++ options: -O3

VkResample

Upscale: 2x - Precision: Double

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: Double22062024_1716110220330440550SE +/- 0.01, N = 3500.041. (CXX) g++ options: -O3

VkResample

Upscale: 2x - Precision: Single

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: Single22062024_1716612182430SE +/- 0.00, N = 323.211. (CXX) g++ options: -O3

Waifu2x-NCNN Vulkan

Scale: 2x - Denoise: 3 - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: Yes02102023_232422062024_17161.15722.31443.47164.62885.786SE +/- 0.036, N = 3SE +/- 0.010, N = 35.0895.143


Phoronix Test Suite v10.8.5