vulkan benchmarks AMD Ryzen 9 7950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) and AMD Radeon RX 6700 XT on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2308012-PTS-VULKANBE49&grs .
vulkan benchmarks Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c AMD Ryzen 9 7950X 16-Core @ 4.50GHz (16 Cores / 32 Threads) ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) AMD Device 14d8 32GB Western Digital WD_BLACK SN850X 1000GB + 4001GB AMD Radeon RX 6700 XT (2855/1000MHz) AMD Navi 21/23 ASUS MG28U Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 Ubuntu 23.04 6.4.6-060406-generic (x86_64) GNOME Shell 44.2 X Server 1.21.1.7 + Wayland 4.6 Mesa 23.3~git2307260600.87109c~oibaf~l (git-87109c3 2023-07-26 lunar-oibaf-ppa) (LLVM 15.0.7 DRM 3.52) GCC 12.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601203 Graphics Details - BAR1 / Visible vRAM Size: 12272 MB - vBIOS Version: 113-D5121100-101 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
vulkan benchmarks ncnn: Vulkan GPU-v3-v3-v3 - FastestDet ncnn: Vulkan GPU-v3-v3-v3 - regnety_400m ncnn: Vulkan GPU-v3-v3-v3 - resnet18 ncnn: CPU-v3-v3-v3 - resnet50 vkpeak: fp32-scalar ncnn: Vulkan GPU-v3-v3-v3 - alexnet ncnn: Vulkan GPU - regnety_400m ncnn: CPU - alexnet ncnn: CPU - vision_transformer vkfft: FFT + iFFT R2C / C2R ncnn: CPU-v3-v3-v3 - vgg16 ncnn: CPU - resnet50 ncnn: CPU-v3-v3-v3 - efficientnet-b0 ncnn: Vulkan GPU-v3-v3-v3 - googlenet ncnn: CPU - resnet18 ncnn: Vulkan GPU-v3-v3-v3 - resnet50 ncnn: CPU - googlenet ncnn: Vulkan GPU - resnet18 ncnn: CPU - efficientnet-b0 ncnn: Vulkan GPU - googlenet ncnn: CPU - vgg16 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: CPU - yolov4-tiny vkpeak: int16-vec4 ncnn: CPU-v3-v3-v3 - regnety_400m ncnn: CPU - regnety_400m ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet vkfft: FFT + iFFT C2C Bluestein benchmark in double precision ncnn: CPU - mobilenet ncnn: Vulkan GPU-v3-v3-v3 - squeezenet_ssd ncnn: CPU-v3-v3-v3 - resnet18 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v3-v3-v3 - efficientnet-b0 vkfft: FFT + iFFT C2C multidimensional in single precision vkpeak: int32-vec4 ncnn: Vulkan GPU - FastestDet ncnn: Vulkan GPU-v3-v3-v3 - blazeface ncnn: CPU-v3-v3-v3 - blazeface ncnn: Vulkan GPU - blazeface ncnn: CPU - blazeface vkpeak: fp32-vec4 ncnn: Vulkan GPU-v3-v3-v3 - yolov4-tiny ncnn: CPU-v3-v3-v3 - yolov4-tiny vkpeak: fp16-vec4 vkpeak: fp64-vec4 ncnn: CPU - mnasnet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3-v3 - mobilenet vkfft: FFT + iFFT C2C Bluestein in single precision ncnn: CPU - squeezenet_ssd ncnn: Vulkan GPU-v3-v3-v3 - vgg16 ncnn: CPU-v3-v3-v3 - googlenet ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - squeezenet_ssd ncnn: CPU-v3-v3-v3 - vision_transformer ncnn: CPU-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v3-v3-v3 - shufflenet-v2 ncnn: CPU - shufflenet-v2 vkpeak: int16-scalar vkpeak: fp64-scalar ncnn: Vulkan GPU - vision_transformer vkfft: FFT + iFFT C2C 1D batched in single precision, no reshuffling ncnn: Vulkan GPU - mobilenet ncnn: CPU-v3-v3-v3 - FastestDet vkfft: FFT + iFFT C2C 1D batched in half precision ncnn: CPU-v3-v3-v3 - alexnet ncnn: Vulkan GPU - vgg16 vkfft: FFT + iFFT C2C 1D batched in single precision ncnn: Vulkan GPU-v3-v3-v3 - vision_transformer vkpeak: int32-scalar vkfft: FFT + iFFT C2C 1D batched in double precision ncnn: CPU-v3-v3-v3 - squeezenet_ssd vkpeak: fp16-scalar ncnn: CPU-v3-v3-v3 - mobilenet ncnn: Vulkan GPU - resnet50 vkresample: 2x - Single ncnn: Vulkan GPU-v3-v3-v3 - mnasnet ncnn: Vulkan GPU-v3-v3-v3 - shufflenet-v2 ncnn: Vulkan GPU-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3-v3 - mnasnet ncnn: CPU-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: CPU - FastestDet a b c 13190.09 8.16 4.41 32.49 42105 10.20 5.29 7.94 5.28 3.90 7.90 23.75 3.17 12.90 23123.77 8.18 3.86 2.98 4717 8.05 3.17 3.35 33001 2658.73 4.1 1.38 1.38 12730.08 23232.42 841.80 2.97 3.16 11340 7.07 12.84 4.31 7.09 3.18 3.34 13102.75 841.40 31.88 50504 8.05 91597 23.51 47887 2272.62 20816 13154.15 10.01 11.686 3.62 4.06 8.27 5.42 10.01 12807.06 4.42 8.21 4.32 31.95 42163 23.5 10.01 3.82 7.97 5.2 9.87 7.82 5.23 3.85 7.85 23.49 3.14 12.74 23396.59 8.05 8.18 3.82 2.95 4695 7.97 7.14 5.21 3.33 3.85 32751 2640.08 4.07 1.37 1.37 1.37 1.38 12808.59 12.77 12.98 23390.44 836.55 2.97 3.16 8 11273 7.06 23.42 7.84 12.87 4.33 7.07 31.65 3.15 3.16 3.33 3.34 13070.81 839.2 31.85 50643 8.04 4.07 91812 4.29 23.56 47948 31.71 2269.25 20822 7.03 13145.19 8.01 10 11.69 2.96 3.33 3.15 2.97 3.16 4.05 3.69 7.98 5.23 10.33 12860.56 4.3 8 4.31 31.77 43021 23.99 10.11 3.89 7.83 5.24 10.03 7.93 5.21 3.88 7.8 23.45 3.13 12.81 23385.44 8.14 8.27 3.83 2.96 4670 8.02 7.07 5.26 3.2 3.32 3.82 32812 2638.69 4.09 1.36 1.38 1.37 1.39 12822.01 12.86 12.89 23387.26 836.16 2.99 3.18 7.95 11311 7.1 23.54 7.88 12.81 4.33 7.06 31.78 3.14 3.17 3.17 3.34 3.35 13063.86 839.01 31.79 50596 8.03 4.08 91744 4.28 23.54 47971 31.66 2269.06 20847 7.04 13136.79 8 10 11.688 2.96 3.33 3.15 2.97 3.16 4.11 OpenBenchmarking.org
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: FastestDet b c 0.9135 1.827 2.7405 3.654 4.5675 4.06 3.69 MIN: 4.03 / MAX: 4.3 MIN: 3.66 / MAX: 3.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: regnety_400m b c 2 4 6 8 10 8.27 7.98 MIN: 8.22 / MAX: 9.01 MIN: 7.93 / MAX: 8.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: resnet18 b c 1.2195 2.439 3.6585 4.878 6.0975 5.42 5.23 MIN: 5.36 / MAX: 6.27 MIN: 5.11 / MAX: 6.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: resnet50 b c 3 6 9 12 15 10.01 10.33 MIN: 9.89 / MAX: 10.86 MIN: 10.16 / MAX: 13.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
vkpeak fp32-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp32-scalar a b c 3K 6K 9K 12K 15K SE +/- 4.18, N = 3 13190.09 12807.06 12860.56
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: alexnet b c 0.9945 1.989 2.9835 3.978 4.9725 4.42 4.30 MIN: 4.32 / MAX: 5.1 MIN: 4.26 / MAX: 5.16 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m a b c 2 4 6 8 10 SE +/- 0.11, N = 3 8.16 8.21 8.00 MIN: 7.9 / MAX: 8.99 MIN: 8.14 / MAX: 8.84 MIN: 7.94 / MAX: 8.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: alexnet a b c 0.9923 1.9846 2.9769 3.9692 4.9615 SE +/- 0.11, N = 3 4.41 4.32 4.31 MIN: 4.24 / MAX: 5.16 MIN: 4.26 / MAX: 5.15 MIN: 4.26 / MAX: 4.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vision_transformer a b c 8 16 24 32 40 SE +/- 0.29, N = 3 32.49 31.95 31.77 MIN: 31.67 / MAX: 40.11 MIN: 31.79 / MAX: 32.33 MIN: 31.61 / MAX: 35.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT R2C / C2R OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT R2C / C2R a b c 9K 18K 27K 36K 45K SE +/- 200.55, N = 3 42105 42163 43021 1. (CXX) g++ options: -O3
NCNN Target: CPU-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: vgg16 b c 6 12 18 24 30 23.50 23.99 MIN: 23.3 / MAX: 24.41 MIN: 23.72 / MAX: 24.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet50 a b c 3 6 9 12 15 SE +/- 0.23, N = 3 10.20 10.01 10.11 MIN: 9.84 / MAX: 12.48 MIN: 9.85 / MAX: 11.06 MIN: 9.95 / MAX: 16.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: efficientnet-b0 b c 0.8753 1.7506 2.6259 3.5012 4.3765 3.82 3.89 MIN: 3.79 / MAX: 4.34 MIN: 3.83 / MAX: 9.72 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: googlenet b c 2 4 6 8 10 7.97 7.83 MIN: 7.89 / MAX: 8.7 MIN: 7.74 / MAX: 8.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet18 a b c 1.1903 2.3806 3.5709 4.7612 5.9515 SE +/- 0.07, N = 3 5.29 5.20 5.24 MIN: 5.09 / MAX: 6.29 MIN: 5.1 / MAX: 5.9 MIN: 5.15 / MAX: 6.09 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: resnet50 b c 3 6 9 12 15 9.87 10.03 MIN: 9.79 / MAX: 10.73 MIN: 9.93 / MAX: 10.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: googlenet a b c 2 4 6 8 10 SE +/- 0.11, N = 3 7.94 7.82 7.93 MIN: 7.71 / MAX: 8.73 MIN: 7.73 / MAX: 8.65 MIN: 7.82 / MAX: 8.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet18 a b c 1.188 2.376 3.564 4.752 5.94 SE +/- 0.01, N = 3 5.28 5.23 5.21 MIN: 5.17 / MAX: 6.16 MIN: 5.13 / MAX: 6.18 MIN: 5.11 / MAX: 6.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: efficientnet-b0 a b c 0.8775 1.755 2.6325 3.51 4.3875 SE +/- 0.05, N = 3 3.90 3.85 3.88 MIN: 3.82 / MAX: 4.51 MIN: 3.81 / MAX: 4.42 MIN: 3.84 / MAX: 4.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: googlenet a b c 2 4 6 8 10 SE +/- 0.02, N = 3 7.90 7.85 7.80 MIN: 7.74 / MAX: 9.54 MIN: 7.76 / MAX: 8.76 MIN: 7.72 / MAX: 8.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vgg16 a b c 6 12 18 24 30 SE +/- 0.30, N = 3 23.75 23.49 23.45 MIN: 23.31 / MAX: 25.12 MIN: 23.36 / MAX: 24.62 MIN: 23.26 / MAX: 24.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 a b c 0.7133 1.4266 2.1399 2.8532 3.5665 SE +/- 0.01, N = 3 3.17 3.14 3.13 MIN: 3.09 / MAX: 3.78 MIN: 3.1 / MAX: 3.73 MIN: 3.08 / MAX: 3.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: yolov4-tiny a b c 3 6 9 12 15 SE +/- 0.11, N = 3 12.90 12.74 12.81 MIN: 12.69 / MAX: 15.88 MIN: 12.66 / MAX: 13.28 MIN: 12.74 / MAX: 13.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
vkpeak int16-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int16-vec4 a b c 5K 10K 15K 20K 25K SE +/- 21.55, N = 3 23123.77 23396.59 23385.44
NCNN Target: CPU-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: regnety_400m b c 2 4 6 8 10 8.05 8.14 MIN: 8 / MAX: 8.58 MIN: 8.08 / MAX: 8.69 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: regnety_400m a b c 2 4 6 8 10 SE +/- 0.04, N = 3 8.18 8.18 8.27 MIN: 8.07 / MAX: 9.68 MIN: 8.12 / MAX: 8.86 MIN: 8.22 / MAX: 9.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 a b c 0.8685 1.737 2.6055 3.474 4.3425 SE +/- 0.01, N = 3 3.86 3.82 3.83 MIN: 3.8 / MAX: 4.6 MIN: 3.78 / MAX: 4.39 MIN: 3.79 / MAX: 4.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mnasnet a b c 0.6705 1.341 2.0115 2.682 3.3525 SE +/- 0.01, N = 3 2.98 2.95 2.96 MIN: 2.92 / MAX: 4.03 MIN: 2.92 / MAX: 3.42 MIN: 2.93 / MAX: 3.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C Bluestein benchmark in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C Bluestein benchmark in double precision a b c 1000 2000 3000 4000 5000 SE +/- 0.33, N = 3 4717 4695 4670 1. (CXX) g++ options: -O3
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mobilenet a b c 2 4 6 8 10 SE +/- 0.03, N = 3 8.05 7.97 8.02 MIN: 7.97 / MAX: 9.07 MIN: 7.94 / MAX: 8.26 MIN: 7.98 / MAX: 8.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: squeezenet_ssd b c 2 4 6 8 10 7.14 7.07 MIN: 7.06 / MAX: 7.95 MIN: 7.01 / MAX: 7.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: resnet18 b c 1.1835 2.367 3.5505 4.734 5.9175 5.21 5.26 MIN: 5.12 / MAX: 6.22 MIN: 5.18 / MAX: 6.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 a c 0.72 1.44 2.16 2.88 3.6 SE +/- 0.02, N = 3 3.17 3.20 MIN: 3.11 / MAX: 3.73 MIN: 3.16 / MAX: 3.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 a b c 0.7538 1.5076 2.2614 3.0152 3.769 SE +/- 0.01, N = 3 3.35 3.33 3.32 MIN: 3.29 / MAX: 3.85 MIN: 3.3 / MAX: 3.59 MIN: 3.29 / MAX: 4.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: efficientnet-b0 b c 0.8663 1.7326 2.5989 3.4652 4.3315 3.85 3.82 MIN: 3.82 / MAX: 4.48 MIN: 3.78 / MAX: 4.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C multidimensional in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C multidimensional in single precision a b c 7K 14K 21K 28K 35K SE +/- 57.83, N = 3 33001 32751 32812 1. (CXX) g++ options: -O3
vkpeak int32-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int32-vec4 a b c 600 1200 1800 2400 3000 SE +/- 0.26, N = 3 2658.73 2640.08 2638.69
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: FastestDet a b c 0.9225 1.845 2.7675 3.69 4.6125 SE +/- 0.00, N = 3 4.10 4.07 4.09 MIN: 4.06 / MAX: 4.81 MIN: 4.04 / MAX: 4.53 MIN: 4.05 / MAX: 5.5 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: blazeface b c 0.3083 0.6166 0.9249 1.2332 1.5415 1.37 1.36 MIN: 1.35 / MAX: 1.39 MIN: 1.34 / MAX: 1.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: blazeface b c 0.3105 0.621 0.9315 1.242 1.5525 1.37 1.38 MIN: 1.35 / MAX: 1.52 MIN: 1.36 / MAX: 1.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: blazeface a b c 0.3105 0.621 0.9315 1.242 1.5525 SE +/- 0.01, N = 3 1.38 1.37 1.37 MIN: 1.34 / MAX: 1.85 MIN: 1.35 / MAX: 1.75 MIN: 1.35 / MAX: 1.82 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: blazeface a b c 0.3128 0.6256 0.9384 1.2512 1.564 SE +/- 0.00, N = 3 1.38 1.38 1.39 MIN: 1.35 / MAX: 2.06 MIN: 1.35 / MAX: 1.67 MIN: 1.36 / MAX: 1.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
vkpeak fp32-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp32-vec4 a b c 3K 6K 9K 12K 15K SE +/- 1.81, N = 3 12730.08 12808.59 12822.01
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: yolov4-tiny b c 3 6 9 12 15 12.77 12.86 MIN: 12.69 / MAX: 13.71 MIN: 12.76 / MAX: 13.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: yolov4-tiny b c 3 6 9 12 15 12.98 12.89 MIN: 12.73 / MAX: 35.55 MIN: 12.84 / MAX: 13.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
vkpeak fp16-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp16-vec4 a b c 5K 10K 15K 20K 25K SE +/- 5.96, N = 3 23232.42 23390.44 23387.26
vkpeak fp64-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp64-vec4 a b c 200 400 600 800 1000 SE +/- 0.32, N = 3 841.80 836.55 836.16
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mnasnet a b c 0.6728 1.3456 2.0184 2.6912 3.364 SE +/- 0.01, N = 3 2.97 2.97 2.99 MIN: 2.92 / MAX: 3.48 MIN: 2.93 / MAX: 3.45 MIN: 2.96 / MAX: 3.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v2-v2 - Model: mobilenet-v2 a b c 0.7155 1.431 2.1465 2.862 3.5775 SE +/- 0.00, N = 3 3.16 3.16 3.18 MIN: 3.1 / MAX: 3.8 MIN: 3.11 / MAX: 3.61 MIN: 3.13 / MAX: 3.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: mobilenet b c 2 4 6 8 10 8.00 7.95 MIN: 7.95 / MAX: 8.99 MIN: 7.89 / MAX: 8.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C Bluestein in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C Bluestein in single precision a b c 2K 4K 6K 8K 10K SE +/- 62.67, N = 3 11340 11273 11311 1. (CXX) g++ options: -O3
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: squeezenet_ssd a b c 2 4 6 8 10 SE +/- 0.01, N = 3 7.07 7.06 7.10 MIN: 7.01 / MAX: 8.07 MIN: 7.01 / MAX: 7.55 MIN: 7.05 / MAX: 7.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: vgg16 b c 6 12 18 24 30 23.42 23.54 MIN: 23.27 / MAX: 24.32 MIN: 23.32 / MAX: 24.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: googlenet b c 2 4 6 8 10 7.84 7.88 MIN: 7.74 / MAX: 8.7 MIN: 7.79 / MAX: 8.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny a b c 3 6 9 12 15 SE +/- 0.02, N = 3 12.84 12.87 12.81 MIN: 12.69 / MAX: 15.33 MIN: 12.76 / MAX: 13.73 MIN: 12.73 / MAX: 13.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: alexnet a b c 0.9743 1.9486 2.9229 3.8972 4.8715 SE +/- 0.01, N = 3 4.31 4.33 4.33 MIN: 4.24 / MAX: 5.2 MIN: 4.28 / MAX: 5.16 MIN: 4.26 / MAX: 10.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd a b c 2 4 6 8 10 SE +/- 0.02, N = 3 7.09 7.07 7.06 MIN: 6.98 / MAX: 7.95 MIN: 7 / MAX: 8.07 MIN: 7 / MAX: 8.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: vision_transformer b c 7 14 21 28 35 31.65 31.78 MIN: 31.53 / MAX: 32.23 MIN: 31.64 / MAX: 34.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 b c 0.7088 1.4176 2.1264 2.8352 3.544 3.15 3.14 MIN: 3.1 / MAX: 3.68 MIN: 3.1 / MAX: 3.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 b c 0.7133 1.4266 2.1399 2.8532 3.5665 3.16 3.17 MIN: 3.11 / MAX: 3.75 MIN: 3.11 / MAX: 8.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3 - Model: mobilenet-v3 a c 0.7155 1.431 2.1465 2.862 3.5775 SE +/- 0.00, N = 2 3.18 3.17 MIN: 3.14 / MAX: 3.82 MIN: 3.15 / MAX: 3.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: shufflenet-v2 b c 0.7515 1.503 2.2545 3.006 3.7575 3.33 3.34 MIN: 3.3 / MAX: 3.79 MIN: 3.32 / MAX: 3.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: shufflenet-v2 a b c 0.7538 1.5076 2.2614 3.0152 3.769 SE +/- 0.01, N = 3 3.34 3.34 3.35 MIN: 3.3 / MAX: 3.85 MIN: 3.31 / MAX: 3.77 MIN: 3.31 / MAX: 3.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
vkpeak int16-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int16-scalar a b c 3K 6K 9K 12K 15K SE +/- 1.30, N = 3 13102.75 13070.81 13063.86
vkpeak fp64-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp64-scalar a b c 200 400 600 800 1000 SE +/- 0.22, N = 3 841.40 839.20 839.01
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer a b c 7 14 21 28 35 SE +/- 0.09, N = 3 31.88 31.85 31.79 MIN: 31.55 / MAX: 37.47 MIN: 31.69 / MAX: 33.06 MIN: 31.63 / MAX: 35.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling a b c 11K 22K 33K 44K 55K SE +/- 8.89, N = 3 50504 50643 50596 1. (CXX) g++ options: -O3
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet a b c 2 4 6 8 10 SE +/- 0.02, N = 3 8.05 8.04 8.03 MIN: 7.95 / MAX: 8.89 MIN: 7.95 / MAX: 14.33 MIN: 7.98 / MAX: 8.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: FastestDet b c 0.918 1.836 2.754 3.672 4.59 4.07 4.08 MIN: 4.03 / MAX: 5.83 MIN: 4.05 / MAX: 4.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C 1D batched in half precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in half precision a b c 20K 40K 60K 80K 100K SE +/- 83.55, N = 3 91597 91812 91744 1. (CXX) g++ options: -O3
NCNN Target: CPU-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: alexnet b c 0.9653 1.9306 2.8959 3.8612 4.8265 4.29 4.28 MIN: 4.24 / MAX: 5.64 MIN: 4.24 / MAX: 5.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vgg16 a b c 6 12 18 24 30 SE +/- 0.04, N = 3 23.51 23.56 23.54 MIN: 23.29 / MAX: 24.68 MIN: 23.34 / MAX: 24.72 MIN: 23.33 / MAX: 24.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C 1D batched in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in single precision a b c 10K 20K 30K 40K 50K SE +/- 9.54, N = 3 47887 47948 47971 1. (CXX) g++ options: -O3
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: vision_transformer b c 7 14 21 28 35 31.71 31.66 MIN: 31.56 / MAX: 33.03 MIN: 31.52 / MAX: 32.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
vkpeak int32-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int32-scalar a b c 500 1000 1500 2000 2500 SE +/- 0.34, N = 3 2272.62 2269.25 2269.06
VkFFT Test: FFT + iFFT C2C 1D batched in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in double precision a b c 4K 8K 12K 16K 20K SE +/- 11.67, N = 3 20816 20822 20847 1. (CXX) g++ options: -O3
NCNN Target: CPU-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: squeezenet_ssd b c 2 4 6 8 10 7.03 7.04 MIN: 6.97 / MAX: 7.88 MIN: 6.96 / MAX: 7.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
vkpeak fp16-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp16-scalar a b c 3K 6K 9K 12K 15K SE +/- 4.01, N = 3 13154.15 13145.19 13136.79
NCNN Target: CPU-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: mobilenet b c 2 4 6 8 10 8.01 8.00 MIN: 7.95 / MAX: 8.95 MIN: 7.96 / MAX: 8.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet50 a b c 3 6 9 12 15 SE +/- 0.02, N = 3 10.01 10.00 10.00 MIN: 9.88 / MAX: 11.4 MIN: 9.92 / MAX: 12.35 MIN: 9.91 / MAX: 11.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single a b c 3 6 9 12 15 SE +/- 0.00, N = 3 11.69 11.69 11.69 1. (CXX) g++ options: -O3
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: mnasnet b c 0.666 1.332 1.998 2.664 3.33 2.96 2.96 MIN: 2.93 / MAX: 3.4 MIN: 2.93 / MAX: 3.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: shufflenet-v2 b c 0.7493 1.4986 2.2479 2.9972 3.7465 3.33 3.33 MIN: 3.3 / MAX: 3.77 MIN: 3.31 / MAX: 3.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 b c 0.7088 1.4176 2.1264 2.8352 3.544 3.15 3.15 MIN: 3.11 / MAX: 3.88 MIN: 3.11 / MAX: 3.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: mnasnet b c 0.6683 1.3366 2.0049 2.6732 3.3415 2.97 2.97 MIN: 2.94 / MAX: 3.43 MIN: 2.94 / MAX: 3.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 b c 0.711 1.422 2.133 2.844 3.555 3.16 3.16 MIN: 3.12 / MAX: 3.69 MIN: 3.12 / MAX: 3.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: FastestDet a b c 0.9248 1.8496 2.7744 3.6992 4.624 SE +/- 0.45, N = 3 3.62 4.05 4.11 MIN: 2.7 / MAX: 4.54 MIN: 4.02 / MAX: 4.35 MIN: 4.08 / MAX: 4.4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Phoronix Test Suite v10.8.5