vulkan benchmarks AMD Ryzen 9 7950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) and AMD Radeon RX 6700 XT on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2308012-PTS-VULKANBE49&rdt&grr .
vulkan benchmarks Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c AMD Ryzen 9 7950X 16-Core @ 4.50GHz (16 Cores / 32 Threads) ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) AMD Device 14d8 32GB Western Digital WD_BLACK SN850X 1000GB + 4001GB AMD Radeon RX 6700 XT (2855/1000MHz) AMD Navi 21/23 ASUS MG28U Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 Ubuntu 23.04 6.4.6-060406-generic (x86_64) GNOME Shell 44.2 X Server 1.21.1.7 + Wayland 4.6 Mesa 23.3~git2307260600.87109c~oibaf~l (git-87109c3 2023-07-26 lunar-oibaf-ppa) (LLVM 15.0.7 DRM 3.52) GCC 12.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601203 Graphics Details - BAR1 / Visible vRAM Size: 12272 MB - vBIOS Version: 113-D5121100-101 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
vulkan benchmarks vkpeak: int16-vec4 vkpeak: int16-scalar vkpeak: int32-vec4 vkpeak: int32-scalar vkpeak: fp64-vec4 vkpeak: fp64-scalar vkpeak: fp16-vec4 vkpeak: fp16-scalar vkpeak: fp32-vec4 vkpeak: fp32-scalar vkfft: FFT + iFFT C2C 1D batched in double precision ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 vkfft: FFT + iFFT C2C Bluestein benchmark in double precision vkfft: FFT + iFFT C2C 1D batched in single precision ncnn: CPU - FastestDet ncnn: CPU - vision_transformer ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet ncnn: Vulkan GPU - FastestDet ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU - mobilenet vkfft: FFT + iFFT C2C 1D batched in single precision, no reshuffling ncnn: CPU-v3-v3-v3 - FastestDet ncnn: CPU-v3-v3-v3 - vision_transformer ncnn: CPU-v3-v3-v3 - regnety_400m ncnn: CPU-v3-v3-v3 - squeezenet_ssd ncnn: CPU-v3-v3-v3 - yolov4-tiny ncnn: CPU-v3-v3-v3 - resnet50 ncnn: CPU-v3-v3-v3 - alexnet ncnn: CPU-v3-v3-v3 - resnet18 ncnn: CPU-v3-v3-v3 - vgg16 ncnn: CPU-v3-v3-v3 - googlenet ncnn: CPU-v3-v3-v3 - blazeface ncnn: CPU-v3-v3-v3 - efficientnet-b0 ncnn: CPU-v3-v3-v3 - mnasnet ncnn: CPU-v3-v3-v3 - shufflenet-v2 ncnn: CPU-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: CPU-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3-v3 - mobilenet ncnn: Vulkan GPU-v3-v3-v3 - FastestDet ncnn: Vulkan GPU-v3-v3-v3 - vision_transformer ncnn: Vulkan GPU-v3-v3-v3 - regnety_400m ncnn: Vulkan GPU-v3-v3-v3 - squeezenet_ssd ncnn: Vulkan GPU-v3-v3-v3 - yolov4-tiny ncnn: Vulkan GPU-v3-v3-v3 - resnet50 ncnn: Vulkan GPU-v3-v3-v3 - alexnet ncnn: Vulkan GPU-v3-v3-v3 - resnet18 ncnn: Vulkan GPU-v3-v3-v3 - vgg16 ncnn: Vulkan GPU-v3-v3-v3 - googlenet ncnn: Vulkan GPU-v3-v3-v3 - blazeface ncnn: Vulkan GPU-v3-v3-v3 - efficientnet-b0 ncnn: Vulkan GPU-v3-v3-v3 - mnasnet ncnn: Vulkan GPU-v3-v3-v3 - shufflenet-v2 ncnn: Vulkan GPU-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3-v3 - mobilenet vkfft: FFT + iFFT C2C Bluestein in single precision vkfft: FFT + iFFT C2C 1D batched in half precision vkfft: FFT + iFFT C2C multidimensional in single precision vkfft: FFT + iFFT R2C / C2R vkresample: 2x - Single a b c 23123.77 13102.75 2658.73 2272.62 841.80 841.40 23232.42 13154.15 12730.08 13190.09 20816 3.18 3.17 4717 47887 3.62 32.49 8.18 7.07 12.90 10.20 4.41 5.29 23.75 7.94 1.38 3.90 2.97 3.34 3.16 8.05 4.1 31.88 8.16 7.09 12.84 10.01 4.31 5.28 23.51 7.90 1.38 3.86 2.98 3.35 3.17 8.05 50504 11340 91597 33001 42105 11.686 23396.59 13070.81 2640.08 2269.25 836.55 839.2 23390.44 13145.19 12808.59 12807.06 20822 4695 47948 4.05 31.95 8.18 7.06 12.74 10.01 4.32 5.2 23.49 7.82 1.38 3.85 2.97 3.34 3.16 7.97 4.07 31.85 8.21 7.07 12.87 10 4.33 5.23 23.56 7.85 1.37 3.82 2.95 3.33 3.14 8.04 50643 4.07 31.65 8.05 7.03 12.98 10.01 4.29 5.21 23.5 7.84 1.37 3.82 2.97 3.33 3.16 3.15 8.01 4.06 31.71 8.27 7.14 12.77 9.87 4.42 5.42 23.42 7.97 1.37 3.85 2.96 3.33 3.16 3.15 8 11273 91812 32751 42163 11.69 23385.44 13063.86 2638.69 2269.06 836.16 839.01 23387.26 13136.79 12822.01 12860.56 20847 3.17 3.2 4670 47971 4.11 31.77 8.27 7.1 12.81 10.11 4.31 5.24 23.45 7.93 1.39 3.88 2.99 3.35 3.18 8.02 4.09 31.79 8 7.06 12.81 10 4.33 5.21 23.54 7.8 1.37 3.83 2.96 3.32 3.13 8.03 50596 4.08 31.78 8.14 7.04 12.89 10.33 4.28 5.26 23.99 7.88 1.38 3.89 2.97 3.34 3.16 3.14 8 3.69 31.66 7.98 7.07 12.86 10.03 4.3 5.23 23.54 7.83 1.36 3.82 2.96 3.33 3.17 3.15 7.95 11311 91744 32812 43021 11.688 OpenBenchmarking.org
vkpeak int16-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int16-vec4 a b c 5K 10K 15K 20K 25K SE +/- 21.55, N = 3 23123.77 23396.59 23385.44
vkpeak int16-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int16-scalar a b c 3K 6K 9K 12K 15K SE +/- 1.30, N = 3 13102.75 13070.81 13063.86
vkpeak int32-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int32-vec4 a b c 600 1200 1800 2400 3000 SE +/- 0.26, N = 3 2658.73 2640.08 2638.69
vkpeak int32-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int32-scalar a b c 500 1000 1500 2000 2500 SE +/- 0.34, N = 3 2272.62 2269.25 2269.06
vkpeak fp64-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp64-vec4 a b c 200 400 600 800 1000 SE +/- 0.32, N = 3 841.80 836.55 836.16
vkpeak fp64-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp64-scalar a b c 200 400 600 800 1000 SE +/- 0.22, N = 3 841.40 839.20 839.01
vkpeak fp16-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp16-vec4 a b c 5K 10K 15K 20K 25K SE +/- 5.96, N = 3 23232.42 23390.44 23387.26
vkpeak fp16-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp16-scalar a b c 3K 6K 9K 12K 15K SE +/- 4.01, N = 3 13154.15 13145.19 13136.79
vkpeak fp32-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp32-vec4 a b c 3K 6K 9K 12K 15K SE +/- 1.81, N = 3 12730.08 12808.59 12822.01
vkpeak fp32-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp32-scalar a b c 3K 6K 9K 12K 15K SE +/- 4.18, N = 3 13190.09 12807.06 12860.56
VkFFT Test: FFT + iFFT C2C 1D batched in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in double precision a b c 4K 8K 12K 16K 20K SE +/- 11.67, N = 3 20816 20822 20847 1. (CXX) g++ options: -O3
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3 - Model: mobilenet-v3 a c 0.7155 1.431 2.1465 2.862 3.5775 SE +/- 0.00, N = 2 3.18 3.17 MIN: 3.14 / MAX: 3.82 MIN: 3.15 / MAX: 3.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 a c 0.72 1.44 2.16 2.88 3.6 SE +/- 0.02, N = 3 3.17 3.20 MIN: 3.11 / MAX: 3.73 MIN: 3.16 / MAX: 3.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C Bluestein benchmark in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C Bluestein benchmark in double precision a b c 1000 2000 3000 4000 5000 SE +/- 0.33, N = 3 4717 4695 4670 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in single precision a b c 10K 20K 30K 40K 50K SE +/- 9.54, N = 3 47887 47948 47971 1. (CXX) g++ options: -O3
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: FastestDet a b c 0.9248 1.8496 2.7744 3.6992 4.624 SE +/- 0.45, N = 3 3.62 4.05 4.11 MIN: 2.7 / MAX: 4.54 MIN: 4.02 / MAX: 4.35 MIN: 4.08 / MAX: 4.4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vision_transformer a b c 8 16 24 32 40 SE +/- 0.29, N = 3 32.49 31.95 31.77 MIN: 31.67 / MAX: 40.11 MIN: 31.79 / MAX: 32.33 MIN: 31.61 / MAX: 35.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: regnety_400m a b c 2 4 6 8 10 SE +/- 0.04, N = 3 8.18 8.18 8.27 MIN: 8.07 / MAX: 9.68 MIN: 8.12 / MAX: 8.86 MIN: 8.22 / MAX: 9.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: squeezenet_ssd a b c 2 4 6 8 10 SE +/- 0.01, N = 3 7.07 7.06 7.10 MIN: 7.01 / MAX: 8.07 MIN: 7.01 / MAX: 7.55 MIN: 7.05 / MAX: 7.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: yolov4-tiny a b c 3 6 9 12 15 SE +/- 0.11, N = 3 12.90 12.74 12.81 MIN: 12.69 / MAX: 15.88 MIN: 12.66 / MAX: 13.28 MIN: 12.74 / MAX: 13.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet50 a b c 3 6 9 12 15 SE +/- 0.23, N = 3 10.20 10.01 10.11 MIN: 9.84 / MAX: 12.48 MIN: 9.85 / MAX: 11.06 MIN: 9.95 / MAX: 16.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: alexnet a b c 0.9923 1.9846 2.9769 3.9692 4.9615 SE +/- 0.11, N = 3 4.41 4.32 4.31 MIN: 4.24 / MAX: 5.16 MIN: 4.26 / MAX: 5.15 MIN: 4.26 / MAX: 4.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet18 a b c 1.1903 2.3806 3.5709 4.7612 5.9515 SE +/- 0.07, N = 3 5.29 5.20 5.24 MIN: 5.09 / MAX: 6.29 MIN: 5.1 / MAX: 5.9 MIN: 5.15 / MAX: 6.09 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vgg16 a b c 6 12 18 24 30 SE +/- 0.30, N = 3 23.75 23.49 23.45 MIN: 23.31 / MAX: 25.12 MIN: 23.36 / MAX: 24.62 MIN: 23.26 / MAX: 24.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: googlenet a b c 2 4 6 8 10 SE +/- 0.11, N = 3 7.94 7.82 7.93 MIN: 7.71 / MAX: 8.73 MIN: 7.73 / MAX: 8.65 MIN: 7.82 / MAX: 8.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: blazeface a b c 0.3128 0.6256 0.9384 1.2512 1.564 SE +/- 0.00, N = 3 1.38 1.38 1.39 MIN: 1.35 / MAX: 2.06 MIN: 1.35 / MAX: 1.67 MIN: 1.36 / MAX: 1.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: efficientnet-b0 a b c 0.8775 1.755 2.6325 3.51 4.3875 SE +/- 0.05, N = 3 3.90 3.85 3.88 MIN: 3.82 / MAX: 4.51 MIN: 3.81 / MAX: 4.42 MIN: 3.84 / MAX: 4.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mnasnet a b c 0.6728 1.3456 2.0184 2.6912 3.364 SE +/- 0.01, N = 3 2.97 2.97 2.99 MIN: 2.92 / MAX: 3.48 MIN: 2.93 / MAX: 3.45 MIN: 2.96 / MAX: 3.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: shufflenet-v2 a b c 0.7538 1.5076 2.2614 3.0152 3.769 SE +/- 0.01, N = 3 3.34 3.34 3.35 MIN: 3.3 / MAX: 3.85 MIN: 3.31 / MAX: 3.77 MIN: 3.31 / MAX: 3.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v2-v2 - Model: mobilenet-v2 a b c 0.7155 1.431 2.1465 2.862 3.5775 SE +/- 0.00, N = 3 3.16 3.16 3.18 MIN: 3.1 / MAX: 3.8 MIN: 3.11 / MAX: 3.61 MIN: 3.13 / MAX: 3.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mobilenet a b c 2 4 6 8 10 SE +/- 0.03, N = 3 8.05 7.97 8.02 MIN: 7.97 / MAX: 9.07 MIN: 7.94 / MAX: 8.26 MIN: 7.98 / MAX: 8.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: FastestDet a b c 0.9225 1.845 2.7675 3.69 4.6125 SE +/- 0.00, N = 3 4.10 4.07 4.09 MIN: 4.06 / MAX: 4.81 MIN: 4.04 / MAX: 4.53 MIN: 4.05 / MAX: 5.5 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer a b c 7 14 21 28 35 SE +/- 0.09, N = 3 31.88 31.85 31.79 MIN: 31.55 / MAX: 37.47 MIN: 31.69 / MAX: 33.06 MIN: 31.63 / MAX: 35.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m a b c 2 4 6 8 10 SE +/- 0.11, N = 3 8.16 8.21 8.00 MIN: 7.9 / MAX: 8.99 MIN: 8.14 / MAX: 8.84 MIN: 7.94 / MAX: 8.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd a b c 2 4 6 8 10 SE +/- 0.02, N = 3 7.09 7.07 7.06 MIN: 6.98 / MAX: 7.95 MIN: 7 / MAX: 8.07 MIN: 7 / MAX: 8.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny a b c 3 6 9 12 15 SE +/- 0.02, N = 3 12.84 12.87 12.81 MIN: 12.69 / MAX: 15.33 MIN: 12.76 / MAX: 13.73 MIN: 12.73 / MAX: 13.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet50 a b c 3 6 9 12 15 SE +/- 0.02, N = 3 10.01 10.00 10.00 MIN: 9.88 / MAX: 11.4 MIN: 9.92 / MAX: 12.35 MIN: 9.91 / MAX: 11.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: alexnet a b c 0.9743 1.9486 2.9229 3.8972 4.8715 SE +/- 0.01, N = 3 4.31 4.33 4.33 MIN: 4.24 / MAX: 5.2 MIN: 4.28 / MAX: 5.16 MIN: 4.26 / MAX: 10.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet18 a b c 1.188 2.376 3.564 4.752 5.94 SE +/- 0.01, N = 3 5.28 5.23 5.21 MIN: 5.17 / MAX: 6.16 MIN: 5.13 / MAX: 6.18 MIN: 5.11 / MAX: 6.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vgg16 a b c 6 12 18 24 30 SE +/- 0.04, N = 3 23.51 23.56 23.54 MIN: 23.29 / MAX: 24.68 MIN: 23.34 / MAX: 24.72 MIN: 23.33 / MAX: 24.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: googlenet a b c 2 4 6 8 10 SE +/- 0.02, N = 3 7.90 7.85 7.80 MIN: 7.74 / MAX: 9.54 MIN: 7.76 / MAX: 8.76 MIN: 7.72 / MAX: 8.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: blazeface a b c 0.3105 0.621 0.9315 1.242 1.5525 SE +/- 0.01, N = 3 1.38 1.37 1.37 MIN: 1.34 / MAX: 1.85 MIN: 1.35 / MAX: 1.75 MIN: 1.35 / MAX: 1.82 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 a b c 0.8685 1.737 2.6055 3.474 4.3425 SE +/- 0.01, N = 3 3.86 3.82 3.83 MIN: 3.8 / MAX: 4.6 MIN: 3.78 / MAX: 4.39 MIN: 3.79 / MAX: 4.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mnasnet a b c 0.6705 1.341 2.0115 2.682 3.3525 SE +/- 0.01, N = 3 2.98 2.95 2.96 MIN: 2.92 / MAX: 4.03 MIN: 2.92 / MAX: 3.42 MIN: 2.93 / MAX: 3.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 a b c 0.7538 1.5076 2.2614 3.0152 3.769 SE +/- 0.01, N = 3 3.35 3.33 3.32 MIN: 3.29 / MAX: 3.85 MIN: 3.3 / MAX: 3.59 MIN: 3.29 / MAX: 4.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 a b c 0.7133 1.4266 2.1399 2.8532 3.5665 SE +/- 0.01, N = 3 3.17 3.14 3.13 MIN: 3.09 / MAX: 3.78 MIN: 3.1 / MAX: 3.73 MIN: 3.08 / MAX: 3.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet a b c 2 4 6 8 10 SE +/- 0.02, N = 3 8.05 8.04 8.03 MIN: 7.95 / MAX: 8.89 MIN: 7.95 / MAX: 14.33 MIN: 7.98 / MAX: 8.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling a b c 11K 22K 33K 44K 55K SE +/- 8.89, N = 3 50504 50643 50596 1. (CXX) g++ options: -O3
NCNN Target: CPU-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: FastestDet b c 0.918 1.836 2.754 3.672 4.59 4.07 4.08 MIN: 4.03 / MAX: 5.83 MIN: 4.05 / MAX: 4.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: vision_transformer b c 7 14 21 28 35 31.65 31.78 MIN: 31.53 / MAX: 32.23 MIN: 31.64 / MAX: 34.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: regnety_400m b c 2 4 6 8 10 8.05 8.14 MIN: 8 / MAX: 8.58 MIN: 8.08 / MAX: 8.69 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: squeezenet_ssd b c 2 4 6 8 10 7.03 7.04 MIN: 6.97 / MAX: 7.88 MIN: 6.96 / MAX: 7.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: yolov4-tiny b c 3 6 9 12 15 12.98 12.89 MIN: 12.73 / MAX: 35.55 MIN: 12.84 / MAX: 13.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: resnet50 b c 3 6 9 12 15 10.01 10.33 MIN: 9.89 / MAX: 10.86 MIN: 10.16 / MAX: 13.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: alexnet b c 0.9653 1.9306 2.8959 3.8612 4.8265 4.29 4.28 MIN: 4.24 / MAX: 5.64 MIN: 4.24 / MAX: 5.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: resnet18 b c 1.1835 2.367 3.5505 4.734 5.9175 5.21 5.26 MIN: 5.12 / MAX: 6.22 MIN: 5.18 / MAX: 6.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: vgg16 b c 6 12 18 24 30 23.50 23.99 MIN: 23.3 / MAX: 24.41 MIN: 23.72 / MAX: 24.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: googlenet b c 2 4 6 8 10 7.84 7.88 MIN: 7.74 / MAX: 8.7 MIN: 7.79 / MAX: 8.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: blazeface b c 0.3105 0.621 0.9315 1.242 1.5525 1.37 1.38 MIN: 1.35 / MAX: 1.52 MIN: 1.36 / MAX: 1.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: efficientnet-b0 b c 0.8753 1.7506 2.6259 3.5012 4.3765 3.82 3.89 MIN: 3.79 / MAX: 4.34 MIN: 3.83 / MAX: 9.72 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: mnasnet b c 0.6683 1.3366 2.0049 2.6732 3.3415 2.97 2.97 MIN: 2.94 / MAX: 3.43 MIN: 2.94 / MAX: 3.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: shufflenet-v2 b c 0.7515 1.503 2.2545 3.006 3.7575 3.33 3.34 MIN: 3.3 / MAX: 3.79 MIN: 3.32 / MAX: 3.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 b c 0.711 1.422 2.133 2.844 3.555 3.16 3.16 MIN: 3.12 / MAX: 3.69 MIN: 3.12 / MAX: 3.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 b c 0.7088 1.4176 2.1264 2.8352 3.544 3.15 3.14 MIN: 3.1 / MAX: 3.68 MIN: 3.1 / MAX: 3.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: mobilenet b c 2 4 6 8 10 8.01 8.00 MIN: 7.95 / MAX: 8.95 MIN: 7.96 / MAX: 8.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: FastestDet b c 0.9135 1.827 2.7405 3.654 4.5675 4.06 3.69 MIN: 4.03 / MAX: 4.3 MIN: 3.66 / MAX: 3.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: vision_transformer b c 7 14 21 28 35 31.71 31.66 MIN: 31.56 / MAX: 33.03 MIN: 31.52 / MAX: 32.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: regnety_400m b c 2 4 6 8 10 8.27 7.98 MIN: 8.22 / MAX: 9.01 MIN: 7.93 / MAX: 8.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: squeezenet_ssd b c 2 4 6 8 10 7.14 7.07 MIN: 7.06 / MAX: 7.95 MIN: 7.01 / MAX: 7.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: yolov4-tiny b c 3 6 9 12 15 12.77 12.86 MIN: 12.69 / MAX: 13.71 MIN: 12.76 / MAX: 13.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: resnet50 b c 3 6 9 12 15 9.87 10.03 MIN: 9.79 / MAX: 10.73 MIN: 9.93 / MAX: 10.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: alexnet b c 0.9945 1.989 2.9835 3.978 4.9725 4.42 4.30 MIN: 4.32 / MAX: 5.1 MIN: 4.26 / MAX: 5.16 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: resnet18 b c 1.2195 2.439 3.6585 4.878 6.0975 5.42 5.23 MIN: 5.36 / MAX: 6.27 MIN: 5.11 / MAX: 6.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: vgg16 b c 6 12 18 24 30 23.42 23.54 MIN: 23.27 / MAX: 24.32 MIN: 23.32 / MAX: 24.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: googlenet b c 2 4 6 8 10 7.97 7.83 MIN: 7.89 / MAX: 8.7 MIN: 7.74 / MAX: 8.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: blazeface b c 0.3083 0.6166 0.9249 1.2332 1.5415 1.37 1.36 MIN: 1.35 / MAX: 1.39 MIN: 1.34 / MAX: 1.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: efficientnet-b0 b c 0.8663 1.7326 2.5989 3.4652 4.3315 3.85 3.82 MIN: 3.82 / MAX: 4.48 MIN: 3.78 / MAX: 4.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: mnasnet b c 0.666 1.332 1.998 2.664 3.33 2.96 2.96 MIN: 2.93 / MAX: 3.4 MIN: 2.93 / MAX: 3.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: shufflenet-v2 b c 0.7493 1.4986 2.2479 2.9972 3.7465 3.33 3.33 MIN: 3.3 / MAX: 3.77 MIN: 3.31 / MAX: 3.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 b c 0.7133 1.4266 2.1399 2.8532 3.5665 3.16 3.17 MIN: 3.11 / MAX: 3.75 MIN: 3.11 / MAX: 8.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 b c 0.7088 1.4176 2.1264 2.8352 3.544 3.15 3.15 MIN: 3.11 / MAX: 3.88 MIN: 3.11 / MAX: 3.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: mobilenet b c 2 4 6 8 10 8.00 7.95 MIN: 7.95 / MAX: 8.99 MIN: 7.89 / MAX: 8.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C Bluestein in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C Bluestein in single precision a b c 2K 4K 6K 8K 10K SE +/- 62.67, N = 3 11340 11273 11311 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in half precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in half precision a b c 20K 40K 60K 80K 100K SE +/- 83.55, N = 3 91597 91812 91744 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C multidimensional in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C multidimensional in single precision a b c 7K 14K 21K 28K 35K SE +/- 57.83, N = 3 33001 32751 32812 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT R2C / C2R OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT R2C / C2R a b c 9K 18K 27K 36K 45K SE +/- 200.55, N = 3 42105 42163 43021 1. (CXX) g++ options: -O3
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single a b c 3 6 9 12 15 SE +/- 0.00, N = 3 11.69 11.69 11.69 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5