vulkan-benchmarks AMD Ryzen 9 7950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) and NVIDIA GeForce RTX 4090 24GB on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2308069-PTS-VULKANBE16&rdt&grs .
vulkan-benchmarks Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution Display Driver a b c d e f g h i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 AMD Ryzen 9 7950X 16-Core @ 4.50GHz (16 Cores / 32 Threads) ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) AMD Device 14d8 32GB Western Digital WD_BLACK SN850X 1000GB + 4001GB AMD Radeon RX 6700 XT (2855/1000MHz) AMD Navi 21/23 ASUS MG28U Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 Ubuntu 23.04 6.4.6-060406-generic (x86_64) GNOME Shell 44.2 X Server 1.21.1.7 + Wayland 4.6 Mesa 23.3~git2307260600.87109c~oibaf~l (git-87109c3 2023-07-26 lunar-oibaf-ppa) (LLVM 15.0.7 DRM 3.52) GCC 12.2.0 ext4 3840x2160 MSI NVIDIA GeForce RTX 4060 8GB NVIDIA Device 22be X Server 1.21.1.7 NVIDIA 535.86.05 4.6.0 eVGA NVIDIA GeForce RTX 3060 12GB NVIDIA GA106 HD Audio NVIDIA GeForce RTX 3060 Ti 8GB NVIDIA GA104 HD Audio 2560x1440 NVIDIA GeForce RTX 4080 16GB NVIDIA Device 22bb 3840x2160 NVIDIA GeForce RTX 3090 24GB NVIDIA GA102 HD Audio NVIDIA GeForce RTX 3070 8GB NVIDIA GA104 HD Audio 2560x1440 NVIDIA GeForce RTX 3070 Ti 8GB NVIDIA GeForce RTX 4090 24GB NVIDIA AD102 HD Audio 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - a: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601203 - b: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601203 - c: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601203 - d: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - e: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - f: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - g: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - h: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - i: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 4080: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 4080 rep: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 4080 xxx: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 4080 zzz: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 3090: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 3090 rep: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 3070: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - RTX 3070 Ti: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 4090: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 4090 rep: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - nv 4090: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 Graphics Details - a: BAR1 / Visible vRAM Size: 12272 MB - vBIOS Version: 113-D5121100-101 - b: BAR1 / Visible vRAM Size: 12272 MB - vBIOS Version: 113-D5121100-101 - c: BAR1 / Visible vRAM Size: 12272 MB - vBIOS Version: 113-D5121100-101 - d: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 95.07.31.00.e3 - e: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 95.07.31.00.e3 - f: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 94.06.14.40.46 - g: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 94.06.14.40.46 - h: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 94.06.14.40.46 - i: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 94.04.25.00.2c - 4080: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.0e.00.04 - 4080 rep: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.0e.00.04 - 4080 xxx: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.0e.00.04 - 4080 zzz: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.0e.00.04 - 3090: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 94.02.27.00.02 - 3090 rep: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 94.02.27.00.02 - 3070: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 94.04.25.00.2b - RTX 3070 Ti: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 94.04.5b.00.02 - 4090: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01 - 4090 rep: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01 - nv 4090: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
vulkan-benchmarks vkpeak: int32-scalar vkpeak: int32-vec4 vkfft: FFT + iFFT C2C 1D batched in double precision vkfft: FFT + iFFT C2C Bluestein benchmark in double precision vkpeak: fp64-vec4 vkpeak: fp64-scalar vkpeak: int16-vec4 vkresample: 2x - Single vkfft: FFT + iFFT C2C 1D batched in single precision vkfft: FFT + iFFT C2C 1D batched in single precision, no reshuffling vkfft: FFT + iFFT C2C 1D batched in half precision ncnn: CPU - FastestDet vkfft: FFT + iFFT R2C / C2R vkfft: FFT + iFFT C2C multidimensional in single precision ncnn: CPU-v3-v3-v3 - FastestDet vkpeak: fp32-scalar vkpeak: fp32-vec4 ncnn: Vulkan GPU-v3-v3-v3-v3-v3 - mobilenet-v3 vkpeak: fp16-scalar ncnn: CPU-v2-v2 - mobilenet-v2 vkpeak: fp16-vec4 vkpeak: int16-scalar vkresample: 2x - Double ncnn: Vulkan GPU-v3-v3-v3 - blazeface ncnn: CPU-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU - blazeface ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mnasnet ncnn: CPU - mobilenet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - alexnet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mnasnet vkfft: FFT + iFFT C2C Bluestein in single precision ncnn: CPU - resnet18 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - squeezenet_ssd ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - alexnet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3-v3 - googlenet ncnn: Vulkan GPU - mobilenet ncnn: CPU-v3-v3-v3 - blazeface ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - blazeface ncnn: CPU-v3-v3-v3 - vision_transformer ncnn: CPU-v3-v3-v3 - alexnet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - resnet18 ncnn: Vulkan GPU-v3-v3-v3 - resnet18 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - efficientnet-b0 ncnn: CPU-v3-v3-v3 - squeezenet_ssd ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - FastestDet ncnn: Vulkan GPU - alexnet ncnn: CPU - mnasnet ncnn: Vulkan GPU - googlenet ncnn: CPU-v3-v3-v3-v3-v3-v3 - FastestDet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - alexnet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - efficientnet-b0 ncnn: CPU-v3-v3-v3 - regnety_400m ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - FastestDet ncnn: CPU-v3-v3-v3-v3-v3-v3 - alexnet ncnn: CPU-v3-v3-v3 - shufflenet-v2 ncnn: Vulkan GPU - resnet18 ncnn: CPU-v3-v3-v3-v3-v3-v3 - resnet18 ncnn: CPU-v3-v3-v3-v3-v3-v3 - googlenet ncnn: CPU - vgg16 ncnn: CPU-v3-v3-v3 - resnet50 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - shufflenet-v2 ncnn: CPU - efficientnet-b0 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - googlenet ncnn: CPU-v3-v3-v3-v3-v3-v3 - efficientnet-b0 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - googlenet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - blazeface ncnn: CPU - vision_transformer ncnn: Vulkan GPU-v3-v3-v3 - mnasnet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - googlenet ncnn: Vulkan GPU-v3-v3-v3 - vgg16 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - vgg16 ncnn: CPU-v3-v3-v3 - efficientnet-b0 ncnn: Vulkan GPU - efficientnet-b0 ncnn: CPU-v3-v3-v3-v3-v3-v3 - shufflenet-v2 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - resnet50 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - resnet50 ncnn: Vulkan GPU - resnet50 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - resnet50 ncnn: CPU - blazeface ncnn: Vulkan GPU-v3-v3-v3 - resnet50 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - resnet18 ncnn: Vulkan GPU-v3-v3-v3 - FastestDet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - alexnet ncnn: CPU - googlenet ncnn: CPU-v3-v3-v3 - mnasnet ncnn: Vulkan GPU-v3-v3-v3 - vision_transformer ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - yolov4-tiny ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet ncnn: Vulkan GPU-v3-v3-v3 - yolov4-tiny ncnn: Vulkan GPU-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - resnet18 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v3-v3-v3 - squeezenet_ssd ncnn: Vulkan GPU-v3-v3-v3 - alexnet ncnn: CPU-v3-v3-v3 - yolov4-tiny ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - mobilenet ncnn: CPU-v3-v3-v3-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - yolov4-tiny ncnn: CPU-v3-v3-v3-v3-v3-v3 - regnety_400m ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - vgg16 ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - regnety_400m ncnn: CPU - alexnet ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU-v3-v3-v3 - regnety_400m ncnn: CPU - regnety_400m ncnn: CPU-v3-v3-v3 - mobilenet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - yolov4-tiny ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - regnety_400m ncnn: Vulkan GPU - vision_transformer ncnn: CPU-v3-v3-v3-v3-v3-v3 - yolov4-tiny ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - vision_transformer ncnn: CPU-v3-v3-v3-v3-v3-v3 - resnet50 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - vision_transformer ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - vision_transformer ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - efficientnet-b0 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - squeezenet_ssd ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - squeezenet_ssd ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - vgg16 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - regnety_400m ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - resnet18 ncnn: CPU-v3-v3-v3 - googlenet ncnn: CPU - yolov4-tiny ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - squeezenet_ssd ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - googlenet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - vision_transformer ncnn: CPU - resnet50 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - resnet50 ncnn: CPU-v3-v3-v3-v3-v3-v3 - vgg16 ncnn: CPU-v3-v3-v3 - resnet18 ncnn: Vulkan GPU - FastestDet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - vgg16 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet ncnn: CPU-v3-v3-v3 - vgg16 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - regnety_400m ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mnasnet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU-v3-v3-v3 - mobilenet ncnn: CPU-v3-v3-v3-v3-v3-v3 - vision_transformer ncnn: CPU-v3-v3-v3-v3-v3-v3 - mnasnet ncnn: CPU-v3-v3-v3-v3-v3-v3 - mobilenet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - yolov4-tiny ncnn: Vulkan GPU-v3-v3-v3 - efficientnet-b0 ncnn: CPU - shufflenet-v2 ncnn: Vulkan GPU - regnety_400m ncnn: CPU-v3-v3-v3-v3-v3-v3 - squeezenet_ssd ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - blazeface ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: CPU-v3-v3-v3-v3-v3-v3 - blazeface ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - shufflenet-v2 ncnn: CPU - squeezenet_ssd ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - FastestDet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - shufflenet-v2 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: CPU-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - shufflenet-v2 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - FastestDet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - mnasnet ncnn: Vulkan GPU-v3-v3-v3 - shufflenet-v2 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - blazeface a b c d e f g h i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2272.62 2658.73 20816 4717 841.80 841.40 23123.77 11.686 47887 50504 91597 3.62 42105 33001 13190.09 12730.08 13154.15 3.16 23232.42 13102.75 1.38 8.05 11340 5.29 8.05 4.31 2.97 7.90 3.17 5.28 23.75 3.90 3.18 32.49 3.86 10.01 1.38 7.94 7.09 4.41 12.84 8.18 31.88 12.90 10.20 4.1 3.35 3.17 23.51 3.34 8.16 7.07 2.98 2269.25 2640.08 20822 4695 836.55 839.2 23396.59 11.69 47948 50643 91812 4.05 42163 32751 4.07 12807.06 12808.59 3.16 13145.19 3.16 23390.44 13070.81 1.37 3.15 1.37 7.97 11273 5.2 7.97 8.04 1.37 31.65 4.29 5.42 7.03 4.33 2.97 7.85 3.14 8.05 3.33 5.23 23.49 10.01 3.85 31.95 2.96 23.42 3.82 3.82 10 1.38 9.87 4.06 7.82 2.97 31.71 12.77 3.15 7.14 4.42 12.98 7.07 4.32 12.87 8.27 8.18 8.01 31.85 7.84 12.74 10.01 5.21 4.07 3.33 23.5 23.56 8 3.85 3.34 8.21 7.06 2.95 3.16 3.33 2269.06 2638.69 20847 4670 836.16 839.01 23385.44 11.688 47971 50596 91744 4.11 43021 32812 4.08 12860.56 12822.01 3.17 13136.79 3.18 23387.26 13063.86 1.36 3.14 1.37 8.02 11311 5.24 7.83 8.03 1.38 31.78 4.28 5.23 7.04 4.33 2.99 7.8 3.13 8.14 3.34 5.21 23.45 10.33 3.88 3.17 31.77 2.96 23.54 3.89 3.83 10 1.39 10.03 3.69 7.93 2.97 31.66 12.86 3.15 7.07 4.3 12.89 7.06 4.31 12.81 7.98 8.27 8 31.79 7.88 12.81 10.11 5.26 4.09 3.32 23.99 3.2 23.54 7.95 3.82 3.35 8 7.1 2.96 3.16 3.33 8520.02 8465.82 12143 2346 267.74 267.43 7352.85 32.855 42645 43365 85181 4.08 35399 36328 8531.96 11251.17 8412.33 3.17 16864.47 5676.02 500.014 1.38 8.10 10719 5.23 8.02 4.31 2.98 7.85 3.16 5.23 23.51 3.85 3.17 32.43 3.87 10.10 1.38 7.85 7.08 4.30 12.85 8.23 32.12 12.95 10.00 4.11 3.35 3.17 23.56 3.35 8.17 7.09 2.97 8505.20 8465.71 12168 2343 267.25 267.41 7336.25 32.850 42651 43365 85191 35304 37090 8515.58 11231.72 8397.80 16865.29 5675.99 500.016 1.38 10560 8.04 4.31 7.85 3.14 5.22 3.84 10.10 7.05 12.87 31.93 4.08 3.33 3.18 23.60 8.10 2.96 6827.92 6800.17 10561 1814 214.23 214.17 5959.75 26.738 56476 57110 104146 4.22 26593 26238 3.85 6837.94 9006.57 3.15 6812.52 3.15 13440.97 4480.59 500.01 1.37 3.16 1.38 8.45 7571 5.69 7.94 8.27 1.43 33.47 4.83 5.3 6.97 4.64 2.97 8.15 3.13 8.34 3.4 5.48 24.19 11.05 3.87 33.56 2.96 24.12 4.04 3.86 10.26 1.37 10.25 4.2 7.92 3.12 33.36 14.34 3.16 7.09 4.35 13.07 7.08 4.36 13.17 8.5 8.08 8.56 32.92 8.07 13.32 11.05 6.13 4.24 3.4 24.45 3.14 24.55 8.65 3.85 3.55 8.34 7.23 2.97 3.15 3.33 6824.29 6794.92 10548 1818 213.95 213.96 5956.38 26.769 56455 57094 104171 2.57 26638 26541 3.97 6812.99 9002.59 3.15 6811.35 3.16 13438.47 4479.22 500.011 1.38 3.15 1.41 22.74 7574 5.55 3.17 8.35 8.17 1.37 33.39 4.71 5.5 7.26 4.87 3 8.96 3.18 8.07 3.92 3.38 6.22 23.78 11.25 3.34 3.91 3.14 32.73 2.98 7.96 24.71 24.04 4.14 3.86 10.33 10.34 1.38 10.43 5.28 3.97 4.35 7.98 3.05 32.68 13.14 13.35 3.17 7.31 4.86 13.08 8.5 7.1 8.38 4.32 13.64 7.99 8.3 8.98 32.42 33.32 3.84 7.14 9.15 17.23 10.72 5.48 4.07 3.35 24.92 3.16 24.2 8.2 4.63 3.59 8.36 7.13 2.98 3.16 2.97 3.35 1.38 6800.6 6772.98 10572 210.96 213.37 5978.38 56431 104298 26524 6810.73 9036.17 6838.32 13490.24 4495.98 7622 14780 2417 20.93 69738 71163 132270 2.66 33727 34686 5.69 3.26 3.52 500.006 1.25 3.3 1.4 8.37 10061 5.6 3.29 10.19 10.4 1.28 36.55 5.01 5.85 8.16 6.53 2.74 8.75 3.29 8.21 3.83 3.52 5.82 30.96 14.05 3.43 4.05 3.26 37.8 2.99 10.47 29.07 27.43 4.68 5.88 13.1 12.96 1.4 11.15 5.88 4.43 5.1 10.3 3.39 38.33 13.77 15.43 3.28 8.33 4.99 15.11 10.02 7.46 8.46 5.3 15.16 7.99 9.94 10.08 36.42 38.01 4.19 7.21 10.17 14.65 12.09 5.86 5.14 3.49 29.12 4.87 3.26 27.83 9.05 4.21 5.03 9.88 8.96 3.2 3.29 3.07 3.36 1.41 34974 5579 13.136 104556 106210 211076 4.42 66473 65869 4.2 3.27 3.28 288.201 1.42 3.31 1.41 8.73 4.98 17121 5.69 7.71 3.28 8.49 8.44 1.44 34.2 4.66 5.65 4.04 7.64 4.19 4.62 3.05 8.42 3.26 8.33 4.28 3.46 5.67 25.37 11.11 3.46 3.99 3.24 1.44 35.07 3.09 8.4 25.04 25.48 4.02 4.01 11.4 11.48 10.81 1.4 10.95 5.61 4.2 4.61 8.42 3.08 34.13 13.86 13.79 3.29 7.66 4.69 13.81 8.84 7.58 8.61 4.75 13.79 8.45 8.24 8.43 13.93 35.56 34.91 35.6 4.06 7.66 8.67 5.92 8.45 13.85 8.79 11.16 5.7 4.2 3.43 8.43 25.67 25.1 3.1 25 8.43 4.05 3.41 8.39 3.26 7.73 3.29 3.07 3.28 3.48 3.06 3.44 1.43 35038 5583 13.136 104491 106205 211058 4.34 68279 70068 4.18 3.27 3.28 288.166 1.45 3.27 1.41 8.41 4.64 17287 5.68 7.62 3.27 8.58 8.57 1.42 34.27 4.65 5.69 4.01 7.59 4.17 4.65 3.06 8.4 4.2 3.29 8.57 4.14 4.68 3.43 5.61 5.64 8.52 26.11 10.79 3.39 4.09 3.26 4.07 1.42 35.28 3.07 8.49 25.04 25.05 4.02 4.05 3.47 10.8 11.07 10.84 1.42 10.84 5.61 4.09 4.72 8.52 3.09 34.29 13.55 13.68 3.3 7.63 4.69 13.55 8.4 3.3 8.72 7.64 8.24 4.67 13.67 8.44 8.67 8.48 13.73 35.07 13.71 34.1 10.86 33.93 3.98 7.55 8.35 5.67 8.52 14.03 8.42 11.76 25.01 5.63 4.21 3.44 8.38 25.56 24.91 3.28 3.24 3.06 25.04 8.46 34.22 3.09 8.45 4.04 3.43 8.56 7.67 3.31 1.43 7.86 3.27 3.08 3.44 3.03 3.44 1.41 35071 5587 13.137 104528 106099 210713 4.17 69068 67887 4.19 3.33 3.28 288.039 1.42 3.3 1.42 8.37 4.71 17343 5.56 7.27 3.27 8.99 8.37 1.42 34.37 4.65 5.89 3.97 7.64 3.75 4.68 3.06 8.42 3.8 3.26 8.75 4.17 4.69 3.47 5.62 5.78 8.32 25.03 10.91 3.43 4.04 3.27 4.01 1.31 34.27 3.13 8.43 26.08 25.4 4.06 4.04 3.4 10.91 11.22 10.94 1.42 11.5 5.67 4.31 4.67 8.38 3.07 35.4 13.62 13.95 3.4 7.7 5.21 13.69 8.46 3.2 8.38 7.62 8.52 4.68 13.65 8.58 8.45 8.44 13.52 34.19 13.63 34.23 11.26 33.9 4.02 7.67 8.25 5.65 8.5 13.6 8.26 10.82 25.44 5.66 4.2 3.5 8.31 25.33 25 3.08 3.26 2.98 25.01 8.88 34.14 3 8.34 4.22 3.45 8.56 7.27 3.05 1.32 7.62 3.14 3.07 3.31 3.34 3.05 3.51 1.42 35058 5584 13.126 104543 105926 210991 4.16 67689 70040 4.04 3.2 3.25 288.028 1.39 3.28 1.42 8.38 4.7 17185 5.63 8.06 3.28 8.37 8.47 1.41 34.47 4.67 5.59 4.05 7.35 4.61 4.68 3.04 8.41 3.82 3.29 8.47 4.2 4.66 3.43 5.59 5.77 8.29 25.4 11.21 3.43 3.99 3.24 3.95 1.41 34.1 3.01 8.42 25.26 25.16 4.03 4.04 3.36 10.91 12.5 11.07 1.4 11.09 5.6 4.12 4.68 8.4 3.06 34.05 13.61 13.62 3.23 7.51 4.65 13.83 8.46 3.16 8.34 7.63 8.49 4.69 13.8 8.1 8.37 8.4 15.26 34.32 13.42 34.1 11.1 35.36 4.01 7.62 8.37 5.74 8.55 13.63 8.55 11.1 25.26 5.71 4.79 3.46 9.19 26.09 25.45 3.28 3.06 3.24 3.08 25.82 8.38 34.47 2.96 8.25 3.95 3.42 8.58 7.25 3.26 1.31 7.55 3.28 3.06 3.27 3.44 3.08 3.37 1.42 20909.02 20820.09 30945 4282 653.15 653.13 16886.66 10.399 141357 143969 255207 4.21 55347 51005 4.04 21269.72 27797.8 20845.09 3.17 41149.1 13710.88 371.699 1.36 3.14 1.39 8.6 4.32 2.97 14406 5.21 7.05 3.15 7.82 8.07 1.36 1.39 31.89 4.31 5.2 5.23 3.88 7.04 3.83 4.35 2.99 7.87 4.1 3.18 4.3 3.86 7.95 4.03 4.33 3.32 5.27 5.2 7.9 23.43 10.05 3.33 3.87 3.18 3.88 7.83 1.38 33.01 2.97 7.86 23.5 23.55 3.83 3.88 3.36 10.38 10.07 10.1 10.03 1.38 9.97 5.19 4.04 4.31 7.86 2.95 31.86 13.1 8.01 12.88 3.16 7.04 4.3 12.87 8.11 3.16 8.22 7.16 7.99 4.3 12.88 8.01 8.25 8.06 12.97 8.2 31.94 12.82 33.22 10.03 31.94 3.83 7.05 7.04 23.5 8.33 5.21 7.83 14.26 7.84 32.16 10.3 23.58 5.19 4.11 3.39 8.03 23.51 23.5 3.19 3.16 2.94 23.55 8 32.1 2.98 8.07 12.86 3.85 3.34 8.38 7.12 3.13 1.39 3.16 3.36 7.52 3.12 3.15 2.99 3.15 3.32 4.08 2.96 3.34 1.36 20767.64 20517.68 31122 4289 653.63 16881.47 10.428 141437 143956 265171 4.08 54432 54814 4.11 20925.3 27807.58 3.15 20953.3 3.17 41188.02 13608.57 371.422 1.38 3.17 1.37 2.96 8.01 4.3 2.98 14449 5.29 7.08 4.31 3.19 7.85 8.03 1.37 1.38 31.93 4.3 5.2 5.24 3.85 7.09 4.07 4.31 2.97 7.82 4.07 3.17 4.3 3.85 8.09 4.08 4.31 3.33 5.2 5.3 7.91 23.43 10.07 3.37 3.86 3.19 7.86 3.85 7.82 1.37 31.8 2.97 7.89 23.43 23.52 3.85 3.86 3.33 10.04 9.98 10.06 10.04 1.38 10.01 5.22 4.1 4.31 7.9 2.96 32.11 12.86 8.05 12.84 3.17 5.27 3.19 7.08 4.3 12.82 8.04 3.16 12.92 8.03 23.72 7.06 8.34 4.31 12.83 8.25 8.24 8.03 12.9 8.19 31.91 12.81 32.09 10.06 31.94 31.97 3.87 7.07 7.09 23.47 8.07 5.2 7.85 12.77 7.07 7.86 32.13 9.95 10.27 23.4 5.2 4.08 3.36 8.05 23.38 8.03 23.54 8.06 3.16 3.17 2.97 23.48 8.01 31.85 2.97 8.06 12.89 3.85 3.35 8.02 7.09 1.38 3.18 1.38 3.15 3.36 7.12 4.07 3.32 3.17 3.17 3.84 2.97 3.15 3.34 4.1 2.99 3.36 1.39 22.064 8.65 9.18 8.06 9.67 24.745 3.18 9.19 3.98 8.55 17.81 11.89 8.15 14.03 18.83 11.43 8.35 20.72 21.11 3.03 3.57 81.77 11 13.34 13.38 9.81 17.75 6.71 10.88 6.88 19.49 7.12 7.81 10.69 9.53 19.66 6.93 10.59 8.13 12.68 12.64 19.2 56.64 24.07 8 9.23 7.52 18.8 9.19 18.66 2.99 75.34 6.02 18.6 55.42 55.48 9.01 8.99 7.81 23.59 23.44 23.48 23.54 2.98 23.11 12.14 8.63 10.08 18.25 6.87 73.51 29.8 18.54 29.49 7.24 12.13 7.34 16.15 9.86 29.34 18.39 7.22 29.38 18.25 53.48 15.82 17.88 9.62 28.59 17.23 18 17.82 28.73 18.24 70.76 28.41 71.08 22.19 70.53 70.29 8.41 15.46 15.4 51.28 17.61 11.3 17 27.66 15.32 16.97 69.48 21.5 22.15 50.32 11.14 8.41 7.07 17.09 49.7 17.06 49.75 17.02 6.6 6.43 6.56 6.06 48.29 16.34 65.41 6.07 16.52 26.33 7.81 6.82 16.22 14.27 2.69 5.99 2.53 5.97 5.92 6.3 13.2 7.23 5.89 5.46 5.49 6.63 5.09 5.38 5.59 4.48 4.59 4.89 1.77 27.183 3.94 4.33 3.65 3.76 24.805 1.71 3.41 1.60 3.12 9.43 5.34 3.25 6.08 8.65 6.25 3.66 9.86 9.35 1.79 1.51 37.86 5.55 6.69 6.23 4.37 8.39 4.32 5.25 3.26 9.87 4.18 3.56 5.41 4.55 9.02 4.41 6.17 4.09 6.28 6.22 9.68 28.36 12.11 3.75 4.72 3.61 9.97 4.74 9.58 1.34 37.91 3.37 9.84 28.40 28.63 4.78 4.53 4.02 12.73 12.42 12.60 12.52 1.60 12.35 6.57 4.26 5.53 9.69 3.11 38.29 15.21 9.98 15.44 3.66 5.94 3.24 8.29 5.67 15.42 9.62 3.91 14.64 8.42 27.86 8.47 9.19 5.49 15.54 8.89 8.83 9.62 15.00 9.10 38.03 14.57 37.88 12.81 38.50 38.27 4.60 8.31 8.28 28.53 9.05 6.40 9.65 15.20 7.45 9.90 38.32 12.73 13.15 27.98 6.18 4.26 3.98 9.52 28.53 10.02 28.40 9.14 3.64 3.70 3.62 3.10 29.06 9.62 38.04 3.24 10.03 15.56 4.73 3.77 9.07 7.57 1.40 3.44 2.48 3.52 3.83 3.92 8.13 4.14 3.48 3.66 3.69 4.17 3.40 3.76 3.89 4.25 3.34 3.95 1.49 55214 8039 9.284 153896 152656 290342 5.48 84351 81406 2.93 3.12 3.3 172.883 1.33 3.46 1.39 3.23 10.08 4.67 3.19 20373 6 9.81 5.14 3.32 8.87 10.55 1.17 1.45 38.79 4.94 7.78 5.97 4.15 9.32 4.45 4.64 3.12 10.62 2.85 3.3 4.94 4.14 8.13 4.03 4.99 5.18 5.69 7.74 8.38 27.75 11.39 5.09 4.23 3.53 8.55 4.47 8.91 1.42 38.25 3 10.87 28.21 31.57 4.09 4.34 3.48 14.08 12.4 14.13 12.98 1.27 14.58 6.58 3.94 5.14 10.27 3.19 39.01 15.55 8.46 15.44 4.99 6.96 3.36 7.93 6.11 15.3 10.18 3.36 16.05 10.09 27.32 7.83 8.1 5.14 13.97 9.6 8.13 8.96 15.95 10.05 38.76 15.85 38.38 11.72 39.35 38.62 4.36 7.4 7.57 27.31 9.87 7.52 9.97 13.68 7.43 9.05 38.82 14.1 13 30.16 5.81 4.39 3.45 9.16 27.44 10.56 28.55 10.11 3.28 3.33 3.3 4.93 28.82 8.81 38.76 5.19 9.04 15.69 4.18 3.55 8.64 9.51 1.35 3.62 1.3 3.36 4.75 3.47 7.86 4.62 3.56 3.48 5.25 4.63 3.18 3.25 3.52 2.82 3.17 3.34 1.16 55383 8119 8.962 153939 155936 287651 5.27 81329 80999 4.16 4.9 4.74 173.043 1.41 3.45 1.4 3.12 8.43 5.25 3.28 20404 8.07 9.46 5.34 3.34 8.9 10.23 1.34 1.38 39.12 5.27 5.84 5.81 4.41 9.3 4.11 6.79 3.22 10.65 3.12 3.31 5.16 4.44 17.15 3.96 5.45 3.59 6.01 8.14 8.97 28.19 13.82 5.27 4.03 3.41 10.47 4.35 10.18 1.41 37.81 3.1 10.38 27.59 29.12 4.34 4.09 3.42 12.17 11.51 13.08 11.24 1.45 13.57 6.05 4.16 5.33 9.53 3.23 38.73 15.45 8.83 16.39 3.31 5.87 3.35 7.81 6.58 15.34 8.37 3.44 15.41 10.23 29.85 8.22 8.64 5.14 15.72 10.34 9.66 8.74 15.4 8.45 37.59 13.88 38.17 12.47 38.65 39.03 4.04 9.34 9.16 29.35 10.69 5.9 9.29 15.38 9.44 10.39 38.69 12.73 10.96 30.74 7.75 4.59 3.49 9.02 29.17 8.22 27.25 8.7 3.3 3.3 3.33 4.99 27.04 9.54 38.79 5.11 10.61 16.6 6.28 3.51 8.48 9.38 1.42 3.34 1.42 3.44 3.38 5.18 7.31 4.59 5.23 3.36 3.6 4.1 3.15 3.53 3.48 3.91 3.13 3.4 1.46 54950 8132 8.967 152170 155148 292768 4.51 84887 82875 4.06 2.61 3.6 172.887 1.07 3.39 1.4 3.1 8.15 4.69 3.1 20601 7.82 7.02 6.32 5.1 10.01 8.45 1.16 2.91 38.99 4.67 5.97 5.84 4.1 9.37 2.64 5.2 4.77 8.7 3.93 3.43 6.54 5.88 9.55 2.81 6.62 3.46 7.44 6.07 10.75 29.54 13.68 3.51 4.37 3.47 10.14 5.94 8.85 1.26 38.9 2.54 8.61 27.77 27.89 4.04 4.1 3.32 13.63 13.13 12.45 13.46 1.33 13.13 8.16 5.92 5.14 8.93 4.61 38.58 17.3 10.15 16.61 4.45 5.58 4.96 7.72 6.11 15.62 9.41 3.29 16.3 8.25 27.25 9.11 10.09 5.18 15.26 7.73 10.17 8.91 15.55 8.37 39.04 17.67 39.18 13.29 37.13 38.46 4.12 7.72 9.21 29.4 10.03 7.38 9.02 15.4 8.26 8.35 38.58 11.41 13.25 27.61 7.61 3.93 3.51 8.93 28.14 10.64 27.04 8.34 3.36 4.97 3.26 3.07 29.29 10.54 38.95 3.12 12.12 15.67 5.26 3.45 9.81 7.48 1.4 3.17 1.42 4.81 3.29 3.37 9.11 5.86 3.43 3.42 3.27 5.82 4.7 3.35 3.5 4.01 3.16 3.17 1.18 OpenBenchmarking.org
vkpeak int32-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int32-scalar g 3090 rep a b c d e f h 3090 4K 8K 12K 16K 20K SE +/- 0.34, N = 3 SE +/- 15.02, N = 3 SE +/- 0.03, N = 3 6824.21 20613.41 2272.62 2269.25 2269.06 8520.02 8505.20 6827.92 6800.60 20909.02
vkpeak int32-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int32-vec4 g 3090 rep a b c d e f h 3090 4K 8K 12K 16K 20K SE +/- 0.26, N = 3 SE +/- 0.19, N = 3 SE +/- 0.05, N = 3 6795.39 20517.45 2658.73 2640.08 2638.69 8465.82 8465.71 6800.17 6772.98 20820.09
VkFFT Test: FFT + iFFT C2C 1D batched in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in double precision a b c d e f g h i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 4090 4090 rep nv 4090 12K 24K 36K 48K 60K SE +/- 11.67, N = 3 SE +/- 10.58, N = 3 SE +/- 12.42, N = 3 SE +/- 14.62, N = 3 20816 20822 20847 12143 12168 10561 10548 10572 14780 34974 35038 35071 35058 30945 31122 55214 55383 54950 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C Bluestein benchmark in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C Bluestein benchmark in double precision a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 4090 4090 rep nv 4090 2K 4K 6K 8K 10K SE +/- 0.33, N = 3 SE +/- 4.37, N = 3 SE +/- 11.20, N = 3 4717 4695 4670 2346 2343 1814 1818 2417 5579 5583 5587 5584 4282 4289 8039 8119 8132 1. (CXX) g++ options: -O3
vkpeak fp64-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp64-vec4 a b c d e f g h 3090 200 400 600 800 1000 SE +/- 0.32, N = 3 SE +/- 0.48, N = 3 SE +/- 0.00, N = 3 841.80 836.55 836.16 267.74 267.25 214.23 213.95 210.96 653.15
vkpeak fp64-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp64-scalar 3090 rep a b c d e f g h 3090 200 400 600 800 1000 SE +/- 0.22, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 648.71 841.40 839.20 839.01 267.43 267.41 214.17 213.96 213.37 653.13
vkpeak int16-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int16-vec4 g 3090 rep a b c d e f h 3090 5K 10K 15K 20K 25K SE +/- 21.55, N = 3 SE +/- 17.33, N = 3 SE +/- 0.31, N = 3 5956.24 16878.20 23123.77 23396.59 23385.44 7352.85 7336.25 5959.75 5978.38 16886.66
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 8 16 24 32 40 SE +/- 0.001, N = 3 SE +/- 0.004, N = 3 SE +/- 0.000, N = 3 SE +/- 0.029, N = 3 11.686 11.690 11.688 32.855 32.850 26.738 26.769 20.930 13.136 13.136 13.137 13.126 10.399 10.428 22.064 27.183 9.284 8.962 8.967 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in single precision a b c d e f g h i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 4090 4090 rep nv 4090 30K 60K 90K 120K 150K SE +/- 9.54, N = 3 SE +/- 2.73, N = 3 SE +/- 1.67, N = 3 SE +/- 25.50, N = 3 47887 47948 47971 42645 42651 56476 56455 56431 69738 104556 104491 104528 104543 141357 141437 153896 153939 152170 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 4090 4090 rep nv 4090 30K 60K 90K 120K 150K SE +/- 8.89, N = 3 SE +/- 2.33, N = 3 SE +/- 2.08, N = 3 50504 50643 50596 43365 43365 57110 57094 71163 106210 106205 106099 105926 143969 143956 152656 155936 155148 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in half precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in half precision a b c d e f g h i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 4090 4090 rep nv 4090 60K 120K 180K 240K 300K SE +/- 83.55, N = 3 SE +/- 18.50, N = 3 SE +/- 26.03, N = 3 SE +/- 133.47, N = 3 91597 91812 91744 85181 85191 104146 104171 104298 132270 211076 211058 210713 210991 255207 265171 290342 287651 292768 1. (CXX) g++ options: -O3
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: FastestDet a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.45, N = 3 SE +/- 0.01, N = 3 SE +/- 0.23, N = 15 3.62 4.05 4.11 4.08 4.22 2.57 2.66 4.42 4.34 4.17 4.16 4.21 4.08 8.65 3.94 5.48 5.27 4.51 MIN: 2.7 / MAX: 4.54 MIN: 4.02 / MAX: 4.35 MIN: 4.08 / MAX: 4.4 MIN: 4.02 / MAX: 4.28 MIN: 4.18 / MAX: 4.97 MIN: 2.53 / MAX: 3.21 MIN: 2.54 / MAX: 3.41 MIN: 4.25 / MAX: 6.71 MIN: 4.19 / MAX: 5.77 MIN: 4.05 / MAX: 4.74 MIN: 4 / MAX: 4.69 MIN: 4.19 / MAX: 4.41 MIN: 4.05 / MAX: 4.84 MIN: 3.94 / MAX: 185.21 MIN: 2.43 / MAX: 267.02 MIN: 2.67 / MAX: 259.34 MIN: 4.05 / MAX: 247.02 MIN: 4.34 / MAX: 5.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT R2C / C2R OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT R2C / C2R a b c d e f g h i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 4090 4090 rep nv 4090 20K 40K 60K 80K 100K SE +/- 200.55, N = 3 SE +/- 118.74, N = 3 SE +/- 3.71, N = 3 SE +/- 796.66, N = 3 42105 42163 43021 35399 35304 26593 26638 26524 33727 66473 68279 69068 67689 55347 54432 84351 81329 84887 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C multidimensional in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C multidimensional in single precision a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 4090 4090 rep nv 4090 20K 40K 60K 80K 100K SE +/- 57.83, N = 3 SE +/- 116.12, N = 3 SE +/- 437.33, N = 3 SE +/- 555.86, N = 3 33001 32751 32812 36328 37090 26238 26541 34686 65869 70068 67887 70040 51005 54814 81406 80999 82875 1. (CXX) g++ options: -O3
NCNN Target: CPU-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: FastestDet b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.27, N = 14 4.07 4.08 3.85 3.97 5.69 4.20 4.18 4.19 4.04 4.04 4.11 9.18 4.33 2.93 4.16 4.06 MIN: 4.03 / MAX: 5.83 MIN: 4.05 / MAX: 4.36 MIN: 3.8 / MAX: 4.65 MIN: 3.92 / MAX: 4.75 MIN: 3.69 / MAX: 261.71 MIN: 4.06 / MAX: 4.86 MIN: 4.03 / MAX: 5.07 MIN: 4.04 / MAX: 5.47 MIN: 3.89 / MAX: 5.01 MIN: 4.01 / MAX: 4.15 MIN: 4.07 / MAX: 4.21 MIN: 3.64 / MAX: 122.65 MIN: 2.59 / MAX: 433.58 MIN: 2.84 / MAX: 3.38 MIN: 4 / MAX: 5.58 MIN: 3.91 / MAX: 5.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
vkpeak fp32-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp32-scalar g 3090 rep a b c d e f h 3090 5K 10K 15K 20K 25K SE +/- 4.18, N = 3 SE +/- 16.18, N = 3 SE +/- 0.30, N = 3 6832.74 20708.84 13190.09 12807.06 12860.56 8531.96 8515.58 6837.94 6810.73 21269.72
vkpeak fp32-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp32-vec4 g 3090 rep a b c d e f h 3090 6K 12K 18K 24K 30K SE +/- 1.81, N = 3 SE +/- 19.37, N = 3 SE +/- 2.57, N = 3 9003.12 27393.20 12730.08 12808.59 12822.01 11251.17 11231.72 9006.57 9036.17 27797.80
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.18, N = 15 3.16 3.17 3.15 3.15 3.26 3.27 3.27 3.33 3.20 3.15 8.06 3.65 3.12 4.90 2.61 MIN: 3.11 / MAX: 3.75 MIN: 3.11 / MAX: 8.89 MIN: 3.1 / MAX: 3.8 MIN: 3.1 / MAX: 3.87 MIN: 3.12 / MAX: 4.19 MIN: 3.12 / MAX: 5.24 MIN: 3.14 / MAX: 3.99 MIN: 3.19 / MAX: 4.2 MIN: 3.06 / MAX: 3.84 MIN: 3.11 / MAX: 3.83 MIN: 2.96 / MAX: 219.87 MIN: 2.87 / MAX: 347.75 MIN: 2.99 / MAX: 5.09 MIN: 3.17 / MAX: 120.84 MIN: 2.5 / MAX: 3.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
vkpeak fp16-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp16-scalar g 3090 rep a b c d e f h 3090 4K 8K 12K 16K 20K SE +/- 4.01, N = 3 SE +/- 13.46, N = 3 SE +/- 5.09, N = 3 6810.55 20640.67 13154.15 13145.19 13136.79 8412.33 8397.80 6812.52 6838.32 20845.09
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v2-v2 - Model: mobilenet-v2 4090 rep a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.17, N = 15 3.30 3.16 3.16 3.18 3.17 3.15 3.16 3.52 3.28 3.28 3.28 3.25 3.17 3.17 9.67 3.76 3.30 3.60 MIN: 3.13 / MAX: 3.97 MIN: 3.1 / MAX: 3.8 MIN: 3.11 / MAX: 3.61 MIN: 3.13 / MAX: 3.84 MIN: 3.1 / MAX: 8.86 MIN: 3.1 / MAX: 3.65 MIN: 3.11 / MAX: 3.83 MIN: 3.29 / MAX: 19.18 MIN: 3.11 / MAX: 3.88 MIN: 3.11 / MAX: 4 MIN: 3.1 / MAX: 4.05 MIN: 3.09 / MAX: 4.51 MIN: 3.12 / MAX: 4.05 MIN: 3.11 / MAX: 4.94 MIN: 3.19 / MAX: 225.84 MIN: 2.6 / MAX: 364.73 MIN: 3.11 / MAX: 4.81 MIN: 3.43 / MAX: 4.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
vkpeak fp16-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp16-vec4 g 3090 rep a b c d e f h 3090 9K 18K 27K 36K 45K SE +/- 5.96, N = 3 SE +/- 0.37, N = 3 SE +/- 0.36, N = 3 13438.40 40876.12 23232.42 23390.44 23387.26 16864.47 16865.29 13440.97 13490.24 41149.10
vkpeak int16-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int16-scalar g 3090 rep a b c d e f h 3090 3K 6K 9K 12K 15K SE +/- 1.30, N = 3 SE +/- 0.09, N = 3 SE +/- 0.02, N = 3 4478.41 13606.79 13102.75 13070.81 13063.86 5676.02 5675.99 4480.59 4495.98 13710.88
VkResample Upscale: 2x - Precision: Double OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 110 220 330 440 550 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 500.01 500.02 500.01 500.01 500.01 288.20 288.17 288.04 288.03 371.70 371.42 24.75 24.81 172.88 173.04 172.89 1. (CXX) g++ options: -O3
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: blazeface g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 0.7155 1.431 2.1465 2.862 3.5775 SE +/- 0.18, N = 15 1.37 1.37 1.36 1.37 1.25 1.42 1.45 1.42 1.39 1.36 1.38 3.18 1.71 1.33 1.41 1.07 MIN: 1.34 / MAX: 1.7 MIN: 1.35 / MAX: 1.39 MIN: 1.34 / MAX: 1.44 MIN: 1.35 / MAX: 1.62 MIN: 1.19 / MAX: 2.61 MIN: 1.35 / MAX: 2.15 MIN: 1.36 / MAX: 8.73 MIN: 1.36 / MAX: 1.92 MIN: 1.34 / MAX: 1.89 MIN: 1.34 / MAX: 1.61 MIN: 1.36 / MAX: 1.9 MIN: 1.31 / MAX: 185.03 MIN: 1.09 / MAX: 448.17 MIN: 1.27 / MAX: 1.98 MIN: 1.35 / MAX: 1.89 MIN: 1.02 / MAX: 1.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.10, N = 15 3.15 3.14 3.16 3.15 3.30 3.31 3.27 3.30 3.28 3.14 3.17 9.19 3.41 3.46 3.45 3.39 MIN: 3.1 / MAX: 3.68 MIN: 3.1 / MAX: 3.67 MIN: 3.09 / MAX: 3.89 MIN: 3.1 / MAX: 3.63 MIN: 3.14 / MAX: 4.82 MIN: 3.12 / MAX: 4.76 MIN: 3.1 / MAX: 4.34 MIN: 3.12 / MAX: 4.03 MIN: 3.1 / MAX: 4 MIN: 3.08 / MAX: 3.7 MIN: 3.11 / MAX: 4.5 MIN: 3.04 / MAX: 232.12 MIN: 2.99 / MAX: 184.91 MIN: 3.29 / MAX: 4.38 MIN: 3.23 / MAX: 4.55 MIN: 3.21 / MAX: 4.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: blazeface a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 0.8955 1.791 2.6865 3.582 4.4775 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.16, N = 15 1.38 1.37 1.37 1.38 1.38 1.38 1.41 1.40 1.41 1.41 1.42 1.42 1.39 1.37 3.98 1.60 1.39 1.40 1.40 MIN: 1.34 / MAX: 1.85 MIN: 1.35 / MAX: 1.75 MIN: 1.35 / MAX: 1.82 MIN: 1.34 / MAX: 2.25 MIN: 1.34 / MAX: 1.88 MIN: 1.35 / MAX: 2.08 MIN: 1.38 / MAX: 2.09 MIN: 1.33 / MAX: 2 MIN: 1.35 / MAX: 2.01 MIN: 1.35 / MAX: 1.9 MIN: 1.36 / MAX: 1.93 MIN: 1.36 / MAX: 2.01 MIN: 1.37 / MAX: 1.82 MIN: 1.36 / MAX: 1.46 MIN: 1.31 / MAX: 228.4 MIN: 1.11 / MAX: 436.01 MIN: 1.33 / MAX: 1.94 MIN: 1.33 / MAX: 1.93 MIN: 1.34 / MAX: 1.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mnasnet 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.02, N = 3 2.96 8.55 3.12 3.23 3.12 3.10 MIN: 2.92 / MAX: 3.27 MIN: 2.99 / MAX: 185.5 MIN: 2.97 / MAX: 4.65 MIN: 3.08 / MAX: 4.73 MIN: 3 / MAX: 4.1 MIN: 2.97 / MAX: 3.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mobilenet 4090 rep a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.22, N = 15 10.75 8.05 7.97 8.02 8.10 8.45 22.74 8.37 8.73 8.41 8.37 8.38 8.60 8.01 17.81 9.43 10.08 8.15 MIN: 8.24 / MAX: 287.14 MIN: 7.97 / MAX: 9.07 MIN: 7.94 / MAX: 8.26 MIN: 7.98 / MAX: 8.33 MIN: 7.94 / MAX: 14.4 MIN: 8.37 / MAX: 9.44 MIN: 8.24 / MAX: 1264.67 MIN: 8.15 / MAX: 9.75 MIN: 8.15 / MAX: 10.96 MIN: 8.14 / MAX: 11.03 MIN: 7.96 / MAX: 9.72 MIN: 7.94 / MAX: 10.16 MIN: 8.5 / MAX: 13.72 MIN: 7.96 / MAX: 9.85 MIN: 8.05 / MAX: 159.41 MIN: 7.95 / MAX: 398.1 MIN: 8.1 / MAX: 118.32 MIN: 7.73 / MAX: 9.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: alexnet 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.16, N = 15 4.98 4.64 4.71 4.70 4.32 4.30 11.89 5.34 4.67 5.25 4.69 MIN: 4.59 / MAX: 7.15 MIN: 4.24 / MAX: 6 MIN: 4.26 / MAX: 7.21 MIN: 4.28 / MAX: 5.92 MIN: 4.25 / MAX: 5.33 MIN: 4.24 / MAX: 5.11 MIN: 4.34 / MAX: 229.18 MIN: 4.25 / MAX: 221.78 MIN: 4.28 / MAX: 6 MIN: 4.86 / MAX: 6.33 MIN: 4.28 / MAX: 6.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mnasnet 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.11, N = 15 2.97 2.98 8.15 3.25 3.19 3.28 3.10 MIN: 2.93 / MAX: 3.45 MIN: 2.94 / MAX: 3.36 MIN: 2.67 / MAX: 317.68 MIN: 2.68 / MAX: 277.21 MIN: 3.04 / MAX: 3.98 MIN: 3.15 / MAX: 4.32 MIN: 2.97 / MAX: 3.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C Bluestein in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C Bluestein in single precision a b c d e f g h i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 4090 4090 rep nv 4090 4K 8K 12K 16K 20K SE +/- 62.67, N = 3 SE +/- 72.34, N = 3 SE +/- 75.16, N = 15 SE +/- 83.38, N = 3 11340 11273 11311 10719 10560 7571 7574 7622 10061 17121 17287 17343 17185 14406 14449 20373 20404 20601 1. (CXX) g++ options: -O3
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet18 4090 rep a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 4 8 12 16 20 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.17, N = 15 6.77 5.29 5.20 5.24 5.23 5.69 5.55 5.60 5.69 5.68 5.56 5.63 5.21 5.29 14.03 6.08 6.00 7.82 MIN: 6.16 / MAX: 8.42 MIN: 5.09 / MAX: 6.29 MIN: 5.1 / MAX: 5.9 MIN: 5.15 / MAX: 6.09 MIN: 5.1 / MAX: 6.28 MIN: 5.22 / MAX: 92.59 MIN: 5.19 / MAX: 25.4 MIN: 5.13 / MAX: 6.83 MIN: 5.16 / MAX: 7.68 MIN: 5.17 / MAX: 7.45 MIN: 5.09 / MAX: 6.84 MIN: 5.08 / MAX: 7.55 MIN: 5.09 / MAX: 6.04 MIN: 5.18 / MAX: 6.19 MIN: 5 / MAX: 303.38 MIN: 4.97 / MAX: 245.95 MIN: 5.47 / MAX: 7.29 MIN: 5.54 / MAX: 303.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 5 10 15 20 25 SE +/- 0.24, N = 15 7.71 7.62 7.27 8.06 7.05 7.08 18.83 8.65 9.81 9.46 7.02 MIN: 7.15 / MAX: 9.1 MIN: 7.01 / MAX: 14.37 MIN: 6.74 / MAX: 8.84 MIN: 7.42 / MAX: 9.25 MIN: 6.97 / MAX: 7.95 MIN: 7 / MAX: 7.94 MIN: 6.71 / MAX: 206.11 MIN: 6.64 / MAX: 544.17 MIN: 7.16 / MAX: 389.1 MIN: 7.03 / MAX: 160.39 MIN: 6.38 / MAX: 9.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: alexnet 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.57, N = 3 4.31 11.43 6.25 5.14 5.34 6.32 MIN: 4.26 / MAX: 4.83 MIN: 4.24 / MAX: 178.83 MIN: 4.27 / MAX: 334.55 MIN: 4.75 / MAX: 7.34 MIN: 4.87 / MAX: 6.57 MIN: 4.26 / MAX: 195.95 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.15, N = 15 3.41 3.17 3.29 3.28 3.27 3.27 3.28 3.15 3.19 8.35 3.66 3.34 5.10 MIN: 3.24 / MAX: 5.42 MIN: 3.1 / MAX: 5.03 MIN: 3.1 / MAX: 3.96 MIN: 3.11 / MAX: 4.16 MIN: 3.08 / MAX: 5.18 MIN: 3.11 / MAX: 4.73 MIN: 3.09 / MAX: 4.98 MIN: 3.1 / MAX: 3.68 MIN: 3.13 / MAX: 4 MIN: 3.08 / MAX: 103.38 MIN: 3.01 / MAX: 311.25 MIN: 3.14 / MAX: 4.45 MIN: 3.14 / MAX: 138.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: googlenet g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 5 10 15 20 25 SE +/- 0.21, N = 15 7.94 7.97 7.83 7.94 10.19 8.49 8.58 8.99 8.37 7.82 7.85 20.72 9.86 8.87 8.90 10.01 MIN: 7.79 / MAX: 9.59 MIN: 7.89 / MAX: 8.7 MIN: 7.74 / MAX: 8.61 MIN: 7.8 / MAX: 8.78 MIN: 7.73 / MAX: 212.36 MIN: 7.82 / MAX: 11.98 MIN: 7.79 / MAX: 10.48 MIN: 8.25 / MAX: 10.27 MIN: 7.76 / MAX: 10.31 MIN: 7.69 / MAX: 8.6 MIN: 7.75 / MAX: 8.64 MIN: 7.49 / MAX: 355.33 MIN: 7.54 / MAX: 396.21 MIN: 8.18 / MAX: 11.09 MIN: 8.22 / MAX: 11.07 MIN: 7.29 / MAX: 259.11 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.25, N = 15 8.05 8.04 8.03 8.02 8.04 8.27 8.17 10.40 8.44 8.57 8.37 8.47 8.07 8.03 21.11 9.35 10.55 10.23 8.45 MIN: 7.95 / MAX: 8.89 MIN: 7.95 / MAX: 14.33 MIN: 7.98 / MAX: 8.84 MIN: 7.95 / MAX: 9.81 MIN: 7.95 / MAX: 9.09 MIN: 8.17 / MAX: 9.04 MIN: 8.08 / MAX: 9.37 MIN: 7.97 / MAX: 455.46 MIN: 7.98 / MAX: 10.55 MIN: 7.98 / MAX: 10 MIN: 7.97 / MAX: 16.09 MIN: 8.04 / MAX: 10.17 MIN: 7.99 / MAX: 8.8 MIN: 7.96 / MAX: 8.77 MIN: 7.98 / MAX: 322.43 MIN: 7.49 / MAX: 474.12 MIN: 8.22 / MAX: 303.1 MIN: 8.13 / MAX: 386.42 MIN: 8.03 / MAX: 12.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: blazeface b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 0.6818 1.3636 2.0454 2.7272 3.409 SE +/- 0.19, N = 15 1.37 1.38 1.43 1.37 1.28 1.44 1.42 1.42 1.41 1.36 1.37 3.03 1.79 1.17 1.34 1.16 MIN: 1.35 / MAX: 1.52 MIN: 1.36 / MAX: 1.58 MIN: 1.4 / MAX: 1.77 MIN: 1.34 / MAX: 2.07 MIN: 1.23 / MAX: 1.73 MIN: 1.37 / MAX: 3.45 MIN: 1.36 / MAX: 2.2 MIN: 1.36 / MAX: 1.92 MIN: 1.34 / MAX: 1.91 MIN: 1.34 / MAX: 1.46 MIN: 1.35 / MAX: 1.46 MIN: 1.28 / MAX: 96.94 MIN: 1.13 / MAX: 312.12 MIN: 1.11 / MAX: 1.9 MIN: 1.27 / MAX: 1.95 MIN: 1.11 / MAX: 1.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: blazeface 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 0.8033 1.6066 2.4099 3.2132 4.0165 SE +/- 0.14, N = 15 1.39 1.38 3.57 1.51 1.45 1.38 2.91 MIN: 1.36 / MAX: 3.12 MIN: 1.35 / MAX: 1.88 MIN: 1.08 / MAX: 141.04 MIN: 1.11 / MAX: 380.46 MIN: 1.38 / MAX: 2.98 MIN: 1.33 / MAX: 1.98 MIN: 1.29 / MAX: 113.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: vision_transformer b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 20 40 60 80 100 SE +/- 0.18, N = 15 31.65 31.78 33.47 33.39 36.55 34.20 34.27 34.37 34.47 31.89 31.93 81.77 37.86 38.79 39.12 38.99 MIN: 31.53 / MAX: 32.23 MIN: 31.64 / MAX: 34.51 MIN: 32.89 / MAX: 74.09 MIN: 32.73 / MAX: 88.83 MIN: 33 / MAX: 209.38 MIN: 32.92 / MAX: 36.19 MIN: 33.07 / MAX: 37.01 MIN: 33.01 / MAX: 38.7 MIN: 33.32 / MAX: 37.42 MIN: 31.66 / MAX: 39.97 MIN: 31.76 / MAX: 33.09 MIN: 44.4 / MAX: 460.28 MIN: 32.9 / MAX: 463.9 MIN: 33.95 / MAX: 457.41 MIN: 33.92 / MAX: 465.83 MIN: 34.17 / MAX: 473.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: alexnet b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.23, N = 15 4.29 4.28 4.83 4.71 5.01 4.66 4.65 4.65 4.67 4.31 4.30 11.00 5.55 4.94 5.27 4.67 MIN: 4.24 / MAX: 5.64 MIN: 4.24 / MAX: 5.12 MIN: 4.76 / MAX: 5.74 MIN: 4.65 / MAX: 5.57 MIN: 4.6 / MAX: 6.68 MIN: 4.29 / MAX: 6.1 MIN: 4.26 / MAX: 6.13 MIN: 4.28 / MAX: 6.42 MIN: 4.28 / MAX: 6.29 MIN: 4.25 / MAX: 5.13 MIN: 4.24 / MAX: 4.99 MIN: 4.33 / MAX: 199.92 MIN: 4.2 / MAX: 281.58 MIN: 4.51 / MAX: 6.64 MIN: 4.78 / MAX: 7.7 MIN: 4.28 / MAX: 5.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet18 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.24, N = 15 5.20 5.20 13.34 6.69 7.78 5.84 5.97 MIN: 5.1 / MAX: 6.05 MIN: 5.08 / MAX: 6.05 MIN: 5.43 / MAX: 279.86 MIN: 5.06 / MAX: 462.37 MIN: 5.4 / MAX: 168.29 MIN: 5.35 / MAX: 8.28 MIN: 5.4 / MAX: 8.25 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: resnet18 g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.16, N = 15 5.30 5.42 5.23 5.30 5.85 5.65 5.69 5.89 5.59 5.23 5.24 13.38 6.23 5.97 5.81 5.84 MIN: 5.19 / MAX: 6.24 MIN: 5.36 / MAX: 6.27 MIN: 5.11 / MAX: 6.03 MIN: 5.17 / MAX: 5.93 MIN: 5.3 / MAX: 8.27 MIN: 5.14 / MAX: 6.93 MIN: 5.11 / MAX: 6.94 MIN: 5.36 / MAX: 7.53 MIN: 5.09 / MAX: 7.7 MIN: 5.1 / MAX: 6.07 MIN: 5.14 / MAX: 5.99 MIN: 5.43 / MAX: 208.42 MIN: 4.99 / MAX: 309.18 MIN: 5.46 / MAX: 7.02 MIN: 5.3 / MAX: 6.82 MIN: 5.35 / MAX: 7.72 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.13, N = 15 4.04 4.01 3.97 4.05 3.88 3.85 9.81 4.37 4.15 4.41 4.10 MIN: 3.84 / MAX: 4.83 MIN: 3.81 / MAX: 6.04 MIN: 3.79 / MAX: 5.93 MIN: 3.83 / MAX: 5.42 MIN: 3.83 / MAX: 4.72 MIN: 3.78 / MAX: 4.83 MIN: 3.87 / MAX: 165.38 MIN: 3.85 / MAX: 366.28 MIN: 3.93 / MAX: 5.94 MIN: 4.21 / MAX: 5.82 MIN: 3.87 / MAX: 6.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: squeezenet_ssd b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.24, N = 15 7.03 7.04 6.97 7.26 8.16 7.64 7.59 7.64 7.35 7.04 7.09 17.75 8.39 9.32 9.30 9.37 MIN: 6.97 / MAX: 7.88 MIN: 6.96 / MAX: 7.83 MIN: 6.83 / MAX: 13.87 MIN: 7.14 / MAX: 8.59 MIN: 7.51 / MAX: 9.94 MIN: 7.05 / MAX: 9.9 MIN: 7.02 / MAX: 8.87 MIN: 7.03 / MAX: 9.19 MIN: 6.79 / MAX: 9.82 MIN: 6.96 / MAX: 7.74 MIN: 7.02 / MAX: 7.99 MIN: 6.47 / MAX: 272.11 MIN: 6.53 / MAX: 436.05 MIN: 7.1 / MAX: 172.56 MIN: 6.92 / MAX: 310.91 MIN: 7.07 / MAX: 281.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: FastestDet 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.27, N = 15 4.19 4.17 3.75 4.61 3.83 4.07 6.71 4.32 4.45 4.11 2.64 MIN: 4.06 / MAX: 7.41 MIN: 4.02 / MAX: 4.75 MIN: 3.63 / MAX: 5.24 MIN: 4.45 / MAX: 5.92 MIN: 3.79 / MAX: 4.09 MIN: 4.03 / MAX: 4.18 MIN: 2.73 / MAX: 109.52 MIN: 2.51 / MAX: 398.91 MIN: 4.29 / MAX: 5.05 MIN: 3.98 / MAX: 4.73 MIN: 2.52 / MAX: 4.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: alexnet a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.18, N = 15 4.31 4.33 4.33 4.31 4.31 4.64 4.87 6.53 4.62 4.65 4.68 4.68 4.35 4.31 10.88 5.25 4.64 6.79 5.20 MIN: 4.24 / MAX: 5.2 MIN: 4.28 / MAX: 5.16 MIN: 4.26 / MAX: 10.59 MIN: 4.25 / MAX: 5.28 MIN: 4.23 / MAX: 11.03 MIN: 4.57 / MAX: 5.49 MIN: 4.8 / MAX: 5.62 MIN: 4.57 / MAX: 242.16 MIN: 4.26 / MAX: 6.15 MIN: 4.26 / MAX: 6.53 MIN: 4.26 / MAX: 6.61 MIN: 4.26 / MAX: 6.23 MIN: 4.28 / MAX: 7.49 MIN: 4.26 / MAX: 5.26 MIN: 4.38 / MAX: 52.99 MIN: 4.23 / MAX: 375.94 MIN: 4.26 / MAX: 5.98 MIN: 4.23 / MAX: 262.43 MIN: 4.82 / MAX: 7.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mnasnet 4090 rep a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.14, N = 15 3.18 2.97 2.97 2.99 2.98 2.97 3.00 2.74 3.05 3.06 3.06 3.04 2.99 2.97 6.88 3.26 3.12 4.77 MIN: 3.05 / MAX: 3.8 MIN: 2.92 / MAX: 3.48 MIN: 2.93 / MAX: 3.45 MIN: 2.96 / MAX: 3.44 MIN: 2.94 / MAX: 3.83 MIN: 2.93 / MAX: 3.95 MIN: 2.96 / MAX: 3.68 MIN: 2.62 / MAX: 4.22 MIN: 2.92 / MAX: 3.82 MIN: 2.94 / MAX: 4.51 MIN: 2.94 / MAX: 4.45 MIN: 2.91 / MAX: 4.47 MIN: 2.95 / MAX: 3.88 MIN: 2.93 / MAX: 3.28 MIN: 3.05 / MAX: 110.25 MIN: 2.46 / MAX: 277.54 MIN: 2.98 / MAX: 3.79 MIN: 3.07 / MAX: 97.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: googlenet a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.22, N = 15 7.90 7.85 7.80 7.85 7.85 8.15 8.96 8.75 8.42 8.40 8.42 8.41 7.87 7.82 19.49 9.87 10.62 10.65 8.70 MIN: 7.74 / MAX: 9.54 MIN: 7.76 / MAX: 8.76 MIN: 7.72 / MAX: 8.74 MIN: 7.71 / MAX: 8.85 MIN: 7.71 / MAX: 8.76 MIN: 8.02 / MAX: 9.02 MIN: 8.82 / MAX: 9.87 MIN: 8.08 / MAX: 16.01 MIN: 7.79 / MAX: 10.01 MIN: 7.77 / MAX: 9.78 MIN: 7.73 / MAX: 10.06 MIN: 7.72 / MAX: 9.9 MIN: 7.76 / MAX: 10.36 MIN: 7.69 / MAX: 8.61 MIN: 7.4 / MAX: 200.01 MIN: 7.33 / MAX: 399.24 MIN: 7.83 / MAX: 323.31 MIN: 8.29 / MAX: 236.11 MIN: 7.96 / MAX: 10.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: FastestDet 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.87, N = 3 4.20 3.80 3.82 4.10 4.07 7.12 4.18 2.85 3.12 3.93 MIN: 4.04 / MAX: 5.63 MIN: 3.65 / MAX: 6.08 MIN: 3.65 / MAX: 9.77 MIN: 4.07 / MAX: 4.34 MIN: 4.03 / MAX: 4.2 MIN: 3.72 / MAX: 188.7 MIN: 2.53 / MAX: 295.11 MIN: 2.74 / MAX: 4.36 MIN: 2.97 / MAX: 4.42 MIN: 3.76 / MAX: 11.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.14, N = 15 3.17 3.14 3.13 3.16 3.14 3.13 3.18 3.29 3.26 3.29 3.26 3.29 3.18 3.17 7.81 3.56 3.30 3.31 3.43 MIN: 3.09 / MAX: 3.78 MIN: 3.1 / MAX: 3.73 MIN: 3.08 / MAX: 3.85 MIN: 3.09 / MAX: 3.92 MIN: 3.08 / MAX: 4.06 MIN: 3.07 / MAX: 3.82 MIN: 3.13 / MAX: 3.9 MIN: 3.12 / MAX: 3.93 MIN: 3.1 / MAX: 4.12 MIN: 3.12 / MAX: 4.14 MIN: 3.1 / MAX: 3.87 MIN: 3.11 / MAX: 3.98 MIN: 3.14 / MAX: 3.63 MIN: 3.12 / MAX: 3.64 MIN: 3.07 / MAX: 154.75 MIN: 3.09 / MAX: 345.01 MIN: 3.12 / MAX: 4.82 MIN: 3.14 / MAX: 4.92 MIN: 3.25 / MAX: 4.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: alexnet 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.21, N = 15 4.30 4.30 10.69 5.41 4.94 5.16 6.54 MIN: 4.25 / MAX: 4.63 MIN: 4.24 / MAX: 4.85 MIN: 4.32 / MAX: 148.92 MIN: 4.23 / MAX: 364.66 MIN: 4.52 / MAX: 6.23 MIN: 4.73 / MAX: 6.38 MIN: 4.56 / MAX: 110.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.16, N = 15 3.86 3.85 9.53 4.55 4.14 4.44 5.88 MIN: 3.82 / MAX: 4.82 MIN: 3.81 / MAX: 4.6 MIN: 3.77 / MAX: 182.53 MIN: 3.84 / MAX: 379.07 MIN: 3.93 / MAX: 5.94 MIN: 4.24 / MAX: 5.18 MIN: 3.96 / MAX: 194.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: regnety_400m b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 5 10 15 20 25 SE +/- 0.21, N = 15 8.05 8.14 8.34 8.07 8.21 8.33 8.57 8.75 8.47 7.95 8.09 19.66 9.02 8.13 17.15 9.55 MIN: 8 / MAX: 8.58 MIN: 8.08 / MAX: 8.69 MIN: 8.26 / MAX: 9.3 MIN: 7.97 / MAX: 8.81 MIN: 7.9 / MAX: 9.99 MIN: 8.02 / MAX: 9.64 MIN: 8.21 / MAX: 10.39 MIN: 8.35 / MAX: 10.08 MIN: 8.13 / MAX: 10.27 MIN: 7.88 / MAX: 8.67 MIN: 7.99 / MAX: 14.25 MIN: 7.5 / MAX: 235.36 MIN: 7.69 / MAX: 501.76 MIN: 7.75 / MAX: 10.05 MIN: 8.02 / MAX: 773.45 MIN: 7.5 / MAX: 193.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: FastestDet 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.20, N = 15 4.13 3.92 3.83 4.28 4.14 4.17 4.20 4.03 4.08 6.93 4.41 3.96 2.81 MIN: 3.99 / MAX: 4.67 MIN: 3.88 / MAX: 4.72 MIN: 3.7 / MAX: 4.57 MIN: 4.13 / MAX: 4.85 MIN: 4 / MAX: 5.6 MIN: 4.03 / MAX: 5.63 MIN: 4.01 / MAX: 11.47 MIN: 3.99 / MAX: 4.22 MIN: 4.04 / MAX: 4.29 MIN: 2.57 / MAX: 163.84 MIN: 2.06 / MAX: 295.24 MIN: 3.79 / MAX: 11.36 MIN: 2.68 / MAX: 4.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: alexnet 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.43, N = 3 4.68 4.69 4.66 4.33 4.31 10.59 6.17 4.99 5.45 6.62 MIN: 4.27 / MAX: 6.08 MIN: 4.26 / MAX: 6.07 MIN: 4.24 / MAX: 5.97 MIN: 4.26 / MAX: 5.19 MIN: 4.26 / MAX: 5.07 MIN: 4.3 / MAX: 177.68 MIN: 4.5 / MAX: 261.75 MIN: 4.56 / MAX: 6.91 MIN: 4.93 / MAX: 7.98 MIN: 4.28 / MAX: 339.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: shufflenet-v2 b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.21, N = 15 3.33 3.34 3.40 3.38 3.52 3.46 3.43 3.47 3.43 3.32 3.33 8.13 4.09 5.18 3.59 3.46 MIN: 3.3 / MAX: 3.79 MIN: 3.32 / MAX: 3.79 MIN: 3.35 / MAX: 4.17 MIN: 3.34 / MAX: 4.15 MIN: 3.39 / MAX: 4.05 MIN: 3.34 / MAX: 3.93 MIN: 3.3 / MAX: 4.03 MIN: 3.33 / MAX: 5.01 MIN: 3.31 / MAX: 3.94 MIN: 3.28 / MAX: 3.66 MIN: 3.3 / MAX: 3.67 MIN: 3.09 / MAX: 147.21 MIN: 3.12 / MAX: 435.28 MIN: 3.34 / MAX: 283.54 MIN: 3.46 / MAX: 4.09 MIN: 3.32 / MAX: 5.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet18 a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.20, N = 15 5.28 5.23 5.21 5.23 5.22 5.48 6.22 5.82 5.67 5.61 5.62 5.59 5.27 5.20 12.68 6.28 5.69 6.01 7.44 MIN: 5.17 / MAX: 6.16 MIN: 5.13 / MAX: 6.18 MIN: 5.11 / MAX: 6.04 MIN: 5.08 / MAX: 6.28 MIN: 5.09 / MAX: 11.15 MIN: 5.33 / MAX: 6.16 MIN: 6.11 / MAX: 7 MIN: 5.28 / MAX: 7.02 MIN: 5.18 / MAX: 7.22 MIN: 5.11 / MAX: 7.44 MIN: 5.1 / MAX: 7.65 MIN: 5.06 / MAX: 6.95 MIN: 5.15 / MAX: 6.19 MIN: 5.09 / MAX: 5.98 MIN: 5.39 / MAX: 262.62 MIN: 4.94 / MAX: 298.06 MIN: 5.16 / MAX: 8.22 MIN: 5.44 / MAX: 8.18 MIN: 5.29 / MAX: 320.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: resnet18 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.30, N = 3 5.64 5.78 5.77 5.20 5.30 12.64 6.22 7.74 8.14 6.07 MIN: 5.11 / MAX: 7.51 MIN: 5.21 / MAX: 6.97 MIN: 5.22 / MAX: 7.06 MIN: 5.1 / MAX: 6.16 MIN: 5.21 / MAX: 6.24 MIN: 5.3 / MAX: 53.81 MIN: 5.3 / MAX: 8.22 MIN: 5.25 / MAX: 312.09 MIN: 5.39 / MAX: 122.47 MIN: 5.49 / MAX: 15.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: googlenet 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 5 10 15 20 25 SE +/- 0.80, N = 3 8.52 8.32 8.29 7.90 7.91 19.20 9.68 8.38 8.97 10.75 MIN: 7.85 / MAX: 10.56 MIN: 7.71 / MAX: 10.39 MIN: 7.63 / MAX: 9.87 MIN: 7.8 / MAX: 8.73 MIN: 7.81 / MAX: 8.62 MIN: 7.84 / MAX: 193.36 MIN: 8.16 / MAX: 382.41 MIN: 7.78 / MAX: 10.43 MIN: 8.22 / MAX: 10.51 MIN: 7.92 / MAX: 447.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vgg16 4090 rep a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 13 26 39 52 65 SE +/- 0.30, N = 3 SE +/- 0.05, N = 3 SE +/- 0.23, N = 15 29.24 23.75 23.49 23.45 23.51 24.19 23.78 30.96 25.37 26.11 25.03 25.40 23.43 23.43 56.64 28.36 27.75 29.54 MIN: 26.51 / MAX: 270.71 MIN: 23.31 / MAX: 25.12 MIN: 23.36 / MAX: 24.62 MIN: 23.26 / MAX: 24.51 MIN: 23.19 / MAX: 24.68 MIN: 23.99 / MAX: 30.98 MIN: 23.52 / MAX: 24.89 MIN: 25.92 / MAX: 328.63 MIN: 24.26 / MAX: 36.52 MIN: 24.54 / MAX: 30.29 MIN: 23.85 / MAX: 28.9 MIN: 24.09 / MAX: 32.86 MIN: 23.2 / MAX: 24.1 MIN: 23.23 / MAX: 24.39 MIN: 25.75 / MAX: 367.74 MIN: 24.13 / MAX: 449.57 MIN: 24.58 / MAX: 282.59 MIN: 24.77 / MAX: 364.86 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: resnet50 b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 6 12 18 24 30 SE +/- 0.22, N = 15 10.01 10.33 11.05 11.25 14.05 11.11 10.79 10.91 11.21 10.05 10.07 24.07 12.11 11.39 13.82 13.68 MIN: 9.89 / MAX: 10.86 MIN: 10.16 / MAX: 13.97 MIN: 10.46 / MAX: 112.6 MIN: 10.55 / MAX: 118.12 MIN: 11.69 / MAX: 252.21 MIN: 10.19 / MAX: 13.03 MIN: 9.91 / MAX: 12.75 MIN: 9.91 / MAX: 13.1 MIN: 10.3 / MAX: 13.25 MIN: 9.85 / MAX: 12.64 MIN: 9.94 / MAX: 11.06 MIN: 10.02 / MAX: 218.35 MIN: 10.16 / MAX: 382.56 MIN: 10.48 / MAX: 13.29 MIN: 10.34 / MAX: 245.6 MIN: 10.25 / MAX: 566.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.16, N = 15 5.17 3.34 3.43 3.46 3.39 3.43 3.43 3.33 3.37 8.00 3.75 5.27 3.51 MIN: 3.22 / MAX: 208.13 MIN: 3.31 / MAX: 4.05 MIN: 3.3 / MAX: 4.89 MIN: 3.3 / MAX: 5.74 MIN: 3.26 / MAX: 3.91 MIN: 3.31 / MAX: 3.95 MIN: 3.29 / MAX: 3.87 MIN: 3.29 / MAX: 3.67 MIN: 3.33 / MAX: 3.8 MIN: 3.16 / MAX: 190.15 MIN: 3.2 / MAX: 361.52 MIN: 3.27 / MAX: 191.55 MIN: 3.38 / MAX: 4.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: efficientnet-b0 4090 rep a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.00, N = 3 SE +/- 0.19, N = 15 4.30 3.90 3.85 3.88 3.85 3.87 3.91 4.05 3.99 4.09 4.04 3.99 3.87 3.86 9.23 4.72 4.23 4.37 MIN: 4.08 / MAX: 5.07 MIN: 3.82 / MAX: 4.51 MIN: 3.81 / MAX: 4.42 MIN: 3.84 / MAX: 4.41 MIN: 3.81 / MAX: 4.46 MIN: 3.81 / MAX: 4.97 MIN: 3.85 / MAX: 4.64 MIN: 3.78 / MAX: 5.45 MIN: 3.79 / MAX: 5.83 MIN: 3.86 / MAX: 5.59 MIN: 3.83 / MAX: 5.71 MIN: 3.8 / MAX: 5.69 MIN: 3.83 / MAX: 4.69 MIN: 3.81 / MAX: 4.75 MIN: 3.43 / MAX: 156.19 MIN: 3.37 / MAX: 486.93 MIN: 3.98 / MAX: 12.23 MIN: 4.15 / MAX: 5.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3 - Model: mobilenet-v3 4090 rep a c d g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 2 4 6 8 10 SE +/- 0.00, N = 2 SE +/- 0.00, N = 3 SE +/- 0.21, N = 14 3.31 3.18 3.17 3.17 3.14 3.26 3.24 3.26 3.27 3.24 3.18 3.19 7.52 3.61 3.53 3.47 MIN: 3.16 / MAX: 4.73 MIN: 3.14 / MAX: 3.82 MIN: 3.15 / MAX: 3.74 MIN: 3.12 / MAX: 3.96 MIN: 3.1 / MAX: 3.81 MIN: 3.14 / MAX: 3.9 MIN: 3.09 / MAX: 4.73 MIN: 3.09 / MAX: 3.96 MIN: 3.13 / MAX: 3.85 MIN: 3.11 / MAX: 4.47 MIN: 3.14 / MAX: 4.14 MIN: 3.15 / MAX: 3.72 MIN: 2.94 / MAX: 215 MIN: 2.51 / MAX: 502.85 MIN: 3.39 / MAX: 4.31 MIN: 3.32 / MAX: 4.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: googlenet 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 5 10 15 20 25 SE +/- 0.55, N = 3 7.86 18.80 9.97 8.55 10.47 10.14 MIN: 7.75 / MAX: 8.57 MIN: 7.78 / MAX: 141.46 MIN: 8.16 / MAX: 381.49 MIN: 7.85 / MAX: 11.39 MIN: 7.86 / MAX: 191.94 MIN: 7.85 / MAX: 257.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.46, N = 3 4.07 4.01 3.95 3.88 3.85 9.19 4.74 4.47 4.35 5.94 MIN: 3.85 / MAX: 4.79 MIN: 3.83 / MAX: 5.28 MIN: 3.79 / MAX: 4.59 MIN: 3.83 / MAX: 4.61 MIN: 3.81 / MAX: 4.75 MIN: 3.85 / MAX: 131.42 MIN: 3.68 / MAX: 295.7 MIN: 4.23 / MAX: 5.82 MIN: 4.08 / MAX: 5.62 MIN: 3.97 / MAX: 208.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: googlenet 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 5 10 15 20 25 SE +/- 0.22, N = 15 7.83 7.82 18.66 9.58 8.91 10.18 8.85 MIN: 7.73 / MAX: 8.6 MIN: 7.72 / MAX: 8.6 MIN: 7.42 / MAX: 326.73 MIN: 7.62 / MAX: 396.9 MIN: 8.3 / MAX: 10.96 MIN: 7.81 / MAX: 204.67 MIN: 8.16 / MAX: 10.25 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: blazeface 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 0.6728 1.3456 2.0184 2.6912 3.364 SE +/- 0.03, N = 15 1.44 1.42 1.31 1.41 1.38 1.37 2.99 1.34 1.42 1.41 1.26 MIN: 1.37 / MAX: 2.07 MIN: 1.35 / MAX: 2.89 MIN: 1.25 / MAX: 3.14 MIN: 1.34 / MAX: 1.88 MIN: 1.36 / MAX: 1.53 MIN: 1.35 / MAX: 1.48 MIN: 1.22 / MAX: 149.55 MIN: 1.06 / MAX: 2.66 MIN: 1.36 / MAX: 1.92 MIN: 1.35 / MAX: 1.91 MIN: 1.2 / MAX: 1.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vision_transformer 4090 rep a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 20 40 60 80 100 SE +/- 0.29, N = 3 SE +/- 0.39, N = 3 SE +/- 0.12, N = 15 37.05 32.49 31.95 31.77 32.43 33.56 32.73 37.80 35.07 35.28 34.27 34.10 33.01 31.80 75.34 37.91 38.25 38.90 MIN: 33.89 / MAX: 407.84 MIN: 31.67 / MAX: 40.11 MIN: 31.79 / MAX: 32.33 MIN: 31.61 / MAX: 35.68 MIN: 31.56 / MAX: 37.69 MIN: 32.98 / MAX: 51.93 MIN: 31.44 / MAX: 81.32 MIN: 33.74 / MAX: 321.51 MIN: 33.14 / MAX: 43.26 MIN: 33.9 / MAX: 38.67 MIN: 32.82 / MAX: 39.79 MIN: 32.65 / MAX: 37.64 MIN: 32.88 / MAX: 33.42 MIN: 31.66 / MAX: 32.23 MIN: 38.72 / MAX: 418.01 MIN: 32.08 / MAX: 541.11 MIN: 33.04 / MAX: 447.7 MIN: 34.2 / MAX: 300.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: mnasnet g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.16, N = 14 2.95 2.96 2.96 2.96 2.99 3.09 3.07 3.13 3.01 2.97 2.97 6.02 3.37 3.00 3.10 2.54 MIN: 2.91 / MAX: 3.64 MIN: 2.93 / MAX: 3.4 MIN: 2.93 / MAX: 3.41 MIN: 2.92 / MAX: 3.81 MIN: 2.86 / MAX: 4.38 MIN: 2.94 / MAX: 3.79 MIN: 2.94 / MAX: 3.72 MIN: 3 / MAX: 5.1 MIN: 2.91 / MAX: 3.6 MIN: 2.92 / MAX: 3.28 MIN: 2.94 / MAX: 3.39 MIN: 2.79 / MAX: 50.49 MIN: 2.86 / MAX: 278.87 MIN: 2.89 / MAX: 3.46 MIN: 2.97 / MAX: 3.72 MIN: 2.44 / MAX: 3.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: googlenet 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 5 10 15 20 25 SE +/- 0.24, N = 15 11.30 7.96 10.47 8.40 8.49 8.43 8.42 7.86 7.89 18.60 9.84 10.38 8.61 MIN: 7.95 / MAX: 477.54 MIN: 7.81 / MAX: 9.05 MIN: 8.21 / MAX: 350.07 MIN: 7.71 / MAX: 10.64 MIN: 7.74 / MAX: 10.76 MIN: 7.77 / MAX: 10.4 MIN: 7.78 / MAX: 10.7 MIN: 7.74 / MAX: 8.62 MIN: 7.79 / MAX: 8.84 MIN: 8.02 / MAX: 292.16 MIN: 7.3 / MAX: 438.04 MIN: 7.96 / MAX: 255.68 MIN: 7.95 / MAX: 10.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: vgg16 g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 12 24 36 48 60 SE +/- 0.27, N = 15 23.82 23.42 23.54 24.12 29.07 25.04 25.04 26.08 25.26 23.50 23.43 55.42 28.40 28.21 27.59 27.77 MIN: 23.62 / MAX: 24.63 MIN: 23.27 / MAX: 24.32 MIN: 23.32 / MAX: 24.54 MIN: 23.57 / MAX: 46.44 MIN: 24.45 / MAX: 263.33 MIN: 23.87 / MAX: 28.04 MIN: 23.81 / MAX: 27.15 MIN: 24.52 / MAX: 27.73 MIN: 24.14 / MAX: 27.73 MIN: 23.23 / MAX: 24.26 MIN: 23.26 / MAX: 24.3 MIN: 25.32 / MAX: 281.46 MIN: 23.98 / MAX: 456 MIN: 24.57 / MAX: 270.76 MIN: 24.34 / MAX: 396.09 MIN: 24.82 / MAX: 264.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: vgg16 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 12 24 36 48 60 SE +/- 0.30, N = 15 29.05 24.04 27.43 25.48 25.05 25.40 25.16 23.55 23.52 55.48 28.63 29.12 27.89 MIN: 24.19 / MAX: 451.92 MIN: 23.48 / MAX: 73.3 MIN: 24.65 / MAX: 251.37 MIN: 23.88 / MAX: 51.68 MIN: 23.78 / MAX: 26.95 MIN: 24.05 / MAX: 27.09 MIN: 23.97 / MAX: 27.81 MIN: 23.31 / MAX: 24.48 MIN: 23.33 / MAX: 25.08 MIN: 25.94 / MAX: 298.67 MIN: 24.13 / MAX: 500.18 MIN: 24.62 / MAX: 266.39 MIN: 24.5 / MAX: 463.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: efficientnet-b0 b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.22, N = 15 3.82 3.89 4.04 4.14 4.68 4.02 4.02 4.06 4.03 3.83 3.85 9.01 4.78 4.09 4.34 4.04 MIN: 3.79 / MAX: 4.34 MIN: 3.83 / MAX: 9.72 MIN: 3.99 / MAX: 4.82 MIN: 4.09 / MAX: 5.13 MIN: 4.48 / MAX: 6.02 MIN: 3.82 / MAX: 5.66 MIN: 3.82 / MAX: 5.39 MIN: 3.83 / MAX: 5.55 MIN: 3.82 / MAX: 5.43 MIN: 3.78 / MAX: 4.41 MIN: 3.81 / MAX: 4.53 MIN: 3.98 / MAX: 188.57 MIN: 3.82 / MAX: 411.19 MIN: 3.86 / MAX: 4.83 MIN: 4.16 / MAX: 5.28 MIN: 3.78 / MAX: 4.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.18, N = 15 3.86 3.82 3.83 3.87 3.84 3.86 3.86 5.88 4.01 4.05 4.04 4.04 3.88 3.86 8.99 4.53 4.34 4.09 4.10 MIN: 3.8 / MAX: 4.6 MIN: 3.78 / MAX: 4.39 MIN: 3.79 / MAX: 4.61 MIN: 3.77 / MAX: 9.91 MIN: 3.79 / MAX: 4.76 MIN: 3.78 / MAX: 10.45 MIN: 3.82 / MAX: 4.22 MIN: 4.04 / MAX: 364.21 MIN: 3.78 / MAX: 5.34 MIN: 3.83 / MAX: 6.11 MIN: 3.82 / MAX: 5.33 MIN: 3.8 / MAX: 5.31 MIN: 3.84 / MAX: 4.39 MIN: 3.82 / MAX: 4.34 MIN: 3.71 / MAX: 129.99 MIN: 3.75 / MAX: 396.62 MIN: 4.14 / MAX: 5.84 MIN: 3.87 / MAX: 5.46 MIN: 3.86 / MAX: 5.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.60, N = 3 3.47 3.40 3.36 3.36 3.33 7.81 4.02 3.48 3.42 3.32 MIN: 3.33 / MAX: 5.39 MIN: 3.28 / MAX: 3.87 MIN: 3.23 / MAX: 3.99 MIN: 3.32 / MAX: 3.66 MIN: 3.3 / MAX: 3.78 MIN: 3.3 / MAX: 131.26 MIN: 3.27 / MAX: 328.59 MIN: 3.35 / MAX: 4.05 MIN: 3.29 / MAX: 3.94 MIN: 3.19 / MAX: 4.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: resnet50 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 6 12 18 24 30 SE +/- 0.23, N = 15 11.53 10.33 13.10 11.40 10.80 10.91 10.91 10.38 10.04 23.59 12.73 12.17 13.63 MIN: 10.59 / MAX: 13.73 MIN: 10.2 / MAX: 11.18 MIN: 10.59 / MAX: 267.95 MIN: 10.5 / MAX: 13.51 MIN: 9.89 / MAX: 12.54 MIN: 9.91 / MAX: 13.07 MIN: 9.94 / MAX: 14.83 MIN: 9.88 / MAX: 18.75 MIN: 9.94 / MAX: 10.89 MIN: 9.96 / MAX: 177.63 MIN: 9.84 / MAX: 518.97 MIN: 11.25 / MAX: 13.79 MIN: 10.52 / MAX: 488.94 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet50 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 6 12 18 24 30 SE +/- 0.25, N = 15 11.48 11.07 11.22 12.50 10.07 9.98 23.44 12.42 12.40 11.51 13.13 MIN: 10.56 / MAX: 12.93 MIN: 10.16 / MAX: 13.16 MIN: 10.33 / MAX: 12.81 MIN: 11.47 / MAX: 14.56 MIN: 9.95 / MAX: 10.88 MIN: 9.85 / MAX: 11.35 MIN: 10.17 / MAX: 219.36 MIN: 10.23 / MAX: 444.76 MIN: 11.44 / MAX: 14.43 MIN: 10.56 / MAX: 13.22 MIN: 10.18 / MAX: 247.5 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet50 a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.09, N = 3 SE +/- 0.12, N = 3 SE +/- 0.26, N = 15 10.01 10.00 10.00 10.10 10.10 10.26 10.34 12.96 10.81 10.84 10.94 11.07 10.10 10.06 23.48 12.60 14.13 13.08 12.45 MIN: 9.88 / MAX: 11.4 MIN: 9.92 / MAX: 12.35 MIN: 9.91 / MAX: 11.15 MIN: 9.86 / MAX: 11.08 MIN: 9.84 / MAX: 11.72 MIN: 10.09 / MAX: 11.22 MIN: 10.14 / MAX: 11.37 MIN: 10.23 / MAX: 424.46 MIN: 9.95 / MAX: 12.78 MIN: 9.93 / MAX: 12.81 MIN: 9.95 / MAX: 12.7 MIN: 10.1 / MAX: 13.23 MIN: 9.97 / MAX: 11.42 MIN: 9.95 / MAX: 11.04 MIN: 10.06 / MAX: 112.91 MIN: 9.82 / MAX: 418.4 MIN: 10.63 / MAX: 167.28 MIN: 10.11 / MAX: 444.45 MIN: 11.55 / MAX: 14.48 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet50 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 6 12 18 24 30 SE +/- 0.24, N = 15 10.03 10.04 23.54 12.52 12.98 11.24 13.46 MIN: 9.88 / MAX: 10.86 MIN: 9.94 / MAX: 10.91 MIN: 10.3 / MAX: 149.49 MIN: 9.95 / MAX: 459.05 MIN: 10.26 / MAX: 145.62 MIN: 10.22 / MAX: 29.96 MIN: 10.6 / MAX: 340.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: blazeface a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 0.6705 1.341 2.0115 2.682 3.3525 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.14, N = 15 1.38 1.38 1.39 1.38 1.37 1.38 1.40 1.40 1.42 1.42 1.40 1.38 1.38 2.98 1.60 1.27 1.45 1.33 MIN: 1.35 / MAX: 2.06 MIN: 1.35 / MAX: 1.67 MIN: 1.36 / MAX: 1.53 MIN: 1.35 / MAX: 2.05 MIN: 1.34 / MAX: 2.11 MIN: 1.36 / MAX: 1.62 MIN: 1.34 / MAX: 2 MIN: 1.34 / MAX: 2.15 MIN: 1.35 / MAX: 1.88 MIN: 1.36 / MAX: 2.02 MIN: 1.34 / MAX: 2.1 MIN: 1.35 / MAX: 2.23 MIN: 1.36 / MAX: 1.71 MIN: 1.29 / MAX: 144.96 MIN: 0.95 / MAX: 433.24 MIN: 1.21 / MAX: 1.95 MIN: 1.38 / MAX: 2.96 MIN: 1.27 / MAX: 1.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: resnet50 g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 6 12 18 24 30 SE +/- 0.27, N = 15 10.18 9.87 10.03 10.25 11.15 10.95 10.84 11.50 11.09 9.97 10.01 23.11 12.35 14.58 13.57 13.13 MIN: 10.01 / MAX: 11.25 MIN: 9.79 / MAX: 10.73 MIN: 9.93 / MAX: 10.96 MIN: 10.05 / MAX: 11.08 MIN: 10.31 / MAX: 12.97 MIN: 9.91 / MAX: 17.11 MIN: 9.93 / MAX: 12.83 MIN: 10.5 / MAX: 13.47 MIN: 10.18 / MAX: 13.12 MIN: 9.86 / MAX: 10.84 MIN: 9.91 / MAX: 10.74 MIN: 10.22 / MAX: 140.41 MIN: 9.83 / MAX: 424.28 MIN: 10.67 / MAX: 324.82 MIN: 10.45 / MAX: 199.55 MIN: 10.56 / MAX: 323.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: resnet18 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.23, N = 15 5.78 5.28 5.88 5.61 5.61 5.67 5.60 5.19 5.22 12.14 6.57 6.05 8.16 MIN: 5.26 / MAX: 7.24 MIN: 5.16 / MAX: 6.09 MIN: 5.36 / MAX: 8.2 MIN: 5.09 / MAX: 7.91 MIN: 5.07 / MAX: 7.08 MIN: 5.1 / MAX: 8.06 MIN: 5.09 / MAX: 7.51 MIN: 5.09 / MAX: 6 MIN: 5.13 / MAX: 6.1 MIN: 5.28 / MAX: 151.53 MIN: 4.91 / MAX: 391.33 MIN: 5.53 / MAX: 7.66 MIN: 5.39 / MAX: 397.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: FastestDet g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.15, N = 15 4.06 4.06 3.69 4.20 4.43 4.20 4.09 4.31 4.12 4.04 4.10 8.63 4.26 3.94 4.16 5.92 MIN: 4.01 / MAX: 4.78 MIN: 4.03 / MAX: 4.3 MIN: 3.66 / MAX: 3.92 MIN: 4.15 / MAX: 4.92 MIN: 4.28 / MAX: 5.01 MIN: 4.04 / MAX: 5.82 MIN: 3.92 / MAX: 5.5 MIN: 4.14 / MAX: 6.11 MIN: 3.97 / MAX: 6.99 MIN: 4 / MAX: 4.15 MIN: 4.06 / MAX: 4.21 MIN: 4.27 / MAX: 144.3 MIN: 2.71 / MAX: 347.03 MIN: 3.8 / MAX: 5.41 MIN: 4.03 / MAX: 4.73 MIN: 4.25 / MAX: 103.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: alexnet 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.22, N = 15 4.72 4.35 5.10 4.61 4.72 4.67 4.68 4.31 4.31 10.08 5.53 5.33 5.14 MIN: 4.31 / MAX: 6.71 MIN: 4.28 / MAX: 5.1 MIN: 4.75 / MAX: 6.12 MIN: 4.24 / MAX: 7.25 MIN: 4.25 / MAX: 7.3 MIN: 4.27 / MAX: 6.36 MIN: 4.26 / MAX: 6.8 MIN: 4.25 / MAX: 4.94 MIN: 4.26 / MAX: 5.07 MIN: 4.36 / MAX: 225.66 MIN: 4.22 / MAX: 362.62 MIN: 4.83 / MAX: 6.6 MIN: 4.65 / MAX: 6.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: googlenet 4090 rep a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 4 8 12 16 20 SE +/- 0.11, N = 3 SE +/- 0.01, N = 3 SE +/- 0.22, N = 15 10.86 7.94 7.82 7.93 7.85 7.92 7.98 10.30 8.42 8.52 8.38 8.40 7.86 7.90 18.25 9.69 10.27 8.93 MIN: 8.12 / MAX: 189.87 MIN: 7.71 / MAX: 8.73 MIN: 7.73 / MAX: 8.65 MIN: 7.82 / MAX: 8.91 MIN: 7.71 / MAX: 8.83 MIN: 7.8 / MAX: 8.96 MIN: 7.86 / MAX: 8.78 MIN: 8.19 / MAX: 349.57 MIN: 7.75 / MAX: 9.96 MIN: 7.84 / MAX: 10.21 MIN: 7.72 / MAX: 10.05 MIN: 7.72 / MAX: 10.5 MIN: 7.76 / MAX: 8.74 MIN: 7.79 / MAX: 8.74 MIN: 7.5 / MAX: 267.89 MIN: 7.29 / MAX: 407.61 MIN: 7.95 / MAX: 115.68 MIN: 8.27 / MAX: 10.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: mnasnet b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.04, N = 15 2.97 2.97 3.12 3.05 3.39 3.08 3.09 3.07 3.06 2.95 2.96 6.87 3.11 3.19 3.23 4.61 MIN: 2.94 / MAX: 3.43 MIN: 2.94 / MAX: 3.43 MIN: 3.08 / MAX: 3.86 MIN: 3.01 / MAX: 3.88 MIN: 3.26 / MAX: 4.86 MIN: 2.94 / MAX: 4.52 MIN: 2.95 / MAX: 4.52 MIN: 2.94 / MAX: 3.6 MIN: 2.93 / MAX: 3.64 MIN: 2.92 / MAX: 3.29 MIN: 2.94 / MAX: 3.38 MIN: 2.93 / MAX: 216.41 MIN: 2.8 / MAX: 4.98 MIN: 3.06 / MAX: 3.75 MIN: 3.1 / MAX: 3.75 MIN: 2.78 / MAX: 222.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: vision_transformer g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 16 32 48 64 80 SE +/- 0.13, N = 15 32.38 31.71 31.66 33.36 38.33 34.13 34.29 35.40 34.05 31.86 32.11 73.51 38.29 39.01 38.73 38.58 MIN: 32.04 / MAX: 51.55 MIN: 31.56 / MAX: 33.03 MIN: 31.52 / MAX: 32.14 MIN: 32.83 / MAX: 76.21 MIN: 34.14 / MAX: 246.43 MIN: 32.98 / MAX: 36.11 MIN: 33.11 / MAX: 40.12 MIN: 33.93 / MAX: 39.3 MIN: 32.83 / MAX: 38.57 MIN: 31.58 / MAX: 35.84 MIN: 31.94 / MAX: 33.01 MIN: 39.27 / MAX: 288.2 MIN: 32.31 / MAX: 557.38 MIN: 33.91 / MAX: 411.66 MIN: 33.81 / MAX: 362.17 MIN: 33.77 / MAX: 476.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 7 14 21 28 35 SE +/- 0.28, N = 15 13.14 13.77 13.86 13.55 13.62 13.61 13.10 12.86 29.80 15.21 15.55 15.45 17.30 MIN: 13 / MAX: 14.02 MIN: 12.96 / MAX: 14.66 MIN: 13.04 / MAX: 15.04 MIN: 12.72 / MAX: 15.51 MIN: 12.71 / MAX: 15.65 MIN: 12.67 / MAX: 19.72 MIN: 13.01 / MAX: 14.17 MIN: 12.76 / MAX: 13.73 MIN: 12.85 / MAX: 216.34 MIN: 12.34 / MAX: 380.51 MIN: 13.11 / MAX: 307.2 MIN: 12.65 / MAX: 445.76 MIN: 14.66 / MAX: 441.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 5 10 15 20 25 SE +/- 0.24, N = 15 8.01 8.05 18.54 9.98 8.46 8.83 10.15 MIN: 7.96 / MAX: 8.47 MIN: 7.98 / MAX: 8.94 MIN: 8.01 / MAX: 164.45 MIN: 7.79 / MAX: 434.9 MIN: 8.12 / MAX: 10.14 MIN: 8.29 / MAX: 10.15 MIN: 8.08 / MAX: 193.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: yolov4-tiny g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 7 14 21 28 35 SE +/- 0.23, N = 15 12.89 12.77 12.86 14.34 15.43 13.79 13.68 13.95 13.62 12.88 12.84 29.49 15.44 15.44 16.39 16.61 MIN: 12.65 / MAX: 27.99 MIN: 12.69 / MAX: 13.71 MIN: 12.76 / MAX: 13.98 MIN: 14.23 / MAX: 15.12 MIN: 13.1 / MAX: 210.2 MIN: 12.79 / MAX: 15.92 MIN: 12.77 / MAX: 15.57 MIN: 13.03 / MAX: 15.9 MIN: 12.75 / MAX: 15.79 MIN: 12.75 / MAX: 13.79 MIN: 12.76 / MAX: 13.7 MIN: 13.03 / MAX: 182.99 MIN: 12.61 / MAX: 387.62 MIN: 12.92 / MAX: 211.43 MIN: 12.97 / MAX: 369.64 MIN: 12.32 / MAX: 375.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.18, N = 15 3.14 3.15 3.15 3.16 3.28 3.29 3.30 3.40 3.23 3.16 3.17 7.24 3.66 4.99 3.31 4.45 MIN: 3.09 / MAX: 3.61 MIN: 3.11 / MAX: 3.88 MIN: 3.11 / MAX: 3.85 MIN: 3.1 / MAX: 3.71 MIN: 3.09 / MAX: 5.28 MIN: 3.12 / MAX: 4.64 MIN: 3.12 / MAX: 4.7 MIN: 3.23 / MAX: 4.8 MIN: 3.06 / MAX: 4.66 MIN: 3.11 / MAX: 3.51 MIN: 3.12 / MAX: 4.03 MIN: 3.04 / MAX: 261.68 MIN: 3.01 / MAX: 437.59 MIN: 3.1 / MAX: 201.8 MIN: 3.12 / MAX: 4.6 MIN: 2.65 / MAX: 216.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet18 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.05, N = 3 5.27 12.13 5.94 6.96 5.87 5.58 MIN: 5.15 / MAX: 6.11 MIN: 5.32 / MAX: 123.4 MIN: 5.32 / MAX: 8.32 MIN: 5.3 / MAX: 242.18 MIN: 5.41 / MAX: 7.58 MIN: 5.09 / MAX: 6.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.04, N = 3 3.19 7.34 3.24 3.36 3.35 4.96 MIN: 3.13 / MAX: 3.61 MIN: 3.09 / MAX: 155.33 MIN: 3.05 / MAX: 5.14 MIN: 3.22 / MAX: 4.62 MIN: 3.22 / MAX: 3.99 MIN: 3.14 / MAX: 189.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: squeezenet_ssd g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.19, N = 15 7.19 7.14 7.07 7.09 8.33 7.66 7.63 7.70 7.51 7.04 7.08 16.15 8.29 7.93 7.81 7.72 MIN: 6.99 / MAX: 23.11 MIN: 7.06 / MAX: 7.95 MIN: 7.01 / MAX: 7.75 MIN: 6.98 / MAX: 8.01 MIN: 6.32 / MAX: 222.03 MIN: 7.02 / MAX: 9.08 MIN: 7.02 / MAX: 9.71 MIN: 7.11 / MAX: 9.19 MIN: 6.94 / MAX: 9.51 MIN: 6.97 / MAX: 7.76 MIN: 7.01 / MAX: 7.93 MIN: 7.25 / MAX: 210.69 MIN: 6.37 / MAX: 448.22 MIN: 7.31 / MAX: 9.45 MIN: 7.24 / MAX: 9.04 MIN: 7.12 / MAX: 23.25 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: alexnet g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.23, N = 15 4.35 4.42 4.30 4.35 4.99 4.69 4.69 5.21 4.65 4.30 4.30 9.86 5.67 6.11 6.58 6.11 MIN: 4.26 / MAX: 5.85 MIN: 4.32 / MAX: 5.1 MIN: 4.26 / MAX: 5.16 MIN: 4.27 / MAX: 5.16 MIN: 4.59 / MAX: 6.56 MIN: 4.26 / MAX: 6.15 MIN: 4.26 / MAX: 7.17 MIN: 4.79 / MAX: 6.66 MIN: 4.26 / MAX: 5.97 MIN: 4.25 / MAX: 5.08 MIN: 4.25 / MAX: 4.7 MIN: 4.25 / MAX: 157.02 MIN: 4.21 / MAX: 365.75 MIN: 4.73 / MAX: 81.72 MIN: 4.61 / MAX: 91.07 MIN: 4.83 / MAX: 124.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: yolov4-tiny b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 7 14 21 28 35 SE +/- 0.14, N = 15 12.98 12.89 13.07 13.08 15.11 13.81 13.55 13.69 13.83 12.87 12.82 29.34 15.42 15.30 15.34 15.62 MIN: 12.73 / MAX: 35.55 MIN: 12.84 / MAX: 13.19 MIN: 12.95 / MAX: 14.55 MIN: 12.96 / MAX: 13.83 MIN: 12.93 / MAX: 151.45 MIN: 12.84 / MAX: 15.1 MIN: 12.75 / MAX: 14.74 MIN: 12.73 / MAX: 15.68 MIN: 12.89 / MAX: 15.4 MIN: 12.75 / MAX: 13.58 MIN: 12.72 / MAX: 13.48 MIN: 12.17 / MAX: 245.34 MIN: 12.21 / MAX: 414.81 MIN: 12.87 / MAX: 144.73 MIN: 12.94 / MAX: 157.95 MIN: 12.99 / MAX: 184 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: mobilenet 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 5 10 15 20 25 SE +/- 0.27, N = 15 8.96 8.50 10.02 8.84 8.40 8.46 8.46 8.11 8.04 18.39 9.62 8.37 9.41 MIN: 8.37 / MAX: 11.12 MIN: 8.42 / MAX: 9.29 MIN: 8.07 / MAX: 266.25 MIN: 8.31 / MAX: 10.98 MIN: 7.93 / MAX: 15.25 MIN: 7.95 / MAX: 10.34 MIN: 7.97 / MAX: 10.56 MIN: 8.02 / MAX: 14.2 MIN: 7.96 / MAX: 9.01 MIN: 7.92 / MAX: 173.39 MIN: 7.71 / MAX: 449.11 MIN: 7.98 / MAX: 10.71 MIN: 8.98 / MAX: 11.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.53, N = 3 3.30 3.20 3.16 3.16 3.16 7.22 3.91 3.36 3.44 3.29 MIN: 3.11 / MAX: 4.01 MIN: 3.05 / MAX: 4.67 MIN: 3.01 / MAX: 5.17 MIN: 3.11 / MAX: 3.95 MIN: 3.09 / MAX: 4.06 MIN: 3.17 / MAX: 69.66 MIN: 3.04 / MAX: 394.66 MIN: 3.21 / MAX: 4.78 MIN: 3.27 / MAX: 4.93 MIN: 3.13 / MAX: 4.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 7 14 21 28 35 SE +/- 0.94, N = 3 12.92 29.38 14.64 16.05 15.41 16.30 MIN: 12.79 / MAX: 18.5 MIN: 12.95 / MAX: 201.31 MIN: 12.77 / MAX: 383.28 MIN: 12.93 / MAX: 474.03 MIN: 12.75 / MAX: 226.87 MIN: 14.11 / MAX: 184.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: regnety_400m 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.29, N = 3 8.72 8.38 8.34 8.22 8.03 18.25 8.42 10.09 10.23 8.25 MIN: 8.32 / MAX: 10.48 MIN: 8.04 / MAX: 9.63 MIN: 8.03 / MAX: 10.23 MIN: 8.14 / MAX: 8.67 MIN: 7.97 / MAX: 8.65 MIN: 7.8 / MAX: 238.29 MIN: 7.66 / MAX: 10.74 MIN: 8.01 / MAX: 418.58 MIN: 8.22 / MAX: 197.1 MIN: 7.87 / MAX: 10.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vgg16 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 12 24 36 48 60 SE +/- 0.28, N = 3 23.72 53.48 27.86 27.32 29.85 27.25 MIN: 23.56 / MAX: 24.59 MIN: 25.52 / MAX: 296.52 MIN: 24.17 / MAX: 416.36 MIN: 24.36 / MAX: 262.38 MIN: 24.25 / MAX: 400.86 MIN: 24.12 / MAX: 252.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.26, N = 15 7.09 7.07 7.06 7.08 7.05 7.08 7.10 7.46 7.58 7.64 7.62 7.63 7.16 7.06 15.82 8.47 7.83 8.22 9.11 MIN: 6.98 / MAX: 7.95 MIN: 7 / MAX: 8.07 MIN: 7 / MAX: 8.03 MIN: 6.97 / MAX: 7.99 MIN: 6.95 / MAX: 8 MIN: 6.98 / MAX: 8.07 MIN: 6.99 / MAX: 8.59 MIN: 6.9 / MAX: 8.9 MIN: 6.98 / MAX: 9.05 MIN: 7.05 / MAX: 9.12 MIN: 7.01 / MAX: 9.28 MIN: 7 / MAX: 9.17 MIN: 7.05 / MAX: 13.55 MIN: 7 / MAX: 7.82 MIN: 6.99 / MAX: 82.57 MIN: 6.29 / MAX: 533.92 MIN: 7.21 / MAX: 9.32 MIN: 7.56 / MAX: 9.8 MIN: 6.35 / MAX: 130.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: regnety_400m 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.21, N = 15 10.10 8.38 8.46 8.61 8.24 8.52 8.49 7.99 8.34 17.88 9.19 8.64 10.09 MIN: 7.93 / MAX: 156.75 MIN: 8.05 / MAX: 27.34 MIN: 8.08 / MAX: 10.33 MIN: 8.21 / MAX: 10.07 MIN: 7.91 / MAX: 9.53 MIN: 8.13 / MAX: 9.73 MIN: 8.08 / MAX: 9.72 MIN: 7.92 / MAX: 8.78 MIN: 8.26 / MAX: 9.09 MIN: 7.38 / MAX: 190.77 MIN: 7.44 / MAX: 524.66 MIN: 8.3 / MAX: 10.51 MIN: 7.84 / MAX: 366.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: alexnet 4090 rep a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 3 6 9 12 15 SE +/- 0.11, N = 3 SE +/- 0.01, N = 3 SE +/- 0.21, N = 14 5.23 4.41 4.32 4.31 4.30 4.36 4.32 5.30 4.75 4.67 4.68 4.69 4.30 4.31 9.62 5.49 5.14 5.18 MIN: 4.78 / MAX: 7.33 MIN: 4.24 / MAX: 5.16 MIN: 4.26 / MAX: 5.15 MIN: 4.26 / MAX: 4.98 MIN: 4.23 / MAX: 5.32 MIN: 4.29 / MAX: 5.7 MIN: 4.25 / MAX: 5.17 MIN: 4.92 / MAX: 7.18 MIN: 4.31 / MAX: 13.88 MIN: 4.27 / MAX: 5.88 MIN: 4.28 / MAX: 6.37 MIN: 4.29 / MAX: 5.78 MIN: 4.25 / MAX: 4.83 MIN: 4.26 / MAX: 5.18 MIN: 4.31 / MAX: 147.6 MIN: 4.26 / MAX: 363.39 MIN: 4.73 / MAX: 6.32 MIN: 4.75 / MAX: 7.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 7 14 21 28 35 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.18, N = 15 12.84 12.87 12.81 12.85 12.87 13.17 13.64 15.16 13.79 13.67 13.65 13.80 12.88 12.83 28.59 15.54 13.97 15.72 15.26 MIN: 12.69 / MAX: 15.33 MIN: 12.76 / MAX: 13.73 MIN: 12.73 / MAX: 13.08 MIN: 12.72 / MAX: 13.93 MIN: 12.68 / MAX: 13.84 MIN: 13.03 / MAX: 14.1 MIN: 13.04 / MAX: 76.32 MIN: 12.86 / MAX: 248.64 MIN: 12.75 / MAX: 19.63 MIN: 12.71 / MAX: 14.88 MIN: 12.71 / MAX: 14.99 MIN: 12.76 / MAX: 15.76 MIN: 12.76 / MAX: 13.67 MIN: 12.74 / MAX: 13.59 MIN: 12.87 / MAX: 325.37 MIN: 12.15 / MAX: 492.01 MIN: 13.11 / MAX: 16.15 MIN: 13.2 / MAX: 301.81 MIN: 12.87 / MAX: 132.82 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: regnety_400m b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.25, N = 14 8.27 7.98 8.50 7.99 7.99 8.45 8.44 8.58 8.10 8.01 8.25 17.23 8.89 9.60 10.34 7.73 MIN: 8.22 / MAX: 9.01 MIN: 7.93 / MAX: 8.65 MIN: 8.04 / MAX: 30.12 MIN: 7.91 / MAX: 8.8 MIN: 7.62 / MAX: 9.27 MIN: 8.05 / MAX: 10.3 MIN: 8.04 / MAX: 10.17 MIN: 8.23 / MAX: 10.39 MIN: 7.77 / MAX: 15.42 MIN: 7.93 / MAX: 8.35 MIN: 8.12 / MAX: 14 MIN: 7.8 / MAX: 193.14 MIN: 7.74 / MAX: 476.28 MIN: 7.66 / MAX: 210.23 MIN: 8.21 / MAX: 214.16 MIN: 7.43 / MAX: 9.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: regnety_400m 4090 rep a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 SE +/- 0.19, N = 15 8.78 8.18 8.18 8.27 8.23 8.08 8.30 9.94 8.24 8.67 8.45 8.37 8.25 8.24 18.00 8.83 8.13 10.17 MIN: 8.45 / MAX: 10.05 MIN: 8.07 / MAX: 9.68 MIN: 8.12 / MAX: 8.86 MIN: 8.22 / MAX: 9.18 MIN: 8.03 / MAX: 8.9 MIN: 7.98 / MAX: 10.87 MIN: 8.22 / MAX: 9.1 MIN: 7.43 / MAX: 166.02 MIN: 7.89 / MAX: 9.52 MIN: 8.22 / MAX: 15.29 MIN: 8.12 / MAX: 9.68 MIN: 8.05 / MAX: 10.19 MIN: 8.17 / MAX: 8.9 MIN: 8.17 / MAX: 8.84 MIN: 7.91 / MAX: 176.28 MIN: 7.65 / MAX: 351.08 MIN: 7.78 / MAX: 9.98 MIN: 8.12 / MAX: 209.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: mobilenet b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.23, N = 15 8.01 8.00 8.56 8.98 10.08 8.43 8.48 8.44 8.40 8.06 8.03 17.82 9.62 8.96 8.74 8.91 MIN: 7.95 / MAX: 8.95 MIN: 7.96 / MAX: 8.63 MIN: 8.04 / MAX: 75.44 MIN: 8.1 / MAX: 124.43 MIN: 8.08 / MAX: 286.28 MIN: 7.99 / MAX: 10.44 MIN: 7.96 / MAX: 10.32 MIN: 7.97 / MAX: 10.71 MIN: 8.12 / MAX: 10.11 MIN: 7.94 / MAX: 13.92 MIN: 7.98 / MAX: 8.77 MIN: 7.57 / MAX: 211.62 MIN: 7.76 / MAX: 454.91 MIN: 8.39 / MAX: 10.77 MIN: 8.25 / MAX: 10.5 MIN: 8.33 / MAX: 10.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 7 14 21 28 35 SE +/- 0.19, N = 15 13.93 13.73 13.52 15.26 12.97 12.90 28.73 15.00 15.95 15.40 15.55 MIN: 13.08 / MAX: 15.68 MIN: 12.78 / MAX: 20.99 MIN: 12.72 / MAX: 21.19 MIN: 14.19 / MAX: 17.06 MIN: 12.83 / MAX: 13.8 MIN: 12.77 / MAX: 13.92 MIN: 12.83 / MAX: 264.49 MIN: 12.75 / MAX: 401.37 MIN: 13.38 / MAX: 245.18 MIN: 13 / MAX: 245.79 MIN: 12.87 / MAX: 342.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: regnety_400m 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.20, N = 15 8.20 8.19 18.24 9.10 10.05 8.45 8.37 MIN: 8.14 / MAX: 8.74 MIN: 8.12 / MAX: 8.98 MIN: 7.5 / MAX: 201.09 MIN: 7.61 / MAX: 454.62 MIN: 8.13 / MAX: 173.18 MIN: 8.05 / MAX: 12.64 MIN: 8.08 / MAX: 10.1 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 16 32 48 64 80 SE +/- 0.09, N = 3 SE +/- 0.21, N = 3 SE +/- 0.07, N = 3 SE +/- 0.16, N = 15 31.88 31.85 31.79 32.12 31.93 32.92 32.42 36.42 35.56 35.07 34.19 34.32 31.94 31.91 70.76 38.03 38.76 37.59 39.04 MIN: 31.55 / MAX: 37.47 MIN: 31.69 / MAX: 33.06 MIN: 31.63 / MAX: 35.57 MIN: 31.66 / MAX: 46.9 MIN: 31.62 / MAX: 35.85 MIN: 32.67 / MAX: 36.93 MIN: 31.89 / MAX: 65.47 MIN: 33.49 / MAX: 224.86 MIN: 33.19 / MAX: 40.43 MIN: 33.66 / MAX: 39.36 MIN: 32.72 / MAX: 36.79 MIN: 32.58 / MAX: 41.88 MIN: 31.73 / MAX: 34.21 MIN: 31.74 / MAX: 34.28 MIN: 38.81 / MAX: 250.01 MIN: 32.66 / MAX: 467.28 MIN: 33.12 / MAX: 539.58 MIN: 34.45 / MAX: 457.98 MIN: 33.83 / MAX: 463.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 7 14 21 28 35 SE +/- 0.81, N = 3 13.71 13.63 13.42 12.82 12.81 28.41 14.57 15.85 13.88 17.67 MIN: 12.78 / MAX: 15.62 MIN: 12.77 / MAX: 16.93 MIN: 12.65 / MAX: 16.19 MIN: 12.72 / MAX: 13.66 MIN: 12.7 / MAX: 13.69 MIN: 12.49 / MAX: 151.04 MIN: 12.33 / MAX: 312.42 MIN: 13.26 / MAX: 253.23 MIN: 13.09 / MAX: 14.77 MIN: 14.92 / MAX: 343.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: vision_transformer 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 16 32 48 64 80 SE +/- 0.20, N = 15 39.06 33.32 38.01 34.91 34.10 34.23 34.10 33.22 32.09 71.08 37.88 38.17 39.18 MIN: 34.16 / MAX: 481.28 MIN: 31.83 / MAX: 104.12 MIN: 32.96 / MAX: 388.09 MIN: 33.72 / MAX: 36.82 MIN: 32.43 / MAX: 38.75 MIN: 33.08 / MAX: 37.43 MIN: 32.32 / MAX: 38.54 MIN: 33.04 / MAX: 36.99 MIN: 31.84 / MAX: 32.77 MIN: 38.84 / MAX: 374.68 MIN: 32.46 / MAX: 518.57 MIN: 32.97 / MAX: 462.63 MIN: 33.74 / MAX: 520.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: resnet50 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 5 10 15 20 25 SE +/- 0.04, N = 3 10.86 11.26 11.10 10.03 10.06 22.19 12.81 11.72 12.47 13.29 MIN: 9.98 / MAX: 12.46 MIN: 10.32 / MAX: 13.29 MIN: 10.19 / MAX: 18.3 MIN: 9.93 / MAX: 10.87 MIN: 9.86 / MAX: 11.9 MIN: 10.16 / MAX: 181.74 MIN: 10.06 / MAX: 349.03 MIN: 10.8 / MAX: 12.8 MIN: 11.5 / MAX: 14.68 MIN: 10.54 / MAX: 456.82 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vision_transformer 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 16 32 48 64 80 SE +/- 0.10, N = 3 31.94 70.53 38.50 39.35 38.65 37.13 MIN: 31.73 / MAX: 32.75 MIN: 39.2 / MAX: 276.33 MIN: 33.7 / MAX: 418.06 MIN: 34.22 / MAX: 466.65 MIN: 33.07 / MAX: 476.08 MIN: 33.97 / MAX: 443.1 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vision_transformer 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 16 32 48 64 80 SE +/- 0.11, N = 15 35.60 33.93 33.90 35.36 31.94 31.97 70.29 38.27 38.62 39.03 38.46 MIN: 34.13 / MAX: 38.49 MIN: 32.77 / MAX: 36.2 MIN: 32.72 / MAX: 37.77 MIN: 33.87 / MAX: 42.41 MIN: 31.72 / MAX: 34.34 MIN: 31.71 / MAX: 33.78 MIN: 39.39 / MAX: 250.19 MIN: 32.29 / MAX: 507.7 MIN: 33.33 / MAX: 465 MIN: 33.61 / MAX: 343.67 MIN: 32.39 / MAX: 435.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.19, N = 15 4.24 3.84 4.19 4.06 3.98 4.02 4.01 3.83 3.87 8.41 4.60 4.04 4.12 MIN: 3.96 / MAX: 5.55 MIN: 3.78 / MAX: 4.57 MIN: 4.01 / MAX: 5.09 MIN: 3.85 / MAX: 4.97 MIN: 3.77 / MAX: 5.44 MIN: 3.8 / MAX: 5.14 MIN: 3.79 / MAX: 5.39 MIN: 3.78 / MAX: 4.4 MIN: 3.81 / MAX: 4.62 MIN: 3.76 / MAX: 67.73 MIN: 3.79 / MAX: 336.2 MIN: 3.85 / MAX: 4.9 MIN: 3.86 / MAX: 5.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.22, N = 15 7.05 7.07 15.46 8.31 7.40 9.34 7.72 MIN: 6.98 / MAX: 7.81 MIN: 6.99 / MAX: 7.81 MIN: 7.08 / MAX: 147.31 MIN: 6.35 / MAX: 364.95 MIN: 6.81 / MAX: 8.46 MIN: 6.88 / MAX: 268.7 MIN: 7.13 / MAX: 8.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.25, N = 14 9.28 7.14 7.21 7.66 7.55 7.67 7.62 7.04 7.09 15.40 8.28 9.16 9.21 MIN: 6.92 / MAX: 355.6 MIN: 7.03 / MAX: 7.99 MIN: 6.73 / MAX: 8.82 MIN: 7.09 / MAX: 8.97 MIN: 6.99 / MAX: 9.08 MIN: 7.04 / MAX: 9.1 MIN: 7 / MAX: 9.93 MIN: 6.96 / MAX: 7.7 MIN: 7.01 / MAX: 7.97 MIN: 6.64 / MAX: 132.68 MIN: 6.38 / MAX: 381.81 MIN: 6.73 / MAX: 423.75 MIN: 6.83 / MAX: 203.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vgg16 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 12 24 36 48 60 SE +/- 0.26, N = 15 23.50 23.47 51.28 28.53 27.31 29.35 29.40 MIN: 23.26 / MAX: 24.34 MIN: 23.25 / MAX: 24.24 MIN: 24.83 / MAX: 242.12 MIN: 24.21 / MAX: 515.3 MIN: 24.27 / MAX: 230.86 MIN: 24.55 / MAX: 485.35 MIN: 26.17 / MAX: 411.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: regnety_400m 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.24, N = 15 8.67 8.35 8.25 8.37 8.33 8.07 17.61 9.05 9.87 10.69 10.03 MIN: 8.3 / MAX: 14.66 MIN: 8.05 / MAX: 9.76 MIN: 7.93 / MAX: 9.88 MIN: 8.04 / MAX: 10.13 MIN: 8.25 / MAX: 9.32 MIN: 7.99 / MAX: 8.88 MIN: 7.85 / MAX: 165.34 MIN: 7.52 / MAX: 417.33 MIN: 7.81 / MAX: 243.06 MIN: 8.17 / MAX: 339.6 MIN: 7.81 / MAX: 171.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet18 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.20, N = 15 5.92 5.67 5.65 5.74 5.21 5.20 11.30 6.40 7.52 5.90 7.38 MIN: 5.37 / MAX: 8.24 MIN: 5.19 / MAX: 7.38 MIN: 5.18 / MAX: 6.76 MIN: 5.18 / MAX: 8.08 MIN: 5.09 / MAX: 6.13 MIN: 5.1 / MAX: 6.09 MIN: 5.3 / MAX: 181.7 MIN: 5.1 / MAX: 457.07 MIN: 5.45 / MAX: 290.49 MIN: 5.43 / MAX: 7.49 MIN: 5.15 / MAX: 138.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: googlenet b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.24, N = 15 7.84 7.88 8.07 9.15 10.17 8.45 8.52 8.50 8.55 7.83 7.85 17.00 9.65 9.97 9.29 9.02 MIN: 7.74 / MAX: 8.7 MIN: 7.79 / MAX: 8.78 MIN: 7.92 / MAX: 8.86 MIN: 7.84 / MAX: 198.46 MIN: 7.94 / MAX: 150.01 MIN: 7.79 / MAX: 10.32 MIN: 7.81 / MAX: 10.78 MIN: 7.79 / MAX: 9.94 MIN: 7.85 / MAX: 10.35 MIN: 7.71 / MAX: 8.8 MIN: 7.75 / MAX: 8.69 MIN: 7.35 / MAX: 277.79 MIN: 7.59 / MAX: 472.81 MIN: 7.67 / MAX: 258.52 MIN: 7.98 / MAX: 83.03 MIN: 8.41 / MAX: 11.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: yolov4-tiny 4090 rep a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 7 14 21 28 35 SE +/- 0.11, N = 3 SE +/- 0.05, N = 3 SE +/- 0.18, N = 15 16.79 12.90 12.74 12.81 12.95 13.32 17.23 14.65 13.85 14.03 13.60 13.63 14.26 12.77 27.66 15.20 13.68 15.40 MIN: 14.1 / MAX: 273.41 MIN: 12.69 / MAX: 15.88 MIN: 12.66 / MAX: 13.28 MIN: 12.74 / MAX: 13.2 MIN: 12.75 / MAX: 18.88 MIN: 12.95 / MAX: 35.49 MIN: 12.99 / MAX: 196.66 MIN: 12.44 / MAX: 202.68 MIN: 12.84 / MAX: 16.75 MIN: 13.15 / MAX: 15.97 MIN: 12.8 / MAX: 16.23 MIN: 12.77 / MAX: 15.36 MIN: 14.17 / MAX: 14.53 MIN: 12.7 / MAX: 13.02 MIN: 12.74 / MAX: 294.9 MIN: 12.69 / MAX: 431.37 MIN: 12.83 / MAX: 14.63 MIN: 12.35 / MAX: 321.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.14, N = 3 7.07 15.32 7.45 7.43 9.44 8.26 MIN: 6.98 / MAX: 9.71 MIN: 6.66 / MAX: 139.17 MIN: 6.59 / MAX: 9.11 MIN: 6.84 / MAX: 8.82 MIN: 7.17 / MAX: 94.63 MIN: 7.64 / MAX: 11.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: googlenet 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.19, N = 15 8.79 8.42 8.26 8.55 7.84 7.86 16.97 9.90 9.05 10.39 8.35 MIN: 8.08 / MAX: 10.27 MIN: 7.77 / MAX: 10.52 MIN: 7.62 / MAX: 10.47 MIN: 7.86 / MAX: 10.08 MIN: 7.74 / MAX: 8.72 MIN: 7.75 / MAX: 8.71 MIN: 7.44 / MAX: 229.93 MIN: 7.76 / MAX: 396.66 MIN: 8.26 / MAX: 13.34 MIN: 7.87 / MAX: 391.66 MIN: 7.7 / MAX: 10.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vision_transformer 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 15 30 45 60 75 SE +/- 0.12, N = 15 32.16 32.13 69.48 38.32 38.82 38.69 38.58 MIN: 31.94 / MAX: 33.7 MIN: 31.95 / MAX: 32.87 MIN: 39.08 / MAX: 374.31 MIN: 32.26 / MAX: 477.15 MIN: 33.83 / MAX: 435.6 MIN: 33.32 / MAX: 390.07 MIN: 33.06 / MAX: 464.16 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet50 4090 rep a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 5 10 15 20 25 SE +/- 0.23, N = 3 SE +/- 0.01, N = 3 SE +/- 0.26, N = 15 13.73 10.20 10.01 10.11 10.00 11.05 10.72 12.09 11.16 11.76 10.82 11.10 10.30 9.95 21.50 12.73 14.10 11.41 MIN: 10.4 / MAX: 137.78 MIN: 9.84 / MAX: 12.48 MIN: 9.85 / MAX: 11.06 MIN: 9.95 / MAX: 16.18 MIN: 9.86 / MAX: 11.02 MIN: 10.14 / MAX: 162.88 MIN: 10.1 / MAX: 108.3 MIN: 11.16 / MAX: 13.48 MIN: 10.29 / MAX: 15.03 MIN: 10.68 / MAX: 44.94 MIN: 9.9 / MAX: 12.26 MIN: 10.2 / MAX: 13.06 MIN: 9.82 / MAX: 17.56 MIN: 9.85 / MAX: 10.72 MIN: 10.24 / MAX: 116.85 MIN: 10.18 / MAX: 541.92 MIN: 10.27 / MAX: 287 MIN: 10.57 / MAX: 12.22 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet50 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 5 10 15 20 25 SE +/- 0.30, N = 3 10.27 22.15 13.15 13.00 10.96 13.25 MIN: 10.12 / MAX: 11.19 MIN: 10.11 / MAX: 123.04 MIN: 10.26 / MAX: 349.93 MIN: 10.34 / MAX: 397.57 MIN: 10.09 / MAX: 12.99 MIN: 10.61 / MAX: 154.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: vgg16 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 11 22 33 44 55 SE +/- 0.54, N = 3 25.01 25.44 25.26 23.58 23.40 50.32 27.98 30.16 30.74 27.61 MIN: 23.88 / MAX: 26.66 MIN: 24.27 / MAX: 27.68 MIN: 24.29 / MAX: 27.75 MIN: 23.35 / MAX: 24.43 MIN: 23.2 / MAX: 24.07 MIN: 25.92 / MAX: 281.06 MIN: 24.35 / MAX: 423.63 MIN: 24.66 / MAX: 332.49 MIN: 25.36 / MAX: 428.68 MIN: 24.67 / MAX: 401.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: resnet18 b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 3 6 9 12 15 SE +/- 0.19, N = 15 5.21 5.26 6.13 5.48 5.86 5.70 5.63 5.66 5.71 5.19 5.20 11.14 6.18 5.81 7.75 7.61 MIN: 5.12 / MAX: 6.22 MIN: 5.18 / MAX: 6.27 MIN: 5.41 / MAX: 151.51 MIN: 5.37 / MAX: 6.51 MIN: 5.35 / MAX: 7.79 MIN: 5.15 / MAX: 7.9 MIN: 5.09 / MAX: 7.75 MIN: 5.14 / MAX: 7.49 MIN: 5.12 / MAX: 8.19 MIN: 5.09 / MAX: 6.13 MIN: 5.1 / MAX: 5.97 MIN: 4.79 / MAX: 65.12 MIN: 5.17 / MAX: 262.79 MIN: 5.27 / MAX: 7.16 MIN: 5.57 / MAX: 125.43 MIN: 5.23 / MAX: 90.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: FastestDet a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.29, N = 15 4.10 4.07 4.09 4.11 4.08 4.24 4.07 5.14 4.20 4.21 4.20 4.79 4.11 4.08 8.41 4.26 4.39 4.59 3.93 MIN: 4.06 / MAX: 4.81 MIN: 4.04 / MAX: 4.53 MIN: 4.05 / MAX: 5.5 MIN: 4.01 / MAX: 9.72 MIN: 4.03 / MAX: 5.29 MIN: 3.88 / MAX: 24.21 MIN: 4.02 / MAX: 4.82 MIN: 3.7 / MAX: 81.79 MIN: 4.02 / MAX: 4.97 MIN: 4.04 / MAX: 4.97 MIN: 4.03 / MAX: 6.49 MIN: 4.64 / MAX: 6.21 MIN: 4.07 / MAX: 4.29 MIN: 4.04 / MAX: 4.35 MIN: 2.89 / MAX: 487.78 MIN: 2.5 / MAX: 396.93 MIN: 4.25 / MAX: 5.86 MIN: 2.62 / MAX: 232.18 MIN: 3.8 / MAX: 5.4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.20, N = 15 3.35 3.33 3.32 3.35 3.33 3.40 3.35 3.49 3.43 3.44 3.50 3.46 3.39 3.36 7.07 3.98 3.45 3.49 3.51 MIN: 3.29 / MAX: 3.85 MIN: 3.3 / MAX: 3.59 MIN: 3.29 / MAX: 4.19 MIN: 3.3 / MAX: 3.82 MIN: 3.28 / MAX: 4.14 MIN: 3.35 / MAX: 5.89 MIN: 3.3 / MAX: 4.02 MIN: 3.35 / MAX: 4.24 MIN: 3.3 / MAX: 4.22 MIN: 3.3 / MAX: 5.36 MIN: 3.37 / MAX: 4.85 MIN: 3.32 / MAX: 5.24 MIN: 3.35 / MAX: 3.69 MIN: 3.32 / MAX: 4.06 MIN: 3.25 / MAX: 243.32 MIN: 3.14 / MAX: 529.82 MIN: 3.32 / MAX: 3.99 MIN: 3.36 / MAX: 4.33 MIN: 3.37 / MAX: 4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.25, N = 15 8.43 8.38 8.31 9.19 8.03 8.05 17.09 9.52 9.16 9.02 8.93 MIN: 8.03 / MAX: 9.64 MIN: 7.94 / MAX: 10.07 MIN: 7.85 / MAX: 10.21 MIN: 8.51 / MAX: 11.04 MIN: 7.96 / MAX: 8.83 MIN: 7.96 / MAX: 9.04 MIN: 7.89 / MAX: 121.53 MIN: 7.97 / MAX: 420.29 MIN: 8.5 / MAX: 10.51 MIN: 8.42 / MAX: 11.17 MIN: 8.33 / MAX: 11.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vgg16 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 11 22 33 44 55 SE +/- 0.28, N = 15 25.67 25.56 25.33 26.09 23.51 23.38 49.70 28.53 27.44 29.17 28.14 MIN: 24.46 / MAX: 27.34 MIN: 24.24 / MAX: 27.92 MIN: 24.26 / MAX: 34.98 MIN: 24.58 / MAX: 30.18 MIN: 23.27 / MAX: 24.38 MIN: 23.19 / MAX: 24.27 MIN: 25.55 / MAX: 421.44 MIN: 23.95 / MAX: 473.83 MIN: 24.06 / MAX: 264.59 MIN: 24.61 / MAX: 264.85 MIN: 24.24 / MAX: 221.5 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.13, N = 3 8.03 17.06 10.02 10.56 8.22 10.64 MIN: 7.97 / MAX: 8.91 MIN: 8 / MAX: 101.45 MIN: 7.8 / MAX: 372.36 MIN: 8.32 / MAX: 239.95 MIN: 7.75 / MAX: 9.41 MIN: 8.4 / MAX: 127.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: vgg16 b c f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 11 22 33 44 55 SE +/- 0.24, N = 15 23.50 23.99 24.45 24.92 29.12 25.10 24.91 25.00 25.45 23.50 23.54 49.75 28.40 28.55 27.25 27.04 MIN: 23.3 / MAX: 24.41 MIN: 23.72 / MAX: 24.98 MIN: 24.26 / MAX: 25.26 MIN: 24.58 / MAX: 31.89 MIN: 26.33 / MAX: 310.23 MIN: 24.12 / MAX: 27.57 MIN: 23.8 / MAX: 26.87 MIN: 23.91 / MAX: 27.99 MIN: 24.22 / MAX: 27.73 MIN: 23.17 / MAX: 24.44 MIN: 23.33 / MAX: 24.41 MIN: 25.45 / MAX: 273.86 MIN: 24.12 / MAX: 509.06 MIN: 24.05 / MAX: 201.8 MIN: 24.14 / MAX: 379.93 MIN: 24.33 / MAX: 215.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: regnety_400m 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.54, N = 3 8.06 17.02 9.14 10.11 8.70 8.34 MIN: 7.98 / MAX: 8.6 MIN: 7.65 / MAX: 216.63 MIN: 8.14 / MAX: 400.02 MIN: 8.03 / MAX: 259.38 MIN: 8.29 / MAX: 12.6 MIN: 8.01 / MAX: 12.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 a c d e f i 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.00, N = 2 SE +/- 0.02, N = 3 SE +/- 0.20, N = 14 3.17 3.20 3.17 3.18 3.14 4.87 3.28 3.19 3.16 6.60 3.64 3.28 3.30 3.36 MIN: 3.11 / MAX: 3.73 MIN: 3.16 / MAX: 3.68 MIN: 3.1 / MAX: 3.83 MIN: 3.11 / MAX: 3.78 MIN: 3.09 / MAX: 3.54 MIN: 3.14 / MAX: 278.98 MIN: 3.13 / MAX: 4.65 MIN: 3.14 / MAX: 3.48 MIN: 3.11 / MAX: 3.62 MIN: 2.98 / MAX: 166.19 MIN: 2.87 / MAX: 429.02 MIN: 3.15 / MAX: 3.9 MIN: 3.15 / MAX: 3.92 MIN: 3.21 / MAX: 4.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 4080 rep 4080 xxx 4080 zzz 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.53, N = 3 3.28 3.08 3.06 3.17 6.43 3.70 3.33 3.30 4.97 MIN: 3.13 / MAX: 4.78 MIN: 2.97 / MAX: 3.67 MIN: 2.94 / MAX: 3.94 MIN: 3.12 / MAX: 3.75 MIN: 2.85 / MAX: 164.91 MIN: 2.98 / MAX: 261.6 MIN: 3.2 / MAX: 4.4 MIN: 3.15 / MAX: 3.91 MIN: 3.15 / MAX: 291.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 4090 g i 4080 rep 4080 xxx 4080 zzz 3090 3070 RTX 3070 Ti 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.18, N = 15 3.34 3.16 3.26 3.24 3.26 3.24 3.16 6.56 3.62 3.33 3.26 MIN: 3.21 / MAX: 4.31 MIN: 3.12 / MAX: 3.58 MIN: 3.11 / MAX: 4.7 MIN: 3.11 / MAX: 4.37 MIN: 3.13 / MAX: 4.08 MIN: 3.1 / MAX: 3.88 MIN: 3.11 / MAX: 3.77 MIN: 3.07 / MAX: 110.87 MIN: 3 / MAX: 469.9 MIN: 3.19 / MAX: 4.79 MIN: 3.13 / MAX: 3.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mnasnet 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.04, N = 15 3.10 3.06 2.98 3.08 2.94 2.97 6.06 3.10 4.93 4.99 3.07 MIN: 2.95 / MAX: 4.05 MIN: 2.93 / MAX: 5.02 MIN: 2.86 / MAX: 4.47 MIN: 2.95 / MAX: 3.88 MIN: 2.9 / MAX: 3.34 MIN: 2.94 / MAX: 3.45 MIN: 2.96 / MAX: 42.7 MIN: 2.61 / MAX: 4.75 MIN: 2.97 / MAX: 124.96 MIN: 3.02 / MAX: 235.56 MIN: 2.93 / MAX: 4.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vgg16 a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 11 22 33 44 55 SE +/- 0.04, N = 3 SE +/- 0.10, N = 3 SE +/- 0.14, N = 3 SE +/- 0.24, N = 15 23.51 23.56 23.54 23.56 23.60 24.55 24.20 27.83 25.00 25.04 25.01 25.82 23.55 23.48 48.29 29.06 28.82 27.04 29.29 MIN: 23.29 / MAX: 24.68 MIN: 23.34 / MAX: 24.72 MIN: 23.33 / MAX: 24.61 MIN: 23.24 / MAX: 24.78 MIN: 23.17 / MAX: 24.71 MIN: 23.62 / MAX: 97.69 MIN: 23.56 / MAX: 58.31 MIN: 24.98 / MAX: 262.23 MIN: 23.93 / MAX: 26.69 MIN: 24.06 / MAX: 27.35 MIN: 23.8 / MAX: 26.41 MIN: 24.35 / MAX: 62.94 MIN: 23.3 / MAX: 24.45 MIN: 23.24 / MAX: 29.21 MIN: 24.97 / MAX: 183.12 MIN: 24.11 / MAX: 541.55 MIN: 24.35 / MAX: 214.1 MIN: 24.22 / MAX: 296.13 MIN: 24.63 / MAX: 296.95 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: mobilenet g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.26, N = 15 8.04 8.00 7.95 8.65 9.05 8.43 8.46 8.88 8.38 8.00 8.01 16.34 9.62 8.81 9.54 10.54 MIN: 7.93 / MAX: 8.86 MIN: 7.95 / MAX: 8.99 MIN: 7.89 / MAX: 8.79 MIN: 8.55 / MAX: 9.53 MIN: 8.48 / MAX: 11.28 MIN: 7.99 / MAX: 10.66 MIN: 7.99 / MAX: 10.62 MIN: 8.31 / MAX: 10.01 MIN: 7.95 / MAX: 10.41 MIN: 7.94 / MAX: 8.78 MIN: 7.95 / MAX: 8.35 MIN: 8.13 / MAX: 80.69 MIN: 7.76 / MAX: 502.83 MIN: 8.32 / MAX: 10.7 MIN: 8.94 / MAX: 10.54 MIN: 8.41 / MAX: 134.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: vision_transformer 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 15 30 45 60 75 SE +/- 0.11, N = 3 34.22 34.14 34.47 32.10 31.85 65.41 38.04 38.76 38.79 38.95 MIN: 33.01 / MAX: 37.09 MIN: 32.5 / MAX: 37.13 MIN: 33.05 / MAX: 39.69 MIN: 31.9 / MAX: 33.03 MIN: 31.67 / MAX: 35.74 MIN: 39.08 / MAX: 230.59 MIN: 33.11 / MAX: 346.94 MIN: 33.38 / MAX: 423.24 MIN: 34.02 / MAX: 460.15 MIN: 34.04 / MAX: 486.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: mnasnet 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.13, N = 3 3.09 3.00 2.96 2.98 2.97 6.07 3.24 5.19 5.11 3.12 MIN: 2.96 / MAX: 4.98 MIN: 2.88 / MAX: 4.37 MIN: 2.85 / MAX: 3.82 MIN: 2.95 / MAX: 3.9 MIN: 2.93 / MAX: 3.3 MIN: 2.94 / MAX: 129.1 MIN: 2.9 / MAX: 5.34 MIN: 3.04 / MAX: 436.91 MIN: 2.96 / MAX: 247.47 MIN: 2.98 / MAX: 3.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: mobilenet 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.14, N = 3 8.45 8.34 8.25 8.07 8.06 16.52 10.03 9.04 10.61 12.12 MIN: 8.01 / MAX: 10.86 MIN: 7.89 / MAX: 9.42 MIN: 7.78 / MAX: 9.61 MIN: 8.01 / MAX: 8.62 MIN: 8 / MAX: 8.96 MIN: 7.9 / MAX: 82.53 MIN: 7.86 / MAX: 346.64 MIN: 8.49 / MAX: 10.96 MIN: 8.34 / MAX: 225.97 MIN: 9.16 / MAX: 505.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 6 12 18 24 30 SE +/- 0.25, N = 15 12.86 12.89 26.33 15.56 15.69 16.60 15.67 MIN: 12.74 / MAX: 13.68 MIN: 12.79 / MAX: 13.77 MIN: 12.62 / MAX: 127.32 MIN: 12.24 / MAX: 459.8 MIN: 13.13 / MAX: 187.93 MIN: 12.98 / MAX: 103.04 MIN: 12.91 / MAX: 334.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: efficientnet-b0 g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.18, N = 15 3.85 3.85 3.82 3.85 4.21 4.05 4.04 4.22 3.95 3.85 3.85 7.81 4.73 4.18 6.28 5.26 MIN: 3.8 / MAX: 4.62 MIN: 3.82 / MAX: 4.48 MIN: 3.78 / MAX: 4.53 MIN: 3.8 / MAX: 4.6 MIN: 3.96 / MAX: 4.94 MIN: 3.83 / MAX: 5 MIN: 3.81 / MAX: 5.08 MIN: 4 / MAX: 5.58 MIN: 3.76 / MAX: 4.84 MIN: 3.8 / MAX: 4.43 MIN: 3.81 / MAX: 4.62 MIN: 3.73 / MAX: 159.47 MIN: 3.79 / MAX: 418.72 MIN: 4 / MAX: 5.25 MIN: 3.91 / MAX: 337.73 MIN: 3.48 / MAX: 250.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: shufflenet-v2 4090 rep a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.19, N = 15 3.47 3.34 3.34 3.35 3.35 3.55 3.59 5.03 3.41 3.43 3.45 3.42 3.34 3.35 6.82 3.77 3.55 3.45 MIN: 3.33 / MAX: 4.93 MIN: 3.3 / MAX: 3.85 MIN: 3.31 / MAX: 3.77 MIN: 3.31 / MAX: 3.8 MIN: 3.3 / MAX: 3.82 MIN: 3.27 / MAX: 22.86 MIN: 3.3 / MAX: 25.28 MIN: 3.07 / MAX: 228.55 MIN: 3.28 / MAX: 4.87 MIN: 3.3 / MAX: 4.15 MIN: 3.32 / MAX: 3.85 MIN: 3.28 / MAX: 4.19 MIN: 3.3 / MAX: 4.19 MIN: 3.31 / MAX: 3.68 MIN: 3.16 / MAX: 64.72 MIN: 3.02 / MAX: 511.95 MIN: 3.39 / MAX: 5.48 MIN: 3.32 / MAX: 4.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.11, N = 3 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 SE +/- 0.21, N = 15 8.16 8.21 8.00 8.17 8.10 8.34 8.36 9.88 8.39 8.56 8.56 8.58 8.38 8.02 16.22 9.07 8.64 8.48 9.81 MIN: 7.9 / MAX: 8.99 MIN: 8.14 / MAX: 8.84 MIN: 7.94 / MAX: 8.88 MIN: 7.99 / MAX: 8.97 MIN: 7.98 / MAX: 8.84 MIN: 7.99 / MAX: 26.72 MIN: 8.27 / MAX: 9.08 MIN: 8.14 / MAX: 251.77 MIN: 8 / MAX: 10.29 MIN: 8.17 / MAX: 10.28 MIN: 8.15 / MAX: 9.8 MIN: 8.13 / MAX: 9.78 MIN: 8.31 / MAX: 8.86 MIN: 7.95 / MAX: 8.63 MIN: 7.74 / MAX: 314.84 MIN: 7.61 / MAX: 402.49 MIN: 8.28 / MAX: 10.42 MIN: 8.09 / MAX: 9.64 MIN: 7.82 / MAX: 241.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 4 8 12 16 20 SE +/- 0.23, N = 3 7.67 7.27 7.25 7.12 7.09 14.27 7.57 9.51 9.38 7.48 MIN: 7.06 / MAX: 9.96 MIN: 6.73 / MAX: 8.77 MIN: 6.72 / MAX: 8.05 MIN: 7.04 / MAX: 7.97 MIN: 7.02 / MAX: 7.86 MIN: 7.01 / MAX: 51.13 MIN: 6.69 / MAX: 10 MIN: 7.11 / MAX: 307.17 MIN: 6.77 / MAX: 224.11 MIN: 6.85 / MAX: 9.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: blazeface 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 0.6053 1.2106 1.8159 2.4212 3.0265 SE +/- 0.04, N = 3 1.38 2.69 1.40 1.35 1.42 1.40 MIN: 1.36 / MAX: 1.73 MIN: 1.35 / MAX: 48.81 MIN: 1.28 / MAX: 1.91 MIN: 1.28 / MAX: 1.84 MIN: 1.36 / MAX: 2.03 MIN: 1.34 / MAX: 1.86 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 1.3478 2.6956 4.0434 5.3912 6.739 SE +/- 0.13, N = 13 3.26 3.31 3.05 3.26 3.13 3.18 5.99 3.44 3.62 3.34 3.17 MIN: 3.13 / MAX: 4.7 MIN: 3.16 / MAX: 3.93 MIN: 2.94 / MAX: 3.56 MIN: 3.12 / MAX: 4.74 MIN: 3.09 / MAX: 3.68 MIN: 3.13 / MAX: 3.61 MIN: 3.05 / MAX: 26.81 MIN: 2.65 / MAX: 361.91 MIN: 3.47 / MAX: 4.24 MIN: 3.19 / MAX: 3.99 MIN: 3.04 / MAX: 4.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: blazeface 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 0.5693 1.1386 1.7079 2.2772 2.8465 SE +/- 0.48, N = 3 1.43 1.32 1.31 1.39 1.38 2.53 2.48 1.30 1.42 1.42 MIN: 1.36 / MAX: 2.02 MIN: 1.26 / MAX: 2.03 MIN: 1.25 / MAX: 1.76 MIN: 1.37 / MAX: 1.48 MIN: 1.35 / MAX: 1.64 MIN: 1.08 / MAX: 118.73 MIN: 1.17 / MAX: 344.52 MIN: 1.24 / MAX: 1.92 MIN: 1.34 / MAX: 2.37 MIN: 1.34 / MAX: 1.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 3090 3070 RTX 3070 Ti 4090 4090 rep nv 4090 1.3433 2.6866 4.0299 5.3732 6.7165 SE +/- 0.17, N = 15 3.16 5.97 3.52 3.36 3.44 4.81 MIN: 3.12 / MAX: 3.67 MIN: 2.84 / MAX: 111.8 MIN: 2.95 / MAX: 536.1 MIN: 3.21 / MAX: 4.83 MIN: 3.3 / MAX: 4.34 MIN: 3.13 / MAX: 149.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 1.332 2.664 3.996 5.328 6.66 SE +/- 0.53, N = 3 3.15 5.92 3.83 4.75 3.38 3.29 MIN: 3.1 / MAX: 3.75 MIN: 3.16 / MAX: 103.24 MIN: 3.11 / MAX: 343.21 MIN: 2.93 / MAX: 147.66 MIN: 3.2 / MAX: 4 MIN: 3.12 / MAX: 4.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.20, N = 15 3.36 3.36 6.30 3.92 3.47 5.18 3.37 MIN: 3.32 / MAX: 3.82 MIN: 3.33 / MAX: 3.83 MIN: 3.28 / MAX: 147.57 MIN: 3.12 / MAX: 496.78 MIN: 3.33 / MAX: 5.01 MIN: 3.45 / MAX: 200.36 MIN: 3.25 / MAX: 5.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: squeezenet_ssd 4090 rep a b c d f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 nv 4090 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.24, N = 15 7.98 7.07 7.06 7.10 7.09 7.23 7.13 8.96 7.73 7.86 7.62 7.55 7.52 7.12 13.20 8.13 7.86 9.11 MIN: 7.32 / MAX: 16.07 MIN: 7.01 / MAX: 8.07 MIN: 7.01 / MAX: 7.55 MIN: 7.05 / MAX: 7.65 MIN: 6.99 / MAX: 9.39 MIN: 7.15 / MAX: 8.02 MIN: 7.04 / MAX: 8.43 MIN: 6.92 / MAX: 244.02 MIN: 7.13 / MAX: 9.7 MIN: 7.22 / MAX: 10.84 MIN: 7.01 / MAX: 8.84 MIN: 7 / MAX: 8.72 MIN: 7.45 / MAX: 7.74 MIN: 7.05 / MAX: 7.63 MIN: 6.9 / MAX: 68.61 MIN: 6.37 / MAX: 399.11 MIN: 7.25 / MAX: 8.98 MIN: 6.77 / MAX: 101.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: FastestDet 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.15, N = 3 4.07 7.23 4.14 4.62 4.59 5.86 MIN: 4.04 / MAX: 4.25 MIN: 3.75 / MAX: 121.71 MIN: 3.73 / MAX: 5.07 MIN: 4.48 / MAX: 5.16 MIN: 4.44 / MAX: 5.2 MIN: 3.9 / MAX: 190.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 1.3253 2.6506 3.9759 5.3012 6.6265 SE +/- 0.02, N = 3 3.32 5.89 3.48 3.56 5.23 3.43 MIN: 3.29 / MAX: 3.62 MIN: 3.19 / MAX: 97.88 MIN: 3.33 / MAX: 5.22 MIN: 3.43 / MAX: 4.24 MIN: 3.34 / MAX: 185.57 MIN: 3.29 / MAX: 5.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 1.2285 2.457 3.6855 4.914 6.1425 SE +/- 0.18, N = 15 3.29 3.27 3.14 3.28 3.12 3.17 5.46 3.66 3.48 3.36 3.42 MIN: 3.12 / MAX: 3.99 MIN: 3.08 / MAX: 4.68 MIN: 3 / MAX: 3.85 MIN: 3.11 / MAX: 4.26 MIN: 3.07 / MAX: 3.62 MIN: 3.12 / MAX: 3.89 MIN: 3.27 / MAX: 38.65 MIN: 2.73 / MAX: 398.42 MIN: 3.32 / MAX: 4.99 MIN: 3.17 / MAX: 4.8 MIN: 3.15 / MAX: 25.1 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 1.2353 2.4706 3.7059 4.9412 6.1765 SE +/- 0.20, N = 15 3.15 3.17 5.49 3.69 5.25 3.60 3.27 MIN: 3.11 / MAX: 3.78 MIN: 3.12 / MAX: 3.78 MIN: 2.97 / MAX: 152.08 MIN: 3.07 / MAX: 544.13 MIN: 3.11 / MAX: 367.53 MIN: 3.44 / MAX: 4.27 MIN: 3.11 / MAX: 4.1 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 2 4 6 8 10 SE +/- 0.08, N = 3 3.84 6.63 4.17 4.63 4.10 5.82 MIN: 3.8 / MAX: 4.67 MIN: 3.75 / MAX: 22.34 MIN: 3.86 / MAX: 5.52 MIN: 4.38 / MAX: 6.01 MIN: 3.88 / MAX: 5.04 MIN: 3.98 / MAX: 197.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mnasnet a b c d e f g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 1.1453 2.2906 3.4359 4.5812 5.7265 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.16, N = 15 2.98 2.95 2.96 2.97 2.96 2.97 2.98 3.20 3.07 3.08 3.07 3.06 2.99 2.97 5.09 3.40 3.18 3.15 4.70 MIN: 2.92 / MAX: 4.03 MIN: 2.92 / MAX: 3.42 MIN: 2.93 / MAX: 3.41 MIN: 2.92 / MAX: 3.34 MIN: 2.91 / MAX: 5.9 MIN: 2.93 / MAX: 3.66 MIN: 2.94 / MAX: 3.65 MIN: 3.07 / MAX: 3.86 MIN: 2.93 / MAX: 4.63 MIN: 2.94 / MAX: 3.67 MIN: 2.95 / MAX: 4.19 MIN: 2.92 / MAX: 3.73 MIN: 2.96 / MAX: 3.14 MIN: 2.94 / MAX: 3.28 MIN: 2.86 / MAX: 53.75 MIN: 2.72 / MAX: 432.18 MIN: 3.05 / MAX: 4.64 MIN: 3 / MAX: 4.54 MIN: 3 / MAX: 188.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 b c f g i 4080 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 1.2105 2.421 3.6315 4.842 6.0525 SE +/- 0.22, N = 14 3.16 3.16 3.15 3.16 3.29 3.28 3.31 3.27 3.15 3.15 5.38 3.76 3.25 3.53 3.35 MIN: 3.12 / MAX: 3.69 MIN: 3.12 / MAX: 3.7 MIN: 3.11 / MAX: 3.48 MIN: 3.11 / MAX: 3.93 MIN: 3.15 / MAX: 4.32 MIN: 3.14 / MAX: 3.89 MIN: 3.16 / MAX: 5.3 MIN: 3.14 / MAX: 4.63 MIN: 3.11 / MAX: 3.71 MIN: 3.11 / MAX: 3.6 MIN: 2.74 / MAX: 121.29 MIN: 2.89 / MAX: 366.04 MIN: 3.11 / MAX: 4.74 MIN: 3.2 / MAX: 40.81 MIN: 3.21 / MAX: 5.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 1.2578 2.5156 3.7734 5.0312 6.289 SE +/- 0.20, N = 15 3.48 3.44 3.34 3.44 3.32 3.34 5.59 3.89 3.52 3.48 3.50 MIN: 3.34 / MAX: 4.88 MIN: 3.31 / MAX: 4.32 MIN: 3.22 / MAX: 3.97 MIN: 3.31 / MAX: 4.85 MIN: 3.29 / MAX: 3.79 MIN: 3.3 / MAX: 3.79 MIN: 3.32 / MAX: 42.33 MIN: 3.08 / MAX: 345.39 MIN: 3.38 / MAX: 4.23 MIN: 3.34 / MAX: 4.1 MIN: 3.37 / MAX: 4.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: FastestDet 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 1.008 2.016 3.024 4.032 5.04 SE +/- 0.29, N = 15 4.08 4.10 4.48 4.25 2.82 3.91 4.01 MIN: 4.04 / MAX: 4.2 MIN: 4.06 / MAX: 4.2 MIN: 2.2 / MAX: 27.6 MIN: 2.46 / MAX: 526.3 MIN: 2.69 / MAX: 3.5 MIN: 3.77 / MAX: 5.87 MIN: 3.87 / MAX: 5.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: mnasnet 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 1.0328 2.0656 3.0984 4.1312 5.164 SE +/- 0.14, N = 15 3.10 2.97 3.07 3.06 3.03 3.05 3.08 2.96 2.99 4.59 3.34 3.13 3.16 MIN: 2.97 / MAX: 3.71 MIN: 2.93 / MAX: 3.88 MIN: 2.93 / MAX: 3.84 MIN: 2.94 / MAX: 3.67 MIN: 2.91 / MAX: 4.45 MIN: 2.91 / MAX: 3.67 MIN: 2.93 / MAX: 4.42 MIN: 2.93 / MAX: 3.31 MIN: 2.96 / MAX: 3.32 MIN: 2.88 / MAX: 20.12 MIN: 2.68 / MAX: 393.6 MIN: 3.01 / MAX: 3.62 MIN: 3.02 / MAX: 4.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: shufflenet-v2 g b c f i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 1.1003 2.2006 3.3009 4.4012 5.5015 SE +/- 0.22, N = 15 3.33 3.33 3.33 3.33 3.36 3.44 3.44 3.51 3.37 3.34 3.36 4.89 3.95 3.34 3.40 3.17 MIN: 3.29 / MAX: 4.1 MIN: 3.3 / MAX: 3.77 MIN: 3.31 / MAX: 3.81 MIN: 3.29 / MAX: 3.99 MIN: 3.25 / MAX: 4.02 MIN: 3.31 / MAX: 4.88 MIN: 3.32 / MAX: 4.16 MIN: 3.37 / MAX: 4.26 MIN: 3.25 / MAX: 3.95 MIN: 3.31 / MAX: 3.6 MIN: 3.32 / MAX: 3.7 MIN: 3.04 / MAX: 18.32 MIN: 3.19 / MAX: 410.41 MIN: 3.23 / MAX: 4.78 MIN: 3.26 / MAX: 4.84 MIN: 3.04 / MAX: 3.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: blazeface 4090 g i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 rep nv 4090 0.3983 0.7966 1.1949 1.5932 1.9915 SE +/- 0.12, N = 14 1.43 1.38 1.41 1.43 1.41 1.42 1.42 1.36 1.39 1.77 1.49 1.46 1.18 MIN: 1.36 / MAX: 2.04 MIN: 1.35 / MAX: 2.09 MIN: 1.35 / MAX: 2.02 MIN: 1.36 / MAX: 2.06 MIN: 1.34 / MAX: 2.1 MIN: 1.35 / MAX: 2 MIN: 1.34 / MAX: 2.84 MIN: 1.34 / MAX: 1.46 MIN: 1.37 / MAX: 1.52 MIN: 1.08 / MAX: 12.53 MIN: 1.05 / MAX: 379.08 MIN: 1.39 / MAX: 2.91 MIN: 1.11 / MAX: 1.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Phoronix Test Suite v10.8.5