RTX 4070 SUPER

sudo apt install vulkan-headers vulkan-tools libvulkan-dev

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2412102-NE-INTELGPU716
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Additional Graphs

Show Perf Per Core/Thread Calculation Graphs Where Applicable
Show Perf Per Clock Calculation Graphs Where Applicable

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
NVIDIA RTX 4070 SUPER
January 25
  21 Hours, 7 Minutes
Intel ARC A770 8Gb
December 07
  11 Hours, 47 Minutes
Intel ARC A750
December 07
  1 Day, 7 Hours, 54 Minutes
intel-gpu
December 05
  10 Minutes
nvidia-gpu
December 05
  10 Minutes
Intel ARC A580
December 09
  23 Hours, 35 Minutes
Invert Behavior (Only Show Selected Data)
  14 Hours, 47 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


RTX 4070 SUPERProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionNVIDIA RTX 4070 SUPERintel-gpunvidia-gpuIntel ARC A770 8GbIntel ARC A750Intel ARC A580Intel Core i9-13900K @ 5.50GHz (24 Cores / 32 Threads)ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS)Intel Device 7a2732GB4001GB Seagate ZP4000GP304001ASUS NVIDIA GeForce RTX 4070 SUPER 12GBRealtek ALC1220ARZOPAIntel I226-V + Intel Device 7a70EndeavourOS rolling6.7.1-arch1-1 (x86_64)KDE Plasma 5.27.10X Server 1.21.1.11NVIDIA 550.40.074.6.0OpenCL 3.0 CUDA 12.4.74GCC 13.2.1 20230801ext41920x1080Intel Core i5-10300H @ 4.50GHz (4 Cores / 8 Threads)CML Stonic_CMS (V1.00 BIOS)Intel Comet Lake PCH16GB1000GB CT1000P3SSD8 + 256GB Western Digital PC SN530 SDBPNPZ-256G-1014Intel UHD CML GT2 4GB (1350/6000MHz)Intel Comet Lake PCH cAVSRealtek Killer E2600 GbE + Intel Comet Lake PCH CNVi WiFiUbuntu 24.046.8.0-49-generic (x86_64)GNOME Shell 46.0X Server 1.20.13NVIDIA 535.183.014.6 Mesa 24.0.9-0ubuntu0.2GCC 13.2.0NVIDIA GeForce GTX 1650 Ti 4GB4.6.0Intel Core Ultra 9 285K @ 5.10GHz (24 Cores)MSI MEG Z890 UNIFY-X (MS-7E20) v1.0 (1.A10 BIOS)Intel Device ae7f2 x 16GB DDR5-6000MT/s Corsair CMH32GX5M2B6000Z301024GB Wodposit NVMe SSDMSI Intel Arc A770 DG2 8GBIntel DG2 AudioPiKVM V3Realtek Device 5000 + Intel Wi-Fi 7Ubuntu 24.106.12.1-061201-generic (x86_64)GNOME Shell 47.0X Server + Wayland4.6 Mesa 24.3.1 kisak-mesa PPAGCC 14.2.0Intel Arc A750 DG2 8GBOpenCL 3.0Intel Arc A580 DG2 8GB1280x720OpenBenchmarking.orgKernel Details- NVIDIA RTX 4070 SUPER: Transparent Huge Pages: always- intel-gpu: Transparent Huge Pages: madvise- nvidia-gpu: Transparent Huge Pages: madvise- Intel ARC A770 8Gb: Transparent Huge Pages: madvise- Intel ARC A750: Transparent Huge Pages: madvise- Intel ARC A580: Transparent Huge Pages: madviseCompiler Details- NVIDIA RTX 4070 SUPER: --disable-libssp --disable-libstdcxx-pch --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=ada,c,c++,d,fortran,go,lto,objc,obj-c++ --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-build-config=bootstrap-lto --with-linker-hash-style=gnu - Intel ARC A770 8Gb: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Intel ARC A750: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Intel ARC A580: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- NVIDIA RTX 4070 SUPER: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x11d- intel-gpu: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0xfc - Thermald 2.5.6- nvidia-gpu: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0xfc - Thermald 2.5.6- Intel ARC A770 8Gb: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8- Intel ARC A750: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8- Intel ARC A580: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8Graphics Details- NVIDIA RTX 4070 SUPER: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.04.69.00.c1- intel-gpu: BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 90.17.4c.00.1d- nvidia-gpu: BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 90.17.4c.00.1dSecurity Details- NVIDIA RTX 4070 SUPER: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - intel-gpu: gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + reg_file_data_sampling: Not affected + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: SW loop KVM: SW loop + srbds: Mitigation of Microcode + tsx_async_abort: Not affected - nvidia-gpu: gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + reg_file_data_sampling: Not affected + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: SW loop KVM: SW loop + srbds: Mitigation of Microcode + tsx_async_abort: Not affected - Intel ARC A770 8Gb: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - Intel ARC A750: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - Intel ARC A580: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected Environment Details- nvidia-gpu: __GLX_VENDOR_LIBRARY_NAME=nvidiaPython Details- Intel ARC A770 8Gb, Intel ARC A750, Intel ARC A580: Python 3.12.7

RTX 4070 SUPERncnn: Vulkan GPU - googlenetncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - regnety_400mopencl-benchmark: INT32 Computehashcat: SHA1hashcat: 7-Zipviennacl: OpenCL BLAS - sAXPYclpeak: Integer Compute INTopencl-benchmark: FP32 Computehashcat: SHA-512clpeak: Single-Precision Floatviennacl: OpenCL BLAS - sCOPYcl-mem: Readunigine-valley: 1920 x 1080 - Fullscreen - OpenGLvkfft: FFT + iFFT C2C Bluestein in single precisionhashcat: TrueCrypt RIPEMD160 + XTSviennacl: OpenCL BLAS - sDOTindigobench: OpenCL GPU - Supercarncnn: Vulkan GPU - vgg16hashcat: MD5opencl-benchmark: Memory Bandwidth Coalesced Readindigobench: OpenCL GPU - Bedroomrealsr-ncnn: 4x - Yesviennacl: CPU BLAS - sDOTncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - squeezenet_ssdwaifu2x-ncnn: 2x - 3 - Yesvkfft: FFT + iFFT R2C / C2Rtensorflow: GPU - 64 - GoogLeNettensorflow: GPU - 32 - GoogLeNetvkfft: FFT + iFFT C2C multidimensional in single precisionopencl-benchmark: INT8 Computetensorflow: GPU - 16 - VGG-16tensorflow: GPU - 32 - VGG-16viennacl: CPU BLAS - sCOPYtensorflow: GPU - 16 - GoogLeNettensorflow: GPU - 64 - ResNet-50opencl-benchmark: INT16 Computetensorflow: GPU - 32 - ResNet-50ncnn: Vulkan GPU - alexnetcl-mem: Writetensorflow: GPU - 16 - ResNet-50vkpeak: int32-vec4vkpeak: int32-scalarvkpeak: fp16-scalarvkpeak: int16-scalarvkpeak: fp16-vec4vkpeak: fp32-vec4vkpeak: int16-vec4vkpeak: fp32-scalartensorflow: GPU - 64 - AlexNetncnn: Vulkan GPU - yolov4-tinytensorflow: GPU - 32 - AlexNetviennacl: CPU BLAS - sAXPYcl-mem: Copyviennacl: CPU BLAS - dDOTtensorflow: GPU - 16 - AlexNettensorflow: GPU - 1 - VGG-16vkfft: FFT + iFFT C2C 1D batched in single precisionviennacl: CPU BLAS - dCOPYtensorflow: GPU - 1 - GoogLeNettensorflow: GPU - 1 - ResNet-50viennacl: CPU BLAS - dGEMM-TNluxmark: GPU - Microphoneluxmark: CPU+GPU - Hotelluxmark: GPU - Hotelluxmark: CPU+GPU - Microphoneluxmark: CPU+GPU - Luxball HDRluxmark: GPU - Luxball HDRvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingvkpeak: fp32-scalarvkpeak: fp16-vec4vkpeak: fp32-vec4vkpeak: fp16-scalarshoc: OpenCL - MD5 Hashshoc: OpenCL - Texture Read Bandwidthshoc: OpenCL - GEMM SGEMM_Nopencl-benchmark: FP16 Computeopencl-benchmark: Memory Bandwidth Coalesced Writetensorflow: GPU - 1 - AlexNetxonotic: 1920 x 1080 - Lowviennacl: CPU BLAS - dGEMM-NNclpeak: Global Memory Bandwidthspecviewperf2020: 1920 x 1080 - ENERGY-03vkresample: 2x - Singleopenarena: 1920 x 1080paraview: Many Spheres - 1920 x 1080viennacl: CPU BLAS - dAXPYxonotic: 1920 x 1080 - Highparaview: Wavelet Contour - 1920 x 1080xonotic: 1920 x 1080 - Ultraunigine-heaven: 1920 x 1080 - Fullscreen - OpenGLviennacl: CPU BLAS - dGEMM-TTviennacl: CPU BLAS - dGEMM-NTshoc: OpenCL - Reductionncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3vkfft: FFT + iFFT C2C Bluestein in single precisionspecviewperf2020: 1920 x 1080 - MEDICAL-O3specviewperf2020: 1920 x 1080 - CATIA-06xonotic: 1920 x 1080 - Ultimatespecviewperf2020: 1920 x 1080 - MAYA-06shoc: OpenCL - S3Dviennacl: CPU BLAS - dGEMV-Tspecviewperf2020: 1920 x 1080 - SOLIDWORKS-07vkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT R2C / C2Rindigobench: CPU - Supercardarktable: Server Rack - OpenCLtensorflow: GPU - 64 - AlexNetspecviewperf2020: 1920 x 1080 - SNX-04darktable: Server Room - OpenCLshoc: OpenCL - Triadspecviewperf2020: 1920 x 1080 - CREO-03tensorflow: GPU - 1 - VGG-16viennacl: CPU BLAS - dGEMV-Nindigobench: CPU - Bedroomtensorflow: GPU - 1 - AlexNettensorflow: GPU - 16 - GoogLeNettensorflow: GPU - 32 - GoogLeNetdarktable: Masskrug - OpenCLdarktable: Server Room - CPU-onlydarktable: Server Rack - CPU-onlydarktable: Boat - OpenCLdarktable: Boat - CPU-onlyshoc: OpenCL - FFT SPtensorflow: GPU - 1 - GoogLeNettensorflow: GPU - 32 - VGG-16tensorflow: GPU - 1 - ResNet-50vkfft: FFT + iFFT C2C 1D batched in single precisiontensorflow: GPU - 16 - AlexNettensorflow: GPU - 32 - AlexNetshoc: OpenCL - Bus Speed Readbackvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingdarktable: Masskrug - CPU-onlyshoc: OpenCL - Bus Speed Downloadtensorflow: GPU - 64 - ResNet-50tensorflow: GPU - 64 - GoogLeNettensorflow: GPU - 32 - ResNet-50tensorflow: GPU - 16 - ResNet-50tensorflow: GPU - 64 - VGG-16tensorflow: GPU - 16 - VGG-16tensorflow: GPU - 64 - VGG-16neatbench: GPUblender: Pabellon Barcelona - NVIDIA OptiXblender: Barbershop - NVIDIA OptiXblender: Fishy Cat - NVIDIA OptiXblender: Classroom - NVIDIA OptiXblender: BMW27 - NVIDIA OptiXviennacl: OpenCL BLAS - dGEMM-TTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dCOPYclpeak: Double-Precision Doublefahbench: vkresample: 2x - Doublevkfft: FFT + iFFT C2C Bluestein benchmark in double precisionvkfft: FFT + iFFT C2C 1D batched in double precisionopencl-benchmark: FP64 Computeshoc: OpenCL - Max SP Flopsparaview: Wavelet Contour - 1920 x 1080paraview: Wavelet Volume - 1920 x 1080paraview: Wavelet Volume - 1920 x 1080paraview: Many Spheres - 1920 x 1080vkfft: FFT + iFFT C2C 1D batched in half precisionncnn: Vulkan GPU - FastestDetncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU-v2-v2 - mobilenet-v2mandelgpu: GPUfinancebench: Black-Scholes OpenCLvkfft: FFT + iFFT C2C 1D batched in half precisionrealsr-ncnn: 4x - Noopencl-benchmark: INT64 ComputeNVIDIA RTX 4070 SUPERintel-gpunvidia-gpuIntel ARC A770 8GbIntel ARC A750Intel ARC A58011.048.62844.618.9711.1119.88922132600000117646739218170.5438.594323273333335492.69334446.21516680296737052.813117.8167583033333464.8619.80134.88516546.266.862.8555479415.5215.615029914.3071.481.5013215.675.5517.1705.5116.17407.55.4633.9763.8233.4156331.896.831.591.357392970.812.624.3511575078455.0113.92119437.6518.48987.2122117109102407014.2951.309.4512.605.57613599584577389210458437423630.11366.0576339.5934451243170.6212.860.845.073.852.312.253.03587219538.25.9121317056.3234.2149.9830474.8164105.3579.79119.0547.11453.8046.0461.86480.693.18100.405.14683.923.304853.464675.0624468.279201.6438575.1111171.139624.6617337.5449.1512277.656.813813418.19778.813312779.7910610188.9927.0195.5517.295.9315.4657.309.80893.1076.44116.3543.81243.665.107540185000024663388.14885.3610.22894346666711380.9198.0153.7226.173557332840013319.70845.5431010500000203.739.30266.06681.387.5689.545.2913254427.8426.47330119.4702.392.4283.724.058.2325.3908.0722.00280.17.834242.304082.1121413.298053.0833769.389779.718426.5915200.6244.8248.7742.79122269.676.739.831.705880056.415.485.271394598813262132364584460539602516400416868.9322840.7313898.3221379.2222.7249883.4622049.2015.616398.1515.79932.7899439134396.7231.1718.894469.572.2778.7777.9338302278.15735.3422777224.51813312574.527276.44542540.9747.85551.3503947165.76224.71710573.13315323121812.5270.17245.57172.031.36218.035562.651.711014.93215.9323.9526.521.6901.3810.1732.9002.9051187.3615.752.435.255862539.6243.0122.4417631131.68918.65658.2027.958.087.872.432.402.4328048522898.6743901.283243.837245.4477057161.4711.5154.5612.005.049.3738.33279683403.24.84210042610.3560.945106.3280.53119.0047.58462.804.096457602500024966794.84133.839.0028072666679758.86116144.0216.331509327790018620.16646.1726539200000187.209.25772.28281.393.48101.575.5942994126.46301968.7012.392.4283.324.1024.05623.41292.87.833636.353504.1318359.496905.6128957.528389.627230.5413055.9249.5642.85122259.777.839.551.695898656.515.715.231373815711010110153822450788505886353014354.2119567.5411908.7318324.9219.5280762.7031777.1813.635408.9015.781051.5133602133388.8527.7420.412524.265.0378.6856.7241455252.66809.4562041205.25013312769.475180.53513439.1345.84575.5149048159.42216.38510570.72305213038812.8580.17644.61168.751.38517.772961.801.731014.97815.8124.1326.341.6801.3730.1722.9162.9201192.9915.682.425.235842639.7443.0922.4252631571.68818.65597.872.432.4023452242633.0443692.084230.756519.4527234693.5424.06106.5615.7037.0529.0451.04248630520.14.8166989511.0991.013OpenBenchmarking.org

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER20406080100SE +/- 1.35, N = 3SE +/- 2.20, N = 3SE +/- 0.15, N = 3SE +/- 1.21, N = 9106.3293.10105.3511.04MIN: 8.23 / MAX: 114.52MIN: 8.42 / MAX: 113.98MIN: 8.44 / MAX: 114.62MIN: 5.28 / MAX: 1769.191. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER20406080100SE +/- 0.40, N = 3SE +/- 0.84, N = 3SE +/- 0.36, N = 3SE +/- 0.47, N = 980.5376.4479.798.62MIN: 12.67 / MAX: 84.4MIN: 9.56 / MAX: 84MIN: 21.72 / MAX: 84.6MIN: 6.42 / MAX: 1101.31. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformerIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER2004006008001000SE +/- 0.39, N = 3SE +/- 0.39, N = 3SE +/- 0.24, N = 3SE +/- 87.53, N = 9119.00116.35119.05844.61MIN: 46.34 / MAX: 1866.931. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18Intel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER1122334455SE +/- 0.19, N = 3SE +/- 0.87, N = 3SE +/- 0.89, N = 3SE +/- 3.49, N = 947.5843.8147.118.97MIN: 5.09 / MAX: 52.05MIN: 5.02 / MAX: 51.96MIN: 4.95 / MAX: 51.24MIN: 3.94 / MAX: 922.041. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER100200300400500SE +/- 2.21, N = 3SE +/- 13.66, N = 3SE +/- 11.18, N = 3SE +/- 3.28, N = 9462.80243.66453.8011.11MIN: 23.85 / MAX: 530.99MIN: 23.78 / MAX: 525.86MIN: 23.74 / MAX: 528.851. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT32 ComputeIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER510152025SE +/- 0.018, N = 3SE +/- 0.011, N = 3SE +/- 0.002, N = 34.0965.10719.8891. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT32 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1Intel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER5000M10000M15000M20000M25000MSE +/- 49049760.02, N = 4SE +/- 65365351.42, N = 4SE +/- 5140363.15, N = 34576025000540185000022132600000

Benchmark: SHA1

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-ZipIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER300K600K900K1200K1500KSE +/- 66.67, N = 3SE +/- 240.37, N = 3SE +/- 1991.93, N = 32496672466331176467

Benchmark: 7-Zip

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER90180270360450SE +/- 1.62, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 394.888.1392.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer Compute INTIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER4K8K12K16K20KSE +/- 3.17, N = 3SE +/- 2.34, N = 3SE +/- 3.14, N = 34133.834885.3618170.541. (CXX) g++ options: -O3

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP32 ComputeIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER918273645SE +/- 0.013, N = 3SE +/- 0.010, N = 3SE +/- 0.031, N = 39.00210.22838.5941. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: FP32 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512Intel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER700M1400M2100M2800M3500MSE +/- 1166666.67, N = 3SE +/- 5228554.08, N = 3SE +/- 1530068.99, N = 38072666679434666673232733333

Benchmark: SHA-512

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision FloatIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER8K16K24K32K40KSE +/- 1.40, N = 3SE +/- 3.31, N = 3SE +/- 0.99, N = 39758.8611380.9135492.691. (CXX) g++ options: -O3

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER70140210280350SE +/- 1.53, N = 3SE +/- 0.09, N = 3SE +/- 0.33, N = 3116.098.0334.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER100200300400500SE +/- 0.03, N = 3SE +/- 0.10, N = 3SE +/- 0.12, N = 3144.0153.7446.21. (CC) gcc options: -O2 -flto -lOpenCL

Benchmark: Read

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

Unigine Valley

This test calculates the average frame-rate within the Valley demo for the Unigine engine, released in February 2013. This engine is extremely demanding on the system's graphics card. Unigine Valley relies upon an OpenGL 3 core profile context. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterUnigine Valley 1.0Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGLIntel ARC A580Intel ARC A750nvidia-gpuintel-gpu50100150200250SE +/- 0.14945, N = 3SE +/- 0.79172, N = 3SE +/- 0.36174, N = 3SE +/- 0.00191, N = 3216.33100226.1730074.816409.98304

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein in single precisionIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER3K6K9K12K15KSE +/- 66.00, N = 12SE +/- 3.76, N = 3SE +/- 102.52, N = 35093557315166-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C Bluestein in single precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER200K400K600K800K1000KSE +/- 57.74, N = 3SE +/- 200.00, N = 3SE +/- 633.33, N = 3277900328400802967

Benchmark: TrueCrypt RIPEMD160 + XTS

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER80160240320400SE +/- 2.40, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 31861333701. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: SupercarIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER1224364860SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.03, N = 320.1719.7152.81

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16Intel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER306090120150SE +/- 0.28, N = 3SE +/- 0.12, N = 3SE +/- 0.14, N = 3SE +/- 29.60, N = 946.1745.5446.04117.81MIN: 29.78 / MAX: 48.34MIN: 23.39 / MAX: 48.99MIN: 28.48 / MAX: 48.53MIN: 17.16 / MAX: 647.671. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5Intel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER14000M28000M42000M56000M70000MSE +/- 254278908.55, N = 3SE +/- 260097949.50, N = 3SE +/- 22430807.19, N = 3265392000003101050000067583033333

Benchmark: MD5

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced ReadIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER100200300400500SE +/- 0.63, N = 3SE +/- 0.30, N = 3SE +/- 0.01, N = 3187.20203.73464.861. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: Memory Bandwidth Coalesced Read

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: BedroomIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER510152025SE +/- 0.004, N = 3SE +/- 0.011, N = 3SE +/- 0.009, N = 39.2579.30219.801

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: YesIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER1632486480SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 372.2866.0761.8634.89

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER4080120160200SE +/- 0.92, N = 3SE +/- 0.17, N = 3SE +/- 0.27, N = 3SE +/- 2.73, N = 381.381.380.6165.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50Intel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER20406080100SE +/- 0.83, N = 3SE +/- 1.03, N = 3SE +/- 1.06, N = 3SE +/- 14.70, N = 993.4887.5693.1846.26MIN: 10.45 / MAX: 101.32MIN: 10.81 / MAX: 101.73MIN: 10.54 / MAX: 100.7MIN: 7.71 / MAX: 1829.991. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER20406080100SE +/- 0.58, N = 3SE +/- 1.33, N = 3SE +/- 0.61, N = 3SE +/- 1.76, N = 9101.5789.54100.406.86MIN: 7.63 / MAX: 108.58MIN: 7.64 / MAX: 108.48MIN: 7.63 / MAX: 107.861. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenArena

OpenBenchmarking.orgMilliseconds, Fewer Is BetterOpenArena 0.8.8Resolution: 1920 x 1080 - Total Frame TimeIntel ARC A580Intel ARC A75048121620Min: 1 / Avg: 1.99 / Max: 14Min: 1 / Avg: 2.05 / Max: 14

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: YesIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER1.25872.51743.77615.03486.2935SE +/- 0.011, N = 3SE +/- 0.006, N = 3SE +/- 0.019, N = 3SE +/- 0.014, N = 35.5945.2915.1462.855

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT R2C / C2RIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER12K24K36K48K60KSE +/- 257.48, N = 15SE +/- 57.59, N = 3SE +/- 702.53, N = 15299413254454794-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT R2C / C2R

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: GoogLeNetIntel ARC A750NVIDIA RTX 4070 SUPER714212835SE +/- 0.06, N = 327.8415.52

Device: GPU - Batch Size: 64 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: GoogLeNetIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER612182430SE +/- 0.06, N = 3SE +/- 0.04, N = 3SE +/- 0.01, N = 226.4626.4715.61

Device: GPU - Batch Size: 32 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C multidimensional in single precisionIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER11K22K33K44K55KSE +/- 408.80, N = 13SE +/- 17.37, N = 3SE +/- 407.19, N = 15301963301150299-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C multidimensional in single precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT8 ComputeIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER48121620SE +/- 0.057, N = 3SE +/- 0.057, N = 3SE +/- 0.046, N = 38.7019.47014.3071. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT8 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: VGG-16Intel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER0.53781.07561.61342.15122.689SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 22.392.391.48

Device: GPU - Batch Size: 16 - Model: VGG-16

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: VGG-16Intel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER0.54451.0891.63352.1782.7225SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 32.422.421.50

Device: GPU - Batch Size: 32 - Model: VGG-16

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER306090120150SE +/- 0.20, N = 3SE +/- 0.20, N = 3SE +/- 0.41, N = 3SE +/- 1.20, N = 383.383.783.9132.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: GoogLeNetIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER612182430SE +/- 0.20, N = 3SE +/- 0.08, N = 3SE +/- 0.03, N = 324.1024.0515.67

Device: GPU - Batch Size: 16 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: ResNet-50Intel ARC A750NVIDIA RTX 4070 SUPER246810SE +/- 0.01, N = 3SE +/- 0.01, N = 28.235.55

Device: GPU - Batch Size: 64 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT16 ComputeIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER612182430SE +/- 0.17, N = 3SE +/- 1.03, N = 3SE +/- 0.00, N = 324.0625.3917.171. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT16 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: ResNet-50Intel ARC A750NVIDIA RTX 4070 SUPER246810SE +/- 0.03, N = 3SE +/- 0.01, N = 28.075.51

Device: GPU - Batch Size: 32 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER612182430SE +/- 0.05, N = 3SE +/- 0.16, N = 3SE +/- 0.18, N = 3SE +/- 5.86, N = 923.4122.0023.3016.17MIN: 3.58 / MAX: 25.27MIN: 3.54 / MAX: 25.29MIN: 3.6 / MAX: 25.53MIN: 3.52 / MAX: 436.521. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER90180270360450SE +/- 0.03, N = 3SE +/- 0.15, N = 3SE +/- 1.11, N = 3292.8280.1407.51. (CC) gcc options: -O2 -flto -lOpenCL

Benchmark: Write

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: ResNet-50Intel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER246810SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 27.837.835.46

Device: GPU - Batch Size: 16 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

vkpeak

Vkpeak is a Vulkan compute benchmark inspired by OpenCL's clpeak. Vkpeak provides Vulkan compute performance measurements for FP16 / FP32 / FP64 / INT16 / INT32 scalar and vec4 performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-vec4Intel ARC A580Intel ARC A750Intel ARC A770 8Gb10002000300040005000SE +/- 0.06, N = 3SE +/- 0.15, N = 3SE +/- 0.03, N = 33636.354242.304853.46

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-scalarIntel ARC A580Intel ARC A750Intel ARC A770 8Gb10002000300040005000SE +/- 0.11, N = 3SE +/- 0.14, N = 3SE +/- 0.04, N = 33504.134082.114675.06

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-scalarIntel ARC A580Intel ARC A750Intel ARC A770 8Gb5K10K15K20K25KSE +/- 0.10, N = 3SE +/- 0.30, N = 3SE +/- 0.22, N = 318359.4921413.2924468.27

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int16-scalarIntel ARC A580Intel ARC A750Intel ARC A770 8Gb2K4K6K8K10KSE +/- 0.07, N = 3SE +/- 0.05, N = 3SE +/- 0.19, N = 36905.618053.089201.64

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-vec4Intel ARC A580Intel ARC A750Intel ARC A770 8Gb8K16K24K32K40KSE +/- 0.72, N = 3SE +/- 0.80, N = 3SE +/- 0.43, N = 328957.5233769.3838575.11

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-vec4Intel ARC A580Intel ARC A750Intel ARC A770 8Gb2K4K6K8K10KSE +/- 0.09, N = 3SE +/- 0.07, N = 3SE +/- 0.27, N = 38389.629779.7111171.13

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int16-vec4Intel ARC A580Intel ARC A750Intel ARC A770 8Gb2K4K6K8K10KSE +/- 0.13, N = 3SE +/- 0.15, N = 3SE +/- 0.11, N = 37230.548426.599624.66

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-scalarIntel ARC A580Intel ARC A750Intel ARC A770 8Gb4K8K12K16K20KSE +/- 0.91, N = 3SE +/- 0.08, N = 3SE +/- 3.56, N = 313055.9215200.6217337.54

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: AlexNetIntel ARC A750NVIDIA RTX 4070 SUPER1020304050SE +/- 0.18, N = 344.8233.97

Device: GPU - Batch Size: 64 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinyIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER1428425670SE +/- 0.02, N = 3SE +/- 0.09, N = 3SE +/- 0.10, N = 3SE +/- 10.56, N = 949.5648.7749.1563.82MIN: 18.28 / MAX: 52.1MIN: 16.94 / MAX: 52.55MIN: 20.41 / MAX: 52.43MIN: 10.28 / MAX: 858.441. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: AlexNetIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER1020304050SE +/- 0.03, N = 3SE +/- 0.19, N = 3SE +/- 0.15, N = 242.8542.7933.40

Device: GPU - Batch Size: 32 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER306090120150SE +/- 0.88, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 2.19, N = 31221221221561. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER70140210280350SE +/- 0.03, N = 3SE +/- 0.13, N = 3SE +/- 0.03, N = 3259.7269.6331.81. (CC) gcc options: -O2 -flto -lOpenCL

Benchmark: Copy

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER20406080100SE +/- 0.27, N = 3SE +/- 0.15, N = 3SE +/- 0.53, N = 3SE +/- 0.09, N = 377.876.777.696.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: AlexNetIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER918273645SE +/- 0.40, N = 3SE +/- 0.42, N = 3SE +/- 0.17, N = 339.5539.8331.59

Device: GPU - Batch Size: 16 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: VGG-16Intel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER0.38250.7651.14751.531.9125SE +/- 0.00, N = 3SE +/- 0.01, N = 31.691.701.35

Device: GPU - Batch Size: 1 - Model: VGG-16

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precisionIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER16K32K48K64K80KSE +/- 54.76, N = 3SE +/- 38.85, N = 3SE +/- 7.94, N = 3589865880073929-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C 1D batched in single precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER1632486480SE +/- 0.07, N = 3SE +/- 0.06, N = 3SE +/- 0.12, N = 3SE +/- 0.32, N = 356.556.456.870.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: GoogLeNetIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER48121620SE +/- 0.05, N = 3SE +/- 0.13, N = 3SE +/- 0.17, N = 215.7115.4812.62

Device: GPU - Batch Size: 1 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: ResNet-50Intel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER1.18582.37163.55744.74325.929SE +/- 0.04, N = 8SE +/- 0.06, N = 35.235.274.35

Device: GPU - Batch Size: 1 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER306090120150SE +/- 1.86, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 1.00, N = 21371391381151. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

LuxMark

LuxMark is a multi-platform OpenGL benchmark using LuxRender. LuxMark supports targeting different OpenCL devices and has multiple scenes available for rendering. LuxMark is a fully open-source OpenCL program with real-world rendering examples. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: MicrophoneIntel ARC A580Intel ARC A75010K20K30K40K50KSE +/- 58.92, N = 3SE +/- 191.00, N = 33815745988

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: HotelIntel ARC A580Intel ARC A7503K6K9K12K15KSE +/- 1.76, N = 3SE +/- 0.33, N = 31101013262

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: HotelIntel ARC A580Intel ARC A7503K6K9K12K15KSE +/- 11.33, N = 3SE +/- 25.67, N = 31101513236

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: MicrophoneIntel ARC A580Intel ARC A75010K20K30K40K50KSE +/- 9.50, N = 3SE +/- 14.17, N = 33822445844

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: Luxball HDRIntel ARC A580Intel ARC A75013K26K39K52K65KSE +/- 62.27, N = 3SE +/- 154.09, N = 35078860539

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Luxball HDRIntel ARC A580Intel ARC A75013K26K39K52K65KSE +/- 167.49, N = 3SE +/- 144.44, N = 35058860251

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER16K32K48K64K80KSE +/- 4.48, N = 3SE +/- 48.89, N = 3SE +/- 37.77, N = 3635306400475078-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

vkpeak

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-scalarIntel ARC A580Intel ARC A7504K8K12K16K20KSE +/- 6.23, N = 3SE +/- 26.85, N = 314354.2116868.93

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-vec4Intel ARC A580Intel ARC A7505K10K15K20K25KSE +/- 0.47, N = 3SE +/- 0.55, N = 319567.5422840.73

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-vec4Intel ARC A580Intel ARC A7503K6K9K12K15KSE +/- 0.40, N = 3SE +/- 0.79, N = 311908.7313898.32

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-scalarIntel ARC A580Intel ARC A7505K10K15K20K25KSE +/- 0.21, N = 3SE +/- 1.00, N = 318324.9221379.22

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashIntel ARC A580Intel ARC A750510152025SE +/- 0.03, N = 3SE +/- 0.01, N = 319.5322.721. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read BandwidthIntel ARC A580Intel ARC A7502004006008001000SE +/- 0.14, N = 3SE +/- 0.34, N = 3762.70883.461. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_NIntel ARC A580Intel ARC A750400800120016002000SE +/- 3.79, N = 3SE +/- 15.19, N = 151777.182049.201. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP16 ComputeIntel ARC A580Intel ARC A75048121620SE +/- 0.03, N = 3SE +/- 0.00, N = 313.6415.621. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced WriteIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER100200300400500SE +/- 0.52, N = 3SE +/- 0.56, N = 3SE +/- 0.14, N = 3408.90398.15455.011. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: Memory Bandwidth Coalesced Write

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: AlexNetIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER48121620SE +/- 0.16, N = 3SE +/- 0.16, N = 3SE +/- 0.22, N = 215.7815.7913.92

Device: GPU - Batch Size: 1 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Xonotic

This is a benchmark of Xonotic, which is a fork of the DarkPlaces-based Nexuiz game. Development began in March of 2010 on the Xonotic game for this open-source first person shooter title. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: LowIntel ARC A580Intel ARC A7502004006008001000SE +/- 11.78, N = 3SE +/- 7.95, N = 31051.51932.79MIN: 704 / MAX: 1770MIN: 597 / MAX: 1516

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER306090120150SE +/- 0.67, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 4.04, N = 31331341341191. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory BandwidthIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER90180270360450SE +/- 0.13, N = 3SE +/- 0.13, N = 3SE +/- 0.02, N = 3388.85396.72437.651. (CXX) g++ options: -O3

SPECViewPerf 2020

This test runs SPECViewPerf 2020 if available on your system. SPECViewPerf is made up of real-world OpenGL workstation tests such as CATIA and SolidWorks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: ENERGY-03Intel ARC A580Intel ARC A750714212835SE +/- 0.00, N = 3SE +/- 0.00, N = 327.7431.17

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: SingleIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER510152025SE +/- 0.02, N = 3SE +/- 0.04, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 320.4118.8918.2018.491. (CXX) g++ options: -O3

OpenArena

This is a test of OpenArena, a popular open-source first-person shooter. This game is based upon ioquake3, which in turn uses the GPL version of id Software's Quake 3 engine. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterOpenArena 0.8.8Resolution: 1920 x 1080Intel ARC A580Intel ARC A750110220330440550SE +/- 4.78, N = 15SE +/- 6.89, N = 15524.2469.5MIN: 1MIN: 1

ParaView

OpenBenchmarking.orgFrames / Sec, More Is BetterParaView 5.13Test: Many Spheres - Resolution: 1920 x 1080Intel ARC A580Intel ARC A7501632486480SE +/- 0.98, N = 15SE +/- 0.80, N = 565.0372.27

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER20406080100SE +/- 0.03, N = 3SE +/- 0.09, N = 3SE +/- 0.03, N = 3SE +/- 0.12, N = 378.678.778.887.21. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

Xonotic

This is a benchmark of Xonotic, which is a fork of the DarkPlaces-based Nexuiz game. Development began in March of 2010 on the Xonotic game for this open-source first person shooter title. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: HighIntel ARC A580Intel ARC A7502004006008001000SE +/- 6.14, N = 15SE +/- 3.18, N = 3856.72777.93MIN: 458 / MAX: 1375MIN: 437 / MAX: 1200

ParaView

OpenBenchmarking.orgFrames / Sec, More Is BetterParaView 5.13Test: Wavelet Contour - Resolution: 1920 x 1080Intel ARC A580Intel ARC A75060120180240300SE +/- 2.75, N = 4SE +/- 0.99, N = 3252.66278.15

Xonotic

This is a benchmark of Xonotic, which is a fork of the DarkPlaces-based Nexuiz game. Development began in March of 2010 on the Xonotic game for this open-source first person shooter title. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: UltraIntel ARC A580Intel ARC A7502004006008001000SE +/- 3.16, N = 3SE +/- 8.54, N = 3809.46735.34MIN: 319 / MAX: 1303MIN: 308 / MAX: 1171

Unigine Heaven

OpenBenchmarking.orgFrames Per Second, More Is BetterUnigine Heaven 4.0Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGLIntel ARC A580Intel ARC A75050100150200250SE +/- 0.23, N = 3SE +/- 0.14, N = 3205.25224.52

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER306090120150SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.58, N = 3SE +/- 2.08, N = 31331331331221. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER306090120150SE +/- 1.20, N = 3SE +/- 1.53, N = 3SE +/- 0.33, N = 3SE +/- 2.08, N = 31271251271171. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionIntel ARC A580Intel ARC A75020406080100SE +/- 0.64, N = 7SE +/- 0.04, N = 369.4874.531. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3Intel ARC A580Intel ARC A750Intel ARC A770 8Gb20406080100SE +/- 0.40, N = 3SE +/- 0.84, N = 3SE +/- 0.36, N = 380.5376.4479.79MIN: 12.67 / MAX: 84.4MIN: 9.56 / MAX: 84MIN: 21.72 / MAX: 84.61. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein in single precisionIntel ARC A580Intel ARC A75012002400360048006000SE +/- 18.21, N = 3SE +/- 58.43, N = 3513454251. (CXX) g++ options: -O3

SPECViewPerf 2020

This test runs SPECViewPerf 2020 if available on your system. SPECViewPerf is made up of real-world OpenGL workstation tests such as CATIA and SolidWorks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: MEDICAL-O3Intel ARC A580Intel ARC A750918273645SE +/- 0.00, N = 3SE +/- 0.00, N = 339.1340.97

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: CATIA-06Intel ARC A580Intel ARC A7501122334455SE +/- 0.10, N = 3SE +/- 0.05, N = 345.8447.85

Xonotic

This is a benchmark of Xonotic, which is a fork of the DarkPlaces-based Nexuiz game. Development began in March of 2010 on the Xonotic game for this open-source first person shooter title. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: UltimateIntel ARC A580Intel ARC A750120240360480600SE +/- 5.72, N = 3SE +/- 2.85, N = 3575.51551.35MIN: 106 / MAX: 1264MIN: 110 / MAX: 1221

SPECViewPerf 2020

This test runs SPECViewPerf 2020 if available on your system. SPECViewPerf is made up of real-world OpenGL workstation tests such as CATIA and SolidWorks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: MAYA-06Intel ARC A580Intel ARC A7504080120160200SE +/- 0.08, N = 3SE +/- 0.32, N = 3159.42165.76

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3DIntel ARC A580Intel ARC A75050100150200250SE +/- 0.20, N = 3SE +/- 0.74, N = 3216.39224.721. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-TIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER20406080100SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 31051051061091. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

SPECViewPerf 2020

This test runs SPECViewPerf 2020 if available on your system. SPECViewPerf is made up of real-world OpenGL workstation tests such as CATIA and SolidWorks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: SOLIDWORKS-07Intel ARC A580Intel ARC A7501632486480SE +/- 0.00, N = 3SE +/- 0.01, N = 370.7273.13

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C multidimensional in single precisionIntel ARC A580Intel ARC A7507K14K21K28K35KSE +/- 20.17, N = 3SE +/- 251.38, N = 1230521315321. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT R2C / C2RIntel ARC A580Intel ARC A7507K14K21K28K35KSE +/- 118.37, N = 3SE +/- 232.00, N = 1530388312181. (CXX) g++ options: -O3

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: CPU - Scene: SupercarIntel ARC A580Intel ARC A7503691215SE +/- 0.13, N = 3SE +/- 0.03, N = 312.8612.53

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Rack - Acceleration: OpenCLIntel ARC A580Intel ARC A7500.03960.07920.11880.15840.198SE +/- 0.002, N = 3SE +/- 0.000, N = 30.1760.172

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: AlexNetIntel ARC A580Intel ARC A7501020304050SE +/- 0.12, N = 3SE +/- 0.18, N = 344.6145.57

SPECViewPerf 2020

This test runs SPECViewPerf 2020 if available on your system. SPECViewPerf is made up of real-world OpenGL workstation tests such as CATIA and SolidWorks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: SNX-04Intel ARC A580Intel ARC A7504080120160200SE +/- 0.33, N = 3SE +/- 0.08, N = 3168.75172.03

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Room - Acceleration: OpenCLIntel ARC A580Intel ARC A7500.31160.62320.93481.24641.558SE +/- 0.005, N = 3SE +/- 0.001, N = 31.3851.362

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: TriadIntel ARC A580Intel ARC A75048121620SE +/- 0.02, N = 3SE +/- 0.01, N = 317.7718.041. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SPECViewPerf 2020

This test runs SPECViewPerf 2020 if available on your system. SPECViewPerf is made up of real-world OpenGL workstation tests such as CATIA and SolidWorks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: CREO-03Intel ARC A580Intel ARC A7501428425670SE +/- 0.04, N = 3SE +/- 0.06, N = 361.8062.65

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: VGG-16Intel ARC A580Intel ARC A7500.38930.77861.16791.55721.9465SE +/- 0.01, N = 3SE +/- 0.01, N = 31.731.71

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-NIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER20406080100SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 31011011011021. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: CPU - Scene: BedroomIntel ARC A580Intel ARC A7501.12012.24023.36034.48045.6005SE +/- 0.015, N = 3SE +/- 0.046, N = 34.9784.932

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: AlexNetIntel ARC A580Intel ARC A75048121620SE +/- 0.15, N = 6SE +/- 0.11, N = 1215.8115.93

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: GoogLeNetIntel ARC A580Intel ARC A750612182430SE +/- 0.20, N = 3SE +/- 0.03, N = 324.1323.95

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: GoogLeNetIntel ARC A580Intel ARC A750612182430SE +/- 0.06, N = 3SE +/- 0.03, N = 326.3426.52

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Masskrug - Acceleration: OpenCLIntel ARC A580Intel ARC A7500.38030.76061.14091.52121.9015SE +/- 0.006, N = 3SE +/- 0.005, N = 31.6801.690

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Room - Acceleration: CPU-onlyIntel ARC A580Intel ARC A7500.31070.62140.93211.24281.5535SE +/- 0.004, N = 3SE +/- 0.008, N = 31.3731.381

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Rack - Acceleration: CPU-onlyIntel ARC A580Intel ARC A7500.03890.07780.11670.15560.1945SE +/- 0.001, N = 3SE +/- 0.001, N = 30.1720.173

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Boat - Acceleration: OpenCLIntel ARC A580Intel ARC A7500.65611.31221.96832.62443.2805SE +/- 0.004, N = 3SE +/- 0.011, N = 32.9162.900

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Boat - Acceleration: CPU-onlyIntel ARC A580Intel ARC A7500.6571.3141.9712.6283.285SE +/- 0.005, N = 3SE +/- 0.012, N = 32.9202.905

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPIntel ARC A580Intel ARC A75030060090012001500SE +/- 12.21, N = 3SE +/- 9.13, N = 31192.991187.361. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: GoogLeNetIntel ARC A580Intel ARC A75048121620SE +/- 0.07, N = 3SE +/- 0.10, N = 315.6815.75

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: VGG-16Intel ARC A580Intel ARC A7500.54681.09361.64042.18722.734SE +/- 0.01, N = 3SE +/- 0.00, N = 32.422.43

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: ResNet-50Intel ARC A580Intel ARC A7501.18132.36263.54394.72525.9065SE +/- 0.05, N = 4SE +/- 0.05, N = 85.235.25

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precisionIntel ARC A580Intel ARC A75013K26K39K52K65KSE +/- 583.21, N = 3SE +/- 70.72, N = 358426586251. (CXX) g++ options: -O3

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: AlexNetIntel ARC A580Intel ARC A750918273645SE +/- 0.33, N = 8SE +/- 0.26, N = 1539.7439.62

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: AlexNetIntel ARC A580Intel ARC A7501020304050SE +/- 0.27, N = 3SE +/- 0.23, N = 343.0943.01

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackIntel ARC A580Intel ARC A750510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 322.4322.441. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingIntel ARC A580Intel ARC A75014K28K42K56K70KSE +/- 36.04, N = 3SE +/- 527.31, N = 363157631131. (CXX) g++ options: -O3

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Masskrug - Acceleration: CPU-onlyIntel ARC A580Intel ARC A7500.380.761.141.521.9SE +/- 0.005, N = 3SE +/- 0.009, N = 31.6881.689

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed DownloadIntel ARC A580Intel ARC A750510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 318.6618.661. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: ResNet-50Intel ARC A750246810SE +/- 0.02, N = 38.20

Device: GPU - Batch Size: 64 - Model: ResNet-50

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: GoogLeNetIntel ARC A750714212835SE +/- 0.01, N = 327.95

Device: GPU - Batch Size: 64 - Model: GoogLeNet

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: ResNet-50Intel ARC A750246810SE +/- 0.06, N = 38.08

Device: GPU - Batch Size: 32 - Model: ResNet-50

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: ResNet-50Intel ARC A580Intel ARC A750246810SE +/- 0.01, N = 3SE +/- 0.02, N = 37.877.87

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: VGG-16Intel ARC A580Intel ARC A7500.54681.09361.64042.18722.734SE +/- 0.01, N = 2SE +/- 0.01, N = 32.432.43

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: VGG-16Intel ARC A580Intel ARC A7500.541.081.622.162.7SE +/- 0.00, N = 3SE +/- 0.00, N = 32.402.40

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: VGG-16Intel ARC A7500.54681.09361.64042.18722.734SE +/- 0.00, N = 32.43

Device: GPU - Batch Size: 64 - Model: VGG-16

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: UnboundLocalError: cannot access local variable 'decorators' where it is not associated with a value

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

NeatBench

NeatBench is a benchmark of the cross-platform Neat Video software on the CPU and optional GPU (OpenCL / CUDA) support. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterNeatBench 5Acceleration: GPUNVIDIA RTX 4070 SUPER9001800270036004500SE +/- 0.00, N = 34070

Acceleration: GPU

Intel ARC A770 8Gb: The test run did not produce a result. E: Failed to load CUDA driver ("/usr/lib64/libcuda.so.1")

Intel ARC A750: The test run did not produce a result.

Intel ARC A580: The test run did not produce a result.

Blender

Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER48121620SE +/- 0.03, N = 314.29

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Barbershop - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER1224364860SE +/- 0.10, N = 351.30

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Fishy Cat - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER3691215SE +/- 0.06, N = 139.45

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Classroom - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER3691215SE +/- 0.00, N = 312.60

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: BMW27 - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER1.25332.50663.75995.01326.2665SE +/- 0.06, N = 135.57

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTNVIDIA RTX 4070 SUPER130260390520650SE +/- 0.00, N = 36131. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNNVIDIA RTX 4070 SUPER130260390520650SE +/- 0.00, N = 35991. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTNVIDIA RTX 4070 SUPER130260390520650SE +/- 0.00, N = 35841. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNNVIDIA RTX 4070 SUPER120240360480600SE +/- 0.00, N = 35771. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-TNVIDIA RTX 4070 SUPER80160240320400SE +/- 0.00, N = 33891. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-NNVIDIA RTX 4070 SUPER50100150200250SE +/- 0.33, N = 32101. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTNVIDIA RTX 4070 SUPER100200300400500SE +/- 0.00, N = 34581. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYNVIDIA RTX 4070 SUPER90180270360450SE +/- 0.00, N = 34371. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYNVIDIA RTX 4070 SUPER90180270360450SE +/- 0.33, N = 34231. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision DoubleNVIDIA RTX 4070 SUPER140280420560700SE +/- 0.98, N = 3630.111. (CXX) g++ options: -O3

OpenCL Test: Double-Precision Double

Intel ARC A750: The test run did not produce a result.

Intel ARC A580: The test run did not produce a result.

FAHBench

FAHBench is a Folding@Home benchmark on the GPU. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2NVIDIA RTX 4070 SUPER80160240320400SE +/- 0.39, N = 3366.06

Intel ARC A770 8Gb: The test run did not produce a result.

Intel ARC A750: The test run did not produce a result.

Intel ARC A580: The test run did not produce a result.

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: DoubleNVIDIA RTX 4070 SUPER70140210280350SE +/- 0.30, N = 3339.591. (CXX) g++ options: -O3

Upscale: 2x - Precision: Double

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein benchmark in double precisionNVIDIA RTX 4070 SUPER10002000300040005000SE +/- 12.55, N = 344511. (CXX) g++ options: -O3 -lrt

Test: FFT + iFFT C2C Bluestein benchmark in double precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in double precisionNVIDIA RTX 4070 SUPER5K10K15K20K25KSE +/- 146.69, N = 3243171. (CXX) g++ options: -O3 -lrt

Test: FFT + iFFT C2C 1D batched in double precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP64 ComputeNVIDIA RTX 4070 SUPER0.13970.27940.41910.55880.6985SE +/- 0.000, N = 30.6211. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: FP64 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP FlopsIntel ARC A580Intel ARC A750600K1200K1800K2400K3000KSE +/- 154699.96, N = 15SE +/- 132865.40, N = 12234522428048521. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

ParaView

OpenBenchmarking.orgMiPolys / Sec, More Is BetterParaView 5.13Test: Wavelet Contour - Resolution: 1920 x 1080Intel ARC A580Intel ARC A7506001200180024003000SE +/- 28.63, N = 4SE +/- 10.34, N = 32633.042898.67

OpenBenchmarking.orgMiVoxels / Sec, More Is BetterParaView 5.13Test: Wavelet Volume - Resolution: 1920 x 1080Intel ARC A580Intel ARC A7508001600240032004000SE +/- 35.11, N = 15SE +/- 73.43, N = 153692.083901.28

OpenBenchmarking.orgFrames / Sec, More Is BetterParaView 5.13Test: Wavelet Volume - Resolution: 1920 x 1080Intel ARC A580Intel ARC A75050100150200250SE +/- 2.19, N = 15SE +/- 4.59, N = 15230.75243.83

OpenBenchmarking.orgMiPolys / Sec, More Is BetterParaView 5.13Test: Many Spheres - Resolution: 1920 x 1080Intel ARC A580Intel ARC A75016003200480064008000SE +/- 98.62, N = 15SE +/- 80.48, N = 56519.457245.45

GLmark2

This is a test of GLmark2, a basic OpenGL and OpenGL ES 2.0 benchmark supporting various windowing/display back-ends. Learn more via the OpenBenchmarking.org test page.

Resolution: $VIDEO_WIDTH x $VIDEO_HEIGHT

Intel ARC A750: The test quit with a non-zero exit status. E: ./glmark2: 2: ./bin/glmark2: not found

Intel ARC A580: The test quit with a non-zero exit status. E: ./glmark2: 2: ./bin/glmark2: not found

Betsy GPU Compressor

Betsy is an open-source GPU compressor of various GPU compression techniques. Betsy is written in GLSL for Vulkan/OpenGL (compute shader) support for GPU-based texture compression. Learn more via the OpenBenchmarking.org test page.

Codec: ETC2 RGB - Quality: Highest

Intel ARC A750: The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found

Intel ARC A580: The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found

Codec: ETC1 - Quality: Highest

Intel ARC A750: The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found

Intel ARC A580: The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

Test: FFT + iFFT C2C Bluestein benchmark in double precision

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

Test: FFT + iFFT C2C 1D batched in double precision

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in half precisionIntel ARC A580Intel ARC A75015K30K45K60K75KSE +/- 1441.23, N = 12SE +/- 2534.09, N = 1572346705711. (CXX) g++ options: -O3

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER20406080100SE +/- 1.06, N = 3SE +/- 4.25, N = 3SE +/- 2.65, N = 3SE +/- 0.29, N = 993.5461.4788.992.86MIN: 5.44 / MAX: 102.43MIN: 5.39 / MAX: 102MIN: 5.39 / MAX: 101.651. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefaceIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER714212835SE +/- 9.84, N = 3SE +/- 2.95, N = 3SE +/- 13.01, N = 3SE +/- 0.04, N = 924.0611.5127.010.84MIN: 2.55 / MAX: 57MIN: 2.57 / MAX: 56.91MIN: 2.51 / MAX: 56.021. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0Intel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER20406080100SE +/- 3.44, N = 3SE +/- 2.82, N = 3SE +/- 1.81, N = 3SE +/- 0.97, N = 9106.5654.5695.555.07MIN: 6.77 / MAX: 121.55MIN: 6.64 / MAX: 119.78MIN: 6.7 / MAX: 121.281. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER714212835SE +/- 7.50, N = 3SE +/- 0.84, N = 3SE +/- 4.53, N = 3SE +/- 1.31, N = 915.7012.0017.293.85MIN: 3.96 / MAX: 70.57MIN: 4.04 / MAX: 70.29MIN: 3.85 / MAX: 70.64MIN: 1.89 / MAX: 1093.291. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2Intel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER918273645SE +/- 23.69, N = 3SE +/- 0.19, N = 3SE +/- 0.64, N = 3SE +/- 0.34, N = 837.055.045.932.31MIN: 4.64 / MAX: 94.33MIN: 4.69 / MAX: 90.57MIN: 4.68 / MAX: 91.151. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3Intel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER714212835SE +/- 13.05, N = 3SE +/- 3.80, N = 3SE +/- 5.47, N = 3SE +/- 0.16, N = 929.049.3715.462.25MIN: 4.39 / MAX: 85.51MIN: 4.43 / MAX: 84.15MIN: 4.32 / MAX: 85.451. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2Intel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER1428425670SE +/- 2.81, N = 3SE +/- 2.78, N = 3SE +/- 5.70, N = 3SE +/- 0.44, N = 951.0438.3357.303.03MIN: 4.14 / MAX: 72.47MIN: 4.08 / MAX: 72.04MIN: 4.06 / MAX: 71.751. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

MandelGPU

MandelGPU is an OpenCL benchmark and this test runs with the OpenCL rendering float4 kernel with a maximum of 4096 iterations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPUIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER130M260M390M520M650MSE +/- 600066.13, N = 3SE +/- 5085622.05, N = 15SE +/- 467034.80, N = 3248630520.1279683403.2587219538.21. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

Target: Vulkan GPU

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ncnn: line 3: ./benchncnn: No such file or directory

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x74746a490450 google::LogMessageFatal::~LogMessageFatal()

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7d7151816450 google::LogMessageFatal::~LogMessageFatal()

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x73552c3e3450 google::LogMessageFatal::~LogMessageFatal()

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7670bcda4450 google::LogMessageFatal::~LogMessageFatal()

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7b5ea59be450 google::LogMessageFatal::~LogMessageFatal()

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7dd7c6de3450 google::LogMessageFatal::~LogMessageFatal()

FinanceBench

FinanceBench is a collection of financial program benchmarks with support for benchmarking on the GPU via OpenCL and CPU benchmarking with OpenMP. The FinanceBench test cases are focused on Black-Sholes-Merton Process with Analytic European Option engine, QMC (Sobol) Monte-Carlo method (Equity Option Example), Bonds Fixed-rate bond with flat forward curve, and Repo Securities repurchase agreement. FinanceBench was originally written by the Cavazos Lab at University of Delaware. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER1.33022.66043.99065.32086.651SE +/- 0.279, N = 15SE +/- 0.213, N = 12SE +/- 0.114, N = 154.8164.8425.9121. (CXX) g++ options: -O3 -march=native -fopenmp

ArrayFire

ArrayFire is an GPU and CPU numeric processing library, this test uses the built-in CPU and OpenCL ArrayFire benchmarks. Learn more via the OpenBenchmarking.org test page.

Test: Conjugate Gradient OpenCL

NVIDIA RTX 4070 SUPER: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: arrayfire: line 3: ./cg_opencl: No such file or directory

Intel ARC A750: The test run did not produce a result. E: ./arrayfire: 3: ./cg_opencl: not found

Intel ARC A580: The test run did not produce a result. E: ./arrayfire: 3: ./cg_opencl: not found

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in half precisionIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER30K60K90K120K150KSE +/- 1784.34, N = 12SE +/- 82.39, N = 3SE +/- 159.17, N = 369895100426131705-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C 1D batched in half precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

Scale: 2x - Denoise: 3 - TAA: No

NVIDIA RTX 4070 SUPER: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

Intel ARC A770 8Gb: The test run did not produce a result.

Intel ARC A750: The test run did not produce a result.

Intel ARC A580: The test run did not produce a result.

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: NoIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER3691215SE +/- 0.021, N = 3SE +/- 0.008, N = 3SE +/- 0.018, N = 3SE +/- 0.150, N = 1511.09910.3569.8086.323

vkpeak

Vkpeak is a Vulkan compute benchmark inspired by OpenCL's clpeak. Vkpeak provides Vulkan compute performance measurements for FP16 / FP32 / FP64 / INT16 / INT32 scalar and vec4 performance. Learn more via the OpenBenchmarking.org test page.

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT64 ComputeIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER0.94821.89642.84463.79284.741SE +/- 0.004, N = 3SE +/- 0.038, N = 3SE +/- 0.015, N = 31.0130.9454.2141. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT64 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

184 Results Shown

NCNN:
  Vulkan GPU - googlenet
  Vulkan GPU - mobilenet
  Vulkan GPU - vision_transformer
  Vulkan GPU - resnet18
  Vulkan GPU - regnety_400m
ProjectPhysX OpenCL-Benchmark
Hashcat:
  SHA1
  7-Zip
ViennaCL
clpeak
ProjectPhysX OpenCL-Benchmark
Hashcat
clpeak
ViennaCL
cl-mem
Unigine Valley
VkFFT
Hashcat
ViennaCL
IndigoBench
NCNN
Hashcat
ProjectPhysX OpenCL-Benchmark
IndigoBench
RealSR-NCNN
ViennaCL
NCNN:
  Vulkan GPU - resnet50
  Vulkan GPU - squeezenet_ssd
OpenArena
Waifu2x-NCNN Vulkan
VkFFT
TensorFlow:
  GPU - 64 - GoogLeNet
  GPU - 32 - GoogLeNet
VkFFT
ProjectPhysX OpenCL-Benchmark
TensorFlow:
  GPU - 16 - VGG-16
  GPU - 32 - VGG-16
ViennaCL
TensorFlow:
  GPU - 16 - GoogLeNet
  GPU - 64 - ResNet-50
ProjectPhysX OpenCL-Benchmark
TensorFlow
NCNN
cl-mem
TensorFlow
vkpeak:
  int32-vec4
  int32-scalar
  fp16-scalar
  int16-scalar
  fp16-vec4
  fp32-vec4
  int16-vec4
  fp32-scalar
TensorFlow
NCNN
TensorFlow
ViennaCL
cl-mem
ViennaCL
TensorFlow:
  GPU - 16 - AlexNet
  GPU - 1 - VGG-16
VkFFT
ViennaCL
TensorFlow:
  GPU - 1 - GoogLeNet
  GPU - 1 - ResNet-50
ViennaCL
LuxMark:
  GPU - Microphone
  CPU+GPU - Hotel
  GPU - Hotel
  CPU+GPU - Microphone
  CPU+GPU - Luxball HDR
  GPU - Luxball HDR
VkFFT
vkpeak:
  fp32-scalar
  fp16-vec4
  fp32-vec4
  fp16-scalar
SHOC Scalable HeterOgeneous Computing:
  OpenCL - MD5 Hash
  OpenCL - Texture Read Bandwidth
  OpenCL - GEMM SGEMM_N
ProjectPhysX OpenCL-Benchmark:
  FP16 Compute
  Memory Bandwidth Coalesced Write
TensorFlow
Xonotic
ViennaCL
clpeak
SPECViewPerf 2020
VkResample
OpenArena
ParaView
ViennaCL
Xonotic
ParaView
Xonotic
Unigine Heaven
ViennaCL:
  CPU BLAS - dGEMM-TT
  CPU BLAS - dGEMM-NT
SHOC Scalable HeterOgeneous Computing
NCNN
VkFFT
SPECViewPerf 2020:
  1920 x 1080 - MEDICAL-O3
  1920 x 1080 - CATIA-06
Xonotic
SPECViewPerf 2020
SHOC Scalable HeterOgeneous Computing
ViennaCL
SPECViewPerf 2020
VkFFT:
  FFT + iFFT C2C multidimensional in single precision
  FFT + iFFT R2C / C2R
IndigoBench
Darktable
TensorFlow
SPECViewPerf 2020
Darktable
SHOC Scalable HeterOgeneous Computing
SPECViewPerf 2020
TensorFlow
ViennaCL
IndigoBench
TensorFlow:
  GPU - 1 - AlexNet
  GPU - 16 - GoogLeNet
  GPU - 32 - GoogLeNet
Darktable:
  Masskrug - OpenCL
  Server Room - CPU-only
  Server Rack - CPU-only
  Boat - OpenCL
  Boat - CPU-only
SHOC Scalable HeterOgeneous Computing
TensorFlow:
  GPU - 1 - GoogLeNet
  GPU - 32 - VGG-16
  GPU - 1 - ResNet-50
VkFFT
TensorFlow:
  GPU - 16 - AlexNet
  GPU - 32 - AlexNet
SHOC Scalable HeterOgeneous Computing
VkFFT
Darktable
SHOC Scalable HeterOgeneous Computing
TensorFlow:
  GPU - 64 - ResNet-50
  GPU - 64 - GoogLeNet
  GPU - 32 - ResNet-50
  GPU - 16 - ResNet-50
  GPU - 64 - VGG-16
  GPU - 16 - VGG-16
  GPU - 64 - VGG-16
NeatBench
Blender:
  Pabellon Barcelona - NVIDIA OptiX
  Barbershop - NVIDIA OptiX
  Fishy Cat - NVIDIA OptiX
  Classroom - NVIDIA OptiX
  BMW27 - NVIDIA OptiX
ViennaCL:
  OpenCL BLAS - dGEMM-TT
  OpenCL BLAS - dGEMM-TN
  OpenCL BLAS - dGEMM-NT
  OpenCL BLAS - dGEMM-NN
  OpenCL BLAS - dGEMV-T
  OpenCL BLAS - dGEMV-N
  OpenCL BLAS - dDOT
  OpenCL BLAS - dAXPY
  OpenCL BLAS - dCOPY
clpeak
FAHBench
VkResample
VkFFT:
  FFT + iFFT C2C Bluestein benchmark in double precision
  FFT + iFFT C2C 1D batched in double precision
ProjectPhysX OpenCL-Benchmark
SHOC Scalable HeterOgeneous Computing
ParaView:
  Wavelet Contour - 1920 x 1080
  Wavelet Volume - 1920 x 1080
  Wavelet Volume - 1920 x 1080
  Many Spheres - 1920 x 1080
VkFFT
NCNN:
  Vulkan GPU - FastestDet
  Vulkan GPU - blazeface
  Vulkan GPU - efficientnet-b0
  Vulkan GPU - mnasnet
  Vulkan GPU - shufflenet-v2
  Vulkan GPU-v3-v3 - mobilenet-v3
  Vulkan GPU-v2-v2 - mobilenet-v2
MandelGPU
FinanceBench
VkFFT
RealSR-NCNN
ProjectPhysX OpenCL-Benchmark