Ubuntu 22.04.02 LTS 7900X 7900XTX opencl AMD Ryzen 9 7900X 12-Core testing with a ASUS ROG STRIX B650E-F GAMING WIFI (1410 BIOS) and ASUS NVIDIA GeForce RTX 4080 16GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2305210-NE-2305163NE65&sor&grt .
Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL OpenCL Compiler File-System Screen Resolution Display Driver Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl AMD Ryzen 9 7900X 12-Core @ 4.70GHz (12 Cores / 24 Threads) ASUS ROG STRIX B650E-F GAMING WIFI (1410 BIOS) AMD Device 14d8 64GB 2000GB SHPP41-2000GM + 120GB TOSHIBA RC100 + 1000GB Western Digital WD_BLACK SN750 SE NVMe 1TB + 32GB Flash Drive AMD Radeon RX 7900 XTX 24GB (3220/1249MHz) AMD Device ab30 LG HDR 4K + LG Ultra HD Intel I225-V + MEDIATEK Device 0608 Ubuntu 22.04 5.19.0-41-generic (x86_64) Budgie 10.6.1 X Server 1.21.1.4 4.6 Mesa 22.3.0-devel (LLVM 15.0.3 DRM 3.48) OpenCL 2.1 AMD-APP (3513.0) GCC 11.3.0 ext4 7680x2160 32GB 2000GB SHPP41-2000GM + 120GB TOSHIBA RC100 + 1000GB Western Digital WD_BLACK SN750 SE NVMe 1TB ASUS NVIDIA GeForce RTX 4080 16GB NVIDIA Device 22bb NVIDIA 530.41.03 4.6.0 OpenCL 3.0 CUDA 12.1.98 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601203 Graphics Details - Ubuntu 22.04.02 LTS 7900X 7900XTX opencl: GLAMOR - BAR1 / Visible vRAM Size: 256 MB - vBIOS Version: 113-TIC106615-100 - Ubuntu 22.04.02 LTS 7900X 4080 opencl: BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 95.03.2b.00.8c Python Details - Python 3.10.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected OpenCL Details - Ubuntu 22.04.02 LTS 7900X 4080 opencl: GPU Compute Cores: 9728
Ubuntu 22.04.02 LTS 7900X 7900XTX opencl cl-mem: Copy cl-mem: Read cl-mem: Write clpeak: Kernel Latency clpeak: Integer Compute clpeak: Integer 24-bit Compute clpeak: Global Memory Bandwidth clpeak: Double-Precision Compute clpeak: Single-Precision Compute clpeak: Transfer Bandwidth enqueueReadBuffer clpeak: Transfer Bandwidth enqueueWriteBuffer darktable: Boat - OpenCL darktable: Masskrug - OpenCL darktable: Server Rack - OpenCL darktable: Server Room - OpenCL fluidx3d: FP32-FP32 fluidx3d: FP32-FP16C fluidx3d: FP32-FP16S lulesh-cl: luxmark: GPU - Hotel luxmark: CPU+GPU - Hotel luxmark: GPU - Microphone luxmark: GPU - Luxball HDR luxmark: CPU+GPU - Microphone luxmark: CPU+GPU - Luxball HDR rodinia: OpenCL Myocyte rodinia: OpenCL Leukocyte rodinia: OpenCL Particle Filter shoc: OpenCL - S3D shoc: OpenCL - Triad shoc: OpenCL - FFT SP shoc: OpenCL - MD5 Hash shoc: OpenCL - Reduction shoc: OpenCL - GEMM SGEMM_N shoc: OpenCL - Max SP Flops shoc: OpenCL - Bus Speed Download shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Texture Read Bandwidth smallpt-gpu: GPU - 7680 x 2160 - Caustic smallpt-gpu: GPU - 7680 x 2160 - Cornell smallpt-gpu: GPU - 7680 x 2160 - Caustic3 viennacl: CPU BLAS - sCOPY viennacl: CPU BLAS - sAXPY viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - dCOPY viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - dGEMV-N viennacl: CPU BLAS - dGEMV-T viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-TT viennacl: OpenCL BLAS - sCOPY viennacl: OpenCL BLAS - sAXPY viennacl: OpenCL BLAS - sDOT viennacl: OpenCL BLAS - dCOPY viennacl: OpenCL BLAS - dAXPY viennacl: OpenCL BLAS - dDOT viennacl: OpenCL BLAS - dGEMV-N viennacl: OpenCL BLAS - dGEMV-T viennacl: OpenCL BLAS - dGEMM-NN viennacl: OpenCL BLAS - dGEMM-NT viennacl: OpenCL BLAS - dGEMM-TN viennacl: OpenCL BLAS - dGEMM-TT xsbench-cl: Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 2.758 3.027 0.121 2.504 179 289 287 63.5 95.4 99.2 115 132 64.1 61.6 69.9 66.2 374.9 620.7 520.5 3.89 23782.67 23875.52 577.23 817.20 46337.02 11.41 12.63 1.187 1.864 0.111 0.647 3772 7705 7786 9189.9160 22310 22294 75029 99385 75300 99792 19.916 2.393 2.929 422.946 25.9744 1812.90 57.0867 946.679 16861.6 53452.7 26.8876 26.3975 2970.92 1684649225 1684649363 1684649501 103 164 178 36.0 54.4 59.9 73.3 78.8 64.4 61.9 70.3 66.2 367 483 412 524 605 596 219 427 743 766 801 817 OpenBenchmarking.org
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy Ubuntu 22.04.02 LTS 7900X 4080 opencl 80 160 240 320 400 SE +/- 0.12, N = 3 374.9 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read Ubuntu 22.04.02 LTS 7900X 4080 opencl 130 260 390 520 650 SE +/- 0.22, N = 3 620.7 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write Ubuntu 22.04.02 LTS 7900X 4080 opencl 110 220 330 440 550 SE +/- 1.33, N = 3 520.5 1. (CC) gcc options: -O2 -flto -lOpenCL
clpeak OpenCL Test: Kernel Latency OpenBenchmarking.org us, Fewer Is Better clpeak 1.1.2 OpenCL Test: Kernel Latency Ubuntu 22.04.02 LTS 7900X 4080 opencl 0.8753 1.7506 2.6259 3.5012 4.3765 SE +/- 0.05, N = 15 3.89 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Integer Compute OpenBenchmarking.org GIOPS, More Is Better clpeak 1.1.2 OpenCL Test: Integer Compute Ubuntu 22.04.02 LTS 7900X 4080 opencl 5K 10K 15K 20K 25K SE +/- 61.13, N = 3 23782.67 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Integer 24-bit Compute OpenBenchmarking.org GIOPS, More Is Better clpeak 1.1.2 OpenCL Test: Integer 24-bit Compute Ubuntu 22.04.02 LTS 7900X 4080 opencl 5K 10K 15K 20K 25K SE +/- 101.49, N = 3 23875.52 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth Ubuntu 22.04.02 LTS 7900X 4080 opencl 120 240 360 480 600 SE +/- 3.45, N = 3 577.23 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Double-Precision Compute OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Double-Precision Compute Ubuntu 22.04.02 LTS 7900X 4080 opencl 200 400 600 800 1000 SE +/- 1.38, N = 3 817.20 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Single-Precision Compute OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Single-Precision Compute Ubuntu 22.04.02 LTS 7900X 4080 opencl 10K 20K 30K 40K 50K SE +/- 82.14, N = 3 46337.02 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Transfer Bandwidth enqueueReadBuffer OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Transfer Bandwidth enqueueReadBuffer Ubuntu 22.04.02 LTS 7900X 4080 opencl 3 6 9 12 15 SE +/- 0.03, N = 3 11.41 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Transfer Bandwidth enqueueWriteBuffer OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Transfer Bandwidth enqueueWriteBuffer Ubuntu 22.04.02 LTS 7900X 4080 opencl 3 6 9 12 15 SE +/- 0.16, N = 3 12.63 1. (CXX) g++ options: -O3
Darktable Test: Boat - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Boat - Acceleration: OpenCL Ubuntu 22.04.02 LTS 7900X 4080 opencl Ubuntu 22.04.02 LTS 7900X 7900XTX opencl 0.6206 1.2412 1.8618 2.4824 3.103 SE +/- 0.006, N = 3 SE +/- 0.022, N = 3 1.187 2.758
Darktable Test: Masskrug - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Masskrug - Acceleration: OpenCL Ubuntu 22.04.02 LTS 7900X 4080 opencl Ubuntu 22.04.02 LTS 7900X 7900XTX opencl 0.6811 1.3622 2.0433 2.7244 3.4055 SE +/- 0.008, N = 3 SE +/- 0.007, N = 3 1.864 3.027
Darktable Test: Server Rack - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Server Rack - Acceleration: OpenCL Ubuntu 22.04.02 LTS 7900X 4080 opencl Ubuntu 22.04.02 LTS 7900X 7900XTX opencl 0.0272 0.0544 0.0816 0.1088 0.136 SE +/- 0.000, N = 3 SE +/- 0.002, N = 3 0.111 0.121
Darktable Test: Server Room - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Server Room - Acceleration: OpenCL Ubuntu 22.04.02 LTS 7900X 4080 opencl Ubuntu 22.04.02 LTS 7900X 7900XTX opencl 0.5634 1.1268 1.6902 2.2536 2.817 SE +/- 0.003, N = 3 SE +/- 0.006, N = 3 0.647 2.504
FluidX3D Test: FP32-FP32 OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.3 Test: FP32-FP32 Ubuntu 22.04.02 LTS 7900X 4080 opencl 800 1600 2400 3200 4000 SE +/- 4.37, N = 3 3772
FluidX3D Test: FP32-FP16C OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.3 Test: FP32-FP16C Ubuntu 22.04.02 LTS 7900X 4080 opencl 1700 3400 5100 6800 8500 SE +/- 1.00, N = 3 7705
FluidX3D Test: FP32-FP16S OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.3 Test: FP32-FP16S Ubuntu 22.04.02 LTS 7900X 4080 opencl 2K 4K 6K 8K 10K SE +/- 0.67, N = 3 7786
Lulesh OpenCL OpenBenchmarking.org z/s, More Is Better Lulesh OpenCL 2017-07-06 Ubuntu 22.04.02 LTS 7900X 4080 opencl 2K 4K 6K 8K 10K SE +/- 33.49, N = 3 9189.92 1. (CXX) g++ options: -std=c++11 -lOpenCL -O3 -lm
LuxMark OpenCL Device: GPU - Scene: Hotel OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Hotel Ubuntu 22.04.02 LTS 7900X 4080 opencl 5K 10K 15K 20K 25K SE +/- 37.00, N = 3 22310
LuxMark OpenCL Device: CPU+GPU - Scene: Hotel OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Hotel Ubuntu 22.04.02 LTS 7900X 4080 opencl 5K 10K 15K 20K 25K SE +/- 11.26, N = 3 22294
LuxMark OpenCL Device: GPU - Scene: Microphone OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Microphone Ubuntu 22.04.02 LTS 7900X 4080 opencl 16K 32K 48K 64K 80K SE +/- 252.33, N = 3 75029
LuxMark OpenCL Device: GPU - Scene: Luxball HDR OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Luxball HDR Ubuntu 22.04.02 LTS 7900X 4080 opencl 20K 40K 60K 80K 100K SE +/- 399.21, N = 3 99385
LuxMark OpenCL Device: CPU+GPU - Scene: Microphone OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Microphone Ubuntu 22.04.02 LTS 7900X 4080 opencl 16K 32K 48K 64K 80K SE +/- 12.53, N = 3 75300
LuxMark OpenCL Device: CPU+GPU - Scene: Luxball HDR OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Luxball HDR Ubuntu 22.04.02 LTS 7900X 4080 opencl 20K 40K 60K 80K 100K SE +/- 6.36, N = 3 99792
Rodinia Test: OpenCL Myocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Myocyte Ubuntu 22.04.02 LTS 7900X 4080 opencl 5 10 15 20 25 SE +/- 0.09, N = 3 19.92 1. (CXX) g++ options: -O2 -lOpenCL
Rodinia Test: OpenCL Leukocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Leukocyte Ubuntu 22.04.02 LTS 7900X 4080 opencl 0.5384 1.0768 1.6152 2.1536 2.692 SE +/- 0.030, N = 12 2.393 1. (CXX) g++ options: -O2 -lOpenCL
Rodinia Test: OpenCL Particle Filter OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Particle Filter Ubuntu 22.04.02 LTS 7900X 4080 opencl 0.659 1.318 1.977 2.636 3.295 SE +/- 0.028, N = 7 2.929 1. (CXX) g++ options: -O2 -lOpenCL
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: S3D OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D Ubuntu 22.04.02 LTS 7900X 4080 opencl 90 180 270 360 450 SE +/- 0.21, N = 3 422.95 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad Ubuntu 22.04.02 LTS 7900X 4080 opencl 6 12 18 24 30 SE +/- 0.00, N = 3 25.97 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP Ubuntu 22.04.02 LTS 7900X 4080 opencl 400 800 1200 1600 2000 SE +/- 2.20, N = 3 1812.90 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash Ubuntu 22.04.02 LTS 7900X 4080 opencl 13 26 39 52 65 SE +/- 0.32, N = 3 57.09 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Reduction OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction Ubuntu 22.04.02 LTS 7900X 4080 opencl 200 400 600 800 1000 SE +/- 8.26, N = 8 946.68 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: GEMM SGEMM_N OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N Ubuntu 22.04.02 LTS 7900X 4080 opencl 4K 8K 12K 16K 20K SE +/- 152.29, N = 15 16861.6 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops Ubuntu 22.04.02 LTS 7900X 4080 opencl 11K 22K 33K 44K 55K SE +/- 92.32, N = 3 53452.7 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download Ubuntu 22.04.02 LTS 7900X 4080 opencl 6 12 18 24 30 SE +/- 0.00, N = 3 26.89 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback Ubuntu 22.04.02 LTS 7900X 4080 opencl 6 12 18 24 30 SE +/- 0.00, N = 3 26.40 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth Ubuntu 22.04.02 LTS 7900X 4080 opencl 600 1200 1800 2400 3000 SE +/- 2.71, N = 3 2970.92 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SmallPT GPU OpenCL Device: GPU - Resolution: 7680 x 2160 - Scene: Caustic OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 7680 x 2160 - Scene: Caustic Ubuntu 22.04.02 LTS 7900X 4080 opencl 400M 800M 1200M 1600M 2000M SE +/- 25.69, N = 3 1684649225 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
SmallPT GPU OpenCL Device: GPU - Resolution: 7680 x 2160 - Scene: Cornell OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 7680 x 2160 - Scene: Cornell Ubuntu 22.04.02 LTS 7900X 4080 opencl 400M 800M 1200M 1600M 2000M SE +/- 25.12, N = 3 1684649363 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
SmallPT GPU OpenCL Device: GPU - Resolution: 7680 x 2160 - Scene: Caustic3 OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 7680 x 2160 - Scene: Caustic3 Ubuntu 22.04.02 LTS 7900X 4080 opencl 400M 800M 1200M 1600M 2000M SE +/- 25.40, N = 3 1684649501 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 40 80 120 160 200 SE +/- 0.58, N = 3 SE +/- 1.00, N = 3 179 103 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 60 120 180 240 300 SE +/- 0.33, N = 3 SE +/- 1.20, N = 3 289 164 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 60 120 180 240 300 SE +/- 0.00, N = 2 SE +/- 0.67, N = 3 287 178 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 14 28 42 56 70 SE +/- 0.09, N = 3 SE +/- 0.12, N = 3 63.5 36.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 95.4 54.4 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 20 40 60 80 100 SE +/- 0.19, N = 3 SE +/- 0.03, N = 3 99.2 59.9 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 30 60 90 120 150 SE +/- 0.67, N = 3 SE +/- 0.07, N = 3 115.0 73.3 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 30 60 90 120 150 SE +/- 0.33, N = 3 SE +/- 0.12, N = 3 132.0 78.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN Ubuntu 22.04.02 LTS 7900X 4080 opencl Ubuntu 22.04.02 LTS 7900X 7900XTX opencl 14 28 42 56 70 SE +/- 0.20, N = 3 SE +/- 0.20, N = 3 64.4 64.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT Ubuntu 22.04.02 LTS 7900X 4080 opencl Ubuntu 22.04.02 LTS 7900X 7900XTX opencl 14 28 42 56 70 SE +/- 0.19, N = 3 SE +/- 0.17, N = 3 61.9 61.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN Ubuntu 22.04.02 LTS 7900X 4080 opencl Ubuntu 22.04.02 LTS 7900X 7900XTX opencl 16 32 48 64 80 SE +/- 0.17, N = 3 SE +/- 0.12, N = 3 70.3 69.9 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT Ubuntu 22.04.02 LTS 7900X 4080 opencl Ubuntu 22.04.02 LTS 7900X 7900XTX opencl 15 30 45 60 75 SE +/- 0.17, N = 3 SE +/- 0.12, N = 3 66.2 66.2 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY Ubuntu 22.04.02 LTS 7900X 4080 opencl 80 160 240 320 400 SE +/- 1.53, N = 3 367 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY Ubuntu 22.04.02 LTS 7900X 4080 opencl 100 200 300 400 500 SE +/- 0.00, N = 3 483 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT Ubuntu 22.04.02 LTS 7900X 4080 opencl 90 180 270 360 450 SE +/- 0.33, N = 3 412 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY Ubuntu 22.04.02 LTS 7900X 4080 opencl 110 220 330 440 550 SE +/- 0.00, N = 3 524 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY Ubuntu 22.04.02 LTS 7900X 4080 opencl 130 260 390 520 650 SE +/- 0.33, N = 3 605 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT Ubuntu 22.04.02 LTS 7900X 4080 opencl 130 260 390 520 650 SE +/- 0.33, N = 3 596 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N Ubuntu 22.04.02 LTS 7900X 4080 opencl 50 100 150 200 250 SE +/- 0.00, N = 3 219 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T Ubuntu 22.04.02 LTS 7900X 4080 opencl 90 180 270 360 450 SE +/- 0.33, N = 3 427 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN Ubuntu 22.04.02 LTS 7900X 4080 opencl 160 320 480 640 800 SE +/- 1.53, N = 3 743 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT Ubuntu 22.04.02 LTS 7900X 4080 opencl 170 340 510 680 850 SE +/- 1.33, N = 3 766 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN Ubuntu 22.04.02 LTS 7900X 4080 opencl 200 400 600 800 1000 SE +/- 1.20, N = 3 801 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT Ubuntu 22.04.02 LTS 7900X 4080 opencl 200 400 600 800 1000 SE +/- 1.53, N = 3 817 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
Phoronix Test Suite v10.8.5