OpenCL Testing

OpenCL tests for a future article on Phoronix.

HTML result view exported from: https://openbenchmarking.org/result/1704211-TR-OPENCLTES87&grs.

OpenCL TestingProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkMonitorOSKernelDesktopDisplay ServerDisplay DriverOpenGLVulkanCompilerFile-SystemScreen ResolutionOpenCLGeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1080 Ti #1GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080GeForce GTX 1080 Ti #2Radeon RX 480Radeon RX 580Radeon R9 FuryIntel Core i7-7700K @ 4.50GHz (8 Cores)MSI Z270-A PRO (MS-7A71) v1.0Intel Device 591f + Z27016384MBSamsung SSD 950 PRO 256GBeVGA NVIDIA GeForce GTX 970 4096MB (1164/3505MHz)Realtek ALC892Realtek RTL8111/8168/8411Ubuntu 17.044.10.0-19-generic (x86_64)Unity 7.5.0X Server 1.19.3NVIDIA 381.094.5.01.0.42GCC 6.3.0 20170406ext43840x2160NVIDIA GeForce GTX 980 4096MB (135/324MHz)NVIDIA GeForce GTX 980 Ti 6144MB (999/3505MHz)Zotac NVIDIA GeForce GTX 1050 2048MB (1316/3504MHz)NVIDIA GeForce GTX 1080 Ti 11264MB (1468/5508MHz)eVGA NVIDIA GeForce GTX 1050 Ti 4096MB (1341/3504MHz)NVIDIA GeForce GTX 1060 6GB 6144MB (1505/4006MHz)NVIDIA GeForce GTX 1070 8192MB (250/4006MHz)NVIDIA GeForce GTX 1080 8192MB (84/5005MHz)NVIDIA GeForce GTX 1080 Ti 11264MB (1472/5508MHz)AMD Radeon RX 470/480 8192MBAcer B286HK4.8.0-040800-generic (x86_64)modesetting 1.19.3OpenCL 2.0 AMD-APP (2348.3)MSI AMD Radeon RX 470/480 8192MBSapphire AMD Radeon R9 FURY / NANO 4096MBOpenBenchmarking.orgCompiler Details- GeForce GTX 970, GeForce GTX 980, GeForce GTX 980 Ti, GeForce GTX 1050, GeForce GTX 1080 Ti #1, GeForce GTX 1050 Ti, GeForce GTX 1060, GeForce GTX 1070, GeForce GTX 1080, Radeon RX 480, Radeon RX 580, Radeon R9 Fury: --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic -v Processor Details- Scaling Governor: intel_pstate powersaveOpenCL Details- GeForce GTX 970: GPU Compute Cores: 1664- GeForce GTX 980: GPU Compute Cores: 2048- GeForce GTX 980 Ti: GPU Compute Cores: 2816- GeForce GTX 1050: GPU Compute Cores: 640- GeForce GTX 1080 Ti #1: GPU Compute Cores: 3584- GeForce GTX 1050 Ti: GPU Compute Cores: 768- GeForce GTX 1060: GPU Compute Cores: 1280- GeForce GTX 1070: GPU Compute Cores: 1920- GeForce GTX 1080: GPU Compute Cores: 2560System Details- GeForce GTX 970: GPU Compute Cores: 1664.- GeForce GTX 980: GPU Compute Cores: 2048.- GeForce GTX 980 Ti: GPU Compute Cores: 2816.- GeForce GTX 1050: GPU Compute Cores: 640.- GeForce GTX 1080 Ti #1: GPU Compute Cores: 3584.- GeForce GTX 1050 Ti: GPU Compute Cores: 768.- GeForce GTX 1060: GPU Compute Cores: 1280.- GeForce GTX 1070: GPU Compute Cores: 1920.- GeForce GTX 1080: GPU Compute Cores: 2560.Graphics Details- Radeon R9 Fury: GLAMOR

OpenCL Testingdarktable: Boat - OpenCLshoc: OpenCL - Texture Read Bandwidthdarktable: Masskrug - OpenCLshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackjuliagpu: GPUshoc: OpenCL - MD5 Hashshoc: OpenCL - FFT SPcl-mem: Writecl-mem: Copydarktable: Server Room - OpenCLcl-mem: Readshoc: OpenCL - Max SP Flopsshoc: OpenCL - TriadGeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1080 Ti #1GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080GeForce GTX 1080 Ti #2Radeon RX 480Radeon RX 580Radeon R9 Fury280.9912.8113.17111315195.206.57399.01133.30125.70143.634333.1911.82333.1612.8113.17121825827.077.59459.23154.80142.77164.574989.3711.943.58351.705.4612.8113.17137239997.109.34712.96242.50216.900.97265.836156.8512.20271.7612.8113.1766598918.933.25246.2886.1387.1394.832112.9711.402.78592.435.2812.8113.17202729946.7319.81986.68341.33316.630.77337.8713088.7012.4116.30301.6218.4012.8113.1780390064.174.13207.6285.0386.6712.1594.102677.9111.384.27380.735.5312.8113.17119921484.477.34329.51145.20139.101.07153.234776.3111.883.46450.045.3512.8113.17151370700.0710.69518.16195.73186.700.87204.907080.1512.113.28520.355.3512.8113.17174049828.8714.23634.14219.97208.970.87229.039342.4112.2126.3614.8918.6839.8139.430.084.051.401.509.162.67130.208.3926.3714.8918.6940.2039.120.084.061.401.509.172.70130.218.4026.3614.8918.6941.2040.400.084.031.401.509.172.70130.168.42OpenBenchmarking.org

Darktable

Test: Boat - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.2.1Test: Boat - Acceleration: OpenCLGeForce GTX 980 TiGeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080GeForce GTX 1080 Ti #1Radeon RX 480Radeon RX 580Radeon R9 Fury612182430SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 33.5816.304.273.463.282.7826.3626.3726.36

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read BandwidthGeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1080 Ti #1GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080Radeon RX 480Radeon RX 580Radeon R9 Fury130260390520650SE +/- 0.19, N = 3SE +/- 1.29, N = 3SE +/- 1.47, N = 3SE +/- 2.55, N = 3SE +/- 1.29, N = 3SE +/- 3.36, N = 3SE +/- 0.39, N = 3SE +/- 2.14, N = 3SE +/- 2.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3280.99333.16351.70271.76592.43301.62380.73450.04520.3514.8914.8914.891. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

Darktable

Test: Masskrug - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.2.1Test: Masskrug - Acceleration: OpenCLGeForce GTX 980 TiGeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080GeForce GTX 1080 Ti #1Radeon RX 480Radeon RX 580Radeon R9 Fury510152025SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 35.4618.405.535.355.355.2818.6818.6918.69

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed DownloadGeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1080 Ti #1GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080Radeon RX 480Radeon RX 580Radeon R9 Fury918273645SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.57, N = 6SE +/- 0.53, N = 3SE +/- 0.30, N = 312.8112.8112.8112.8112.8112.8112.8112.8112.8139.8140.2041.201. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed ReadbackGeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1080 Ti #1GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080Radeon RX 480Radeon RX 580Radeon R9 Fury918273645SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.36, N = 3SE +/- 0.28, N = 3SE +/- 0.36, N = 313.1713.1713.1713.1713.1713.1713.1713.1713.1739.4339.1240.401. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

JuliaGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterJuliaGPU 1.2pts1OpenCL Device: GPUGeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1080 Ti #1GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 108040M80M120M160M200MSE +/- 75709.36, N = 3SE +/- 162823.13, N = 3SE +/- 153929.03, N = 3SE +/- 80114.08, N = 3SE +/- 235270.97, N = 3SE +/- 89962.63, N = 3SE +/- 296618.42, N = 3SE +/- 74030.18, N = 3SE +/- 171757.93, N = 3111315195.20121825827.07137239997.1066598918.93202729946.7380390064.17119921484.47151370700.07174049828.871. (CC) gcc options: -O3 -march=native -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL -lm

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 HashGeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1080 Ti #1GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080Radeon RX 480Radeon RX 580Radeon R9 Fury510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.06, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 36.577.599.343.2519.814.137.3410.6914.230.080.080.081. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPGeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1080 Ti #1GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080Radeon RX 480Radeon RX 580Radeon R9 Fury2004006008001000SE +/- 2.00, N = 3SE +/- 1.07, N = 3SE +/- 15.11, N = 6SE +/- 8.54, N = 6SE +/- 2.48, N = 3SE +/- 6.55, N = 6SE +/- 3.36, N = 3SE +/- 7.10, N = 3SE +/- 1.90, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 3399.01459.23712.96246.28986.68207.62329.51518.16634.144.054.064.031. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteGeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1080 Ti #1GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080Radeon RX 480Radeon RX 580Radeon R9 Fury70140210280350SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.10, N = 3SE +/- 0.03, N = 3SE +/- 0.23, N = 3SE +/- 0.07, N = 3SE +/- 0.15, N = 3SE +/- 0.19, N = 3SE +/- 0.27, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3133.30154.80242.5086.13341.3385.03145.20195.73219.971.401.401.401. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyGeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1080 Ti #1GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080Radeon RX 480Radeon RX 580Radeon R9 Fury70140210280350SE +/- 0.00, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 3SE +/- 0.26, N = 3SE +/- 0.47, N = 3SE +/- 0.03, N = 3SE +/- 0.10, N = 3SE +/- 0.15, N = 3SE +/- 0.29, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3125.70142.77216.9087.13316.6386.67139.10186.70208.971.501.501.501. (CC) gcc options: -O2 -flto -lOpenCL

Darktable

Test: Server Room - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.2.1Test: Server Room - Acceleration: OpenCLGeForce GTX 980 TiGeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080GeForce GTX 1080 Ti #1Radeon RX 480Radeon RX 580Radeon R9 Fury3691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 30.9712.151.070.870.870.779.169.179.17

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadGeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1080 Ti #1GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080Radeon RX 480Radeon RX 580Radeon R9 Fury70140210280350SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.55, N = 3SE +/- 0.00, N = 3SE +/- 0.09, N = 3SE +/- 0.15, N = 3SE +/- 0.39, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3143.63164.57265.8394.83337.8794.10153.23204.90229.032.672.702.701. (CC) gcc options: -O2 -flto -lOpenCL

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP FlopsGeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1080 Ti #1GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080Radeon RX 480Radeon RX 580Radeon R9 Fury3K6K9K12K15KSE +/- 1.10, N = 3SE +/- 19.83, N = 3SE +/- 2.28, N = 3SE +/- 0.29, N = 3SE +/- 25.00, N = 3SE +/- 5.27, N = 3SE +/- 14.72, N = 3SE +/- 21.97, N = 3SE +/- 14.19, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 34333.194989.376156.852112.9713088.702677.914776.317080.159342.41130.20130.21130.161. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: TriadGeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1080 Ti #1GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080Radeon RX 480Radeon RX 580Radeon R9 Fury3691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 3SE +/- 0.11, N = 6SE +/- 0.08, N = 311.8211.9412.2011.4012.4111.3811.8812.1112.218.398.408.421. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt


Phoronix Test Suite v10.8.5