12700k HPC+OpenCL AVX512 performance profiling

Intel Core i7-12700K testing with a MSI PRO Z690-A DDR4(MS-7D25) v1.0 (1.15 BIOS) and Gigabyte AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 6GB on Pop 21.04 via the Phoronix Test Suite.

12700k AVX512 march=sapphirerapids gcc 11.1 rx 5600xt

Processor: Intel Core i7-12700K @ 6.30GHz (8 Cores / 16 Threads), Motherboard: MSI PRO Z690-A DDR4(MS-7D25) v1.0 (1.15 BIOS), Chipset: Intel Device 7aa7, Memory: 32GB, Disk: 500GB Western Digital WDS500G2B0C-00PXH0 + 3 x 10001GB Seagate ST10000DM0004-1Z + 128GB HP SSD S700 Pro, Graphics: Gigabyte AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 6GB (1650/750MHz), Audio: Realtek ALC897, Monitor: LG HDR WQHD, Network: Intel I225-V

OS: Pop 21.04, Kernel: 5.15.5-76051505-generic (x86_64), Desktop: GNOME Shell 3.38.4, Display Server: X Server 1.20.11, OpenGL: 4.6 Mesa 21.2.2 (LLVM 12.0.0), OpenCL: OpenCL 2.2 AMD-APP (3361.0), Vulkan: 1.2.185, Compiler: GCC 11.1.0, File-System: ext4, Screen Resolution: 3440x1440

Kernel Notes: Transparent Huge Pages: madvise
Environment Notes: CXXFLAGS="-O3 -march=sapphirerapids -mno-amx-tile -mno-amx-int8 -mno-amx-bf16" CFLAGS="-O3 -march=sapphirerapids -mno-amx-tile -mno-amx-int8 -mno-amx-bf16"
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-RPS7jb/gcc-11-11.1.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-RPS7jb/gcc-11-11.1.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Disk Notes: NONE / errors=remount-ro,noatime,rw / Block Size: 4096
Processor Notes: Scaling Governor: intel_pstate powersave - CPU Microcode: 0x15 - Thermald 2.4.3
Graphics Notes: GLAMOR - BAR1 / Visible vRAM Size: 6128 MB
Python Notes: Python 2.7.18 + Python 3.9.5
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

12700k AVX512 march=native + AVX512 gcc 11.1 rx 5600xt

Changed Disk to 500GB Western Digital WDS500G2B0C-00PXH0 + 3 x 10001GB Seagate ST10000DM0004-1Z + 300GB Western Digital WD3000GLFS-0 + 128GB HP SSD S700 Pro.

Environment Change: CXXFLAGS="-O3 -march=native -mavx512f -mavx512dq -mavx512ifma -mavx512cd -mavx512bw -mavx512vl -mavx512bf16 -mavx512vbmi -mavx512vbmi2 -mavx512vnni -mavx512bitalg -mavx512vpopcntdq -mavx512vp2intersect" CFLAGS="-O3 -march=native -mavx512f -mavx512dq -mavx512ifma -mavx512cd -mavx512bw -mavx512vl -mavx512bf16 -mavx512vbmi -mavx512vbmi2 -mavx512vnni -mavx512bitalg -mavx512vpopcntdq -mavx512vp2intersect" FFLAGS="-O3 -march=native -mavx512f -mavx512dq -mavx512ifma -mavx512cd -mavx512bw -mavx512vl -mavx512bf16 -mavx512vbmi -mavx512vbmi2 -mavx512vnni -mavx512bitalg -mavx512vpopcntdq -mavx512vp2intersect"

RELION

RELION - REgularised LIkelihood OptimisatioN - is a stand-alone computer program for Maximum A Posteriori refinement of (multiple) 3D reconstructions or 2D class averages in cryo-electron microscopy (cryo-EM). It is developed in the research group of Sjors Scheres at the MRC Laboratory of Molecular Biology. Learn more via the OpenBenchmarking.org test page.

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

OpenFOAM

OpenFOAM is the leading free, open source software for computational fluid dynamics (CFD). Learn more via the OpenBenchmarking.org test page.

HPL Linpack

HPL is a well known portable Linpack implementation for distributed memory systems. This test profile is testing HPL upstream directly, outside the scope of the HPC Challenge test profile also available through the Phoronix Test Suite (hpcc). The test profile attempts to generate an optimized HPL.dat input file based on the CPU/memory under test. The automated HPL.dat input generation is still being tuned and thus for now this test profile remains "experimental". Learn more via the OpenBenchmarking.org test page.

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

Caffe

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

Parboil

The Parboil Benchmarks from the IMPACT Research Group at University of Illinois are a set of throughput computing applications for looking at computing architecture and compilers. Parboil test-cases support OpenMP, OpenCL, and CUDA multi-processing environments. However, at this time the test profile is just making use of the OpenMP and OpenCL test workloads. Learn more via the OpenBenchmarking.org test page.

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

Caffe

OpenFOAM

OpenFOAM is the leading free, open source software for computational fluid dynamics (CFD). Learn more via the OpenBenchmarking.org test page.

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

CP2K Molecular Dynamics

CP2K is an open-source molecular dynamics software package focused on quantum chemistry and solid-state physics. This test profile currently uses the SSMP (OpenMP) version of cp2k. Learn more via the OpenBenchmarking.org test page.

Numpy Benchmark

This is a test to obtain the general Numpy performance. Learn more via the OpenBenchmarking.org test page.

Parboil

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

TensorFlow Lite

This is a benchmark of the TensorFlow Lite implementation. The current Linux support is limited to running on CPUs. This test profile is measuring the average inference time. Learn more via the OpenBenchmarking.org test page.

Intel MPI Benchmarks

Intel MPI Benchmarks for stressing MPI implementations. At this point the test profile aggregates results for some common MPI functionality. Learn more via the OpenBenchmarking.org test page.

TensorFlow Lite

oneDNN

Timed HMMer Search

This test searches through the Pfam database of profile hidden markov models. The search finds the domain structure of Drosophila Sevenless protein. Learn more via the OpenBenchmarking.org test page.

oneDNN

Timed MrBayes Analysis

This test performs a bayesian analysis of a set of primate genome sequences in order to estimate their phylogeny. Learn more via the OpenBenchmarking.org test page.

Pennant

Pennant is an application focused on hydrodynamics on general unstructured meshes in 2D. Learn more via the OpenBenchmarking.org test page.

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

NAMD

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.

TensorFlow Lite

Darmstadt Automotive Parallel Heterogeneous Suite

DAPHNE is the Darmstadt Automotive Parallel HeterogeNEous Benchmark Suite with OpenCL / CUDA / OpenMP test cases for these automotive benchmarks for evaluating programming models in context to vehicle autonomous driving capabilities. Learn more via the OpenBenchmarking.org test page.

Himeno Benchmark

The Himeno benchmark is a linear solver of pressure Poisson using a point-Jacobi method. Learn more via the OpenBenchmarking.org test page.

SHOC Scalable HeterOgeneous Computing

QMCPACK

QMCPACK is a modern high-performance open-source Quantum Monte Carlo (QMC) simulation code making use of MPI for this benchmark of the H20 example code. QMCPACK is an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids. QMCPACK is supported by the U.S. Department of Energy. Learn more via the OpenBenchmarking.org test page.

Pennant

Pennant is an application focused on hydrodynamics on general unstructured meshes in 2D. Learn more via the OpenBenchmarking.org test page.

Caffe

ACES DGEMM

This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.

miniFE

MiniFE Finite Element is an application for unstructured implicit finite element codes. Learn more via the OpenBenchmarking.org test page.

R Benchmark

This test is a quick-running survey of general R performance Learn more via the OpenBenchmarking.org test page.

DeepSpeech

Mozilla DeepSpeech is a speech-to-text engine powered by TensorFlow for machine learning and derived from Baidu's Deep Speech research paper. This test profile times the speech-to-text process for a roughly three minute audio recording. Learn more via the OpenBenchmarking.org test page.

ASKAP

oneDNN

Caffe

oneDNN

Intel MPI Benchmarks

Intel MPI Benchmarks for stressing MPI implementations. At this point the test profile aggregates results for some common MPI functionality. Learn more via the OpenBenchmarking.org test page.

oneDNN

Darmstadt Automotive Parallel Heterogeneous Suite

Intel MPI Benchmarks

Intel MPI Benchmarks for stressing MPI implementations. At this point the test profile aggregates results for some common MPI functionality. Learn more via the OpenBenchmarking.org test page.

RNNoise

RNNoise is a recurrent neural network for audio noise reduction developed by Mozilla and Xiph.Org. This test profile is a single-threaded test measuring the time to denoise a sample 26 minute long 16-bit RAW audio file using this recurrent neural network noise suppression library. Learn more via the OpenBenchmarking.org test page.

Intel MPI Benchmarks

Intel MPI Benchmarks for stressing MPI implementations. At this point the test profile aggregates results for some common MPI functionality. Learn more via the OpenBenchmarking.org test page.

Parboil

ArrayFire

ArrayFire is an GPU and CPU numeric processing library, this test uses the built-in CPU and OpenCL ArrayFire benchmarks. Learn more via the OpenBenchmarking.org test page.

oneDNN

Algebraic Multi-Grid Benchmark

AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.

oneDNN

Darmstadt Automotive Parallel Heterogeneous Suite

ASKAP

oneDNN

LULESH

LULESH is the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. Learn more via the OpenBenchmarking.org test page.

oneDNN

SHOC Scalable HeterOgeneous Computing

ASKAP

GNU Octave Benchmark

This test profile measures how long it takes to complete several reference GNU Octave files via octave-benchmark. GNU Octave is used for numerical computations and is an open-source alternative to MATLAB. Learn more via the OpenBenchmarking.org test page.

Timed MAFFT Alignment

This test performs an alignment of 100 pyruvate decarboxylase sequences. Learn more via the OpenBenchmarking.org test page.

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

oneDNN

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

SHOC Scalable HeterOgeneous Computing

Darktable

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

SHOC Scalable HeterOgeneous Computing

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

Parboil

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

SHOC Scalable HeterOgeneous Computing

Darktable

SHOC Scalable HeterOgeneous Computing

106 Results Shown

RELION
Caffe
OpenFOAM
HPL Linpack
SHOC Scalable HeterOgeneous Computing
LeelaChessZero
Caffe:
AlexNet - CPU - 1000
GoogleNet - CPU - 100
FFTW
Parboil
GROMACS
Caffe
OpenFOAM
oneDNN
CP2K Molecular Dynamics
Numpy Benchmark
Parboil
FFTW
TensorFlow Lite
Intel MPI Benchmarks:
IMB-MPI1 Exchange:
Average usec
Average Mbytes/sec
TensorFlow Lite
oneDNN
Timed HMMer Search
oneDNN:
Recurrent Neural Network Training - u8s8f32 - CPU
Recurrent Neural Network Training - bf16bf16bf16 - CPU
Recurrent Neural Network Training - f32 - CPU
Recurrent Neural Network Inference - u8s8f32 - CPU
Recurrent Neural Network Inference - bf16bf16bf16 - CPU
Timed MrBayes Analysis
Pennant
ASKAP:
tConvolve MT - Degridding
tConvolve MT - Gridding
NAMD
TensorFlow Lite:
Mobilenet Quant
SqueezeNet
NASNet Mobile
Mobilenet Float
Darmstadt Automotive Parallel Heterogeneous Suite
Himeno Benchmark
SHOC Scalable HeterOgeneous Computing
QMCPACK
Pennant
Caffe
ACES DGEMM
miniFE
R Benchmark
DeepSpeech
ASKAP:
tConvolve MPI - Gridding
tConvolve MPI - Degridding
oneDNN
Caffe
oneDNN:
Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU
Deconvolution Batch shapes_1d - u8s8f32 - CPU
Intel MPI Benchmarks
oneDNN:
IP Shapes 1D - bf16bf16bf16 - CPU
IP Shapes 3D - bf16bf16bf16 - CPU
Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU
Darmstadt Automotive Parallel Heterogeneous Suite
Intel MPI Benchmarks:
IMB-MPI1 Sendrecv:
Average usec
Average Mbytes/sec
RNNoise
Intel MPI Benchmarks
Parboil
ArrayFire
oneDNN:
Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU
IP Shapes 1D - f32 - CPU
IP Shapes 1D - u8s8f32 - CPU
Algebraic Multi-Grid Benchmark
oneDNN
Darmstadt Automotive Parallel Heterogeneous Suite
ASKAP
oneDNN:
Matrix Multiply Batch Shapes Transformer - f32 - CPU
Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU
LULESH
oneDNN:
IP Shapes 3D - u8s8f32 - CPU
Deconvolution Batch shapes_3d - f32 - CPU
SHOC Scalable HeterOgeneous Computing
ASKAP:
tConvolve OpenMP - Degridding
tConvolve OpenMP - Gridding
GNU Octave Benchmark
Timed MAFFT Alignment
FFTW
cl-mem:
Copy
Read
Write
oneDNN:
Convolution Batch Shapes Auto - f32 - CPU
Convolution Batch Shapes Auto - u8s8f32 - CPU
Convolution Batch Shapes Auto - bf16bf16bf16 - CPU
Darktable:
Boat - OpenCL
Masskrug - OpenCL
SHOC Scalable HeterOgeneous Computing
Darktable
FFTW
SHOC Scalable HeterOgeneous Computing
FFTW:
Stock - 2D FFT Size 32
Stock - 1D FFT Size 32
Parboil
FFTW:
Float + SSE - 1D FFT Size 32
Float + SSE - 2D FFT Size 32
SHOC Scalable HeterOgeneous Computing:
OpenCL - Triad
OpenCL - MD5 Hash
OpenCL - Reduction
OpenCL - FFT SP
Darktable
SHOC Scalable HeterOgeneous Computing

12700k AVX512 march=sapphirerapids gcc 11.1 rx 5600xt

Testing initiated at 9 December 2021 07:03 by user felix.

12700k AVX512 march=native + AVX512 gcc 11.1 rx 5600xt

Processor: Intel Core i7-12700K @ 6.30GHz (8 Cores / 16 Threads), Motherboard: MSI PRO Z690-A DDR4(MS-7D25) v1.0 (1.15 BIOS), Chipset: Intel Device 7aa7, Memory: 32GB, Disk: 500GB Western Digital WDS500G2B0C-00PXH0 + 3 x 10001GB Seagate ST10000DM0004-1Z + 300GB Western Digital WD3000GLFS-0 + 128GB HP SSD S700 Pro, Graphics: Gigabyte AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 6GB (1650/750MHz), Audio: Realtek ALC897, Monitor: LG HDR WQHD, Network: Intel I225-V

Kernel Notes: Transparent Huge Pages: madvise
Environment Notes: CXXFLAGS="-O3 -march=native -mavx512f -mavx512dq -mavx512ifma -mavx512cd -mavx512bw -mavx512vl -mavx512bf16 -mavx512vbmi -mavx512vbmi2 -mavx512vnni -mavx512bitalg -mavx512vpopcntdq -mavx512vp2intersect" CFLAGS="-O3 -march=native -mavx512f -mavx512dq -mavx512ifma -mavx512cd -mavx512bw -mavx512vl -mavx512bf16 -mavx512vbmi -mavx512vbmi2 -mavx512vnni -mavx512bitalg -mavx512vpopcntdq -mavx512vp2intersect" FFLAGS="-O3 -march=native -mavx512f -mavx512dq -mavx512ifma -mavx512cd -mavx512bw -mavx512vl -mavx512bf16 -mavx512vbmi -mavx512vbmi2 -mavx512vnni -mavx512bitalg -mavx512vpopcntdq -mavx512vp2intersect"
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-RPS7jb/gcc-11-11.1.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-RPS7jb/gcc-11-11.1.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Disk Notes: NONE / errors=remount-ro,noatime,rw / Block Size: 4096
Processor Notes: Scaling Governor: intel_pstate powersave - CPU Microcode: 0x15 - Thermald 2.4.3
Graphics Notes: GLAMOR - BAR1 / Visible vRAM Size: 6128 MB
Python Notes: Python 2.7.18 + Python 3.9.5
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

Testing initiated at 11 December 2021 11:22 by user felix.

12700k HPC+OpenCL AVX512 performance profiling

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

12700k AVX512 march=sapphirerapids gcc 11.1 rx 5600xt

12700k AVX512 march=native + AVX512 gcc 11.1 rx 5600xt

RELION

Caffe

OpenFOAM

HPL Linpack

SHOC Scalable HeterOgeneous Computing

LeelaChessZero

Caffe

FFTW

Parboil

GROMACS

Caffe

OpenFOAM

oneDNN

CP2K Molecular Dynamics

Numpy Benchmark

Parboil

FFTW

TensorFlow Lite

Intel MPI Benchmarks

TensorFlow Lite

oneDNN

Timed HMMer Search

oneDNN

Timed MrBayes Analysis

Pennant

ASKAP

NAMD

TensorFlow Lite

Darmstadt Automotive Parallel Heterogeneous Suite

Himeno Benchmark

SHOC Scalable HeterOgeneous Computing

QMCPACK

Pennant

Caffe

ACES DGEMM

miniFE

R Benchmark

DeepSpeech

ASKAP

oneDNN

Caffe

oneDNN

Intel MPI Benchmarks

oneDNN

Darmstadt Automotive Parallel Heterogeneous Suite

Intel MPI Benchmarks

RNNoise

Intel MPI Benchmarks

Parboil

ArrayFire

oneDNN

Algebraic Multi-Grid Benchmark

oneDNN

Darmstadt Automotive Parallel Heterogeneous Suite

ASKAP

oneDNN

LULESH

oneDNN

SHOC Scalable HeterOgeneous Computing

ASKAP

GNU Octave Benchmark

Timed MAFFT Alignment

FFTW

cl-mem

oneDNN

Darktable

SHOC Scalable HeterOgeneous Computing

Darktable

FFTW

SHOC Scalable HeterOgeneous Computing

FFTW