tgl tgl

Intel Core i7-1185G7 testing with a Dell XPS 13 9310 0DXP1F (3.7.0 BIOS) and Intel Xe TGL GT2 15GB on Ubuntu 23.10 via the Phoronix Test Suite.

a

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xb4 - Thermald 2.5.4
Python Notes: Python 3.11.6
Security Notes: gather_data_sampling: Mitigation of Microcode + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

b

c

Processor: Intel Core i7-1185G7 @ 4.80GHz (4 Cores / 8 Threads), Motherboard: Dell XPS 13 9310 0DXP1F (3.7.0 BIOS), Chipset: Intel Tiger Lake-LP, Memory: 8 x 2GB LPDDR4-4267MT/s, Disk: Micron 2300 NVMe 512GB, Graphics: Intel Xe TGL GT2 15GB (1350MHz), Audio: Realtek ALC289, Network: Intel Wi-Fi 6 AX201

OS: Ubuntu 23.10, Kernel: 6.7.0-060700rc5-generic (x86_64), Desktop: GNOME Shell 45.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.0~git2312220600.68c53e~oibaf~m (git-68c53ec 2023-12-22 mantic-oibaf-ppa), OpenCL: OpenCL 3.0, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200

Libplacebo

Libplacebo is a multimedia rendering library based on the core rendering code of the MPV player. The libplacebo benchmark relies on the Vulkan API and tests various primitives. Learn more via the OpenBenchmarking.org test page.

dav1d

Dav1d is an open-source, speedy AV1 video decoder supporting modern SIMD CPU features. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

dav1d

ONNX Runtime

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

NAMD

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

ONNX Runtime

Result

Inference Time Cost (ms)

VkFFT

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

VkFFT

ONNX Runtime

Result

Inference Time Cost (ms)

dav1d

LZ4 Compression

This test measures the time needed to compress/decompress a sample file (silesia archive) using LZ4 compression. Learn more via the OpenBenchmarking.org test page.

CacheBench

This is a performance test of CacheBench, which is part of LLCbench. CacheBench is designed to test the memory and cache bandwidth performance Learn more via the OpenBenchmarking.org test page.

ONNX Runtime

Result

Inference Time Cost (ms)

Llama.cpp

Llama.cpp is a port of Facebook's LLaMA model in C/C++ developed by Georgi Gerganov. Llama.cpp allows the inference of LLaMA and other supported models in C/C++. For CPU inference Llama.cpp supports AVX2/AVX-512, ARM NEON, and other modern ISAs along with features like OpenBLAS usage. Learn more via the OpenBenchmarking.org test page.

Llamafile

Mozilla's Llamafile allows distributing and running large language models (LLMs) as a single file. Llamafile aims to make open-source LLMs more accessible to developers and users. Llamafile supports a variety of models, CPUs and GPUs, and other options. Learn more via the OpenBenchmarking.org test page.

ONNX Runtime

Result

Inference Time Cost (ms)

LZ4 Compression

This test measures the time needed to compress/decompress a sample file (silesia archive) using LZ4 compression. Learn more via the OpenBenchmarking.org test page.

ONNX Runtime

Result

Inference Time Cost (ms)

LZ4 Compression

This test measures the time needed to compress/decompress a sample file (silesia archive) using LZ4 compression. Learn more via the OpenBenchmarking.org test page.

Libplacebo

Llamafile

VkFFT

LZ4 Compression

This test measures the time needed to compress/decompress a sample file (silesia archive) using LZ4 compression. Learn more via the OpenBenchmarking.org test page.

dav1d

LZ4 Compression

This test measures the time needed to compress/decompress a sample file (silesia archive) using LZ4 compression. Learn more via the OpenBenchmarking.org test page.

CacheBench

This is a performance test of CacheBench, which is part of LLCbench. CacheBench is designed to test the memory and cache bandwidth performance Learn more via the OpenBenchmarking.org test page.

Libplacebo

NAMD

Libplacebo

VkFFT

Libplacebo

CacheBench

This is a performance test of CacheBench, which is part of LLCbench. CacheBench is designed to test the memory and cache bandwidth performance Learn more via the OpenBenchmarking.org test page.

Llamafile

Llama.cpp

Intel Open Image Denoise

Open Image Denoise is a denoising library for ray-tracing and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

VkFFT

Test: FFT + iFFT C2C Bluestein benchmark in double precision

a: The test quit with a non-zero exit status.

b: The test quit with a non-zero exit status.

c: The test quit with a non-zero exit status.

Test: FFT + iFFT C2C 1D batched in double precision

a: The test quit with a non-zero exit status.

b: The test quit with a non-zero exit status.

c: The test quit with a non-zero exit status.

57 Results Shown

Libplacebo
dav1d
ONNX Runtime:
bertsquad-12 - CPU - Parallel
ArcFace ResNet-100 - CPU - Parallel
fcn-resnet101-11 - CPU - Parallel
yolov4 - CPU - Parallel
Faster R-CNN R-50-FPN-int8 - CPU - Parallel
fcn-resnet101-11 - CPU - Standard
dav1d
ONNX Runtime:
ArcFace ResNet-100 - CPU - Standard
super-resolution-10 - CPU - Standard
bertsquad-12 - CPU - Standard
GPT-2 - CPU - Parallel
yolov4 - CPU - Standard
CaffeNet 12-int8 - CPU - Standard
Faster R-CNN R-50-FPN-int8 - CPU - Standard
super-resolution-10 - CPU - Parallel
GPT-2 - CPU - Standard
NAMD
VkFFT
ONNX Runtime
VkFFT
GROMACS
VkFFT
ONNX Runtime
dav1d
LZ4 Compression:
3 - Compression Speed
9 - Compression Speed
CacheBench
ONNX Runtime
Llama.cpp
Llamafile
ONNX Runtime
LZ4 Compression
ONNX Runtime
LZ4 Compression
Libplacebo
Llamafile
VkFFT
LZ4 Compression
dav1d
LZ4 Compression
CacheBench
Libplacebo:
deband_heavy
hdr_lut
NAMD
Libplacebo
VkFFT:
FFT + iFFT C2C 1D batched in single precision, no reshuffling
FFT + iFFT C2C 1D batched in single precision
Libplacebo
CacheBench
Llamafile
Llama.cpp:
llama-2-70b-chat.Q5_0.gguf
llama-2-13b.Q4_0.gguf
Intel Open Image Denoise:
RTLightmap.hdr.4096x4096 - CPU-Only
RT.ldr_alb_nrm.3840x2160 - CPU-Only
RT.hdr_alb_nrm.3840x2160 - CPU-Only

a

Testing initiated at 20 February 2024 02:51 by user phoronix.

b

Testing initiated at 20 February 2024 10:32 by user phoronix.

c

Testing initiated at 20 February 2024 17:43 by user phoronix.

tgl tgl

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

a

b

c

Libplacebo

dav1d

ONNX Runtime

dav1d

ONNX Runtime

NAMD

VkFFT

ONNX Runtime

VkFFT

GROMACS

VkFFT

ONNX Runtime

dav1d

LZ4 Compression

CacheBench

ONNX Runtime

Llama.cpp

Llamafile

ONNX Runtime

LZ4 Compression

ONNX Runtime

LZ4 Compression

Libplacebo

Llamafile

VkFFT

LZ4 Compression

dav1d

LZ4 Compression

CacheBench

Libplacebo

NAMD

Libplacebo

VkFFT

Libplacebo

CacheBench

Llamafile

Llama.cpp

Intel Open Image Denoise

VkFFT

57 Results Shown

a

b

c