kdlkf

AMD EPYC 8534P 64-Core testing with a AMD Cinnabar (RCB1009C BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.

a

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00212
Python Notes: Python 3.11.5
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

b

c

Processor: AMD EPYC 8534P 64-Core @ 2.30GHz (64 Cores / 128 Threads), Motherboard: AMD Cinnabar (RCB1009C BIOS), Chipset: AMD Device 14a4, Memory: 6 x 32GB DRAM-4800MT/s Samsung M321R4GA0BB0-CQKMG, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 PCIe

OS: Ubuntu 23.10, Kernel: 6.5.0-15-generic (x86_64), Desktop: GNOME Shell, Display Server: X Server 1.21.1.7, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200

JPEG-XL libjxl

The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance using the reference libjxl library. Learn more via the OpenBenchmarking.org test page.

JPEG-XL Decoding libjxl

The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is suited for JPEG XL decode performance testing to PNG output file, the pts/jpexl test is for encode performance. The JPEG XL encoding/decoding is done using the libjxl codebase. Learn more via the OpenBenchmarking.org test page.

srsRAN Project

SVT-AV1

This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.

Stockfish

This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 1024 CPU threads. Learn more via the OpenBenchmarking.org test page.

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration (defconfig) for the architecture being tested or alternatively an allmodconfig for building all possible kernel modules for the build. Learn more via the OpenBenchmarking.org test page.

Parallel BZIP2 Compression

This test measures the time needed to compress a file (FreeBSD-13.0-RELEASE-amd64-memstick.img) using Parallel BZIP2 compression. Learn more via the OpenBenchmarking.org test page.

Primesieve

Primesieve generates prime numbers using a highly optimized sieve of Eratosthenes implementation. Primesieve primarily benchmarks the CPU's L1/L2 cache performance. Learn more via the OpenBenchmarking.org test page.

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

Google Draco

Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

WavPack Audio Encoding

This test times how long it takes to encode a sample WAV file to WavPack format with very high quality settings. Learn more via the OpenBenchmarking.org test page.

Chaos Group V-RAY

This is a test of Chaos Group's V-RAY benchmark. V-RAY is a commercial renderer that can integrate with various creator software products like SketchUp and 3ds Max. The V-RAY benchmark is standalone and supports CPU and NVIDIA CUDA/RTX based rendering. Learn more via the OpenBenchmarking.org test page.

119 Results Shown

JPEG-XL libjxl:
PNG - 80
PNG - 90
JPEG - 80
JPEG - 90
PNG - 100
JPEG - 100
JPEG-XL Decoding libjxl:
1
All
srsRAN Project:
PDSCH Processor Benchmark, Throughput Total
PDSCH Processor Benchmark, Throughput Thread
SVT-AV1:
Preset 4 - Bosphorus 4K
Preset 8 - Bosphorus 4K
Preset 12 - Bosphorus 4K
Preset 13 - Bosphorus 4K
Preset 4 - Bosphorus 1080p
Preset 8 - Bosphorus 1080p
Preset 12 - Bosphorus 1080p
Preset 13 - Bosphorus 1080p
Stockfish
Timed Linux Kernel Compilation:
defconfig
allmodconfig
Parallel BZIP2 Compression
Primesieve:
1e12
1e13
oneDNN:
IP Shapes 1D - CPU
IP Shapes 3D - CPU
Convolution Batch Shapes Auto - CPU
Deconvolution Batch shapes_1d - CPU
Deconvolution Batch shapes_3d - CPU
Recurrent Neural Network Training - CPU
Recurrent Neural Network Inference - CPU
Neural Magic DeepSparse:
NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream:
items/sec
ms/batch
NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream:
items/sec
ms/batch
ResNet-50, Baseline - Asynchronous Multi-Stream:
items/sec
ms/batch
ResNet-50, Baseline - Synchronous Single-Stream:
items/sec
ms/batch
ResNet-50, Sparse INT8 - Asynchronous Multi-Stream:
items/sec
ms/batch
ResNet-50, Sparse INT8 - Synchronous Single-Stream:
items/sec
ms/batch
Llama2 Chat 7b Quantized - Asynchronous Multi-Stream:
items/sec
ms/batch
Llama2 Chat 7b Quantized - Synchronous Single-Stream:
items/sec
ms/batch
CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream:
items/sec
ms/batch
CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream:
items/sec
ms/batch
CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream:
items/sec
ms/batch
CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream:
items/sec
ms/batch
NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream:
items/sec
ms/batch
CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream:
items/sec
ms/batch
CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream:
items/sec
ms/batch
BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream:
items/sec
ms/batch
BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream:
items/sec
ms/batch
NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream:
items/sec
ms/batch
Google Draco:
Lion
Church Facade
OpenVINO:
Face Detection FP16 - CPU:
FPS
ms
Person Detection FP16 - CPU:
FPS
ms
Person Detection FP32 - CPU:
FPS
ms
Vehicle Detection FP16 - CPU:
FPS
ms
Face Detection FP16-INT8 - CPU:
FPS
ms
Face Detection Retail FP16 - CPU:
FPS
ms
Road Segmentation ADAS FP16 - CPU:
FPS
ms
Vehicle Detection FP16-INT8 - CPU:
FPS
ms
Weld Porosity Detection FP16 - CPU:
FPS
ms
Face Detection Retail FP16-INT8 - CPU:
FPS
ms
Road Segmentation ADAS FP16-INT8 - CPU:
FPS
ms
Machine Translation EN To DE FP16 - CPU:
FPS
ms
Weld Porosity Detection FP16-INT8 - CPU:
FPS
ms
Person Vehicle Bike Detection FP16 - CPU:
FPS
ms
Noise Suppression Poconet-Like FP16 - CPU:
FPS
ms
Handwritten English Recognition FP16 - CPU:
FPS
ms
Person Re-Identification Retail FP16 - CPU:
FPS
ms
Age Gender Recognition Retail 0013 FP16 - CPU:
FPS
ms
Handwritten English Recognition FP16-INT8 - CPU:
FPS
ms
Age Gender Recognition Retail 0013 FP16-INT8 - CPU:
FPS
ms
WavPack Audio Encoding
Chaos Group V-RAY

a

Testing initiated at 16 March 2024 21:35 by user phoronix.

b

Testing initiated at 16 March 2024 23:00 by user phoronix.

c

Testing initiated at 17 March 2024 00:16 by user phoronix.