M6i.8xlarge Benchmarks [2407013-NE-M6I8XLARG97]

84 Results Shown

Whisper.cpp:
ggml-medium.en - 2016 State of the Union
ggml-small.en - 2016 State of the Union
ONNX Runtime:
GPT-2 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
ArcFace ResNet-100 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
super-resolution-10 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
fcn-resnet101-11 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
Faster R-CNN R-50-FPN-int8 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
Whisper.cpp
OpenVINO:
Face Detection FP16 - CPU:
ms
FPS
Face Detection FP16-INT8 - CPU:
ms
FPS
ONNX Runtime:
Faster R-CNN R-50-FPN-int8 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
GPT-2 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
fcn-resnet101-11 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
OpenVINO:
Person Detection FP16 - CPU:
ms
FPS
Person Detection FP32 - CPU:
ms
FPS
ONNX Runtime:
bertsquad-12 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
bertsquad-12 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
yolov4 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
yolov4 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
T5 Encoder - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
OpenVINO:
Machine Translation EN To DE FP16 - CPU:
ms
FPS
ONNX Runtime:
T5 Encoder - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
ArcFace ResNet-100 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
OpenVINO:
Road Segmentation ADAS FP16-INT8 - CPU:
ms
FPS
Noise Suppression Poconet-Like FP16 - CPU:
ms
FPS
Person Vehicle Bike Detection FP16 - CPU:
ms
FPS
Person Re-Identification Retail FP16 - CPU:
ms
FPS
Road Segmentation ADAS FP16 - CPU:
ms
FPS
Handwritten English Recognition FP16-INT8 - CPU:
ms
FPS
Handwritten English Recognition FP16 - CPU:
ms
FPS
Vehicle Detection FP16-INT8 - CPU:
ms
FPS
Face Detection Retail FP16-INT8 - CPU:
ms
FPS
Vehicle Detection FP16 - CPU:
ms
FPS
Age Gender Recognition Retail 0013 FP16-INT8 - CPU:
ms
FPS
Weld Porosity Detection FP16 - CPU:
ms
FPS
Face Detection Retail FP16 - CPU:
ms
FPS
Weld Porosity Detection FP16-INT8 - CPU:
ms
FPS
ONNX Runtime:
CaffeNet 12-int8 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
OpenVINO:
Age Gender Recognition Retail 0013 FP16 - CPU:
ms
FPS
ONNX Runtime:
CaffeNet 12-int8 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
ResNet50 v1-12-int8 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
ResNet50 v1-12-int8 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
super-resolution-10 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
Llama.cpp

m6i.8xlarge

Processor: Intel Xeon Platinum 8375C (16 Cores / 32 Threads), Motherboard: Amazon EC2 m6i.8xlarge (1.0 BIOS), Chipset: Intel 440FX 82441FX PMC, Memory: 1 x 128 GB DDR4-3200MT/s, Disk: 537GB Amazon Elastic Block Store, Graphics: EFI VGA, Network: Amazon Elastic

OS: Ubuntu 22.04, Kernel: 6.5.0-1017-aws (x86_64), Vulkan: 1.3.255, Compiler: GCC 11.4.0, File-System: ext4, Screen Resolution: 800x600, System Layer: amazon

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: CPU Microcode: 0xd0003d1
Security Notes: gather_data_sampling: Unknown: Dependent on hypervisor status + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT Host state unknown + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

Testing initiated at 1 July 2024 09:30 by user root.

m6i.8xlarge

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

m6i.8xlarge

Whisper.cpp

ONNX Runtime

Whisper.cpp

OpenVINO

ONNX Runtime

OpenVINO

ONNX Runtime

OpenVINO

ONNX Runtime

OpenVINO

ONNX Runtime

OpenVINO

ONNX Runtime

Llama.cpp

84 Results Shown

m6i.8xlarge