M7g.8xlarge Benchmarks [2407019-NE-M7G8XLARG55]

146 Results Shown

Whisper.cpp
Neural Magic DeepSparse:
CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream:
ms/batch
items/sec
Whisper.cpp
Neural Magic DeepSparse:
CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream:
ms/batch
items/sec
OpenCV
Neural Magic DeepSparse:
ResNet-50, Sparse INT8 - Asynchronous Multi-Stream:
ms/batch
items/sec
Whisper.cpp
OpenCV
Mlpack Benchmark
OpenCV
oneDNN
OpenCV
Mlpack Benchmark
oneDNN
Neural Magic DeepSparse:
Llama2 Chat 7b Quantized - Asynchronous Multi-Stream:
ms/batch
items/sec
OpenVINO:
Face Detection FP16-INT8 - CPU:
ms
FPS
Neural Magic DeepSparse:
ResNet-50, Sparse INT8 - Synchronous Single-Stream:
ms/batch
items/sec
OpenVINO:
Face Detection FP16 - CPU:
ms
FPS
Neural Magic DeepSparse:
Llama2 Chat 7b Quantized - Synchronous Single-Stream:
ms/batch
items/sec
ONNX Runtime:
Faster R-CNN R-50-FPN-int8 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
Faster R-CNN R-50-FPN-int8 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
OpenVINO:
Person Detection FP16 - CPU:
ms
FPS
Road Segmentation ADAS FP16-INT8 - CPU:
ms
FPS
Person Detection FP32 - CPU:
ms
FPS
Machine Translation EN To DE FP16 - CPU:
ms
FPS
ONNX Runtime:
fcn-resnet101-11 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
fcn-resnet101-11 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
GPT-2 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
GPT-2 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
OpenVINO:
Noise Suppression Poconet-Like FP16 - CPU:
ms
FPS
ONNX Runtime:
yolov4 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
bertsquad-12 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
bertsquad-12 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
yolov4 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
OpenVINO:
Handwritten English Recognition FP16-INT8 - CPU:
ms
FPS
Vehicle Detection FP16-INT8 - CPU:
ms
FPS
ONNX Runtime:
T5 Encoder - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
OpenVINO:
Person Vehicle Bike Detection FP16 - CPU:
ms
FPS
Road Segmentation ADAS FP16 - CPU:
ms
FPS
Handwritten English Recognition FP16 - CPU:
ms
FPS
ONNX Runtime:
T5 Encoder - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
ArcFace ResNet-100 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
ArcFace ResNet-100 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
OpenVINO:
Weld Porosity Detection FP16-INT8 - CPU:
ms
FPS
Weld Porosity Detection FP16 - CPU:
ms
FPS
Person Re-Identification Retail FP16 - CPU:
ms
FPS
Face Detection Retail FP16-INT8 - CPU:
ms
FPS
Vehicle Detection FP16 - CPU:
ms
FPS
Face Detection Retail FP16 - CPU:
ms
FPS
ONNX Runtime:
CaffeNet 12-int8 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
OpenVINO:
Age Gender Recognition Retail 0013 FP16-INT8 - CPU:
ms
FPS
Age Gender Recognition Retail 0013 FP16 - CPU:
ms
FPS
ONNX Runtime:
CaffeNet 12-int8 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
ResNet50 v1-12-int8 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
ResNet50 v1-12-int8 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
super-resolution-10 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
super-resolution-10 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
OpenCV
Neural Magic DeepSparse:
BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream:
ms/batch
items/sec
BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream:
ms/batch
items/sec
NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream:
ms/batch
items/sec
NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream:
ms/batch
items/sec
NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream:
ms/batch
items/sec
NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream:
ms/batch
items/sec
NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream:
ms/batch
items/sec
oneDNN
Neural Magic DeepSparse:
NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream:
ms/batch
items/sec
CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream:
ms/batch
items/sec
NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream:
ms/batch
items/sec
CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream:
ms/batch
items/sec
NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream:
ms/batch
items/sec
ResNet-50, Baseline - Synchronous Single-Stream:
ms/batch
items/sec
ResNet-50, Baseline - Asynchronous Multi-Stream:
ms/batch
items/sec
CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream:
ms/batch
items/sec
CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream:
ms/batch
items/sec
Mlpack Benchmark
OpenCV
Llama.cpp
OpenCV
Mlpack Benchmark
oneDNN:
Deconvolution Batch shapes_1d - CPU
IP Shapes 1D - CPU
Convolution Batch Shapes Auto - CPU
Deconvolution Batch shapes_3d - CPU

m7g.8xlarge

Processor: ARMv8 Neoverse-V1 (32 Cores), Motherboard: Amazon EC2 m7g.8xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 128GB, Disk: 537GB Amazon Elastic Block Store, Network: Amazon Elastic

OS: Ubuntu 22.04, Kernel: 6.5.0-1017-aws (aarch64), Vulkan: 1.3.255, Compiler: GCC 11.4.0, File-System: ext4, System Layer: amazon

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected

Testing initiated at 1 July 2024 09:32 by user root.

m7g.8xlarge

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

m7g.8xlarge

Whisper.cpp

Neural Magic DeepSparse

Whisper.cpp

Neural Magic DeepSparse

OpenCV

Neural Magic DeepSparse

Whisper.cpp

OpenCV

Mlpack Benchmark

OpenCV

oneDNN

OpenCV

Mlpack Benchmark

oneDNN

Neural Magic DeepSparse

OpenVINO

Neural Magic DeepSparse

OpenVINO

Neural Magic DeepSparse

ONNX Runtime

OpenVINO

ONNX Runtime

OpenVINO

ONNX Runtime

OpenVINO

ONNX Runtime

OpenVINO

ONNX Runtime

OpenVINO

ONNX Runtime

OpenVINO

ONNX Runtime

OpenCV

Neural Magic DeepSparse

oneDNN

Neural Magic DeepSparse

Mlpack Benchmark

OpenCV

Llama.cpp

OpenCV

Mlpack Benchmark

oneDNN

146 Results Shown

m7g.8xlarge