amazon testing on Ubuntu 22.04 via the Phoronix Test Suite.
m7g.8xlarge Processor: ARMv8 Neoverse-V1 (32 Cores), Motherboard: Amazon EC2 m7g.8xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 128GB, Disk: 537GB Amazon Elastic Block Store, Network: Amazon Elastic
OS: Ubuntu 22.04, Kernel: 6.5.0-1017-aws (aarch64), Vulkan: 1.3.255, Compiler: GCC 11.4.0, File-System: ext4, System Layer: amazon
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vSecurity Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
Whisper.cpp OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-medium.en - Input: 2016 State of the Union m7g.8xlarge 100 200 300 400 500 SE +/- 5.49, N = 9 439.61 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -mcpu=native
OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-base.en - Input: 2016 State of the Union m7g.8xlarge 20 40 60 80 100 SE +/- 0.69, N = 15 81.61 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -mcpu=native
OpenCV This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Stitching m7g.8xlarge 60K 120K 180K 240K 300K SE +/- 284.51, N = 3 284480 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
Whisper.cpp OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-small.en - Input: 2016 State of the Union m7g.8xlarge 40 80 120 160 200 SE +/- 2.02, N = 4 179.73 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -mcpu=native
OpenCV This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.
Test: Graph API
m7g.8xlarge: The test quit with a non-zero exit status. E: AbsExact error: G-API output and reference output matrixes are not bitexact equal.
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: DNN - Deep Neural Network m7g.8xlarge 5K 10K 15K 20K 25K SE +/- 427.59, N = 15 23459 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Image Processing m7g.8xlarge 20K 40K 60K 80K 100K SE +/- 282.17, N = 3 105292 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Core m7g.8xlarge 20K 40K 60K 80K 100K SE +/- 663.91, N = 3 96964 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU m7g.8xlarge 700 1400 2100 2800 3500 SE +/- 1.31, N = 3 3042.17 MIN: 2748.48 / MAX: 4882.42 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU m7g.8xlarge 0.5805 1.161 1.7415 2.322 2.9025 SE +/- 0.00, N = 3 2.58 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU m7g.8xlarge 500 1000 1500 2000 2500 SE +/- 2.66, N = 3 2151.54 MIN: 1728.64 / MAX: 4274.63 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU m7g.8xlarge 0.828 1.656 2.484 3.312 4.14 SE +/- 0.01, N = 3 3.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard m7g.8xlarge 40 80 120 160 200 SE +/- 1.31, N = 3 159.76 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel m7g.8xlarge 40 80 120 160 200 SE +/- 0.24, N = 3 173.47 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU m7g.8xlarge 80 160 240 320 400 SE +/- 0.20, N = 3 377.31 MIN: 235.41 / MAX: 510.12 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU m7g.8xlarge 5 10 15 20 25 SE +/- 0.01, N = 3 21.17 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU m7g.8xlarge 110 220 330 440 550 SE +/- 0.48, N = 3 492.91 MIN: 489.58 / MAX: 526.43 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU m7g.8xlarge 4 8 12 16 20 SE +/- 0.01, N = 3 16.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU m7g.8xlarge 80 160 240 320 400 SE +/- 0.30, N = 3 376.80 MIN: 207.53 / MAX: 510.75 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU m7g.8xlarge 5 10 15 20 25 SE +/- 0.02, N = 3 21.20 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU m7g.8xlarge 40 80 120 160 200 SE +/- 0.08, N = 3 170.48 MIN: 154.07 / MAX: 334.2 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU m7g.8xlarge 11 22 33 44 55 SE +/- 0.02, N = 3 46.87 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel m7g.8xlarge 200 400 600 800 1000 SE +/- 4.21, N = 3 879.88 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard m7g.8xlarge 150 300 450 600 750 SE +/- 0.42, N = 3 704.10 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Standard m7g.8xlarge 1.0378 2.0756 3.1134 4.1512 5.189 SE +/- 0.01049, N = 3 4.61260 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Parallel m7g.8xlarge 2 4 6 8 10 SE +/- 0.00806, N = 3 6.91744 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU m7g.8xlarge 30 60 90 120 150 SE +/- 0.04, N = 3 113.16 MIN: 110.94 / MAX: 150.98 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU m7g.8xlarge 16 32 48 64 80 SE +/- 0.02, N = 3 70.67 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Parallel m7g.8xlarge 50 100 150 200 250 SE +/- 1.73, N = 3 217.82 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Standard m7g.8xlarge 12 24 36 48 60 SE +/- 0.06, N = 3 54.23 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Parallel m7g.8xlarge 20 40 60 80 100 SE +/- 0.64, N = 3 111.42 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Standard m7g.8xlarge 30 60 90 120 150 SE +/- 0.01, N = 3 112.96 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU m7g.8xlarge 40 80 120 160 200 SE +/- 0.38, N = 3 201.65 MIN: 199.37 / MAX: 225.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU m7g.8xlarge 9 18 27 36 45 SE +/- 0.07, N = 3 39.62 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU m7g.8xlarge 30 60 90 120 150 SE +/- 0.32, N = 3 154.86 MIN: 152.55 / MAX: 178.82 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU m7g.8xlarge 12 24 36 48 60 SE +/- 0.11, N = 3 51.63 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Standard m7g.8xlarge 0.637 1.274 1.911 2.548 3.185 SE +/- 0.01609, N = 3 2.83110 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU m7g.8xlarge 6 12 18 24 30 SE +/- 0.32, N = 3 25.39 MIN: 22.51 / MAX: 39.78 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU m7g.8xlarge 70 140 210 280 350 SE +/- 3.87, N = 3 314.92 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU m7g.8xlarge 16 32 48 64 80 SE +/- 0.09, N = 3 70.31 MIN: 53.97 / MAX: 122.98 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU m7g.8xlarge 30 60 90 120 150 SE +/- 0.14, N = 3 113.70 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU m7g.8xlarge 40 80 120 160 200 SE +/- 0.75, N = 3 185.84 MIN: 183.08 / MAX: 211.71 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU m7g.8xlarge 10 20 30 40 50 SE +/- 0.18, N = 3 43.01 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Parallel m7g.8xlarge 0.9796 1.9592 2.9388 3.9184 4.898 SE +/- 0.01533, N = 3 4.35358 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel m7g.8xlarge 20 40 60 80 100 SE +/- 0.24, N = 3 88.16 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard m7g.8xlarge 12 24 36 48 60 SE +/- 0.02, N = 3 55.09 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU m7g.8xlarge 6 12 18 24 30 SE +/- 0.12, N = 3 24.07 MIN: 22.31 / MAX: 225.53 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU m7g.8xlarge 70 140 210 280 350 SE +/- 1.64, N = 3 332.14 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU m7g.8xlarge 4 8 12 16 20 SE +/- 0.02, N = 3 14.70 MIN: 11.59 / MAX: 168.15 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU m7g.8xlarge 120 240 360 480 600 SE +/- 0.86, N = 3 543.78 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU m7g.8xlarge 6 12 18 24 30 SE +/- 0.07, N = 3 22.98 MIN: 16.5 / MAX: 41.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU m7g.8xlarge 80 160 240 320 400 SE +/- 1.11, N = 3 347.81 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU m7g.8xlarge 11 22 33 44 55 SE +/- 0.04, N = 3 48.17 MIN: 46.88 / MAX: 54.74 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU m7g.8xlarge 40 80 120 160 200 SE +/- 0.13, N = 3 166.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU m7g.8xlarge 6 12 18 24 30 SE +/- 0.04, N = 3 26.78 MIN: 23.25 / MAX: 51.98 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU m7g.8xlarge 60 120 180 240 300 SE +/- 0.45, N = 3 298.31 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU m7g.8xlarge 2 4 6 8 10 SE +/- 0.00, N = 3 8.84 MIN: 7.91 / MAX: 16.48 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU m7g.8xlarge 200 400 600 800 1000 SE +/- 0.28, N = 3 903.29 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard m7g.8xlarge 0.2418 0.4836 0.7254 0.9672 1.209 SE +/- 0.00052, N = 3 1.07450 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU m7g.8xlarge 0.4613 0.9226 1.3839 1.8452 2.3065 SE +/- 0.00, N = 3 2.05 MIN: 1.28 / MAX: 23.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU m7g.8xlarge 800 1600 2400 3200 4000 SE +/- 2.45, N = 3 3886.78 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU m7g.8xlarge 0.4703 0.9406 1.4109 1.8812 2.3515 SE +/- 0.00, N = 3 2.09 MIN: 1 / MAX: 26.97 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU m7g.8xlarge 800 1600 2400 3200 4000 SE +/- 1.81, N = 3 3808.13 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel m7g.8xlarge 0.5834 1.1668 1.7502 2.3336 2.917 SE +/- 0.00506, N = 3 2.59272 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard m7g.8xlarge 0.8838 1.7676 2.6514 3.5352 4.419 SE +/- 0.00884, N = 3 3.92809 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel m7g.8xlarge 1.23 2.46 3.69 4.92 6.15 SE +/- 0.01039, N = 3 5.46672 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Parallel m7g.8xlarge 3 6 9 12 15 SE +/- 0.01, N = 3 12.78 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Standard m7g.8xlarge 3 6 9 12 15 SE +/- 0.01, N = 3 12.64 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenCV This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Features 2D m7g.8xlarge 12K 24K 36K 48K 60K SE +/- 267.56, N = 3 54743 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Object Detection m7g.8xlarge 6K 12K 18K 24K 30K SE +/- 75.10, N = 3 27261 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
Llama.cpp OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b3067 Model: Meta-Llama-3-8B-Instruct-Q8_0.gguf m7g.8xlarge 5 10 15 20 25 SE +/- 0.14, N = 3 22.47 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -lopenblas
OpenCV This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Video m7g.8xlarge 5K 10K 15K 20K 25K SE +/- 92.56, N = 3 22760 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
m7g.8xlarge Processor: ARMv8 Neoverse-V1 (32 Cores), Motherboard: Amazon EC2 m7g.8xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 128GB, Disk: 537GB Amazon Elastic Block Store, Network: Amazon Elastic
OS: Ubuntu 22.04, Kernel: 6.5.0-1017-aws (aarch64), Vulkan: 1.3.255, Compiler: GCC 11.4.0, File-System: ext4, System Layer: amazon
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vSecurity Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 1 July 2024 09:32 by user root.