Nvidia KVM testing on Ubuntu 24.04 via the Phoronix Test Suite. ASPEED - 2 x Intel Xeon Gold 6226R: Processor: 2 x Intel Xeon Gold 6226R @ 3.90GHz (32 Cores / 64 Threads), Motherboard: (5.14 BIOS), Chipset: Intel Sky Lake-E DMI3 Registers, Memory: 512GB, Disk: 2 x 8002GB INTEL SSDPE2KX080T8, Graphics: ASPEED 16GB, Audio: NVIDIA GA104 HD Audio, Monitor: 27B2G5, Network: 2 x Intel X722 for 1GbE + 2 x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb OS: Ubuntu 24.04, Kernel: 6.8.0-38-generic (x86_64), Display Server: X Server, Display Driver: NVIDIA, OpenCL: OpenCL 3.0 CUDA 12.4.131, Compiler: GCC 13.2.0 + CUDA 12.4, File-System: ext4, Screen Resolution: 1920x1080 5x A5000 kw-dl580-3-4 NVIDIA: Processor: 4 x Intel Xeon E7-4880 v2 (60 Cores / 120 Threads), Motherboard: QEMU Standard PC (Q35 + ICH9 2009) (edk2-20240813-1.fc40 BIOS), Chipset: Intel 82G33/G31/P35/P31 + ICH9, Memory: 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 4 GB RAM, Disk: 21GB VIRTUAL-DISK, Graphics: Red Hat QXL paravirtual graphic card 22GB, Audio: QEMU Generic, Network: 2 x Red Hat Virtio 1.0 device OS: Ubuntu 24.04, Kernel: 6.8.0-45-generic (x86_64), Display Server: X Server, Display Driver: NVIDIA, OpenCL: OpenCL 3.0 CUDA 12.4.131, Compiler: GCC 13.2.0 + CUDA 12.0, File-System: ext4, Screen Resolution: 1024x768, System Layer: KVM PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL FPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 751.93 |================================== PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL FPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 1898.70 |================================= PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL FPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 2201.61 |================================= PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL FPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 179.21 |================================== NeatBench 5 Acceleration: GPU FPS > Higher Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 12.12 |=================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 324.18 |================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 12.33 |=================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 13.15 |=================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 1998.58 |================================= cl-mem 2017-01-13 Benchmark: Copy GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 283.4 |============================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 328.2 |=================================== cl-mem 2017-01-13 Benchmark: Read GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 380.1 |======================= 5x A5000 kw-dl580-3-4 NVIDIA ....... 584.4 |=================================== cl-mem 2017-01-13 Benchmark: Write GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 376.4 |======================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 547.4 |=================================== ViennaCL 1.7.1 Test: CPU BLAS - sCOPY GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 228.0 |=========================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 299.3 |=================================== ViennaCL 1.7.1 Test: CPU BLAS - sAXPY GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 382.0 |================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 392.9 |=================================== ViennaCL 1.7.1 Test: CPU BLAS - sDOT GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 261.0 |=================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 199.8 |=========================== ViennaCL 1.7.1 Test: CPU BLAS - dCOPY GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 117.9 |=================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 46.1 |============== ViennaCL 1.7.1 Test: CPU BLAS - dAXPY GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 183.0 |=================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 67.8 |============= ViennaCL 1.7.1 Test: CPU BLAS - dDOT GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 175.0 |=================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 56.6 |=========== ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 108.0 |=================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 51.6 |================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 196.0 |=================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 191.3 |================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 266 |================================ 5x A5000 kw-dl580-3-4 NVIDIA ....... 309 |===================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 345 |================================ 5x A5000 kw-dl580-3-4 NVIDIA ....... 403 |===================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 312 |===================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 291 |=================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 359 |============================ 5x A5000 kw-dl580-3-4 NVIDIA ....... 473 |===================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 385 |=========================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 534 |===================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 383 |============================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 477 |===================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 170 |===================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 164 |==================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T GB/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 317 |==================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 324 |===================================== clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth GBPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 377.04 |====================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 582.46 |================================== Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Double Precision GFLOPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 309.39 |=========================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 391.91 |================================== Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Single Precision GFLOPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 18670.28 |====================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 27753.40 |================================ SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D GFLOPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 211.90 |================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP GFLOPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 1094.66 |================================= SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N GFLOPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 3630.55 |================================= SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops GFLOPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 21619.9 |================================= clpeak 1.1.2 OpenCL Test: Single-Precision Float GFLOPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 18602.65 |====================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 26836.38 |================================ clpeak 1.1.2 OpenCL Test: Double-Precision Double GFLOPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 365.83 |========================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 483.57 |================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN GFLOPs/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 59.7 |==================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 57.0 |================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT GFLOPs/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 59.4 |==================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 56.1 |================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN GFLOPs/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 62.2 |==================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 58.3 |================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT GFLOPs/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 58.3 |==================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 57.2 |=================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN GFLOPs/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 344 |============================= 5x A5000 kw-dl580-3-4 NVIDIA ....... 440 |===================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT GFLOPs/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 348 |============================= 5x A5000 kw-dl580-3-4 NVIDIA ....... 443 |===================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT GFLOPs/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 340 |============================ 5x A5000 kw-dl580-3-4 NVIDIA ....... 442 |===================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN GFLOPs/s > Higher Is Better 5x A5000 kw-dl580-3-4 NVIDIA . 443 |=========================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash GHash/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 22.57 |=================================== Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Integer GIOPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 11601.09 |========================= 5x A5000 kw-dl580-3-4 NVIDIA ....... 14663.50 |================================ clpeak 1.1.2 OpenCL Test: Integer Compute INT GIOPS > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 9617.49 |====================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 13718.21 |================================ Hashcat 6.2.4 Benchmark: MD5 H/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 156178112500 |============================ 5x A5000 kw-dl580-3-4 NVIDIA ....... 137067950000 |========================= Hashcat 6.2.4 Benchmark: SHA1 H/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 91940033333 |============================= 5x A5000 kw-dl580-3-4 NVIDIA ....... 75698633333 |======================== Hashcat 6.2.4 Benchmark: 7-Zip H/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 4224700 |================================= 5x A5000 kw-dl580-3-4 NVIDIA ....... 3519467 |=========================== Hashcat 6.2.4 Benchmark: SHA-512 H/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 13308000000 |============================= 5x A5000 kw-dl580-3-4 NVIDIA ....... 10923433333 |======================== Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS H/s > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 3454200 |================================= 5x A5000 kw-dl580-3-4 NVIDIA ....... 2853933 |=========================== LuxCoreRender 2.6 Scene: DLSC - Acceleration: GPU M samples/sec > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 57.44 |=================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 48.60 |============================== LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: GPU M samples/sec > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 35.82 |=================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 24.72 |======================== LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: GPU M samples/sec > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 50.64 |=================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 34.46 |======================== LuxCoreRender 2.6 Scene: LuxCore Benchmark - Acceleration: GPU M samples/sec > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 26.70 |=================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 16.69 |====================== LuxCoreRender 2.6 Scene: Rainbow Colors and Prism - Acceleration: GPU M samples/sec > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 122.06 |================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 76.39 |===================== LeelaChessZero 0.30 Backend: OpenCL Nodes Per Second > Higher Is Better FAHBench 2.3.2 Ns Per Day > Higher Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 240.14 |================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 185.28 |========================== MandelGPU 1.3pts1 OpenCL Device: GPU Samples/sec > Higher Is Better ArrayFire 3.9 Test: Conjugate Gradient OpenCL FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 10.460 |================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 8.049 |========================== NCNN 20230517 Target: Vulkan GPU - Model: mobilenet ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 18.93 |========= 5x A5000 kw-dl580-3-4 NVIDIA ....... 77.05 |=================================== NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 8.37 |======= 5x A5000 kw-dl580-3-4 NVIDIA ....... 40.16 |=================================== NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 8.52 |======= 5x A5000 kw-dl580-3-4 NVIDIA ....... 40.31 |=================================== NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 9.72 |======= 5x A5000 kw-dl580-3-4 NVIDIA ....... 47.98 |=================================== NCNN 20230517 Target: Vulkan GPU - Model: mnasnet ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 7.30 |====== 5x A5000 kw-dl580-3-4 NVIDIA ....... 39.58 |=================================== NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 11.02 |======= 5x A5000 kw-dl580-3-4 NVIDIA ....... 57.12 |=================================== NCNN 20230517 Target: Vulkan GPU - Model: blazeface ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 4.13 |======= 5x A5000 kw-dl580-3-4 NVIDIA ....... 21.06 |=================================== NCNN 20230517 Target: Vulkan GPU - Model: googlenet ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 18.15 |======= 5x A5000 kw-dl580-3-4 NVIDIA ....... 90.21 |=================================== NCNN 20230517 Target: Vulkan GPU - Model: vgg16 ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 45.73 |=========== 5x A5000 kw-dl580-3-4 NVIDIA ....... 140.86 |================================== NCNN 20230517 Target: Vulkan GPU - Model: resnet18 ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 10.92 |========= 5x A5000 kw-dl580-3-4 NVIDIA ....... 43.28 |=================================== NCNN 20230517 Target: Vulkan GPU - Model: alexnet ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 7.97 |========== 5x A5000 kw-dl580-3-4 NVIDIA ....... 28.16 |=================================== NCNN 20230517 Target: Vulkan GPU - Model: resnet50 ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 21.90 |======== 5x A5000 kw-dl580-3-4 NVIDIA ....... 93.89 |=================================== NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 33.43 |=========== 5x A5000 kw-dl580-3-4 NVIDIA ....... 106.30 |================================== NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 20.32 |========= 5x A5000 kw-dl580-3-4 NVIDIA ....... 77.37 |=================================== NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 32.77 |===== 5x A5000 kw-dl580-3-4 NVIDIA ....... 227.14 |================================== NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 58.46 |========= 5x A5000 kw-dl580-3-4 NVIDIA ....... 228.16 |================================== NCNN 20230517 Target: Vulkan GPU - Model: FastestDet ms < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 10.39 |======= 5x A5000 kw-dl580-3-4 NVIDIA ....... 50.17 |=================================== NCNN 20230517 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 ms < Lower Is Better 5x A5000 kw-dl580-3-4 NVIDIA . 77.05 |========================================= RedShift Demo 3.0 Seconds < Lower Is Better Rodinia 3.1 Test: OpenCL Particle Filter Seconds < Lower Is Better ASPEED - 2 x Intel Xeon Gold 6226R . 7.105 |=================================== 5x A5000 kw-dl580-3-4 NVIDIA ....... 6.694 |=================================