2 x AMD EPYC 7742 64-Core testing with a Supermicro H11DSi-NT v2.00 (2.1 BIOS) and ASPEED on Ubuntu 21.10 via the Phoronix Test Suite.
MGLRU Enabled Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034Java Notes: OpenJDK Runtime Environment (build 11.0.12+7-Ubuntu-0ubuntu3)Python Notes: Python 3.9.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
MGLRU Disabled Processor: 2 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads), Motherboard: Supermicro H11DSi-NT v2.00 (2.1 BIOS), Chipset: AMD Starship/Matisse, Memory: 128GB, Disk: 280GB INTEL SSDPE21D280GA, Graphics: ASPEED, Monitor: VE228, Network: 2 x Intel 10G X550T
OS: Ubuntu 21.10, Kernel: 5.16.0-rc8-mglru-pts (x86_64), Desktop: GNOME Shell 40.5, Display Server: X Server, Vulkan: 1.1.182, Compiler: GCC 11.2.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034Java Notes: OpenJDK Runtime Environment (build 11.0.13+8-Ubuntu-0ubuntu1.21.10)Python Notes: Python 3.9.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
MGLRU Kernel Tests OpenBenchmarking.org Phoronix Test Suite 2 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads) Supermicro H11DSi-NT v2.00 (2.1 BIOS) AMD Starship/Matisse 128GB 280GB INTEL SSDPE21D280GA ASPEED VE228 2 x Intel 10G X550T Ubuntu 21.10 5.16.0-rc8-mglru-pts (x86_64) GNOME Shell 40.5 X Server 1.1.182 GCC 11.2.0 ext4 1920x1080 Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution MGLRU Kernel Tests Performance System Logs - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034 - MGLRU Enabled: OpenJDK Runtime Environment (build 11.0.12+7-Ubuntu-0ubuntu3) - MGLRU Disabled: OpenJDK Runtime Environment (build 11.0.13+8-Ubuntu-0ubuntu1.21.10) - Python 3.9.7 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
MGLRU Enabled vs. MGLRU Disabled Comparison Phoronix Test Suite Baseline +3.3% +3.3% +6.6% +6.6% +9.9% +9.9% +13.2% +13.2% 13.3% 8.5% 5.8% 3.4% 3.2% 2% super-resolution-10 - CPU yolov4 - CPU 1000 7.1% No - Inference - VGG16 - CPU 6.9% 500 OpenMP CFD Solver 3.6% 100 - 250 - Read Only 100 - 250 - Read Only - Average Latency Compression Rating 3.1% No - Inference - VGG19 - CPU 2.8% OpenMP Leukocyte 2.1% No - Inference - ResNet 50 - CPU 2% San Miguel - Path Tracer ONNX Runtime ONNX Runtime Apache HTTP Server PlaidML Apache HTTP Server Rodinia PostgreSQL pgbench PostgreSQL pgbench 7-Zip Compression PlaidML Rodinia PlaidML OSPray MGLRU Enabled MGLRU Disabled
MGLRU Kernel Tests compress-7zip: Compression Rating compress-7zip: Decompression Rating mt-dgemm: Sustained Floating-Point Rate amg: apache: 500 apache: 1000 embree: Pathtracer - Crown embree: Pathtracer ISPC - Crown java-gradle-perf: Reactor liquid-dsp: 128 - 256 - 57 liquid-dsp: 256 - 256 - 57 luxcorerender: DLSC - CPU luxcorerender: Danish Mood - CPU mocassin: Dust 2D tau100.0 namd: ATPase Simulation - 327,506 Atoms npb: EP.D npb: MG.C nginx: 500 nginx: 1000 nwchem: C240 Buckyball onnx: yolov4 - CPU onnx: fcn-resnet101-11 - CPU onnx: shufflenet-v2-10 - CPU onnx: super-resolution-10 - CPU openvkl: vklBenchmark ISPC openvkl: vklBenchmark Scalar ospray: San Miguel - SciVis ospray: San Miguel - Path Tracer plaidml: No - Inference - VGG16 - CPU plaidml: No - Inference - VGG19 - CPU plaidml: No - Inference - ResNet 50 - CPU pgbench: 100 - 250 - Read Only pgbench: 100 - 250 - Read Only - Average Latency pgbench: 100 - 500 - Read Only pgbench: 100 - 500 - Read Only - Average Latency qe: AUSURF112 rodinia: OpenMP LavaMD rodinia: OpenMP HotSpot3D rodinia: OpenMP Leukocyte rodinia: OpenMP CFD Solver rodinia: OpenMP Streamcluster stockfish: Total Time svt-av1: Preset 4 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K build-godot: Time To Compile build-linux-kernel: Time To Compile build-linux-kernel: defconfig build-linux-kernel: allmodconfig build-llvm: Ninja build-llvm: Unix Makefiles build-mesa: Time To Compile incompact3d: X3D-benchmarking input.i3d incompact3d: input.i3d 193 Cells Per Direction xmrig: Monero - 1M xmrig: Wownero - 1M MGLRU Enabled MGLRU Disabled 409975 594185 28.601698 1249141667 76184.73 94120.00 66.2049 59.8594 374.749 5100733333 5511766667 10.36 5.35 230 0.27158 8589.03 74668.52 89324.74 91024.31 2161.2 212 180 5553 5923 175 118 83.33 6.52 28.09 24.19 4.49 1935150 0.129 1922285 0.260 330.62 33.040 104.601 46.110 8.969 9.729 249652058 4.476 52.655 57.651 19.828 20.453 157.844 110.353 196.919 21.144 463.579814 13.3743223 40029.9 53546.0 397755 594384 28.826575 1244605000 80632.59 87890.86 66.2287 60.1011 380.441 5093333333 5526200000 10.28 5.25 230 0.27257 8547.91 74790.12 89985.51 91601.65 2154.7 230 177 5447 6711 176 118 83.33 6.65 26.28 23.53 4.40 2000445 0.125 1936663 0.258 330.96 33.295 105.137 47.077 9.288 9.803 250125465 4.530 52.288 58.112 19.804 20.618 159.280 110.285 196.729 21.130 463.230825 13.3808743 40078.3 53670.3 OpenBenchmarking.org
OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 21.06 Test: Decompression Rating MGLRU Disabled MGLRU Enabled 130K 260K 390K 520K 650K SE +/- 7344.83, N = 3 SE +/- 6316.43, N = 3 594384 594185 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
Algebraic Multi-Grid Benchmark AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 MGLRU Disabled MGLRU Enabled 300M 600M 900M 1200M 1500M SE +/- 2472880.37, N = 3 SE +/- 738959.93, N = 3 1244605000 1249141667 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi
Apache HTTP Server This is a test of the Apache HTTPD web server. This Apache HTTPD web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.48 Concurrent Requests: 500 MGLRU Disabled MGLRU Enabled 20K 40K 60K 80K 100K SE +/- 232.53, N = 3 SE +/- 976.98, N = 3 80632.59 76184.73 1. (CC) gcc options: -shared -fPIC -O2
OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.48 Concurrent Requests: 1000 MGLRU Disabled MGLRU Enabled 20K 40K 60K 80K 100K SE +/- 1001.52, N = 4 SE +/- 908.68, N = 6 87890.86 94120.00 1. (CC) gcc options: -shared -fPIC -O2
Embree Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.13 Binary: Pathtracer - Model: Crown MGLRU Disabled MGLRU Enabled 15 30 45 60 75 SE +/- 0.24, N = 3 SE +/- 0.15, N = 3 66.23 66.20 MIN: 61.28 / MAX: 74.94 MIN: 61.78 / MAX: 73.44
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.13 Binary: Pathtracer ISPC - Model: Crown MGLRU Disabled MGLRU Enabled 13 26 39 52 65 SE +/- 0.18, N = 3 SE +/- 0.42, N = 3 60.10 59.86 MIN: 56.51 / MAX: 68.5 MIN: 55.99 / MAX: 67.12
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 128 - Buffer Length: 256 - Filter Length: 57 MGLRU Disabled MGLRU Enabled 1100M 2200M 3300M 4400M 5500M SE +/- 7872808.34, N = 3 SE +/- 11970148.05, N = 3 5093333333 5100733333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 256 - Buffer Length: 256 - Filter Length: 57 MGLRU Disabled MGLRU Enabled 1200M 2400M 3600M 4800M 6000M SE +/- 16977730.51, N = 3 SE +/- 18720428.53, N = 3 5526200000 5511766667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
LuxCoreRender LuxCoreRender is an open-source 3D physically based renderer formerly known as LuxRender. LuxCoreRender supports CPU-based rendering as well as GPU acceleration via OpenCL, NVIDIA CUDA, and NVIDIA OptiX interfaces. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: DLSC - Acceleration: CPU MGLRU Disabled MGLRU Enabled 3 6 9 12 15 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 10.28 10.36 MIN: 9.66 / MAX: 14.1 MIN: 9.71 / MAX: 14.1
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: CPU MGLRU Disabled MGLRU Enabled 1.2038 2.4076 3.6114 4.8152 6.019 SE +/- 0.08, N = 15 SE +/- 0.09, N = 15 5.25 5.35 MIN: 1.73 / MAX: 7.07 MIN: 1.85 / MAX: 7.13
Monte Carlo Simulations of Ionised Nebulae Mocassin is the Monte Carlo Simulations of Ionised Nebulae. MOCASSIN is a fully 3D or 2D photoionisation and dust radiative transfer code which employs a Monte Carlo approach to the transfer of radiation through media of arbitrary geometry and density distribution. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2019-03-24 Input: Dust 2D tau100.0 MGLRU Disabled MGLRU Enabled 50 100 150 200 250 SE +/- 0.00, N = 3 SE +/- 0.67, N = 3 230 230 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O3 -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz
NAMD NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms MGLRU Disabled MGLRU Enabled 0.0613 0.1226 0.1839 0.2452 0.3065 SE +/- 0.00229, N = 3 SE +/- 0.00314, N = 3 0.27257 0.27158
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D MGLRU Disabled MGLRU Enabled 2K 4K 6K 8K 10K SE +/- 47.65, N = 3 SE +/- 4.21, N = 3 8547.91 8589.03 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.0
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C MGLRU Disabled MGLRU Enabled 16K 32K 48K 64K 80K SE +/- 406.38, N = 3 SE +/- 370.74, N = 3 74790.12 74668.52 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.0
nginx This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 500 MGLRU Disabled MGLRU Enabled 20K 40K 60K 80K 100K SE +/- 202.70, N = 3 SE +/- 134.98, N = 3 89985.51 89324.74 1. (CC) gcc options: -lcrypt -lz -O3 -march=native
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 1000 MGLRU Disabled MGLRU Enabled 20K 40K 60K 80K 100K SE +/- 270.06, N = 3 SE +/- 265.34, N = 3 91601.65 91024.31 1. (CC) gcc options: -lcrypt -lz -O3 -march=native
NWChem NWChem is an open-source high performance computational chemistry package. Per NWChem's documentation, "NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better NWChem 7.0.2 Input: C240 Buckyball MGLRU Disabled MGLRU Enabled 500 1000 1500 2000 2500 2154.7 2161.2 1. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz -lcomex -m64 -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.10 Model: yolov4 - Device: CPU MGLRU Disabled MGLRU Enabled 50 100 150 200 250 SE +/- 0.87, N = 3 SE +/- 1.95, N = 12 230 212 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.10 Model: fcn-resnet101-11 - Device: CPU MGLRU Disabled MGLRU Enabled 40 80 120 160 200 SE +/- 1.80, N = 3 SE +/- 4.29, N = 12 177 180 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.10 Model: shufflenet-v2-10 - Device: CPU MGLRU Disabled MGLRU Enabled 1200 2400 3600 4800 6000 SE +/- 71.82, N = 3 SE +/- 54.18, N = 3 5447 5553 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.10 Model: super-resolution-10 - Device: CPU MGLRU Disabled MGLRU Enabled 1400 2800 4200 5600 7000 SE +/- 5.04, N = 3 SE +/- 176.40, N = 12 6711 5923 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
OpenVKL OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.0 Benchmark: vklBenchmark ISPC MGLRU Disabled MGLRU Enabled 40 80 120 160 200 SE +/- 0.00, N = 3 SE +/- 0.67, N = 3 176 175 MIN: 16 / MAX: 2455 MIN: 14 / MAX: 2362
OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.0 Benchmark: vklBenchmark Scalar MGLRU Disabled MGLRU Enabled 30 60 90 120 150 SE +/- 1.00, N = 3 SE +/- 0.33, N = 3 118 118 MIN: 11 / MAX: 2528 MIN: 11 / MAX: 2529
OSPray Intel OSPray is a portable ray-tracing engine for high-performance, high-fidenlity scientific visualizations. OSPray builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better OSPray 1.8.5 Demo: San Miguel - Renderer: SciVis MGLRU Disabled MGLRU Enabled 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 83.33 83.33 MIN: 47.62 / MAX: 100 MIN: 35.71 / MAX: 100
OpenBenchmarking.org FPS, More Is Better OSPray 1.8.5 Demo: San Miguel - Renderer: Path Tracer MGLRU Disabled MGLRU Enabled 2 4 6 8 10 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 6.65 6.52 MIN: 5.46 / MAX: 7.14 MIN: 5.41 / MAX: 7.09
OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: VGG19 - Device: CPU MGLRU Disabled MGLRU Enabled 6 12 18 24 30 SE +/- 0.23, N = 15 SE +/- 0.23, N = 3 23.53 24.19
OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: ResNet 50 - Device: CPU MGLRU Disabled MGLRU Enabled 1.0103 2.0206 3.0309 4.0412 5.0515 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 4.40 4.49
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 14.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Only - Average Latency MGLRU Disabled MGLRU Enabled 0.029 0.058 0.087 0.116 0.145 SE +/- 0.001, N = 3 SE +/- 0.000, N = 3 0.125 0.129 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 14.0 Scaling Factor: 100 - Clients: 500 - Mode: Read Only MGLRU Disabled MGLRU Enabled 400K 800K 1200K 1600K 2000K SE +/- 21901.77, N = 3 SE +/- 11345.70, N = 3 1936663 1922285 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 14.0 Scaling Factor: 100 - Clients: 500 - Mode: Read Only - Average Latency MGLRU Disabled MGLRU Enabled 0.0585 0.117 0.1755 0.234 0.2925 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 0.258 0.260 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
Quantum ESPRESSO Quantum ESPRESSO is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Quantum ESPRESSO 7.0 Input: AUSURF112 MGLRU Disabled MGLRU Enabled 70 140 210 280 350 SE +/- 0.36, N = 3 SE +/- 0.20, N = 3 330.96 330.62 1. (F9X) gfortran options: -pthread -fopenmp -ldevXlib -lopenblas -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common -lFoX_utils -lFoX_fsys -lfftw3_omp -lfftw3 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Rodinia Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP LavaMD MGLRU Disabled MGLRU Enabled 8 16 24 32 40 SE +/- 0.17, N = 3 SE +/- 0.15, N = 3 33.30 33.04 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP HotSpot3D MGLRU Disabled MGLRU Enabled 20 40 60 80 100 SE +/- 0.89, N = 3 SE +/- 0.54, N = 3 105.14 104.60 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP Leukocyte MGLRU Disabled MGLRU Enabled 11 22 33 44 55 SE +/- 0.20, N = 3 SE +/- 0.32, N = 13 47.08 46.11 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP CFD Solver MGLRU Disabled MGLRU Enabled 3 6 9 12 15 SE +/- 0.124, N = 12 SE +/- 0.083, N = 6 9.288 8.969 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP Streamcluster MGLRU Disabled MGLRU Enabled 3 6 9 12 15 SE +/- 0.201, N = 14 SE +/- 0.134, N = 15 9.803 9.729 1. (CXX) g++ options: -O2 -lOpenCL
Stockfish This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 512 CPU threads. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 13 Total Time MGLRU Disabled MGLRU Enabled 50M 100M 150M 200M 250M SE +/- 2532421.76, N = 6 SE +/- 2261974.30, N = 3 250125465 249652058 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fprofile-use -fno-peel-loops -fno-tracer -pedantic -O3 -msse -msse3 -mpopcnt -mavx2 -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
SVT-AV1 This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8.7 Encoder Mode: Preset 4 - Input: Bosphorus 4K MGLRU Disabled MGLRU Enabled 1.0193 2.0386 3.0579 4.0772 5.0965 SE +/- 0.005, N = 3 SE +/- 0.008, N = 3 4.530 4.476 1. (CXX) g++ options: -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8.7 Encoder Mode: Preset 8 - Input: Bosphorus 4K MGLRU Disabled MGLRU Enabled 12 24 36 48 60 SE +/- 0.18, N = 3 SE +/- 0.18, N = 3 52.29 52.66 1. (CXX) g++ options: -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
Xcompact3d Incompact3d Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d MGLRU Disabled MGLRU Enabled 100 200 300 400 500 SE +/- 1.49, N = 3 SE +/- 0.82, N = 3 463.23 463.58 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction MGLRU Disabled MGLRU Enabled 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 13.38 13.37 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Monero - Hash Count: 1M MGLRU Disabled MGLRU Enabled 9K 18K 27K 36K 45K SE +/- 67.91, N = 3 SE +/- 178.74, N = 3 40078.3 40029.9 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M MGLRU Disabled MGLRU Enabled 11K 22K 33K 44K 55K SE +/- 176.64, N = 3 SE +/- 192.97, N = 3 53670.3 53546.0 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
MGLRU Enabled Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034Java Notes: OpenJDK Runtime Environment (build 11.0.12+7-Ubuntu-0ubuntu3)Python Notes: Python 3.9.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 9 January 2022 20:23 by user phoronix.
MGLRU Disabled Processor: 2 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads), Motherboard: Supermicro H11DSi-NT v2.00 (2.1 BIOS), Chipset: AMD Starship/Matisse, Memory: 128GB, Disk: 280GB INTEL SSDPE21D280GA, Graphics: ASPEED, Monitor: VE228, Network: 2 x Intel 10G X550T
OS: Ubuntu 21.10, Kernel: 5.16.0-rc8-mglru-pts (x86_64), Desktop: GNOME Shell 40.5, Display Server: X Server, Vulkan: 1.1.182, Compiler: GCC 11.2.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034Java Notes: OpenJDK Runtime Environment (build 11.0.13+8-Ubuntu-0ubuntu1.21.10)Python Notes: Python 3.9.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 10 January 2022 13:46 by user phoronix.