Amazon AWS Graviton3E vs. Graviton 2/3 benchmarks

Benchmarks by Michael Larabel for a future article on Phoronix.com.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2405291-NE-2308110NE13
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts

Limit displaying results to tests within:

BLAS (Basic Linear Algebra Sub-Routine) Tests 4 Tests
C++ Boost Tests 3 Tests
Chess Test Suite 2 Tests
Timed Code Compilation 3 Tests
C/C++ Compiler Tests 6 Tests
CPU Massive 10 Tests
Creator Workloads 2 Tests
Fortran Tests 6 Tests
HPC - High Performance Computing 17 Tests
Common Kernel Benchmarks 2 Tests
LAPACK (Linear Algebra Pack) Tests 2 Tests
Linear Algebra 2 Tests
Molecular Dynamics 6 Tests
MPI Benchmarks 8 Tests
Multi-Core 12 Tests
NVIDIA GPU Compute 3 Tests
OpenMPI Tests 18 Tests
Programmer / Developer System Benchmarks 5 Tests
Python Tests 5 Tests
Scientific Computing 12 Tests
Software Defined Radio 2 Tests
Server 2 Tests
Server CPU Tests 6 Tests
Common Workstation Benchmarks 2 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Additional Graphs

Show Perf Per Core/Thread Calculation Graphs Where Applicable

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
m7g.16xlarge Graviton3
June 22 2023
  6 Hours, 40 Minutes
c6g.16xlarge Graviton2
June 23 2023
  8 Hours, 18 Minutes
c7g.16xlarge Graviton3
June 23 2023
  6 Hours, 40 Minutes
c7gn.16xlarge Graviton3E
July 10 2023
  6 Hours, 39 Minutes
c6a.16xlarge AMD Zen 3
August 11 2023
  8 Hours, 48 Minutes
egeo-07
May 28
  17 Hours, 28 Minutes
Invert Hiding All Results Option
  9 Hours, 6 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Amazon AWS Graviton3E vs. Graviton 2/3 benchmarksProcessorMotherboardChipsetMemoryDiskNetworkGraphicsAudioMonitorOSKernelCompilerFile-SystemSystem LayerVulkanDisplay ServerDisplay DriverOpenCLScreen Resolutionm7g.16xlarge Graviton3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Ec6a.16xlarge AMD Zen 3egeo-07ARMv8 Neoverse-V1 (64 Cores)Amazon EC2 m7g.16xlarge (1.0 BIOS)Amazon Device 0200256GB215GB Amazon Elastic Block StoreAmazon ElasticUbuntu 22.045.19.0-1025-aws (aarch64)GCC 11.3.0ext4amazonARMv8 Neoverse-N1 (64 Cores)Amazon EC2 c6g.16xlarge (1.0 BIOS)128GBARMv8 Neoverse-V1 (64 Cores)Amazon EC2 c7g.16xlarge (1.0 BIOS)Amazon EC2 c7gn.16xlarge (1.0 BIOS)AMD EPYC 7R13 (32 Cores / 64 Threads)Amazon EC2 c6a.16xlarge (1.0 BIOS)Intel 440FX 82441FX PMC322GB Amazon Elastic Block Store5.19.0-1025-aws (x86_64)1.3.238GCC 11.4.02 x Intel Xeon Silver 4208 @ 3.20GHz (16 Cores / 32 Threads)Dell Precision 7920 Rack 0DY2X0 (2.21.2 BIOS)Intel Sky Lake-E DMI3 Registers64GB2000GB TOSHIBA DT01ACA2Matrox G200eW3 15GBNVIDIA TU104 HD AudioDELL 17FP4 x Intel I350Debian 115.10.0-28-amd64 (x86_64)X ServerNVIDIAOpenCL 3.0 CUDA 12.2.1381.3.242GCC 10.2.1 20210110 + Clang 11.0.1-2 + CUDA 11.21280x1024OpenBenchmarking.orgKernel Details- m7g.16xlarge Graviton3: Transparent Huge Pages: madvise- c6g.16xlarge Graviton2: Transparent Huge Pages: madvise- c7g.16xlarge Graviton3: Transparent Huge Pages: madvise- c7gn.16xlarge Graviton3E: Transparent Huge Pages: madvise- c6a.16xlarge AMD Zen 3: Transparent Huge Pages: madvise- egeo-07: Transparent Huge Pages: alwaysCompiler Details- m7g.16xlarge Graviton3: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - c6g.16xlarge Graviton2: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - c7g.16xlarge Graviton3: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - c7gn.16xlarge Graviton3E: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - c6a.16xlarge AMD Zen 3: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - egeo-07: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-mutex --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-Km9U7s/gcc-10-10.2.1/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-Km9U7s/gcc-10-10.2.1/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Python Details- m7g.16xlarge Graviton3: Python 3.10.6- c6g.16xlarge Graviton2: Python 3.10.6- c7g.16xlarge Graviton3: Python 3.10.6- c7gn.16xlarge Graviton3E: Python 3.10.6- c6a.16xlarge AMD Zen 3: Python 3.10.12- egeo-07: Python 2.7.18 + Python 3.9.2Security Details- m7g.16xlarge Graviton3: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected- c6g.16xlarge Graviton2: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected- c7g.16xlarge Graviton3: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected- c7gn.16xlarge Graviton3E: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected- c6a.16xlarge AMD Zen 3: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - egeo-07: gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled Processor Details- c6a.16xlarge AMD Zen 3: CPU Microcode: 0xa0011cf- egeo-07: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x5003605

m7g.16xlarge Graviton3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Ec6a.16xlarge AMD Zen 3egeo-07Logarithmic Result OverviewPhoronix Test SuiteStress-NGPennantBRL-CADXcompact3d Incompact3dNWChemGraph500OpenSSLLAMMPS Molecular Dynamics SimulatorLULESHHeFFTe - Highly Efficient FFT for ExascaleRemhosLaghosGPAWStockfish7-Zip CompressionCoremarkGROMACSAlgebraic Multi-Grid BenchmarknginxRodiniaKripkeQMCPACKsrsRAN ProjectLiquid-DSPTimed Node.js CompilationTimed Godot Game Engine CompilationMonte Carlo Simulations of Ionised NebulaeTimed Gem5 CompilationNAS Parallel BenchmarksACES DGEMM

Amazon AWS Graviton3E vs. Graviton 2/3 benchmarksnwchem: C240 Buckyballgraph500: 26graph500: 26graph500: 26graph500: 26lammps: 20k Atomsbuild-gem5: Time To Compilebrl-cad: VGR Performance Metricstockfish: Total Timelczero: BLASbuild-nodejs: Time To Compilelczero: Eigenqmcpack: FeCO6_b3lyp_gmsqmcpack: FeCO6_b3lyp_gmsbuild-godot: Time To Compilemocassin: Dust 2D tau100.0openssl: SHA256openssl: AES-128-GCMopenssl: ChaCha20openssl: ChaCha20-Poly1305openssl: AES-256-GCMopenssl: SHA512qmcpack: Li2_STO_aestress-ng: CPU Cachelaghos: Sedov Blast Wave, ube_922_hex.meshnekrs: TurboPipe Periodicmt-dgemm: Sustained Floating-Point Ratenpb: EP.Dgpaw: Carbon Nanotubenekrs: Kershawnpb: SP.Cnginx: 1000nginx: 500stress-ng: Wide Vector Mathlaghos: Triple Point Problemrodinia: OpenMP LavaMDheffte: c2c - FFTW - double - 512gromacs: MPI CPU - water_GMX50_barenpb: LU.Copenssl: RSA4096openssl: RSA4096coremark: CoreMark Size 666 - Iterations Per Secondrodinia: OpenMP Streamclusterstress-ng: NUMAqmcpack: simple-H2Osrsran: PUSCH Processor Benchmark, Throughput Totalstress-ng: Vector Floating Pointsrsran: Downlink Processor Benchmarksrsran: PUSCH Processor Benchmark, Throughput Threadheffte: r2c - FFTW - double - 512heffte: c2c - FFTW - float - 512kripke: compress-7zip: Decompression Ratingcompress-7zip: Compression Ratingincompact3d: input.i3d 193 Cells Per Directionliquid-dsp: 64 - 256 - 512liquid-dsp: 32 - 256 - 512stress-ng: Fused Multiply-Addstress-ng: Vector Shuffleliquid-dsp: 64 - 256 - 32liquid-dsp: 64 - 256 - 57stress-ng: Matrix 3D Mathstress-ng: Matrix Mathstress-ng: Memory Copyingliquid-dsp: 32 - 256 - 32liquid-dsp: 32 - 256 - 57stress-ng: Vector Mathremhos: Sample Remap Examplepennant: sedovbigamg: heffte: r2c - FFTW - float - 512mocassin: Gas HII40lulesh: pennant: leblancbigincompact3d: input.i3d 129 Cells Per Directionheffte: c2c - FFTW - double - 256npb: CG.Crodinia: OpenMP CFD Solverheffte: c2c - FFTW - float - 256heffte: r2c - FFTW - double - 256npb: MG.Cheffte: r2c - FFTW - float - 256lammps: Rhodopsin Proteinheffte: c2c - FFTW - double - 128heffte: r2c - FFTW - float - 128heffte: c2c - FFTW - float - 128heffte: r2c - FFTW - double - 128m7g.16xlarge Graviton3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Ec6a.16xlarge AMD Zen 3egeo-071940.24197540002994970001227790000119432000036.927180.2477837771121197111301237.7831398211.60205.72154.37882.669542125155803320331719001032267845177428746099028333311363032125448870112.613892396.34410.55397630000024.3623533738.9861.831315068000017244.85255616.04255768.441542834.94232.0143.78846.25044.22328341.68713859.510181.91601880.34226411.6633759.1028.0415413.876102.55318.595.884.473988.048233900040028554031682513.94541801627533338139666763762252.7654143.402270500000144240000010403.93368750.6720484.241136066667721493333217235.5914.0409.2064901646761667162.95613.57528296.3786.7205373.0987103840.892321988.994.37581.444278.504950126.29164.87337.55857.1503306.540186.356138.0142976.928468900020935000087438900086043200025.171225.30553302086609284947287.814891302.19297.94218.276145.37442472798847158436163857672925412034671763680712919959315714393925490165.121921785.20322.37222019000020.4179522216.2692.76017603366679711.70158676.40148964.69997272.65180.8062.22424.26582.76718741.90214040.92624.31260642.17702413.7352112.6645.2253938.742850.82197.263.844.929742.828422012023323420224070225.88256581349266676748633337732190.5435614.5115314000009782000005752.17284713.6311324.79765466667489270000147886.1420.74016.48050103558633381.941220.75817557.48512.176835.6372073520.627913103.626.05141.981640.110425671.2992.399625.95032.7468209.496135.35881.44981962.74157580002938260001206990000117771000036.862181.7797890661173164761333238.5431382211.32204.77156.68782.822542165612633320643498431032755169977431884221328337379573732145914147112.643844101.98408.01397898333324.1406053664.5462.083326185333317219.95255552.05255145.521535336.57230.6843.96346.37064.20028375.71713945.910181.41605948.67464511.6253523.5827.9905356.876178.46319.795.784.745188.184235444273328563331105613.83266931627666678141200063818458.6154472.072271966667144236666710813.59368671.3920478.671136133333721386667217446.1214.1209.4222701765277667163.27613.65928708.6566.9613453.1444799940.828321911.024.44281.009677.768549742.30162.01037.41255.1055301.418184.026133.51419144117620002961640001207760000117564000036.838182.4717447431170271211392238.6361444188.28204.25155.95182.974541542185934111304699431141181194237996946548735115246542032126059040113.203860335.38423.11414144000024.0785293657.6756.440330282333317163.11256585.83253518.511530043.52236.2244.04446.53004.82028369.11713754.810183.31611801.55926510.6903525.1727.9995431.276911.74323.297.485.006088.455135423406728567731200913.76067261627566678139400063723431.5554695.042266833333144266666710882.02369258.8920475.961136000000721380000217567.1014.0829.3409531765966333163.55913.52528736.2266.8399983.1148982840.970822155.364.42981.167178.165849860.68162.36137.48255.1038300.396184.110133.4223440.420455000015768800041777700041057100020.342192.118485038969056091316230.4231152184.10187.32147.737194.435458575347771514492693171383893787539252299937313845788945015291283297123.951447265.35275.9243375366679.3880503061.4289.818430881000034025.35163178.67165847.751380146.63227.4064.17923.52123.96595221.40548396.58392.41466587.0365808.396552.6826.8676479.196529.51691.3215.942.439444.317623708765023578723097030.314528846007666727480333330920910.9222255.84218486666717108000004571.96147576.418080.4311939666671444266667221776.1522.10416.5305083699930082.758412.66916708.2589.9175657.0197528820.871920210.009.34243.590741.586845946.81102.65219.56348.9432158.85898.702686.37301098485755800690706002105720002082880007.539462.59910268426092344641.867608.86552.72391.737281.2493466925203611313202535621349462726507041907446191285603878566000408.451525522.4670.772.0942061329.20257.46411726.5570348.7773654.19399851.8561.58212.54510.66531.11636556.34200342.73041.8362563.28837023.2470.8977.5231129.521347.32342.388.518.990219.6715109107233590557584084.664220212956000012825000010084119.217162.566412433334778966671841.5855962.003209.8363348333342959666738515.8968.95689.0049044445613334.367426.9205676.776243.4588021.23927379.323799661.9118.63016.997117.117019477.2332.42877.1769.6229658.201226.736723.0160OpenBenchmarking.org

NWChem

NWChem is an open-source high performance computational chemistry package. Per NWChem's documentation, "NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters." Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterNWChem 7.0.2Input: C240 Buckyballc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton32K4K6K8K10K3440.42976.91962.71914.010984.01940.2-m64-ldl -lutil -m641. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz -lcomex -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2

Graph500

This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsssp max_TEPS, More Is BetterGraph500 3.0Scale: 26c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton390M180M270M360M450M20455000028468900041575800041176200085755800419754000-pthread1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

OpenBenchmarking.orgsssp median_TEPS, More Is BetterGraph500 3.0Scale: 26c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton360M120M180M240M300M15768800020935000029382600029616400069070600299497000-pthread1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

OpenBenchmarking.orgbfs max_TEPS, More Is BetterGraph500 3.0Scale: 26c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3300M600M900M1200M1500M4177770008743890001206990000120776000021057200012277900001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

OpenBenchmarking.orgbfs median_TEPS, More Is BetterGraph500 3.0Scale: 26c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3300M600M900M1200M1500M4105710008604320001177710000117564000020828800011943200001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

LAMMPS Molecular Dynamics Simulator

LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k Atomsc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3816243240SE +/- 0.066, N = 3SE +/- 0.009, N = 3SE +/- 0.025, N = 3SE +/- 0.018, N = 3SE +/- 0.006, N = 3SE +/- 0.034, N = 320.34225.17136.86236.8387.53936.927-lm-pthread -lm1. (CXX) g++ options: -O3 -ldl

Timed Gem5 Compilation

This test times how long it takes to compile Gem5. Gem5 is a simulator for computer system architecture research. Gem5 is widely used for computer architecture research within the industry, academia, and more. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Gem5 Compilation 21.2Time To Compilec6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3100200300400500SE +/- 0.26, N = 3SE +/- 0.35, N = 3SE +/- 0.26, N = 3SE +/- 0.38, N = 3SE +/- 9.98, N = 9SE +/- 0.13, N = 3192.12225.31181.78182.47462.60180.25

BRL-CAD

BRL-CAD is a cross-platform, open-source solid modeling system with built-in benchmark mode. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgVGR Performance Metric, More Is BetterBRL-CAD 7.34VGR Performance Metricc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3200K400K600K800K1000K485038533020789066744743102684783777-m641. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6

Stockfish

This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 512 CPU threads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 15Total Timec6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton330M60M90M120M150MSE +/- 1430593.84, N = 15SE +/- 2597495.37, N = 15SE +/- 2998209.87, N = 12SE +/- 1531345.46, N = 15SE +/- 349749.32, N = 12SE +/- 2854071.93, N = 15969056098660928411731647611702712126092344112119711-m64 -msse -msse3 -mpopcnt -mavx2 -msse4.1 -mssse3 -msse2 -mbmi2-m64 -msse -msse3 -mpopcnt -mavx2 -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi21. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -flto -flto=jobserver

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Em7g.16xlarge Graviton330060090012001500SE +/- 13.29, N = 5SE +/- 11.79, N = 3SE +/- 3.53, N = 3SE +/- 7.22, N = 3SE +/- 4.67, N = 313169471333139213011. (CXX) g++ options: -flto -pthread

Backend: BLAS

egeo-07: The test quit with a non-zero exit status. E: ./lczero: line 4: ./lc0: No such file or directory

Timed Node.js Compilation

This test profile times how long it takes to build/compile Node.js itself from source. Node.js is a JavaScript run-time built from the Chrome V8 JavaScript engine while itself is written in C/C++. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Node.js Compilation 19.8.1Time To Compilec6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3140280420560700SE +/- 0.40, N = 3SE +/- 0.16, N = 3SE +/- 0.20, N = 3SE +/- 0.32, N = 3SE +/- 1.11, N = 3SE +/- 0.33, N = 3230.42287.81238.54238.64641.87237.78

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: Eigenc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Em7g.16xlarge Graviton330060090012001500SE +/- 7.37, N = 3SE +/- 4.73, N = 3SE +/- 15.65, N = 3SE +/- 14.88, N = 3SE +/- 8.74, N = 311528911382144413981. (CXX) g++ options: -flto -pthread

Backend: Eigen

egeo-07: The test quit with a non-zero exit status. E: ./lczero: line 4: ./lc0: No such file or directory

QMCPACK

QMCPACK is a modern high-performance open-source Quantum Monte Carlo (QMC) simulation code making use of MPI for this benchmark of the H20 example code. QMCPACK is an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids. QMCPACK is supported by the U.S. Department of Energy. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.16Input: FeCO6_b3lyp_gmsc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3130260390520650SE +/- 1.03, N = 3SE +/- 0.37, N = 3SE +/- 0.19, N = 3SE +/- 0.29, N = 3SE +/- 0.11, N = 3SE +/- 0.22, N = 3184.10302.19211.32188.28608.86211.60-march=native-mcpu=native-mcpu=native-mcpu=native-march=native -pthread-mcpu=native1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -O3 -lm -ldl

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.16Input: FeCO6_b3lyp_gmsc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3120240360480600SE +/- 2.30, N = 3SE +/- 1.75, N = 3SE +/- 0.82, N = 3SE +/- 0.21, N = 3SE +/- 7.86, N = 3SE +/- 0.45, N = 3187.32297.94204.77204.25552.72205.72-march=native-mcpu=native-mcpu=native-mcpu=native-march=native -pthread-mcpu=native1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -O3 -lm -ldl

Timed Godot Game Engine Compilation

This test times how long it takes to compile the Godot Game Engine. Godot is a popular, open-source, cross-platform 2D/3D game engine and is built using the SCons build system and targeting the X11 platform. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Godot Game Engine Compilation 4.0Time To Compilec6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton380160240320400SE +/- 0.12, N = 3SE +/- 0.30, N = 3SE +/- 0.63, N = 3SE +/- 0.45, N = 3SE +/- 0.36, N = 3SE +/- 0.32, N = 3147.74218.28156.69155.95391.74154.38

Monte Carlo Simulations of Ionised Nebulae

Mocassin is the Monte Carlo Simulations of Ionised Nebulae. MOCASSIN is a fully 3D or 2D photoionisation and dust radiative transfer code which employs a Monte Carlo approach to the transfer of radiation through media of arbitrary geometry and density distribution. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterMonte Carlo Simulations of Ionised Nebulae 2.02.73.3Input: Dust 2D tau100.0c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton360120180240300SE +/- 1.84, N = 7SE +/- 0.86, N = 3SE +/- 0.00, N = 3SE +/- 0.07, N = 3SE +/- 0.37, N = 3SE +/- 0.01, N = 3194.44145.3782.8282.97281.2582.67-pthread -ldl -lutil -lrt1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA256c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton312000M24000M36000M48000M60000MSE +/- 26770675.21, N = 3SE +/- 245440310.03, N = 3SE +/- 16491036.11, N = 3SE +/- 19542665.92, N = 3SE +/- 404619.57, N = 3SE +/- 18610524.10, N = 345857534777424727988475421656126354154218593346692520354212515580-m641. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-128-GCMc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton390000M180000M270000M360000M450000MSE +/- 4227452.23, N = 3SE +/- 9833681.11, N = 3SE +/- 12264074.61, N = 3SE +/- 11273100.69, N = 3SE +/- 11737066.92, N = 3SE +/- 81289574.27, N = 315144926931715843616385733206434984341113046994361131320253332033171900-m641. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton330000M60000M90000M120000M150000MSE +/- 36376378.52, N = 3SE +/- 35952887.59, N = 3SE +/- 1725060.95, N = 3SE +/- 771581.87, N = 3SE +/- 13595278.49, N = 3SE +/- 1293723.80, N = 31383893787536729254120310327551699711411811942356213494627103226784517-m64-m641. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20-Poly1305c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton320000M40000M60000M80000M100000MSE +/- 232372675.93, N = 3SE +/- 1132293.08, N = 3SE +/- 1218886.42, N = 3SE +/- 1769561.47, N = 3SE +/- 1523000.86, N = 3SE +/- 1340503.89, N = 3925229993734671763680774318842213799694654872650704190774287460990-m64-m641. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-256-GCMc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton380000M160000M240000M320000M400000MSE +/- 41584947.90, N = 3SE +/- 2312792.64, N = 3SE +/- 33807617.40, N = 3SE +/- 24279491.44, N = 3SE +/- 2585526.42, N = 3SE +/- 6411836.47, N = 313845788945012919959315728337379573735115246542044619128560283333113630-m641. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA512c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton37000M14000M21000M28000M35000MSE +/- 207279.55, N = 3SE +/- 9173912.49, N = 3SE +/- 4573992.60, N = 3SE +/- 16155877.53, N = 3SE +/- 1513929.31, N = 3SE +/- 17714077.14, N = 315291283297143939254903214591414732126059040387856600032125448870-m641. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

QMCPACK

QMCPACK is a modern high-performance open-source Quantum Monte Carlo (QMC) simulation code making use of MPI for this benchmark of the H20 example code. QMCPACK is an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids. QMCPACK is supported by the U.S. Department of Energy. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.16Input: Li2_STO_aec6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton390180270360450SE +/- 0.13, N = 3SE +/- 1.13, N = 3SE +/- 0.12, N = 3SE +/- 0.31, N = 3SE +/- 4.45, N = 3SE +/- 0.08, N = 3123.95165.12112.64113.20408.45112.61-march=native-mcpu=native-mcpu=native-mcpu=native-march=native -pthread-mcpu=native1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -O3 -lm -ldl

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CPU Cachec6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3800K1600K2400K3200K4000KSE +/- 30785.49, N = 12SE +/- 21905.72, N = 15SE +/- 59376.56, N = 15SE +/- 40698.46, N = 15SE +/- 22640.51, N = 15SE +/- 57217.78, N = 151447265.351921785.203844101.983860335.381525522.463892396.34-laio -lbsd -lEGL -lGLESv2 -lmd1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

Laghos

Laghos (LAGrangian High-Order Solver) is a miniapp that solves the time-dependent Euler equations of compressible gas dynamics in a moving Lagrangian frame using unstructured high-order finite element spatial discretization and explicit high-order time-stepping. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMajor Kernels Total Rate, More Is BetterLaghos 3.1Test: Sedov Blast Wave, ube_922_hex.meshc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton390180270360450SE +/- 0.48, N = 3SE +/- 0.89, N = 3SE +/- 0.89, N = 3SE +/- 0.79, N = 3SE +/- 0.27, N = 3SE +/- 0.42, N = 3275.92322.37408.01423.1170.77410.55-pthread1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

nekRS

nekRS is an open-source Navier Stokes solver based on the spectral element method. NekRS supports both CPU and GPU/accelerator support though this test profile is currently configured for CPU execution. NekRS is part of Nek5000 of the Mathematics and Computer Science MCS at Argonne National Laboratory. This nekRS benchmark is primarily relevant to large core count HPC servers and otherwise may be very time consuming on smaller systems. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgflops/rank, More Is BetternekRS 23.0Input: TurboPipe Periodicc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Em7g.16xlarge Graviton3900M1800M2700M3600M4500MSE +/- 12801180.07, N = 3SE +/- 144222.05, N = 3SE +/- 169148.19, N = 3SE +/- 1394740.12, N = 3SE +/- 1199180.28, N = 3433753666722201900003978983333414144000039763000001. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -rdynamic -lmpi_cxx -lmpi

Input: TurboPipe Periodic

egeo-07: The test quit with a non-zero exit status. E: [egeo-07.qteorica.unal.edu.co:290233] PMIX ERROR: UNREACHABLE in file ../../../src/server/pmix_server.c at line 2795

ACES DGEMM

This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point Ratec6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3612182430SE +/- 0.038051, N = 3SE +/- 0.154503, N = 3SE +/- 0.285590, N = 4SE +/- 0.297525, N = 4SE +/- 0.035680, N = 15SE +/- 0.171001, N = 139.38805020.41795224.14060524.0785292.09420624.3623531. (CC) gcc options: -O3 -march=native -fopenmp

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.Dc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton38001600240032004000SE +/- 4.77, N = 3SE +/- 2.22, N = 3SE +/- 34.07, N = 15SE +/- 32.06, N = 15SE +/- 0.38, N = 3SE +/- 1.69, N = 33061.422216.263664.543657.671329.203738.98-pthread -ldl -lutil -lrt1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. c6a.16xlarge AMD Zen 3: Open MPI 4.1.23. egeo-07: Open MPI 4.1.0

GPAW

GPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 23.6Input: Carbon Nanotubec6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton360120180240300SE +/- 0.13, N = 3SE +/- 0.02, N = 3SE +/- 0.04, N = 3SE +/- 0.04, N = 3SE +/- 0.19, N = 3SE +/- 0.03, N = 389.8292.7662.0856.44257.4661.83-pthread1. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi

nekRS

nekRS is an open-source Navier Stokes solver based on the spectral element method. NekRS supports both CPU and GPU/accelerator support though this test profile is currently configured for CPU execution. NekRS is part of Nek5000 of the Mathematics and Computer Science MCS at Argonne National Laboratory. This nekRS benchmark is primarily relevant to large core count HPC servers and otherwise may be very time consuming on smaller systems. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgflops/rank, More Is BetternekRS 23.0Input: Kershawc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Em7g.16xlarge Graviton3900M1800M2700M3600M4500MSE +/- 22342148.51, N = 3SE +/- 737119.02, N = 3SE +/- 2490845.46, N = 3SE +/- 5414395.42, N = 3SE +/- 1575066.14, N = 3430881000017603366673261853333330282333331506800001. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -rdynamic -lmpi_cxx -lmpi

Input: Kershaw

egeo-07: The test quit with a non-zero exit status. E: [egeo-07.qteorica.unal.edu.co:290025] PMIX ERROR: UNREACHABLE in file ../../../src/server/pmix_server.c at line 2795

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.Cc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton37K14K21K28K35KSE +/- 20.85, N = 3SE +/- 1.54, N = 3SE +/- 7.21, N = 3SE +/- 31.31, N = 3SE +/- 15.52, N = 3SE +/- 10.19, N = 334025.359711.7017219.9517163.1111726.5517244.85-pthread -ldl -lutil -lrt1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. c6a.16xlarge AMD Zen 3: Open MPI 4.1.23. egeo-07: Open MPI 4.1.0

nginx

This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the wrk program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients/connections. HTTPS with a self-signed OpenSSL certificate is used by this test for local benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 1000c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton350K100K150K200K250KSE +/- 136.82, N = 3SE +/- 185.79, N = 3SE +/- 55.97, N = 3SE +/- 402.16, N = 3SE +/- 141.39, N = 3SE +/- 137.20, N = 3163178.67158676.40255552.05256585.8370348.77255616.041. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 500c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton350K100K150K200K250KSE +/- 60.38, N = 3SE +/- 90.87, N = 3SE +/- 243.69, N = 3SE +/- 317.05, N = 3SE +/- 47.73, N = 3SE +/- 323.56, N = 3165847.75148964.69255145.52253518.5173654.19255768.441. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Wide Vector Mathc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3300K600K900K1200K1500KSE +/- 2507.18, N = 3SE +/- 505.84, N = 3SE +/- 16521.46, N = 15SE +/- 16444.95, N = 15SE +/- 641.34, N = 3SE +/- 16116.93, N = 151380146.63997272.651535336.571530043.52399851.851542834.94-laio -lbsd -lEGL -lGLESv2 -lmd1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

Laghos

Laghos (LAGrangian High-Order Solver) is a miniapp that solves the time-dependent Euler equations of compressible gas dynamics in a moving Lagrangian frame using unstructured high-order finite element spatial discretization and explicit high-order time-stepping. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMajor Kernels Total Rate, More Is BetterLaghos 3.1Test: Triple Point Problemc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton350100150200250SE +/- 1.06, N = 3SE +/- 0.48, N = 3SE +/- 0.16, N = 3SE +/- 0.27, N = 3SE +/- 0.71, N = 4SE +/- 0.28, N = 3227.40180.80230.68236.2261.58232.01-pthread1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP LavaMDc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton350100150200250SE +/- 0.53, N = 3SE +/- 0.04, N = 3SE +/- 0.11, N = 3SE +/- 0.15, N = 3SE +/- 0.02, N = 3SE +/- 0.15, N = 364.1862.2243.9644.04212.5543.79-O2 -lOpenCL-O2 -lOpenCL-O2 -lOpenCL-O2 -lOpenCL-m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl-O2 -lOpenCL1. (CXX) g++ options:

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton31122334455SE +/- 0.05, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 323.5224.2746.3746.5310.6746.25-pthread1. (CXX) g++ options: -O3

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2023Implementation: MPI CPU - Input: water_GMX50_barec6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton31.08452.1693.25354.3385.4225SE +/- 0.013, N = 3SE +/- 0.002, N = 3SE +/- 0.004, N = 3SE +/- 0.003, N = 3SE +/- 0.002, N = 3SE +/- 0.003, N = 33.9652.7674.2004.8201.1164.223-lm1. (CXX) g++ options: -O3

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.Cc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton320K40K60K80K100KSE +/- 90.22, N = 3SE +/- 26.12, N = 3SE +/- 36.09, N = 3SE +/- 43.73, N = 3SE +/- 21.23, N = 3SE +/- 48.62, N = 395221.4018741.9028375.7128369.1136556.3428341.68-pthread -ldl -lutil -lrt1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. c6a.16xlarge AMD Zen 3: Open MPI 4.1.23. egeo-07: Open MPI 4.1.0

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3150K300K450K600K750KSE +/- 34.73, N = 3SE +/- 88.30, N = 3SE +/- 12.03, N = 3SE +/- 198.10, N = 3SE +/- 170.27, N = 3SE +/- 21.82, N = 3548396.5214040.9713945.9713754.8200342.7713859.5-m64-m641. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton32K4K6K8K10KSE +/- 3.06, N = 3SE +/- 1.71, N = 3SE +/- 1.54, N = 3SE +/- 0.84, N = 3SE +/- 5.58, N = 3SE +/- 1.27, N = 38392.42624.310181.410183.33041.810181.9-m64-m641. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

Coremark

This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per Secondc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3300K600K900K1200K1500KSE +/- 6710.50, N = 3SE +/- 153.60, N = 3SE +/- 13274.76, N = 15SE +/- 14869.41, N = 7SE +/- 4039.35, N = 3SE +/- 11449.37, N = 151466587.041260642.181605948.671611801.56362563.291601880.341. (CC) gcc options: -O2 -lrt" -lrt

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP Streamclusterc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3612182430SE +/- 0.101, N = 15SE +/- 0.211, N = 15SE +/- 0.099, N = 8SE +/- 0.233, N = 12SE +/- 0.397, N = 15SE +/- 0.138, N = 38.39613.73511.62510.69023.24711.663-O2 -lOpenCL-O2 -lOpenCL-O2 -lOpenCL-O2 -lOpenCL-m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl-O2 -lOpenCL1. (CXX) g++ options:

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: NUMAc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton38001600240032004000SE +/- 9.75, N = 15SE +/- 1.53, N = 3SE +/- 3.39, N = 3SE +/- 7.31, N = 3SE +/- 0.00, N = 3SE +/- 5.17, N = 3552.682112.663523.583525.170.893759.101. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

QMCPACK

QMCPACK is a modern high-performance open-source Quantum Monte Carlo (QMC) simulation code making use of MPI for this benchmark of the H20 example code. QMCPACK is an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids. QMCPACK is supported by the U.S. Department of Energy. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.16Input: simple-H2Oc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton320406080100SE +/- 0.08, N = 3SE +/- 0.24, N = 3SE +/- 0.02, N = 3SE +/- 0.04, N = 3SE +/- 0.86, N = 5SE +/- 0.03, N = 326.8745.2327.9928.0077.5228.04-march=native-mcpu=native-mcpu=native-mcpu=native-march=native -pthread-mcpu=native1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -O3 -lm -ldl

srsRAN Project

srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: PUSCH Processor Benchmark, Throughput Totalc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton314002800420056007000SE +/- 21.76, N = 3SE +/- 2.53, N = 3SE +/- 1.80, N = 3SE +/- 3.32, N = 3SE +/- 6.96, N = 3SE +/- 4.08, N = 36479.13938.75356.85431.21129.55413.8-march=native -mfma-march=native -mfma -lpthread1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -lgtest

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Floating Pointc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton320K40K60K80K100KSE +/- 864.23, N = 13SE +/- 31.31, N = 3SE +/- 71.97, N = 3SE +/- 1.74, N = 3SE +/- 160.53, N = 3SE +/- 190.19, N = 396529.5142850.8276178.4676911.7421347.3276102.55-laio -lbsd -lEGL -lGLESv2 -lmd1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

srsRAN Project

srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: Downlink Processor Benchmarkc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3150300450600750SE +/- 1.26, N = 3SE +/- 0.25, N = 3SE +/- 0.95, N = 3SE +/- 0.06, N = 3SE +/- 4.00, N = 4SE +/- 0.91, N = 3691.3197.2319.7323.2342.3318.5-march=native -mfma-march=native -mfma -lpthread1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -lgtest

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: PUSCH Processor Benchmark, Throughput Threadc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton350100150200250SE +/- 0.55, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.06, N = 3SE +/- 0.68, N = 10SE +/- 0.03, N = 3215.963.895.797.488.595.8-march=native -mfma-march=native -mfma -lpthread1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -lgtest

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton320406080100SE +/- 0.05, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 342.4444.9384.7585.0118.9984.47-pthread1. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton320406080100SE +/- 0.14, N = 3SE +/- 0.01, N = 3SE +/- 0.05, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 344.3242.8388.1888.4619.6788.05-pthread1. (CXX) g++ options: -O3

Kripke

Kripke is a simple, scalable, 3D Sn deterministic particle transport code. Its primary purpose is to research how data layout, programming paradigms and architectures effect the implementation and performance of Sn transport. Kripke is developed by LLNL. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgThroughput FoM, More Is BetterKripke 1.2.6c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton380M160M240M320M400MSE +/- 2932840.19, N = 4SE +/- 102787.75, N = 3SE +/- 525406.56, N = 3SE +/- 445212.18, N = 3SE +/- 523405.33, N = 3SE +/- 619419.33, N = 3237087650220120233354442733354234067109107233339000400-pthread1. (CXX) g++ options: -O3 -fopenmp -ldl

7-Zip Compression

This is a test of 7-Zip compression/decompression with its integrated benchmark feature. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Decompression Ratingc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton360K120K180K240K300KSE +/- 1190.65, N = 3SE +/- 15.43, N = 3SE +/- 146.43, N = 3SE +/- 54.90, N = 3SE +/- 265.06, N = 3SE +/- 93.51, N = 3235787234202285633285677590552855401. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Compression Ratingc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton370K140K210K280K350KSE +/- 670.46, N = 3SE +/- 209.44, N = 3SE +/- 72.90, N = 3SE +/- 308.14, N = 3SE +/- 414.62, N = 3SE +/- 154.72, N = 3230970240702311056312009758403168251. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per Directionc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton320406080100SE +/- 0.28, N = 3SE +/- 0.03, N = 3SE +/- 0.09, N = 3SE +/- 0.05, N = 3SE +/- 0.03, N = 3SE +/- 0.02, N = 330.3125.8813.8313.7684.6613.95-pthread -ldl -lutil -lrt1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 512c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3100M200M300M400M500MSE +/- 392527.42, N = 3SE +/- 3333.33, N = 3SE +/- 3333.33, N = 3SE +/- 8819.17, N = 3SE +/- 92915.73, N = 3SE +/- 6666.67, N = 34600766671349266671627666671627566671295600001627533331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 512c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton360M120M180M240M300MSE +/- 193419.52, N = 3SE +/- 333.33, N = 3SE +/- 1000.00, N = 3SE +/- 577.35, N = 3SE +/- 120554.28, N = 3SE +/- 1855.92, N = 3274803333674863338141200081394000128250000813966671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Fused Multiply-Addc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton314M28M42M56M70MSE +/- 32747.05, N = 3SE +/- 3687.67, N = 3SE +/- 4431.60, N = 3SE +/- 10061.51, N = 3SE +/- 16948.60, N = 3SE +/- 4870.19, N = 330920910.9237732190.5463818458.6163723431.5510084119.2163762252.761. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Shufflec6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton312K24K36K48K60KSE +/- 0.50, N = 3SE +/- 74.80, N = 3SE +/- 139.03, N = 3SE +/- 294.96, N = 3SE +/- 0.40, N = 3SE +/- 21.44, N = 322255.8435614.5154472.0754695.047162.5654143.401. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 32c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3500M1000M1500M2000M2500MSE +/- 218581.28, N = 3SE +/- 251661.15, N = 3SE +/- 284800.12, N = 3SE +/- 2915666.50, N = 3SE +/- 707515.21, N = 3SE +/- 435889.89, N = 3218486666715314000002271966667226683333364124333322705000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 57c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3400M800M1200M1600M2000MSE +/- 1014889.16, N = 3SE +/- 11547.01, N = 3SE +/- 284800.12, N = 3SE +/- 88191.71, N = 3SE +/- 851162.60, N = 3SE +/- 152752.52, N = 317108000009782000001442366667144266666747789666714424000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Matrix 3D Mathc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton32K4K6K8K10KSE +/- 1.96, N = 3SE +/- 1.40, N = 3SE +/- 9.35, N = 3SE +/- 19.16, N = 3SE +/- 9.17, N = 3SE +/- 6.38, N = 34571.965752.1710813.5910882.021841.5810403.93-laio -lbsd -lEGL -lGLESv2 -lmd1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Matrix Mathc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton380K160K240K320K400KSE +/- 167.77, N = 3SE +/- 8.13, N = 3SE +/- 38.76, N = 3SE +/- 28.60, N = 3SE +/- 6.88, N = 3SE +/- 53.44, N = 3147576.41284713.63368671.39369258.8955962.00368750.671. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Memory Copyingc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton34K8K12K16K20KSE +/- 0.46, N = 3SE +/- 1.12, N = 3SE +/- 4.65, N = 3SE +/- 1.36, N = 3SE +/- 1.93, N = 3SE +/- 3.80, N = 38080.4311324.7920478.6720475.963209.8320484.24-laio -lbsd -lEGL -lGLESv2 -lmd1. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 32c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3300M600M900M1200M1500MSE +/- 578311.72, N = 3SE +/- 456520.66, N = 3SE +/- 33333.33, N = 3SE +/- 57735.03, N = 3SE +/- 571382.34, N = 3SE +/- 233333.33, N = 311939666677654666671136133333113600000063348333311360666671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 57c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3300M600M900M1200M1500MSE +/- 9533333.33, N = 3SE +/- 23094.01, N = 3SE +/- 168358.08, N = 3SE +/- 150111.07, N = 3SE +/- 846666.67, N = 3SE +/- 3333.33, N = 314442666674892700007213866677213800004295966677214933331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Mathc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton350K100K150K200K250KSE +/- 100.78, N = 3SE +/- 37.96, N = 3SE +/- 20.95, N = 3SE +/- 27.00, N = 3SE +/- 9.74, N = 3SE +/- 47.94, N = 3221776.15147886.14217446.12217567.1038515.89217235.591. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

Remhos

Remhos (REMap High-Order Solver) is a miniapp that solves the pure advection equations that are used to perform monotonic and conservative discontinuous field interpolation (remap) as part of the Eulerian phase in Arbitrary Lagrangian Eulerian (ALE) simulations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRemhos 1.0Test: Sample Remap Examplec6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton31530456075SE +/- 0.11, N = 3SE +/- 0.08, N = 3SE +/- 0.04, N = 3SE +/- 0.02, N = 3SE +/- 0.44, N = 3SE +/- 0.04, N = 322.1020.7414.1214.0868.9614.04-pthread1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

Pennant

Pennant is an application focused on hydrodynamics on general unstructured meshes in 2D. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: sedovbigc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton320406080100SE +/- 0.036687, N = 3SE +/- 0.018218, N = 3SE +/- 0.011497, N = 3SE +/- 0.003721, N = 3SE +/- 0.050055, N = 3SE +/- 0.011347, N = 316.53050016.4805009.4222709.34095389.0049009.206490-pthread1. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi

Algebraic Multi-Grid Benchmark

AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFigure Of Merit, More Is BetterAlgebraic Multi-Grid Benchmark 1.2c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3400M800M1200M1600M2000MSE +/- 1055539.30, N = 3SE +/- 140169.34, N = 3SE +/- 192645.90, N = 3SE +/- 488508.39, N = 3SE +/- 394420.25, N = 3SE +/- 103191.30, N = 38369993001035586333176527766717659663334444561331646761667-pthread1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton34080120160200SE +/- 0.08, N = 3SE +/- 0.03, N = 3SE +/- 0.05, N = 3SE +/- 0.03, N = 3SE +/- 0.05, N = 3SE +/- 0.13, N = 382.7681.94163.28163.5634.37162.96-pthread1. (CXX) g++ options: -O3

Monte Carlo Simulations of Ionised Nebulae

Mocassin is the Monte Carlo Simulations of Ionised Nebulae. MOCASSIN is a fully 3D or 2D photoionisation and dust radiative transfer code which employs a Monte Carlo approach to the transfer of radiation through media of arbitrary geometry and density distribution. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterMonte Carlo Simulations of Ionised Nebulae 2.02.73.3Input: Gas HII40c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3612182430SE +/- 0.02, N = 3SE +/- 0.17, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.04, N = 3SE +/- 0.05, N = 312.6720.7613.6613.5326.9213.58-pthread -ldl -lutil -lrt1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz

LULESH

LULESH is the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton36K12K18K24K30KSE +/- 90.11, N = 3SE +/- 38.55, N = 3SE +/- 11.81, N = 3SE +/- 12.73, N = 3SE +/- 5.42, N = 3SE +/- 27.09, N = 316708.2617557.4928708.6628736.235676.7828296.38-pthread1. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi

Pennant

Pennant is an application focused on hydrodynamics on general unstructured meshes in 2D. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: leblancbigc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton31020304050SE +/- 0.013289, N = 3SE +/- 0.018924, N = 3SE +/- 0.005468, N = 3SE +/- 0.000467, N = 3SE +/- 0.025073, N = 3SE +/- 0.000869, N = 39.91756512.1768306.9613456.83999843.4588006.720537-pthread1. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per Directionc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3510152025SE +/- 0.08686597, N = 15SE +/- 0.02560507, N = 3SE +/- 0.03233273, N = 3SE +/- 0.01738352, N = 3SE +/- 0.12655798, N = 3SE +/- 0.02702838, N = 37.019752885.637207353.144479993.1148982821.239273703.09871038-pthread -ldl -lutil -lrt1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3918273645SE +/- 0.17467, N = 3SE +/- 0.01033, N = 3SE +/- 0.02659, N = 3SE +/- 0.02971, N = 3SE +/- 0.01615, N = 3SE +/- 0.01031, N = 320.8719020.6279040.8283040.970809.3237940.89230-pthread1. (CXX) g++ options: -O3

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.Cc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton35K10K15K20K25KSE +/- 14.83, N = 3SE +/- 31.56, N = 3SE +/- 283.23, N = 3SE +/- 125.21, N = 3SE +/- 12.05, N = 3SE +/- 130.18, N = 320210.0013103.6221911.0222155.369661.9121988.99-pthread -ldl -lutil -lrt1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. c6a.16xlarge AMD Zen 3: Open MPI 4.1.23. egeo-07: Open MPI 4.1.0

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP CFD Solverc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3510152025SE +/- 0.002, N = 3SE +/- 0.016, N = 3SE +/- 0.021, N = 3SE +/- 0.027, N = 3SE +/- 0.208, N = 4SE +/- 0.011, N = 39.3426.0514.4424.42918.6304.375-O2 -lOpenCL-O2 -lOpenCL-O2 -lOpenCL-O2 -lOpenCL-m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl-O2 -lOpenCL1. (CXX) g++ options:

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton320406080100SE +/- 0.42, N = 6SE +/- 0.05, N = 3SE +/- 0.07, N = 3SE +/- 0.09, N = 3SE +/- 0.10, N = 3SE +/- 0.01, N = 343.5941.9881.0181.1717.0081.44-pthread1. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton320406080100SE +/- 0.17, N = 3SE +/- 0.01, N = 3SE +/- 0.31, N = 3SE +/- 0.03, N = 3SE +/- 0.10, N = 3SE +/- 0.02, N = 341.5940.1177.7778.1717.1278.50-pthread1. (CXX) g++ options: -O3

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.Cc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton311K22K33K44K55KSE +/- 167.32, N = 3SE +/- 7.02, N = 3SE +/- 32.94, N = 3SE +/- 14.65, N = 3SE +/- 35.78, N = 3SE +/- 24.30, N = 345946.8125671.2949742.3049860.6819477.2350126.29-pthread -ldl -lutil -lrt1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. c6a.16xlarge AMD Zen 3: Open MPI 4.1.23. egeo-07: Open MPI 4.1.0

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton34080120160200SE +/- 1.28, N = 3SE +/- 0.19, N = 3SE +/- 0.11, N = 3SE +/- 0.04, N = 3SE +/- 0.04, N = 3SE +/- 0.27, N = 3102.6592.40162.01162.3632.43164.87-pthread1. (CXX) g++ options: -O3

LAMMPS Molecular Dynamics Simulator

LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: Rhodopsin Proteinc6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3918273645SE +/- 0.257, N = 12SE +/- 0.083, N = 3SE +/- 0.033, N = 3SE +/- 0.026, N = 3SE +/- 0.024, N = 3SE +/- 0.057, N = 319.56325.95037.41237.4827.17637.558-lm-pthread -lm1. (CXX) g++ options: -O3 -ldl

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton31326395265SE +/- 0.84547, N = 15SE +/- 0.08221, N = 3SE +/- 0.14885, N = 3SE +/- 0.32202, N = 3SE +/- 0.03519, N = 3SE +/- 0.28294, N = 348.9432032.7468055.1055055.103809.6229657.150301. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton370140210280350SE +/- 1.94, N = 3SE +/- 0.64, N = 3SE +/- 0.56, N = 3SE +/- 1.62, N = 3SE +/- 0.85, N = 15SE +/- 0.83, N = 3158.86209.50301.42300.4058.20306.54-pthread1. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton34080120160200SE +/- 1.25, N = 14SE +/- 0.35, N = 3SE +/- 0.47, N = 3SE +/- 0.20, N = 3SE +/- 0.04, N = 3SE +/- 0.27, N = 398.70135.36184.03184.1126.74186.36-pthread1. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128c6a.16xlarge AMD Zen 3c6g.16xlarge Graviton2c7g.16xlarge Graviton3c7gn.16xlarge Graviton3Eegeo-07m7g.16xlarge Graviton3306090120150SE +/- 1.46, N = 12SE +/- 0.61, N = 3SE +/- 0.47, N = 3SE +/- 0.04, N = 3SE +/- 0.06, N = 3SE +/- 0.12, N = 386.3781.45133.51133.4223.02138.01-pthread1. (CXX) g++ options: -O3

87 Results Shown

NWChem
Graph500:
  26:
    sssp max_TEPS
    sssp median_TEPS
    bfs max_TEPS
    bfs median_TEPS
LAMMPS Molecular Dynamics Simulator
Timed Gem5 Compilation
BRL-CAD
Stockfish
LeelaChessZero
Timed Node.js Compilation
LeelaChessZero
QMCPACK:
  FeCO6_b3lyp_gms:
    Total Execution Time - Seconds
    Total Execution Time - Seconds
Timed Godot Game Engine Compilation
Monte Carlo Simulations of Ionised Nebulae
OpenSSL:
  SHA256
  AES-128-GCM
  ChaCha20
  ChaCha20-Poly1305
  AES-256-GCM
  SHA512
QMCPACK
Stress-NG
Laghos
nekRS
ACES DGEMM
NAS Parallel Benchmarks
GPAW
nekRS
NAS Parallel Benchmarks
nginx:
  1000
  500
Stress-NG
Laghos
Rodinia
HeFFTe - Highly Efficient FFT for Exascale
GROMACS
NAS Parallel Benchmarks
OpenSSL:
  RSA4096:
    verify/s
    sign/s
Coremark
Rodinia
Stress-NG
QMCPACK
srsRAN Project
Stress-NG
srsRAN Project:
  Downlink Processor Benchmark
  PUSCH Processor Benchmark, Throughput Thread
HeFFTe - Highly Efficient FFT for Exascale:
  r2c - FFTW - double - 512
  c2c - FFTW - float - 512
Kripke
7-Zip Compression:
  Decompression Rating
  Compression Rating
Xcompact3d Incompact3d
Liquid-DSP:
  64 - 256 - 512
  32 - 256 - 512
Stress-NG:
  Fused Multiply-Add
  Vector Shuffle
Liquid-DSP:
  64 - 256 - 32
  64 - 256 - 57
Stress-NG:
  Matrix 3D Math
  Matrix Math
  Memory Copying
Liquid-DSP:
  32 - 256 - 32
  32 - 256 - 57
Stress-NG
Remhos
Pennant
Algebraic Multi-Grid Benchmark
HeFFTe - Highly Efficient FFT for Exascale
Monte Carlo Simulations of Ionised Nebulae
LULESH
Pennant
Xcompact3d Incompact3d
HeFFTe - Highly Efficient FFT for Exascale
NAS Parallel Benchmarks
Rodinia
HeFFTe - Highly Efficient FFT for Exascale:
  c2c - FFTW - float - 256
  r2c - FFTW - double - 256
NAS Parallel Benchmarks
HeFFTe - Highly Efficient FFT for Exascale
LAMMPS Molecular Dynamics Simulator
HeFFTe - Highly Efficient FFT for Exascale:
  c2c - FFTW - double - 128
  r2c - FFTW - float - 128
  c2c - FFTW - float - 128
  r2c - FFTW - double - 128