9684x-march

2 x AMD EPYC 9684X 96-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2403279-NE-9684XMARC57&grs&sor.

9684x-march ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen ResolutionPREa2 x AMD EPYC 9684X 96-Core @ 2.55GHz (192 Cores / 384 Threads)AMD Titanite_4G (RTI1007B BIOS)AMD Device 14a41520GB3201GB Micron_7450_MTFDKCB3T2TFS + 257GB Flash DriveASPEEDBroadcom NetXtreme BCM5720 PCIeUbuntu 23.106.5.0-25-generic (x86_64)GCC 13.2.0ext4640x480OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e Python Details- Python 3.11.6Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

9684x-march tensorflow: CPU - 32 - ResNet-50pytorch: CPU - 32 - ResNet-152pytorch: CPU - 1 - ResNet-152tensorflow: CPU - 32 - GoogLeNettensorflow: CPU - 1 - GoogLeNettensorflow: CPU - 16 - ResNet-50tensorflow: CPU - 1 - ResNet-50pytorch: CPU - 64 - ResNet-152pytorch: CPU - 32 - ResNet-50tensorflow: CPU - 256 - AlexNettensorflow: CPU - 32 - AlexNetpytorch: CPU - 16 - ResNet-50pytorch: CPU - 512 - ResNet-50rocksdb: Read While Writingpytorch: CPU - 1 - Efficientnet_v2_lpytorch: CPU - 64 - ResNet-50tensorflow: CPU - 16 - AlexNettensorflow: CPU - 64 - AlexNetpytorch: CPU - 256 - ResNet-50tensorflow: CPU - 512 - GoogLeNetpytorch: CPU - 256 - ResNet-152tensorflow: CPU - 1 - AlexNetpytorch: CPU - 256 - Efficientnet_v2_ltensorflow: CPU - 512 - AlexNetpytorch: CPU - 512 - ResNet-152tensorflow: CPU - 16 - GoogLeNettensorflow: CPU - 64 - ResNet-50blender: Fishy Cat - CPU-Onlyrocksdb: Update Randpytorch: CPU - 16 - ResNet-152pytorch: CPU - 512 - Efficientnet_v2_lpytorch: CPU - 32 - Efficientnet_v2_ltensorflow: CPU - 256 - ResNet-50rocksdb: Read Rand Write Randbuild-mesa: Time To Compilepytorch: CPU - 1 - ResNet-50tensorflow: CPU - 64 - GoogLeNetbrl-cad: VGR Performance Metricblender: Pabellon Barcelona - CPU-Onlypytorch: CPU - 64 - Efficientnet_v2_lblender: Barbershop - CPU-Onlyblender: Junkshop - CPU-Onlyrocksdb: Rand Readblender: Classroom - CPU-Onlytensorflow: CPU - 256 - GoogLeNetrocksdb: Overwritetensorflow: CPU - 512 - ResNet-50blender: BMW27 - CPU-Onlypytorch: CPU - 16 - Efficientnet_v2_lPREa65.888.729.97185.1612.5839.684.059.2120.191652.23424.0620.9320.43271303636.2921.59242.29765.5521.20493.318.9221.162.291980.519.47112.6487.729.964212668.932.312.33119.83361914214.6623.06275.34595661222.992.3267.3811.4110530623318.03400.03421049140.597.552.3360.259.3410.58176.3613.2041.263.98.9120.841604.52436.2521.5321.01264066626.4521.08247.55749.4620.77484.029.0920.782.332010.569.33114.2688.939.854256879.012.332.31118.88364326314.75623.20273.68592756423.12.3167.6611.44110889277618.08399.46421616140.497.552.33OpenBenchmarking.org

TensorFlow

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: ResNet-50PREa153045607565.8860.25

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: ResNet-152aPRE3691215SE +/- 0.08, N = 39.348.72MIN: 4.74 / MAX: 9.74MIN: 5.23 / MAX: 9.06

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-152aPRE3691215SE +/- 0.10, N = 1510.589.97MIN: 4.55 / MAX: 11.67MIN: 4.85 / MAX: 10.69

TensorFlow

Device: CPU - Batch Size: 32 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: GoogLeNetPREa4080120160200185.16176.36

TensorFlow

Device: CPU - Batch Size: 1 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: GoogLeNetaPRE3691215SE +/- 0.14, N = 1513.2012.58

TensorFlow

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: ResNet-50aPRE91827364541.2639.68

TensorFlow

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: ResNet-50PREa0.91131.82262.73393.64524.55654.053.90

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: ResNet-152PREa3691215SE +/- 0.09, N = 129.218.91MIN: 4.8 / MAX: 9.43MIN: 4.5 / MAX: 9.7

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: ResNet-50aPRE510152025SE +/- 0.16, N = 1520.8420.19MIN: 11.24 / MAX: 22.33MIN: 11.95 / MAX: 21.04

TensorFlow

Device: CPU - Batch Size: 256 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: AlexNetPREa4008001200160020001652.231604.52

TensorFlow

Device: CPU - Batch Size: 32 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: AlexNetaPRE90180270360450SE +/- 6.62, N = 15436.25424.06

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: ResNet-50aPRE510152025SE +/- 0.16, N = 321.5320.93MIN: 12.64 / MAX: 22.28MIN: 12.91 / MAX: 21.51

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-50aPRE510152025SE +/- 0.14, N = 1521.0120.43MIN: 11.92 / MAX: 22.65MIN: 13.46 / MAX: 21.1

RocksDB

Test: Read While Writing

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 9.0Test: Read While WritingPREa6M12M18M24M30M27130363264066621. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

PyTorch

Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_laPRE246810SE +/- 0.09, N = 36.456.29MIN: 3.05 / MAX: 6.85MIN: 3.09 / MAX: 6.44

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: ResNet-50PREa510152025SE +/- 0.23, N = 321.5921.08MIN: 14.02 / MAX: 22.21MIN: 13.2 / MAX: 22.07

TensorFlow

Device: CPU - Batch Size: 16 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: AlexNetaPRE50100150200250SE +/- 2.30, N = 15247.55242.29

TensorFlow

Device: CPU - Batch Size: 64 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: AlexNetPREa170340510680850SE +/- 5.39, N = 15765.55749.46

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-50PREa510152025SE +/- 0.10, N = 321.2020.77MIN: 12.68 / MAX: 21.88MIN: 12.97 / MAX: 21.67

TensorFlow

Device: CPU - Batch Size: 512 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: GoogLeNetPREa110220330440550493.31484.02

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-152aPRE3691215SE +/- 0.10, N = 129.098.92MIN: 4.84 / MAX: 10.03MIN: 5.04 / MAX: 9.16

TensorFlow

Device: CPU - Batch Size: 1 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: AlexNetPREa510152025SE +/- 0.16, N = 1521.1620.78

PyTorch

Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_laPRE0.52431.04861.57292.09722.6215SE +/- 0.01, N = 32.332.29MIN: 1.59 / MAX: 2.78MIN: 1.79 / MAX: 2.72

TensorFlow

Device: CPU - Batch Size: 512 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: AlexNetaPRE4008001200160020002010.561980.51

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-152PREa3691215SE +/- 0.10, N = 39.479.33MIN: 5.17 / MAX: 9.87MIN: 4.69 / MAX: 9.66

TensorFlow

Device: CPU - Batch Size: 16 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: GoogLeNetaPRE306090120150114.26112.64

TensorFlow

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: ResNet-50aPRE2040608010088.9387.72

Blender

Blend File: Fishy Cat - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Fishy Cat - Compute: CPU-OnlyaPRE36912159.859.96

RocksDB

Test: Update Random

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 9.0Test: Update RandomaPRE90K180K270K360K450K4256874212661. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: ResNet-152aPRE3691215SE +/- 0.09, N = 39.018.93MIN: 4.81 / MAX: 9.31MIN: 8.8 / MAX: 9.04

PyTorch

Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_laPRE0.52431.04861.57292.09722.6215SE +/- 0.01, N = 32.332.31MIN: 1.58 / MAX: 2.83MIN: 1.7 / MAX: 2.84

PyTorch

Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_lPREa0.52431.04861.57292.09722.6215SE +/- 0.01, N = 32.332.31MIN: 1.78 / MAX: 2.8MIN: 1.88 / MAX: 2.74

TensorFlow

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: ResNet-50PREa306090120150119.83118.88

RocksDB

Test: Read Random Write Random

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 9.0Test: Read Random Write RandomaPRE800K1600K2400K3200K4000K364326336191421. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Timed Mesa Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Mesa Compilation 24.0Time To CompilePREa48121620SE +/- 0.04, N = 314.6614.76

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-50aPRE612182430SE +/- 0.20, N = 1523.2023.06MIN: 12.21 / MAX: 25.13MIN: 12.95 / MAX: 24.52

TensorFlow

Device: CPU - Batch Size: 64 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: GoogLeNetPREa60120180240300275.34273.68

BRL-CAD

VGR Performance Metric

OpenBenchmarking.orgVGR Performance Metric, More Is BetterBRL-CAD 7.38.2VGR Performance MetricPREa1.3M2.6M3.9M5.2M6.5M595661259275641. (CXX) g++ options: -std=c++17 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lnetpbm -lregex_brl -lz_brl -lassimp -ldl -lm -ltk8.6

Blender

Blend File: Pabellon Barcelona - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Pabellon Barcelona - Compute: CPU-OnlyPREa61218243022.9923.10

PyTorch

Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_lPREa0.5221.0441.5662.0882.61SE +/- 0.01, N = 32.322.31MIN: 1.9 / MAX: 2.75MIN: 1.53 / MAX: 2.83

Blender

Blend File: Barbershop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Barbershop - Compute: CPU-OnlyPREa153045607567.3867.66

Blender

Blend File: Junkshop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Junkshop - Compute: CPU-OnlyPREa369121511.4011.44

RocksDB

Test: Random Read

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 9.0Test: Random ReadaPRE200M400M600M800M1000M110889277611053062331. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Classroom - Compute: CPU-OnlyPREa4812162018.0318.08

TensorFlow

Device: CPU - Batch Size: 256 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: GoogLeNetPREa90180270360450400.03399.46

RocksDB

Test: Overwrite

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 9.0Test: OverwriteaPRE90K180K270K360K450K4216164210491. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

TensorFlow

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: ResNet-50PREa306090120150140.59140.49

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: BMW27 - Compute: CPU-OnlyaPRE2468107.557.55

PyTorch

Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_laPRE0.52431.04861.57292.09722.6215SE +/- 0.01, N = 32.332.33MIN: 1.77 / MAX: 2.9MIN: 1.76 / MAX: 2.72


Phoronix Test Suite v10.8.5