dddd AMD Ryzen 7 7840HS testing with a Framework Laptop 16 (AMD Ryzen 7040 ) FRANMZCP07 (03.01 BIOS) and AMD Radeon 780M 512MB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2411045-NE-DDDD1454570 .
dddd Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a aa b d AMD Ryzen 7 7840HS @ 5.29GHz (8 Cores / 16 Threads) Framework Laptop 16 (AMD Ryzen 7040 ) FRANMZCP07 (03.01 BIOS) AMD Device 14e8 2 x 8GB DDR5-5600MT/s A-DATA AD5S56008G-B 512GB Western Digital PC SN810 SDCPNRY-512G AMD Radeon 780M 512MB AMD Navi 31 HDMI/DP MEDIATEK MT7922 802.11ax PCI Ubuntu 24.04 6.8.0-39-generic (x86_64) GNOME Shell 46.0 X Server + Wayland 4.6 Mesa 24.2~git2406200600.0ac0fb~oibaf~n (git-0ac0fbc 2024-06-20 noble-oibaf-ppa) (LLVM 17.0.6 DRM 3.57) GCC 13.2.0 ext4 2560x1600 6.8.0-40-generic (x86_64) OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - Platform Profile: balanced - CPU Microcode: 0xa704103 - ACPI Profile: balanced Graphics Details - BAR1 / Visible vRAM Size: 512 MB Java Details - OpenJDK Runtime Environment (build 21.0.4+7-Ubuntu-1ubuntu224.04) Python Details - Python 3.12.3 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Vulnerable: Safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
dddd unvanquished: 1920 x 1080 - High unvanquished: 1920 x 1200 - High unvanquished: 2560 x 1440 - High unvanquished: 2560 x 1600 - High unvanquished: 1920 x 1080 - Ultra unvanquished: 1920 x 1200 - Ultra unvanquished: 2560 x 1440 - Ultra unvanquished: 2560 x 1600 - Ultra unvanquished: 1920 x 1080 - Medium unvanquished: 1920 x 1200 - Medium unvanquished: 2560 x 1440 - Medium unvanquished: 2560 x 1600 - Medium paraview: Many Spheres - 600 - 1920 x 1080 paraview: Many Spheres - 600 - 1920 x 1080 paraview: Many Spheres - 600 - 1920 x 1200 paraview: Many Spheres - 600 - 1920 x 1200 paraview: Many Spheres - 600 - 2560 x 1440 paraview: Many Spheres - 600 - 2560 x 1440 cp2k: H20-64 cp2k: H20-256 cp2k: Fayalite-FIST namd: ATPase with 327,506 Atoms namd: STMV with 1,066,628 Atoms epoch: Cone warpx: Uniform Plasma warpx: Plasma Acceleration byte: Pipe byte: Dhrystone 2 byte: System Call byte: Whetstone Double svt-av1: Preset 3 - Bosphorus 4K svt-av1: Preset 5 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K svt-av1: Preset 3 - Bosphorus 1080p svt-av1: Preset 5 - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 1080p svt-av1: Preset 3 - Beauty 4K 10-bit svt-av1: Preset 5 - Beauty 4K 10-bit svt-av1: Preset 8 - Beauty 4K 10-bit svt-av1: Preset 13 - Beauty 4K 10-bit stockfish: Chess Benchmark primesieve: 1e12 primesieve: 1e13 onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_1d - CPU onednn: Deconvolution Batch shapes_3d - CPU onednn: Recurrent Neural Network Training - CPU onednn: Recurrent Neural Network Inference - CPU encode-opus: WAV To Opus Encode astcenc: Fast astcenc: Medium astcenc: Thorough astcenc: Exhaustive astcenc: Very Thorough litert: DeepLab V3 litert: SqueezeNet litert: Inception V4 litert: NASNet Mobile litert: Mobilenet Float litert: Mobilenet Quant litert: Inception ResNet V2 litert: Quantized COCO SSD MobileNet v1 xnnpack: FP32MobileNetV1 xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV3Small xnnpack: FP16MobileNetV1 xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV3Small xnnpack: QS8MobileNetV2 cassandra: Writes onnx: GPT-2 - CPU - Parallel onnx: GPT-2 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Standard onnx: yolov4 - CPU - Parallel onnx: yolov4 - CPU - Parallel onnx: yolov4 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: ZFNet-512 - CPU - Parallel onnx: ZFNet-512 - CPU - Parallel onnx: ZFNet-512 - CPU - Standard onnx: ZFNet-512 - CPU - Standard onnx: T5 Encoder - CPU - Parallel onnx: T5 Encoder - CPU - Parallel onnx: T5 Encoder - CPU - Standard onnx: T5 Encoder - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: bertsquad-12 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Standard onnx: ResNet101_DUC_HDC-12 - CPU - Parallel onnx: ResNet101_DUC_HDC-12 - CPU - Parallel onnx: ResNet101_DUC_HDC-12 - CPU - Standard onnx: ResNet101_DUC_HDC-12 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard whisperfile: Tiny whisperfile: Small whisperfile: Medium a aa b d 460.1 437.5 288.6 266.5 418.2 385.3 253.4 233.9 542.5 511.8 371.2 336 0.18 17.859 0.18 17.798 0.17 17.38 473.3 439.3 289.6 266.7 417.3 384.1 254.7 234.6 539.8 512.5 373.2 336.2 99.801 1002.985 147.309 1.32956 0.38378 546.93 21.17645074 55.36615002 16671392.5 719195925.3 16049531.4 155612.4 4.169 15.057 49.912 121.142 14.285 49.158 163.326 497.584 0.705 3.119 5.343 7.579 21157157 15.383 188.66 2.51714 4.57722 13.3615 6.10289 4.86939 3155.62 1770.5 22.172 170.7735 67.6726 8.8155 0.7361 1.2075 2125.78 2288.03 33358 7642.71 1527.48 1249.46 28571.2 2016.71 1536 1057 1255 465 1868 1337 1429 581 533 115862 106.355 9.39903 122.646 8.14774 6.61722 151.119 9.36551 106.772 62.6579 15.958 86.771 11.5232 140.288 7.12656 142.366 7.02169 9.11603 109.695 15.779 63.3729 456.271 2.19001 498.19 2.00621 0.88519 1129.7 1.59037 628.78 20.9556 47.7186 20.5943 48.5551 231.737 4.31466 243.293 4.10903 71.3414 14.0165 113.981 8.7719 0.466728 2142.57 0.729925 1370 40.1989 24.8744 52.3988 19.0821 55.18821 229.32516 655.14231 471.6 438 289.8 266.4 416.3 384.2 253.9 234 542.4 517.5 373.5 336.6 96.287 1010.618 154.827 1.27220 0.37007 554.79 21.69885843 56.63241771 18027399 774189636.1 17367015.3 137723.9 4.022 14.522 49.519 119.373 13.796 47.191 160.583 489.204 0.674 2.918 4.909 6.83 20057178 15.34 191.957 2.54144 4.662 13.3262 6.13915 4.92932 3104.92 1727.88 26.284 156.3041 62.1679 8.1192 0.6837 1.1154 2128.84 2286.41 33308.8 7816 1491.33 1276.41 28254.4 22695.2 13914 6981 9205 1998 6239 10687 11295 606 526 108743 104.496 9.56576 124.718 8.01188 6.63616 150.687 5.90138 169.449 62.3166 16.0452 60.4683 16.5356 137.129 7.29084 148.556 6.72951 8.48061 117.914 15.6956 63.7087 456.335 2.18961 502.552 1.9887 0.763448 1309.84 1.34498 743.501 20.257 49.3639 32.4009 30.8617 227.831 4.38853 225.296 4.4372 68.1345 14.6761 111.026 9.0054 0.404181 2474.13 0.381099 2623.99 35.462 28.1973 47.839 20.9008 58.1555 231.05572 657.63606 472.3 436.4 288.7 266.9 422.2 380.8 252.4 235 540.7 504.8 373.4 337 100.098 1052.128 151.938 1.28715 0.37644 549.16 21.70649519 60.22498194 16434910.2 709502408.8 15836969.9 125618.1 3.748 13.964 46.575 114.049 12.769 44.332 149.784 460.217 0.619 2.659 4.484 6.224 20267956 16.749 206.119 2.79418 4.66831 13.3305 6.58308 5.72961 3379.99 1844.16 28.909 142.7008 56.7288 7.4152 0.6236 1.017 2378.54 2504.03 36136.2 8630.17 1622.81 1458.15 30683.2 2338.08 1617 1130 1319 515 2169 1533 1602 654 581 100972 102.473 9.75407 121.495 8.22476 5.74356 174.106 8.7288 114.561 56.7363 17.6235 57.4247 17.4118 137.211 7.28642 147.262 6.78853 8.13821 122.875 12.6292 79.1783 434.39 2.30027 479.385 2.08476 0.809516 1235.3 0.79605 1256.2 18.1385 55.1297 30.2995 33.002 208.591 4.79323 209.121 4.78049 61.2545 16.3246 101.972 9.80512 0.356134 2807.93 0.611475 1635.39 33.2835 30.043 32.9074 30.3849 59.90084 243.97484 690.37381 OpenBenchmarking.org
Unvanquished Resolution: 1920 x 1080 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1080 - Effects Quality: High a aa b d 100 200 300 400 500 460.1 473.3 471.6 472.3
Unvanquished Resolution: 1920 x 1200 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1200 - Effects Quality: High a aa b d 100 200 300 400 500 437.5 439.3 438.0 436.4
Unvanquished Resolution: 2560 x 1440 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1440 - Effects Quality: High a aa b d 60 120 180 240 300 288.6 289.6 289.8 288.7
Unvanquished Resolution: 2560 x 1600 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1600 - Effects Quality: High a aa b d 60 120 180 240 300 266.5 266.7 266.4 266.9
Unvanquished Resolution: 1920 x 1080 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1080 - Effects Quality: Ultra a aa b d 90 180 270 360 450 418.2 417.3 416.3 422.2
Unvanquished Resolution: 1920 x 1200 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1200 - Effects Quality: Ultra a aa b d 80 160 240 320 400 385.3 384.1 384.2 380.8
Unvanquished Resolution: 2560 x 1440 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1440 - Effects Quality: Ultra a aa b d 60 120 180 240 300 253.4 254.7 253.9 252.4
Unvanquished Resolution: 2560 x 1600 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1600 - Effects Quality: Ultra a aa b d 50 100 150 200 250 233.9 234.6 234.0 235.0
Unvanquished Resolution: 1920 x 1080 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1080 - Effects Quality: Medium a aa b d 120 240 360 480 600 542.5 539.8 542.4 540.7
Unvanquished Resolution: 1920 x 1200 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1200 - Effects Quality: Medium a aa b d 110 220 330 440 550 511.8 512.5 517.5 504.8
Unvanquished Resolution: 2560 x 1440 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1440 - Effects Quality: Medium a aa b d 80 160 240 320 400 371.2 373.2 373.5 373.4
Unvanquished Resolution: 2560 x 1600 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1600 - Effects Quality: Medium a aa b d 70 140 210 280 350 336.0 336.2 336.6 337.0
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 a 0.0405 0.081 0.1215 0.162 0.2025 0.18
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 a 4 8 12 16 20 17.86
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 a 0.0405 0.081 0.1215 0.162 0.2025 0.18
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 a 4 8 12 16 20 17.80
ParaView Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 a 0.0383 0.0766 0.1149 0.1532 0.1915 0.17
ParaView Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 a 4 8 12 16 20 17.38
CP2K Molecular Dynamics Input: H20-64 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-64 aa b d 20 40 60 80 100 99.80 96.29 100.10 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
CP2K Molecular Dynamics Input: H20-256 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-256 aa b d 200 400 600 800 1000 1002.99 1010.62 1052.13 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: Fayalite-FIST aa b d 30 60 90 120 150 147.31 154.83 151.94 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
NAMD Input: ATPase with 327,506 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms aa b d 0.2992 0.5984 0.8976 1.1968 1.496 1.32956 1.27220 1.28715
NAMD Input: STMV with 1,066,628 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms aa b d 0.0864 0.1728 0.2592 0.3456 0.432 0.38378 0.37007 0.37644
Epoch Epoch3D Deck: Cone OpenBenchmarking.org Seconds, Fewer Is Better Epoch 4.19.4 Epoch3D Deck: Cone aa b d 120 240 360 480 600 546.93 554.79 549.16 1. (F9X) gfortran options: -O3 -std=f2003 -Jobj -lsdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
WarpX Input: Uniform Plasma OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Uniform Plasma aa b d 5 10 15 20 25 21.18 21.70 21.71 1. (CXX) g++ options: -O3 -lm
WarpX Input: Plasma Acceleration OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Plasma Acceleration aa b d 13 26 39 52 65 55.37 56.63 60.22 1. (CXX) g++ options: -O3 -lm
BYTE Unix Benchmark Computational Test: Pipe OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Pipe aa b d 4M 8M 12M 16M 20M 16671392.5 18027399.0 16434910.2 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: Dhrystone 2 OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Dhrystone 2 aa b d 170M 340M 510M 680M 850M 719195925.3 774189636.1 709502408.8 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: System Call OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: System Call aa b d 4M 8M 12M 16M 20M 16049531.4 17367015.3 15836969.9 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: Whetstone Double OpenBenchmarking.org MWIPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Whetstone Double aa b d 30K 60K 90K 120K 150K 155612.4 137723.9 125618.1 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 4K aa b d 0.938 1.876 2.814 3.752 4.69 4.169 4.022 3.748 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 4K aa b d 4 8 12 16 20 15.06 14.52 13.96 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 4K aa b d 11 22 33 44 55 49.91 49.52 46.58 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 4K aa b d 30 60 90 120 150 121.14 119.37 114.05 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 1080p aa b d 4 8 12 16 20 14.29 13.80 12.77 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 1080p aa b d 11 22 33 44 55 49.16 47.19 44.33 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 1080p aa b d 40 80 120 160 200 163.33 160.58 149.78 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 1080p aa b d 110 220 330 440 550 497.58 489.20 460.22 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit aa b d 0.1586 0.3172 0.4758 0.6344 0.793 0.705 0.674 0.619 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit aa b d 0.7018 1.4036 2.1054 2.8072 3.509 3.119 2.918 2.659 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit aa b d 1.2022 2.4044 3.6066 4.8088 6.011 5.343 4.909 4.484 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit aa b d 2 4 6 8 10 7.579 6.830 6.224 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 17 Chess Benchmark aa b d 5M 10M 15M 20M 25M 21157157 20057178 20267956 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.5 Length: 1e12 aa b d 4 8 12 16 20 15.38 15.34 16.75 1. (CXX) g++ options: -O3
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.5 Length: 1e13 aa b d 50 100 150 200 250 188.66 191.96 206.12 1. (CXX) g++ options: -O3
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU aa b d 0.6287 1.2574 1.8861 2.5148 3.1435 2.51714 2.54144 2.79418 MIN: 2.4 MIN: 2.43 MIN: 2.66 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU aa b d 1.0504 2.1008 3.1512 4.2016 5.252 4.57722 4.66200 4.66831 MIN: 4.47 MIN: 4.52 MIN: 4.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU aa b d 3 6 9 12 15 13.36 13.33 13.33 MIN: 13.14 MIN: 13.14 MIN: 13.11 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU aa b d 2 4 6 8 10 6.10289 6.13915 6.58308 MIN: 4.53 MIN: 4.76 MIN: 5.22 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU aa b d 1.2892 2.5784 3.8676 5.1568 6.446 4.86939 4.92932 5.72961 MIN: 4.64 MIN: 4.92 MIN: 5.66 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU aa b d 700 1400 2100 2800 3500 3155.62 3104.92 3379.99 MIN: 3145.17 MIN: 3084.26 MIN: 3369.93 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU aa b d 400 800 1200 1600 2000 1770.50 1727.88 1844.16 MIN: 1763.22 MIN: 1720.93 MIN: 1834.87 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.5.2 WAV To Opus Encode aa b d 7 14 21 28 35 22.17 26.28 28.91 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
ASTC Encoder Preset: Fast OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Fast aa b d 40 80 120 160 200 170.77 156.30 142.70 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Medium OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Medium aa b d 15 30 45 60 75 67.67 62.17 56.73 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Thorough aa b d 2 4 6 8 10 8.8155 8.1192 7.4152 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Exhaustive aa b d 0.1656 0.3312 0.4968 0.6624 0.828 0.7361 0.6837 0.6236 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Very Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Very Thorough aa b d 0.2717 0.5434 0.8151 1.0868 1.3585 1.2075 1.1154 1.0170 1. (CXX) g++ options: -O3 -flto -pthread
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 aa b d 500 1000 1500 2000 2500 2125.78 2128.84 2378.54
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet aa b d 500 1000 1500 2000 2500 2288.03 2286.41 2504.03
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 aa b d 8K 16K 24K 32K 40K 33358.0 33308.8 36136.2
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile aa b d 2K 4K 6K 8K 10K 7642.71 7816.00 8630.17
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float aa b d 300 600 900 1200 1500 1527.48 1491.33 1622.81
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant aa b d 300 600 900 1200 1500 1249.46 1276.41 1458.15
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 aa b d 7K 14K 21K 28K 35K 28571.2 28254.4 30683.2
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 aa b d 5K 10K 15K 20K 25K 2016.71 22695.20 2338.08
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 aa b d 3K 6K 9K 12K 15K 1536 13914 1617 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 aa b d 1500 3000 4500 6000 7500 1057 6981 1130 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large aa b d 2K 4K 6K 8K 10K 1255 9205 1319 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small aa b d 400 800 1200 1600 2000 465 1998 515 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 aa b d 1300 2600 3900 5200 6500 1868 6239 2169 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 aa b d 2K 4K 6K 8K 10K 1337 10687 1533 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large aa b d 2K 4K 6K 8K 10K 1429 11295 1602 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small aa b d 140 280 420 560 700 581 606 654 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 aa b d 130 260 390 520 650 533 526 581 1. (CXX) g++ options: -O3 -lrt -lm
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 5.0 Test: Writes aa b d 20K 40K 60K 80K 100K 115862 108743 100972
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Parallel aa b d 20 40 60 80 100 106.36 104.50 102.47 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Parallel aa b d 3 6 9 12 15 9.39903 9.56576 9.75407 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard aa b d 30 60 90 120 150 122.65 124.72 121.50 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard aa b d 2 4 6 8 10 8.14774 8.01188 8.22476 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Parallel aa b d 2 4 6 8 10 6.61722 6.63616 5.74356 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Parallel aa b d 40 80 120 160 200 151.12 150.69 174.11 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard aa b d 3 6 9 12 15 9.36551 5.90138 8.72880 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard aa b d 40 80 120 160 200 106.77 169.45 114.56 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Parallel aa b d 14 28 42 56 70 62.66 62.32 56.74 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Parallel aa b d 4 8 12 16 20 15.96 16.05 17.62 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard aa b d 20 40 60 80 100 86.77 60.47 57.42 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard aa b d 4 8 12 16 20 11.52 16.54 17.41 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Parallel aa b d 30 60 90 120 150 140.29 137.13 137.21 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Parallel aa b d 2 4 6 8 10 7.12656 7.29084 7.28642 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard aa b d 30 60 90 120 150 142.37 148.56 147.26 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard aa b d 2 4 6 8 10 7.02169 6.72951 6.78853 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Parallel aa b d 3 6 9 12 15 9.11603 8.48061 8.13821 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Parallel aa b d 30 60 90 120 150 109.70 117.91 122.88 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard aa b d 4 8 12 16 20 15.78 15.70 12.63 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard aa b d 20 40 60 80 100 63.37 63.71 79.18 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel aa b d 100 200 300 400 500 456.27 456.34 434.39 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel aa b d 0.5176 1.0352 1.5528 2.0704 2.588 2.19001 2.18961 2.30027 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard aa b d 110 220 330 440 550 498.19 502.55 479.39 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard aa b d 0.4691 0.9382 1.4073 1.8764 2.3455 2.00621 1.98870 2.08476 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel aa b d 0.1992 0.3984 0.5976 0.7968 0.996 0.885190 0.763448 0.809516 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel aa b d 300 600 900 1200 1500 1129.70 1309.84 1235.30 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard aa b d 0.3578 0.7156 1.0734 1.4312 1.789 1.59037 1.34498 0.79605 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard aa b d 300 600 900 1200 1500 628.78 743.50 1256.20 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel aa b d 5 10 15 20 25 20.96 20.26 18.14 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel aa b d 12 24 36 48 60 47.72 49.36 55.13 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard aa b d 8 16 24 32 40 20.59 32.40 30.30 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard aa b d 11 22 33 44 55 48.56 30.86 33.00 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel aa b d 50 100 150 200 250 231.74 227.83 208.59 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel aa b d 1.0785 2.157 3.2355 4.314 5.3925 4.31466 4.38853 4.79323 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard aa b d 50 100 150 200 250 243.29 225.30 209.12 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard aa b d 1.0756 2.1512 3.2268 4.3024 5.378 4.10903 4.43720 4.78049 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Parallel aa b d 16 32 48 64 80 71.34 68.13 61.25 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Parallel aa b d 4 8 12 16 20 14.02 14.68 16.32 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard aa b d 30 60 90 120 150 113.98 111.03 101.97 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard aa b d 3 6 9 12 15 8.77190 9.00540 9.80512 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel aa b d 0.105 0.21 0.315 0.42 0.525 0.466728 0.404181 0.356134 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel aa b d 600 1200 1800 2400 3000 2142.57 2474.13 2807.93 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard aa b d 0.1642 0.3284 0.4926 0.6568 0.821 0.729925 0.381099 0.611475 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard aa b d 600 1200 1800 2400 3000 1370.00 2623.99 1635.39 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel aa b d 9 18 27 36 45 40.20 35.46 33.28 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel aa b d 7 14 21 28 35 24.87 28.20 30.04 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard aa b d 12 24 36 48 60 52.40 47.84 32.91 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard aa b d 7 14 21 28 35 19.08 20.90 30.38 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Whisperfile Model Size: Tiny OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Tiny aa b d 13 26 39 52 65 55.19 58.16 59.90
Whisperfile Model Size: Small OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Small aa b d 50 100 150 200 250 229.33 231.06 243.97
Whisperfile Model Size: Medium OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Medium aa b d 150 300 450 600 750 655.14 657.64 690.37
Phoronix Test Suite v10.8.5