new mtl framework Intel Core Ultra 7 155H testing with a Framework Laptop 13 (Intel Core Ultra 1) FRANMECP05 (03.01 BIOS) and Intel Arc MTL 8GB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2408139-NE-NEWMTLFRA01&grw .
new mtl framework Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c Intel Core Ultra 7 155H @ 4.50GHz (16 Cores / 22 Threads) Framework Laptop 13 (Intel Core Ultra 1) FRANMECP05 (03.01 BIOS) Intel Device 7e7f 2 x 8GB DDR5-5600MT/s A-DATA AD5S56008G-SFW 512GB Western Digital WD PC SN740 SDDPNQD-512G Intel Arc MTL 8GB Realtek ALC285 MEDIATEK MT7922 802.11ax PCI Ubuntu 24.04 6.10.0-061000rc4daily20240621-generic (x86_64) GNOME Shell 46.0 X Server + Wayland 4.6 Mesa 24.2~git2406250600.5cb15a~oibaf~n (git-5cb15a6 2024-06-25 noble-oibaf-ppa) GCC 13.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x1e - Thermald 2.5.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: BHI_DIS_S + srbds: Not affected + tsx_async_abort: Not affected
new mtl framework etcpak: Multi-Threaded - ETC2 lczero: BLAS lczero: Eigen mnn: nasnet mnn: mobilenetV3 povray: Trace Time mnn: squeezenetv1.1 x265: Bosphorus 4K mnn: SqueezeNetV1.0 y-cruncher: 500M mnn: MobileNetV2_224 stockfish: Chess Benchmark y-cruncher: 1B x265: Bosphorus 1080p mnn: resnet-v2-50 mnn: mobilenet-v1-1.0 mnn: inception-v3 gromacs: water_GMX50_bare xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV3Small xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV3Small xnnpack: QU8MobileNetV2 xnnpack: QU8MobileNetV3Large xnnpack: QU8MobileNetV3Small mt-dgemm: Sustained Floating-Point Rate build2: Time To Compile simdjson: Kostya simdjson: TopTweet simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID a b c 220.524 26 25 23.726 3.111 57.761 5.861 12.1 9.814 31.851 6.692 10688485 74.994 51.06 48.546 6.216 69.547 0.579 4388 6627 2246 4251 5247 2385 3737 3605 2132 52.027491 297.757 4.2 6.81 1.46 6.62 6.81 223.28 27 24 23.633 3.157 58.398 5.881 12.08 9.669 31.872 6.701 10530329 75.403 49.51 48.746 6.253 69.196 0.581 4077 5697 2362 4377 4817 2523 3554 3747 2264 52.067546 298.584 4.21 6.81 1.47 6.62 6.83 222.068 26 25 23.841 3.772 57.951 5.921 12.13 9.9 31.502 6.757 11305323 75.633 50.61 48.796 6.152 69.412 0.581 4397 5356 2160 5409 4784 2277 4238 4439 1896 52.094087 299.953 4.26 6.79 1.46 6.62 6.83 OpenBenchmarking.org
Etcpak Benchmark: Multi-Threaded - Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 2.0 Benchmark: Multi-Threaded - Configuration: ETC2 a b c 50 100 150 200 250 220.52 223.28 222.07 1. (CXX) g++ options: -flto -pthread
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.31.1 Backend: BLAS a b c 6 12 18 24 30 26 27 26 1. (CXX) g++ options: -flto -pthread
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.31.1 Backend: Eigen a b c 6 12 18 24 30 25 24 25 1. (CXX) g++ options: -flto -pthread
Mobile Neural Network Model: nasnet OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: nasnet a b c 6 12 18 24 30 23.73 23.63 23.84 MIN: 20 / MAX: 52.87 MIN: 19.46 / MAX: 56.99 MIN: 19.99 / MAX: 46.4 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: mobilenetV3 a b c 0.8487 1.6974 2.5461 3.3948 4.2435 3.111 3.157 3.772 MIN: 2.56 / MAX: 10.64 MIN: 2.44 / MAX: 19.36 MIN: 2.45 / MAX: 17.15 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
POV-Ray Trace Time OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray Trace Time a b c 13 26 39 52 65 57.76 58.40 57.95 1. POV-Ray 3.7.0.10.unofficial
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: squeezenetv1.1 a b c 1.3322 2.6644 3.9966 5.3288 6.661 5.861 5.881 5.921 MIN: 4.69 / MAX: 21.36 MIN: 4.62 / MAX: 25.58 MIN: 4.44 / MAX: 25.13 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 4K a b c 3 6 9 12 15 12.10 12.08 12.13 1. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: SqueezeNetV1.0 a b c 3 6 9 12 15 9.814 9.669 9.900 MIN: 7.23 / MAX: 25.89 MIN: 7.59 / MAX: 16.03 MIN: 7.61 / MAX: 30.63 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Y-Cruncher Pi Digits To Calculate: 500M OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 500M a b c 7 14 21 28 35 31.85 31.87 31.50
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: MobileNetV2_224 a b c 2 4 6 8 10 6.692 6.701 6.757 MIN: 5.01 / MAX: 16.16 MIN: 5.19 / MAX: 26.48 MIN: 5.05 / MAX: 29.03 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish Chess Benchmark a b c 2M 4M 6M 8M 10M 10688485 10530329 11305323 1. Stockfish 16 by the Stockfish developers (see AUTHORS file)
Y-Cruncher Pi Digits To Calculate: 1B OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 1B a b c 20 40 60 80 100 74.99 75.40 75.63
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 1080p a b c 12 24 36 48 60 51.06 49.51 50.61 1. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: resnet-v2-50 a b c 11 22 33 44 55 48.55 48.75 48.80 MIN: 36.42 / MAX: 73.25 MIN: 37.7 / MAX: 70.01 MIN: 38.05 / MAX: 133.88 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: mobilenet-v1-1.0 a b c 2 4 6 8 10 6.216 6.253 6.152 MIN: 3.56 / MAX: 26.49 MIN: 3.61 / MAX: 31.97 MIN: 3.54 / MAX: 24.17 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: inception-v3 a b c 15 30 45 60 75 69.55 69.20 69.41 MIN: 58.98 / MAX: 120.56 MIN: 57.21 / MAX: 117.21 MIN: 55.7 / MAX: 110.33 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
GROMACS Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS Input: water_GMX50_bare a b c 0.1307 0.2614 0.3921 0.5228 0.6535 0.579 0.581 0.581 1. GROMACS version: 2023.3-Ubuntu_2023.3_1ubuntu3
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP32MobileNetV2 a b c 900 1800 2700 3600 4500 4388 4077 4397 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP32MobileNetV3Large a b c 1400 2800 4200 5600 7000 6627 5697 5356 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP32MobileNetV3Small a b c 500 1000 1500 2000 2500 2246 2362 2160 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP16MobileNetV2 a b c 1200 2400 3600 4800 6000 4251 4377 5409 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP16MobileNetV3Large a b c 1100 2200 3300 4400 5500 5247 4817 4784 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP16MobileNetV3Small a b c 500 1000 1500 2000 2500 2385 2523 2277 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: QU8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: QU8MobileNetV2 a b c 900 1800 2700 3600 4500 3737 3554 4238 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: QU8MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: QU8MobileNetV3Large a b c 1000 2000 3000 4000 5000 3605 3747 4439 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: QU8MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: QU8MobileNetV3Small a b c 500 1000 1500 2000 2500 2132 2264 1896 1. (CXX) g++ options: -O3 -lrt -lm
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate a b c 12 24 36 48 60 52.03 52.07 52.09 1. (CC) gcc options: -ffast-math -mavx2 -O3 -fopenmp -lopenblas
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.17 Time To Compile a b c 70 140 210 280 350 297.76 298.58 299.95
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: Kostya a b c 0.9585 1.917 2.8755 3.834 4.7925 4.20 4.21 4.26 1. (CXX) g++ options: -O3 -lrt
simdjson Throughput Test: TopTweet OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: TopTweet a b c 2 4 6 8 10 6.81 6.81 6.79 1. (CXX) g++ options: -O3 -lrt
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: LargeRandom a b c 0.3308 0.6616 0.9924 1.3232 1.654 1.46 1.47 1.46 1. (CXX) g++ options: -O3 -lrt
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: PartialTweets a b c 2 4 6 8 10 6.62 6.62 6.62 1. (CXX) g++ options: -O3 -lrt
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: DistinctUserID a b c 2 4 6 8 10 6.81 6.83 6.83 1. (CXX) g++ options: -O3 -lrt
Phoronix Test Suite v10.8.5