new mtl framework Intel Core Ultra 7 155H testing with a Framework Laptop 13 (Intel Core Ultra 1) FRANMECP05 (03.01 BIOS) and Intel Arc MTL 8GB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2408139-NE-NEWMTLFRA01&grs .
new mtl framework Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c Intel Core Ultra 7 155H @ 4.50GHz (16 Cores / 22 Threads) Framework Laptop 13 (Intel Core Ultra 1) FRANMECP05 (03.01 BIOS) Intel Device 7e7f 2 x 8GB DDR5-5600MT/s A-DATA AD5S56008G-SFW 512GB Western Digital WD PC SN740 SDDPNQD-512G Intel Arc MTL 8GB Realtek ALC285 MEDIATEK MT7922 802.11ax PCI Ubuntu 24.04 6.10.0-061000rc4daily20240621-generic (x86_64) GNOME Shell 46.0 X Server + Wayland 4.6 Mesa 24.2~git2406250600.5cb15a~oibaf~n (git-5cb15a6 2024-06-25 noble-oibaf-ppa) GCC 13.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x1e - Thermald 2.5.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: BHI_DIS_S + srbds: Not affected + tsx_async_abort: Not affected
new mtl framework xnnpack: FP16MobileNetV2 xnnpack: FP32MobileNetV3Large xnnpack: QU8MobileNetV3Large mnn: mobilenetV3 xnnpack: QU8MobileNetV3Small xnnpack: QU8MobileNetV2 xnnpack: FP16MobileNetV3Small xnnpack: FP16MobileNetV3Large xnnpack: FP32MobileNetV3Small xnnpack: FP32MobileNetV2 stockfish: Chess Benchmark lczero: Eigen lczero: BLAS x265: Bosphorus 1080p mnn: SqueezeNetV1.0 mnn: mobilenet-v1-1.0 simdjson: Kostya etcpak: Multi-Threaded - ETC2 y-cruncher: 500M povray: Trace Time mnn: squeezenetv1.1 mnn: MobileNetV2_224 mnn: nasnet y-cruncher: 1B build2: Time To Compile simdjson: LargeRand mnn: resnet-v2-50 mnn: inception-v3 x265: Bosphorus 4K gromacs: water_GMX50_bare simdjson: TopTweet simdjson: DistinctUserID mt-dgemm: Sustained Floating-Point Rate simdjson: PartialTweets a b c 4251 6627 3605 3.111 2132 3737 2385 5247 2246 4388 10688485 25 26 51.06 9.814 6.216 4.2 220.524 31.851 57.761 5.861 6.692 23.726 74.994 297.757 1.46 48.546 69.547 12.1 0.579 6.81 6.81 52.027491 6.62 4377 5697 3747 3.157 2264 3554 2523 4817 2362 4077 10530329 24 27 49.51 9.669 6.253 4.21 223.28 31.872 58.398 5.881 6.701 23.633 75.403 298.584 1.47 48.746 69.196 12.08 0.581 6.81 6.83 52.067546 6.62 5409 5356 4439 3.772 1896 4238 2277 4784 2160 4397 11305323 25 26 50.61 9.9 6.152 4.26 222.068 31.502 57.951 5.921 6.757 23.841 75.633 299.953 1.46 48.796 69.412 12.13 0.581 6.79 6.83 52.094087 6.62 OpenBenchmarking.org
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP16MobileNetV2 a b c 1200 2400 3600 4800 6000 4251 4377 5409 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP32MobileNetV3Large a b c 1400 2800 4200 5600 7000 6627 5697 5356 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: QU8MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: QU8MobileNetV3Large a b c 1000 2000 3000 4000 5000 3605 3747 4439 1. (CXX) g++ options: -O3 -lrt -lm
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: mobilenetV3 a b c 0.8487 1.6974 2.5461 3.3948 4.2435 3.111 3.157 3.772 MIN: 2.56 / MAX: 10.64 MIN: 2.44 / MAX: 19.36 MIN: 2.45 / MAX: 17.15 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
XNNPACK Model: QU8MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: QU8MobileNetV3Small a b c 500 1000 1500 2000 2500 2132 2264 1896 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: QU8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: QU8MobileNetV2 a b c 900 1800 2700 3600 4500 3737 3554 4238 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP16MobileNetV3Small a b c 500 1000 1500 2000 2500 2385 2523 2277 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP16MobileNetV3Large a b c 1100 2200 3300 4400 5500 5247 4817 4784 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP32MobileNetV3Small a b c 500 1000 1500 2000 2500 2246 2362 2160 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP32MobileNetV2 a b c 900 1800 2700 3600 4500 4388 4077 4397 1. (CXX) g++ options: -O3 -lrt -lm
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish Chess Benchmark a b c 2M 4M 6M 8M 10M 10688485 10530329 11305323 1. Stockfish 16 by the Stockfish developers (see AUTHORS file)
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.31.1 Backend: Eigen a b c 6 12 18 24 30 25 24 25 1. (CXX) g++ options: -flto -pthread
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.31.1 Backend: BLAS a b c 6 12 18 24 30 26 27 26 1. (CXX) g++ options: -flto -pthread
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 1080p a b c 12 24 36 48 60 51.06 49.51 50.61 1. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: SqueezeNetV1.0 a b c 3 6 9 12 15 9.814 9.669 9.900 MIN: 7.23 / MAX: 25.89 MIN: 7.59 / MAX: 16.03 MIN: 7.61 / MAX: 30.63 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: mobilenet-v1-1.0 a b c 2 4 6 8 10 6.216 6.253 6.152 MIN: 3.56 / MAX: 26.49 MIN: 3.61 / MAX: 31.97 MIN: 3.54 / MAX: 24.17 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: Kostya a b c 0.9585 1.917 2.8755 3.834 4.7925 4.20 4.21 4.26 1. (CXX) g++ options: -O3 -lrt
Etcpak Benchmark: Multi-Threaded - Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 2.0 Benchmark: Multi-Threaded - Configuration: ETC2 a b c 50 100 150 200 250 220.52 223.28 222.07 1. (CXX) g++ options: -flto -pthread
Y-Cruncher Pi Digits To Calculate: 500M OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 500M a b c 7 14 21 28 35 31.85 31.87 31.50
POV-Ray Trace Time OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray Trace Time a b c 13 26 39 52 65 57.76 58.40 57.95 1. POV-Ray 3.7.0.10.unofficial
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: squeezenetv1.1 a b c 1.3322 2.6644 3.9966 5.3288 6.661 5.861 5.881 5.921 MIN: 4.69 / MAX: 21.36 MIN: 4.62 / MAX: 25.58 MIN: 4.44 / MAX: 25.13 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: MobileNetV2_224 a b c 2 4 6 8 10 6.692 6.701 6.757 MIN: 5.01 / MAX: 16.16 MIN: 5.19 / MAX: 26.48 MIN: 5.05 / MAX: 29.03 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Mobile Neural Network Model: nasnet OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: nasnet a b c 6 12 18 24 30 23.73 23.63 23.84 MIN: 20 / MAX: 52.87 MIN: 19.46 / MAX: 56.99 MIN: 19.99 / MAX: 46.4 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Y-Cruncher Pi Digits To Calculate: 1B OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 1B a b c 20 40 60 80 100 74.99 75.40 75.63
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.17 Time To Compile a b c 70 140 210 280 350 297.76 298.58 299.95
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: LargeRandom a b c 0.3308 0.6616 0.9924 1.3232 1.654 1.46 1.47 1.46 1. (CXX) g++ options: -O3 -lrt
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: resnet-v2-50 a b c 11 22 33 44 55 48.55 48.75 48.80 MIN: 36.42 / MAX: 73.25 MIN: 37.7 / MAX: 70.01 MIN: 38.05 / MAX: 133.88 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: inception-v3 a b c 15 30 45 60 75 69.55 69.20 69.41 MIN: 58.98 / MAX: 120.56 MIN: 57.21 / MAX: 117.21 MIN: 55.7 / MAX: 110.33 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 4K a b c 3 6 9 12 15 12.10 12.08 12.13 1. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6
GROMACS Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS Input: water_GMX50_bare a b c 0.1307 0.2614 0.3921 0.5228 0.6535 0.579 0.581 0.581 1. GROMACS version: 2023.3-Ubuntu_2023.3_1ubuntu3
simdjson Throughput Test: TopTweet OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: TopTweet a b c 2 4 6 8 10 6.81 6.81 6.79 1. (CXX) g++ options: -O3 -lrt
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: DistinctUserID a b c 2 4 6 8 10 6.81 6.83 6.83 1. (CXX) g++ options: -O3 -lrt
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate a b c 12 24 36 48 60 52.03 52.07 52.09 1. (CC) gcc options: -ffast-math -mavx2 -O3 -fopenmp -lopenblas
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: PartialTweets a b c 2 4 6 8 10 6.62 6.62 6.62 1. (CXX) g++ options: -O3 -lrt
Phoronix Test Suite v10.8.5