kdkdk AMD Ryzen 9 7950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (1905 BIOS) and AMD Radeon RX 7900 GRE 16GB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2406020-PTS-KDKDK07936&rdt&grs .
kdkdk Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c d e AMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads) ASUS ROG STRIX X670E-E GAMING WIFI (1905 BIOS) AMD Device 14d8 2 x 16GB DDR5-6000MT/s Crucial CP16G60C36U5W.M8D1 Western Digital WD_BLACK SN850X 2000GB + 4001GB Western Digital WD_BLACK SN850X 4000GB AMD Radeon RX 7900 GRE 16GB (2200/3000MHz) AMD Navi 31 HDMI/DP DELL U2723QE Intel I225-V + Intel Wi-Fi 6E Ubuntu 24.04 6.8.0-31-generic (x86_64) GNOME Shell 46.0 X Server + Wayland 4.6 Mesa 24.0.5-1ubuntu1 (LLVM 17.0.6 DRM 3.57) GCC 13.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601206 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
kdkdk llamafile: wizardcoder-python-34b-v1.0.Q6_K - CPU whisper-cpp: ggml-base.en - 2016 State of the Union whisper-cpp: ggml-small.en - 2016 State of the Union llamafile: Meta-Llama-3-8B-Instruct.F16 - CPU llamafile: mistral-7b-instruct-v0.2.Q5_K_M - CPU whisper-cpp: ggml-medium.en - 2016 State of the Union llama-cpp: Meta-Llama-3-8B-Instruct-Q8_0.gguf llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - CPU llamafile: llava-v1.6-mistral-7b.Q8_0 - CPU a b c d e 2.30 81.10251 225.22591 4.42 12.25 648.53817 8.08 30.07 2.32 80.81378 223.34056 12.17 646.96444 8.05 30.19 2.3 81.54243 225.65538 4.4 648.86975 8.03 30.12 2.29 80.67784 225.35352 4.38 12.2 649.91712 8.06 30.23 2.3 80.6863 224.3112 12.14 651.79719 8.07 30.14 OpenBenchmarking.org
Llamafile Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.6 Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU a b c d e 0.522 1.044 1.566 2.088 2.61 SE +/- 0.00, N = 3 2.30 2.32 2.30 2.29 2.30
Whisper.cpp Model: ggml-base.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-base.en - Input: 2016 State of the Union a b c d e 20 40 60 80 100 SE +/- 0.25, N = 3 81.10 80.81 81.54 80.68 80.69 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
Whisper.cpp Model: ggml-small.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-small.en - Input: 2016 State of the Union a b c d e 50 100 150 200 250 SE +/- 0.84, N = 3 225.23 223.34 225.66 225.35 224.31 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
Llamafile Test: Meta-Llama-3-8B-Instruct.F16 - Acceleration: CPU OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.6 Test: Meta-Llama-3-8B-Instruct.F16 - Acceleration: CPU a c d 0.9945 1.989 2.9835 3.978 4.9725 SE +/- 0.03, N = 2 4.42 4.40 4.38
Llamafile Test: mistral-7b-instruct-v0.2.Q5_K_M - Acceleration: CPU OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.6 Test: mistral-7b-instruct-v0.2.Q5_K_M - Acceleration: CPU a b d e 3 6 9 12 15 SE +/- 0.02, N = 3 12.25 12.17 12.20 12.14
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-medium.en - Input: 2016 State of the Union a b c d e 140 280 420 560 700 SE +/- 0.46, N = 3 648.54 646.96 648.87 649.92 651.80 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
Llama.cpp Model: Meta-Llama-3-8B-Instruct-Q8_0.gguf OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b3067 Model: Meta-Llama-3-8B-Instruct-Q8_0.gguf a b c d e 2 4 6 8 10 SE +/- 0.02, N = 3 8.08 8.05 8.03 8.06 8.07 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
Llamafile Test: TinyLlama-1.1B-Chat-v1.0.BF16 - Acceleration: CPU OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.6 Test: TinyLlama-1.1B-Chat-v1.0.BF16 - Acceleration: CPU a b c d e 7 14 21 28 35 SE +/- 0.05, N = 3 30.07 30.19 30.12 30.23 30.14
Phoronix Test Suite v10.8.5