AMD Ryzen AI Max+ 395 Strix Halo

adamelliotfields 's Collections

Small Language Models

Favorites

Image Generation

CV Papers

Papers

updated 24 days ago

Quantized models benchmarked with Windows ROCm llama.cpp builds from Lemonade using recommended parameters. OpenCode testing done in WSL.

Upvote

unsloth/LFM2.5-1.2B-Instruct-GGUF

Text Generation • 1B • Updated Jan 6 • 3.99k • 35

Note 145 t/s @ Q8_0. Surprisingly capable in chat. Not usable in OpenCode.
ggml-org/gpt-oss-20b-GGUF

21B • Updated Oct 30, 2025 • 89.3k • 139

Note 60 t/s @ MXFP4. OpenCode tools work. Prefer 120B.
mradermacher/Nanbeige4.1-3B-GGUF

4B • Updated Feb 12 • 11k • 36

Note 51 t/s @ Q8_0. Thinks for minutes. Not usable in OpenCode.
unsloth/GLM-4.7-Flash-GGUF

Text Generation • 30B • Updated Feb 12 • 317k • 575

Note 45 t/s @ Q8_0. OpenCode tool calling works great. Made a nice looking 400-line OpenMeteo weather app with typeahead search. Required manual TypeScript error fixes to run. Note that the smaller REAP model wasn't faster.
bartowski/moonshotai_Kimi-Linear-48B-A3B-Instruct-GGUF

Text Generation • 49B • Updated Feb 9 • 4.91k • 19

Note 45 t/s @ Q8_0. Excellent OpenCode tool calling including todo list and ask question. Made a 600-line OpenMeteo weather app with no errors. Note that it did everything the frontend-design skill said NOT to do, resulting in a comically bad looking app. Most usable model on this list.
ggml-org/gpt-oss-120b-GGUF

117B • Updated Oct 30, 2025 • 353k • 67

Note 42 t/s @ MXFP4. Good OpenCode tool calling, writes working TypeScript, but even the frontend-design skill can't get it to make attractive websites. Feels like GPT-4o, which is nice for nostalgia.
unsloth/Qwen3-Coder-Next-GGUF

Text Generation • 80B • Updated 19 days ago • 368k • 516

Note 32 t/s @ Q8_0. Was not able to build a working OpenMeteo weather app. Struggled with the edit tool attempting to fix errors. Was not able to properly trace errors in the code.
Intel/MiniMax-M2-REAP-172B-A10B-gguf-q2ks-mixed-AutoRound

173B • Updated Feb 9 • 110 • 10

Note 26 t/s @ Q2_K_S.
unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF

Image-Text-to-Text • 108B • Updated Jun 17, 2025 • 43.4k • 145

Note 15 t/s @ UD-IQ3_XXS.
unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF

24B • Updated Dec 15, 2025 • 41.1k • 120

Note 9 t/s @ Q8_0. All dense models are slow on Strix Halo. Speculative decoding (ngram-mod) works very well when it kicks in.

Upvote