-
inference-optimization/test_tencentbac_fastmtp
Updated • 38 -
inference-optimization/test_qwen3_next_mtp
Updated • 42 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct_mtp_speculator
Text Generation • 2B • Updated • 58 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct-MTP-ultrachat-epoch3
2B • Updated • 18
Inference Optimization
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
-
inference-optimization/test_tencentbac_fastmtp
Updated • 38 -
inference-optimization/test_qwen3_next_mtp
Updated • 42 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct_mtp_speculator
Text Generation • 2B • Updated • 58 -
inference-optimization/Qwen3-Next-80B-A3B-Instruct-MTP-ultrachat-epoch3
2B • Updated • 18
FP8-block, FP8-dynamic, NVFP4, w4a16, w8a8 quantized models of ibm-granite/granite-4.0-h-small and ibm-granite/granite-4.0-h-tiny models
models 199
inference-optimization/Qwen3-30B-A3B-Instruct-2507-quant-test-1
25B • Updated • 24
inference-optimization/gpt-oss-120b-from-self-ckpt5-speculator.eagle3
0.9B • Updated • 69
inference-optimization/gpt-oss-120b-from-self-ckpt3-speculator.eagle3
0.9B • Updated • 58
inference-optimization/gpt-oss-120b-from-self-ckpt4-speculator.eagle3
0.9B • Updated • 52
inference-optimization/gpt-oss-120b-from-self-ckpt2-speculator.eagle3
0.9B • Updated • 62
inference-optimization/gpt-oss-120b-from-self-ckpt1-speculator.eagle3
0.9B • Updated • 58
inference-optimization/gpt-oss-120b-from-self-ckpt0-speculator.eagle3
0.9B • Updated • 59
inference-optimization/Qwen3-Next-80B-A3B-Instruct-GSM8K-MTP-finetuned
81B • Updated • 12
inference-optimization/Qwen3-Next-80B-A3B-Instruct_mtp_speculator_new
Updated • 20
inference-optimization/Qwen3-30B-from-Qwen3-235B_resps-speculators.eagle3-ckpt3
0.5B • Updated • 22
datasets 6
inference-optimization/speculators-qwen3-30b-a3b-instruct
Preview • Updated • 7
inference-optimization/speculators-qwen3-32b-instruct
Preview • Updated • 10
inference-optimization/gpt-oss-20b-nan-hidden-states-repro
Updated • 27
inference-optimization/SWE-bench_Multilingual
Viewer • Updated • 300 • 13
inference-optimization/SWE-bench_Verified
Viewer • Updated • 500 • 82
inference-optimization/SWE-bench_Lite
Viewer • Updated • 323 • 53