GPTOSS-120B-Uncensored-HauhauCS-Aggressive

Uncensored version of GPT-OSS 120B by OpenAI. This is the aggressive variant - tuned harder for fewer refusals.

No changes to datasets or capabilities. Fully functional, 100% of what the original authors intended - just without the refusals.

Format

MXFP4 GGUF. This is the model's native precision - GPT-OSS was trained in MXFP4, so no further quantization is needed or recommended. Re-quantizing would only lose quality.

Works with llama.cpp, LM Studio, Ollama, and anything else that loads GGUFs.

Downloads

File Size
GPTOSS-120B-Uncensored-HauhauCS-Aggressive-MXFP4.gguf 61 GB

Specs

  • 117B total parameters, ~5.1B active per forward pass (MoE: 128 experts, top-4 routing)
  • 128K context
  • Based on openai/gpt-oss-120b

Recommended Settings

  • temperature: 1.0
  • top_k: 40
  • Everything else (top_p, min_p, repeat penalty, etc.) should be disabled - some clients enable these by default, turn them off

Required flag: --jinja to enable the Harmony response format (the model won't work correctly without it).

For llama.cpp:

llama-server -m model.gguf --jinja -fa -b 2048 -ub 2048

LM Studio

Compatible with Reasoning Effort custom buttons. To use them, put the model in:

LM Models\lmstudio-community\gpt-oss-120b-GGUF\

Hardware

Fits in ~61GB VRAM. Single H100 or equivalent. For lower VRAM, use --n-cpu-moe N in llama.cpp to offload MoE layers to CPU.

Downloads last month
6,632
GGUF
Model size
117B params
Architecture
gpt-oss
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for HauhauCS/GPTOSS-120B-Uncensored-HauhauCS-Aggressive

Quantized
(80)
this model