Post-train any open model on your data, in your infrastructure.

Supported models

Name	Context	Type
Qwen3.6-35B-A3B	256k	MoE
Qwen3.6-27B	256k	Dense
Qwen3.5-4B	256k	Dense
Qwen3.5-9B	256k	Dense
Qwen3.5-9B-Base	256k	Dense
Qwen3.5-35B-A3B-Base	256k	MoE
Qwen3.5-397B-A17B	256k	MoE
Qwen3-8B	32k	Dense

Name	Context	Type
GPT-OSS-120B	128k	MoE
GPT-OSS-20B	128k	MoE

Name	Context	Type
Llama-3.3-70B-Instruct	128k	Dense

Mistral6 models

Name	Context	Type
Mistral-Large-3-675B-Instruct-2512	256k	MoE
Mistral-Medium-3.5-128B	256k	Dense
Mistral-Small-4-119B-2603	256k	MoE
Ministral-3-14B-Instruct-2512	256k	Dense
Ministral-3-8B-Instruct-2512	256k	Dense
Ministral-3-3B-Instruct-2512	256k	Dense

DeepSeek3 models

Name	Context	Type
DeepSeek-V4-Pro	1M	MoE
DeepSeek-V4-Flash	1M	MoE
DeepSeek-V3.1	128k	MoE

Name	Context	Type
GLM-5.1	200k	MoE
GLM-5	200k	MoE
GLM-4.7	200k	MoE

Moonshot2 models

Name	Context	Type
Kimi-K2.6	256k	MoE
Kimi-K2.7-Code	256k	MoE

Name	Context	Type
Nemotron-3-Nano-30B-A3B-BF16	256k	MoE
Nemotron-3-Super-120B-A12B-BF16	256k	MoE
Nemotron-3-Ultra-550B-A55B-BF16	256k	MoE