Post-train any open model on your data, in your infrastructure.
Supported models
Qwen8 models
| Name | Context | Type |
|---|---|---|
| Qwen3.6-35B-A3B | 256k | MoE |
| Qwen3.6-27B | 256k | Dense |
| Qwen3.5-4B | 256k | Dense |
| Qwen3.5-9B | 256k | Dense |
| Qwen3.5-9B-Base | 256k | Dense |
| Qwen3.5-35B-A3B-Base | 256k | MoE |
| Qwen3.5-397B-A17B | 256k | MoE |
| Qwen3-8B | 32k | Dense |
OpenAI2 models
| Name | Context | Type |
|---|---|---|
| GPT-OSS-120B | 128k | MoE |
| GPT-OSS-20B | 128k | MoE |
Meta1 models
| Name | Context | Type |
|---|---|---|
| Llama-3.3-70B-Instruct | 128k | Dense |
Mistral6 models
| Name | Context | Type |
|---|---|---|
| Mistral-Large-3-675B-Instruct-2512 | 256k | MoE |
| Mistral-Medium-3.5-128B | 256k | Dense |
| Mistral-Small-4-119B-2603 | 256k | MoE |
| Ministral-3-14B-Instruct-2512 | 256k | Dense |
| Ministral-3-8B-Instruct-2512 | 256k | Dense |
| Ministral-3-3B-Instruct-2512 | 256k | Dense |
DeepSeek3 models
| Name | Context | Type |
|---|---|---|
| DeepSeek-V4-Pro | 1M | MoE |
| DeepSeek-V4-Flash | 1M | MoE |
| DeepSeek-V3.1 | 128k | MoE |
Moonshot2 models
| Name | Context | Type |
|---|---|---|
| Kimi-K2.6 | 256k | MoE |
| Kimi-K2.7-Code | 256k | MoE |
Nvidia3 models
| Name | Context | Type |
|---|---|---|
| Nemotron-3-Nano-30B-A3B-BF16 | 256k | MoE |
| Nemotron-3-Super-120B-A12B-BF16 | 256k | MoE |
| Nemotron-3-Ultra-550B-A55B-BF16 | 256k | MoE |