Nebius AI Provider
Nebius AI Studio - OpenAI-compatible API for large language models
Available Models
Qwen3.5 397B A17B
alibaba
qwen35-397b-a17bStreaming
Vision
Tools
Reasoning
JSON Output
Nebius AI
Context: 262.1k
Input
$0.6
/M tokens
Cached
—
/M tokens
Output
$3.6
/M tokens
MiniMax M2.5
minimax
minimax-m2.5Streaming
Tools
Reasoning
JSON Output
Nebius AI
Context: 204.8k
Input
$0.3
/M tokens
Cached
—
/M tokens
Output
$1.2
/M tokens
GLM-5
glm
glm-5Streaming
Tools
Reasoning
JSON Output
Nebius AI
Context: 202.8k
Input
$1
/M tokens
Cached
—
/M tokens
Output
$3.2
/M tokens
Kimi K2.5
moonshot
kimi-k2.5Streaming
Vision
Tools
Reasoning
JSON Output
Nebius AI
Context: 262.1k
Input
$0.5
/M tokens
Cached
$0.02
/M tokens
Output
$2.5
/M tokens
DeepSeek V3.2
deepseek
deepseek-v3.2Streaming
Tools
JSON Output
Nebius AI
Context: 163.8k
Input
$0.3
/M tokens
Cached
—
/M tokens
Output
$0.45
/M tokens
Qwen3 Next 80B A3B Thinking
alibaba
qwen3-next-80b-a3b-thinkingStreaming
Tools
Reasoning
Nebius AI
Context: 131.1k
Input
$0.15
/M tokens
Cached
—
/M tokens
Output
$1.2
/M tokens
GPT OSS 120B
openai
gpt-oss-120bStreaming
Tools
Reasoning
JSON Output
Nebius AI
Context: 131.1k
Input
$0.15
/M tokens
Cached
—
/M tokens
Output
$0.6
/M tokens
Qwen3 Coder 30B A3B Instruct
alibabaModel Deactivated
qwen3-coder-30b-a3b-instructStreaming
Tools
JSON Output
Nebius AI
Context: 262k
Deactivated since Apr 25, 2026
Input
$0.1
/M tokens
Cached
—
/M tokens
Output
$0.3
/M tokens
Qwen3 30B A3B Instruct 2507
alibaba
qwen3-30b-a3b-instruct-2507Streaming
Tools
JSON Output
Nebius AI
Context: 262k
Input
$0.1
/M tokens
Cached
—
/M tokens
Output
$0.3
/M tokens
Qwen3 30B A3B Thinking 2507
alibabaModel Deactivated
qwen3-30b-a3b-thinking-2507Streaming
Tools
Reasoning
JSON Output
Nebius AI
Context: 262k
Deactivated since Apr 25, 2026
Input
$0.1
/M tokens
Cached
—
/M tokens
Output
$0.3
/M tokens
Qwen3 235B A22B Thinking 2507
alibaba
qwen3-235b-a22b-thinking-2507Streaming
Tools
Reasoning
JSON Output
Nebius AI
Context: 262k
Input
$0.2
/M tokens
Cached
—
/M tokens
Output
$0.6
/M tokens
Qwen3 235B A22B Instruct 2507
alibaba
qwen3-235b-a22b-instruct-2507Streaming
Tools
JSON Output
Nebius AI
Context: 262k
Input
$0.2
/M tokens
Cached
—
/M tokens
Output
$0.6
/M tokens
Kimi K2
moonshotModel Deactivated
kimi-k2Streaming
Tools
JSON Output
Nebius AI
Context: 131.1k
Deactivated since Apr 25, 2026
Input
$0.5
/M tokens
Cached
—
/M tokens
Output
$2.4
/M tokens
DeepSeek R1 (0528)
deepseekModel Deactivated
deepseek-r1-0528Streaming
Nebius AI
Context: 64k
Deactivated since Apr 25, 2026
Input
$0.8
/M tokens
Cached
—
/M tokens
Output
$2.4
/M tokens
Qwen3 14B
alibabaModel Deactivated
qwen3-14bStreaming
Tools
JSON Output
Nebius AI
Context: 32.8k
Deactivated since Nov 3, 2025
Input
$0.08
/M tokens
Cached
—
/M tokens
Output
$0.24
/M tokens
Qwen3 32B
alibaba
qwen3-32bStreaming
Tools
JSON Output
Nebius AI
Context: 32.8k
Input
$0.1
/M tokens
Cached
—
/M tokens
Output
$0.3
/M tokens
Qwen3 30B A3B
alibabaModel Deactivated
qwen3-30b-a3bStreaming
Tools
JSON Output
Nebius AI
Context: 32.8k
Deactivated since Nov 3, 2025
Input
$0.1
/M tokens
Cached
—
/M tokens
Output
$0.3
/M tokens
Llama 3.1 Nemotron Ultra 253B
meta
llama-3.1-nemotron-ultra-253bStreaming
JSON Output
Nebius AI
Context: 128k
Input
$0.6
/M tokens
Cached
—
/M tokens
Output
$1.8
/M tokens
Gemma 3 27B
googleModel Deactivated
gemma-3-27bStreaming
Vision
Nebius AI
Context: 128k
Deactivated since Apr 30, 2026
Input
$0.27
/M tokens
Cached
—
/M tokens
Output
$0.27
/M tokens
Qwen QwQ 32B
alibabaModel Deactivated
qwen-qwq-32bStreaming
JSON Output
Nebius AI
Context: 32.8k
Deactivated since Nov 3, 2025
Input
$0.15
/M tokens
Cached
—
/M tokens
Output
$0.45
/M tokens
Qwen3 Coder 480B A35B Instruct
alibaba
qwen3-coder-480b-a35b-instructStreaming
Tools
JSON Output
Nebius AI
Context: 262k
Input
$0.4
/M tokens
Cached
—
/M tokens
Output
$1.8
/M tokens
Qwen2.5 VL 72B Instruct
alibaba
qwen2-5-vl-72b-instructStreaming
Vision
JSON Output
Nebius AI
Context: 32.8k
Input
$0.13
/M tokens
Cached
—
/M tokens
Output
$0.4
/M tokens
DeepSeek V3
deepseekModel Deactivated
deepseek-v3Streaming
Nebius AI
Context: 64k
Deactivated since Nov 3, 2025
Input
$0.5
/M tokens
Cached
—
/M tokens
Output
$1.5
/M tokens
Llama 3.3 70B Instruct
meta
llama-3.3-70b-instructStreaming
Tools
JSON Output
Nebius AI
Context: 128k
Input
$0.13
/M tokens
Cached
—
/M tokens
Output
$0.4
/M tokens
Qwen2.5 Coder 7B
alibabaModel Deactivated
qwen25-coder-7bStreaming
JSON Output
Nebius AI
Context: 32.8k
Deactivated since Apr 25, 2026
Input
$0.01
/M tokens
Cached
—
/M tokens
Output
$0.03
/M tokens
Qwen2.5 32B Instruct
alibabaModel Deactivated
qwen25-32b-instructStreaming
Tools
JSON Output
Nebius AI
Context: 32.8k
Deactivated since Sep 10, 2025
Input
$0.06
/M tokens
Cached
—
/M tokens
Output
$0.2
/M tokens
Qwen2.5 72B Instruct
alibabaModel Deactivated
qwen25-72b-instructStreaming
Tools
JSON Output
Nebius AI
Context: 32.8k
Deactivated since Nov 3, 2025
Input
$0.13
/M tokens
Cached
—
/M tokens
Output
$0.4
/M tokens
Qwen2 VL 72B Instruct
alibabaModel Deactivated
qwen2-vl-72b-instructStreaming
Vision
JSON Output
Nebius AI
Context: 32.8k
Deactivated since Sep 10, 2025
Input
$0.13
/M tokens
Cached
—
/M tokens
Output
$0.4
/M tokens
Hermes 3 Llama 405B
nousresearchModel Deactivated
hermes-3-llama-405bStreaming
JSON Output
Nebius AI
Context: 131.1k
Deactivated since Nov 3, 2025
Input
$1
/M tokens
Cached
—
/M tokens
Output
$3
/M tokens
Llama 3.1 8B Instruct
metaModel Deactivated
llama-3.1-8b-instructStreaming
Nebius AI
Context: 128k
Deactivated since Apr 25, 2026
Input
$0.02
/M tokens
Cached
—
/M tokens
Output
$0.06
/M tokens
Llama 3.1 405B Instruct
metaModel Deactivated
llama-3.1-405b-instructStreaming
Tools
JSON Output
Nebius AI
Context: 128k
Deactivated since Nov 3, 2025
Input
$1
/M tokens
Cached
—
/M tokens
Output
$3
/M tokens