Open Source Models
Open-weight models — Llama, DeepSeek, Qwen, GLM, Kimi, GPT-OSS, Gemma, and more — served through one API
| Features | |||||
|---|---|---|---|---|---|
Nebius AI | $0.10 | $0.30 | — | ||
NovitaAI | $0.07 | $0.27 | — | ||
Nebius AI | $0.40 | $1.80 | — | ||
Vertex AI (OpenAI-compatible) | $0.22 | $1.80 | $0.02 | ||
NovitaAI | $0.30 | $1.30 | — | ||
Nebius AI | $0.13 | $0.40 | — | ||
Cerebras | $0.40 | $0.80 | — | ||
Nebius AI | $0.10 | $0.30 | — | ||
Nebius AI | $0.20 | $0.60 | — | ||
NovitaAI | $0.30 | $3.00 | — | ||
Nebius AI | $0.20 | $0.60 | — | ||
Vertex AI (OpenAI-compatible) | $0.22 | $0.88 | — | ||
Cerebras | $0.60 | $1.20 | — | ||
NovitaAI | $0.09 | $0.58 | — | ||
Alibaba Cloud | $0.57 | $2.29 | — | ||
Alibaba Cloud(cn-beijing) | $0.57 | $2.29 | — | ||
Moonshot AI | $0.60 | $2.50 | $0.15 | ||
Groq | $1.00 | $3.00 | $0.50 | ||
ByteDance | $0.60 | $2.50 | $0.12 | ||
Nebius AI | $0.50 | $2.40 | — | ||
NovitaAI | $0.57 | $2.30 | — | ||
DeepSeek | $0.56 | $1.68 | $0.07 | ||
ByteDance | $0.56 | $1.68 | $0.11 | ||
Groq | $0.75 | $0.99 | — | ||
DeepSeek | $0.55 | $2.19 | — | ||
Nebius AI | $0.80 | $2.40 | — | ||
Nebius AI | $0.50 | $1.50 | — | ||
Together AI | $0.18 | $0.59 | — | ||
Nebius AI | $1.00 | $3.00 | — | ||
NovitaAI | $0.14 | $0.40 | — | ||
Nebius AI | $0.13 | $0.40 | — | ||
Cerebras | $0.85 | $1.20 | — | ||
Groq | $0.20 | $0.20 | — | ||
Nebius AI | $0.60 | $1.80 | — | ||
Inference.net | $0.07 | $0.33 | — | ||
Cerebras | $0.10 | $0.10 | — | ||
NovitaAI | $0.02 | $0.05 | — | ||
Inference.net | $0.07 | $0.33 | — | ||
Nebius AI | $0.02 | $0.06 | — | ||
Together AI | $0.06 | $0.06 | — | ||
AWS Bedrock | $0.22 | $0.22 | — | ||
Groq | $0.10 | $0.50 | — | ||
Together AI | $0.05 | $0.20 | — | ||
NanoGPT | $0.04 | $0.15 | — | ||
Cerebras | $0.35 | $0.75 | — | ||
Together AI | $0.15 | $0.60 | — | ||
ByteDance | $0.10 | $0.50 | $0.02 | ||
Azure | $0.15 | $0.60 | — | ||
NanoGPT | $0.05 | $0.25 | — | ||
Nebius AI | $0.15 | $0.60 | — |
Open-weight models have closed most of the gap with proprietary frontiers: DeepSeek V4, Qwen3.7, GLM-5, Kimi K2, and MiniMax M3 sit near the top of real-world leaderboards, joined by OpenAI's GPT-OSS and Google's Gemma releases. Their weights are public — but running a 200B+ parameter model yourself means serious GPU infrastructure.
This page lists open-weight models served by hosted providers, so you get the openness — inspectable weights, no lock-in, the option to self-host later — with API convenience. LLM Gateway itself is open source (AGPLv3) and self-hostable, so the whole stack can run on your terms.
Frequently asked questions
What is the best open source LLM?
DeepSeek V4, Qwen3.7, GLM-5.2, Kimi K2.6, and MiniMax M3 are the current leaders, each within striking distance of proprietary frontier models. For smaller, hardware-friendly options, GPT-OSS 20B, Gemma 4, and Qwen3.5 9B are the standouts.
What does 'open source' mean for LLMs?
Usually 'open weight': the trained weights are downloadable, but licenses vary — some are Apache 2.0 or MIT, others (like the Llama license) carry usage restrictions, and training data is rarely published. Check the license of a specific model before building on it.
Should I self-host or use an API?
Self-hosting pays off with steady high volume, strict data-residency needs, or fine-tuned weights. For everything else, per-token APIs are cheaper than idle GPUs. A middle path: develop against hosted open models and keep self-hosting as an exit option, since the weights are public.
Are open models cheaper than proprietary ones?
Dramatically, per token. Competition among hosts drives prices down — DeepSeek V4 Flash and Qwen3 Coder 30B cost 10–50x less than frontier proprietary models. The list above shows every provider's price for each model.