Open Source Models
Open-weight models — Llama, DeepSeek, Qwen, GLM, Kimi, GPT-OSS, Gemma, and more — served through one API
| Features | |||||
|---|---|---|---|---|---|
Granite | $1.40$1.12 -20% off | $4.40$3.52 -20% off | $0.26$0.21 -20% off | ||
Z AI | $1.40 | $4.40 | $0.26 | ||
EmberCloud | $1.26 | $3.96 | $0.23 | ||
DeepInfra | $0.10 | $0.15 | — | ||
Moonshot AI | $1.90 | $8.00 | $0.38 | ||
NovitaAI | $0.07 | $0.34 | — | ||
DeepInfra | $0.07 | $0.34 | — | ||
Cerebras | $0.99 | $1.49 | — | ||
NovitaAI | $0.13 | $0.38 | — | ||
DeepInfra | $0.13 | $0.38 | — | ||
Together AI | $0.13 | $0.38 | — | ||
Moonshot AI | $0.95 | $4.00 | $0.19 | ||
Z AI | $0.00 | $0.00 | $0.00 | ||
DeepInfra | $0.50 | $2.50 | $0.15 | ||
MiniMax | $0.60 | $2.40 | $0.12 | ||
Xiaomi | $0.40 | $2.00 | $0.08 | ||
Xiaomi | $0.14 | $0.28 | $0.00 | ||
Xiaomi | $1.00 | $3.00 | $0.20 | ||
Xiaomi | $0.43 | $0.87 | $0.00 | ||
NovitaAI | $0.25 | $1.48 | — | ||
Alibaba Cloud(singapore) | $0.25 | $1.48 | — | ||
Alibaba Cloud | $0.25 | $1.48 | — | ||
Alibaba Cloud(singapore) | $0.20 | $0.40 | $0.04 | ||
DeepSeek | $0.14 | $0.28 | $0.00 | ||
DeepInfra | $0.14 | $0.28 | $0.03 | ||
NovitaAI | $0.14 | $0.28 | $0.03 | ||
Alibaba Cloud(cn-beijing) | $0.14 | $0.28 | $0.03 | ||
Alibaba Cloud | $0.20 | $0.40 | $0.04 | ||
DeepSeek | $0.43 | $0.87 | $0.00 | ||
Alibaba Cloud(singapore) | $2.40 | $4.80 | $0.20 | ||
Together AI | $1.74 | $3.48 | $0.20 | ||
Alibaba Cloud(cn-beijing) | $1.65 | $3.30 | $0.14 | ||
Alibaba Cloud | $2.40 | $4.80 | $0.20 | ||
DeepInfra | $1.74 | $3.48 | $0.14 | ||
Tundra | $0.40 | $2.20 | $0.08 | ||
Together AI | $1.20 | $4.50 | $0.20 | ||
CanopyWave | $0.50 | $2.80 | $0.10 | ||
NovitaAI | $0.95 | $4.00 | $0.16 | ||
Moonshot AI | $0.95 | $4.00 | $0.16 | ||
DeepInfra | $1.05 | $3.50 | $0.20 | ||
EmberCloud | $0.93 | $2.93 | $0.17 | ||
Together AI | $1.40 | $4.40 | $0.26 | ||
NovitaAI | $1.40 | $4.40 | $0.26 | ||
Z AI | $1.40 | $4.40 | $0.26 | ||
Xiaomi | $0.10 | $0.30 | $0.02 | ||
MiniMax | $0.60 | $2.40 | $0.03 | ||
MiniMax | $0.60 | $2.40 | $0.06 | ||
Together AI | $0.30 | $1.20 | $0.06 | ||
MiniMax | $0.30 | $1.20 | $0.06 | ||
NovitaAI | $0.30 | $1.20 | $0.06 |
Open-weight models have closed most of the gap with proprietary frontiers: DeepSeek V4, Qwen3.7, GLM-5, Kimi K2, and MiniMax M3 sit near the top of real-world leaderboards, joined by OpenAI's GPT-OSS and Google's Gemma releases. Their weights are public — but running a 200B+ parameter model yourself means serious GPU infrastructure.
This page lists open-weight models served by hosted providers, so you get the openness — inspectable weights, no lock-in, the option to self-host later — with API convenience. LLM Gateway itself is open source (AGPLv3) and self-hostable, so the whole stack can run on your terms.
Frequently asked questions
What is the best open source LLM?
DeepSeek V4, Qwen3.7, GLM-5.2, Kimi K2.6, and MiniMax M3 are the current leaders, each within striking distance of proprietary frontier models. For smaller, hardware-friendly options, GPT-OSS 20B, Gemma 4, and Qwen3.5 9B are the standouts.
What does 'open source' mean for LLMs?
Usually 'open weight': the trained weights are downloadable, but licenses vary — some are Apache 2.0 or MIT, others (like the Llama license) carry usage restrictions, and training data is rarely published. Check the license of a specific model before building on it.
Should I self-host or use an API?
Self-hosting pays off with steady high volume, strict data-residency needs, or fine-tuned weights. For everything else, per-token APIs are cheaper than idle GPUs. A middle path: develop against hosted open models and keep self-hosting as an exit option, since the weights are public.
Are open models cheaper than proprietary ones?
Dramatically, per token. Competition among hosts drives prices down — DeepSeek V4 Flash and Qwen3 Coder 30B cost 10–50x less than frontier proprietary models. The list above shows every provider's price for each model.