Support

AI-powered help

Welcome!

Please introduce yourself before we start.

    LLM Gateway
    • Docs
    • Pricing
    • Pricing
    • Docs
    • Models
      • AI Gateway
      • DevPass
      • Chat Playground
      • Observability
      • Enterprise
      • Blog
      • Changelog
      • Integrations
      • Reliability
      • Guardrails
      • Providers
      • Apps
      • Models
      • Model Timeline
      • Compare
      • Token Cost Calculator
      • Referral Program
      • MCP Server
      • Agents
      • AI SDK Provider
      • Agent Skills
      • Templates
      • Guides
    1.4k
    Log InGet Started

    Open Source Models

    Open-weight models — Llama, DeepSeek, Qwen, GLM, Kimi, GPT-OSS, Gemma, and more — served through one API

    Compare

    Use Case

    Capabilities

    Provider

    Input Price ($/M tokens)

    Output Price ($/M tokens)

    Context Size (tokens)

    88
    Models
    44
    Providers
    25
    Vision Models
    66
    Tool-enabled
    3
    Free Models
    Features
    Nebius AI
    qwen3-coder-30b-a3b-instruct
    $0.10$0.30—
    NovitaAI
    qwen3-coder-30b-a3b-instruct
    $0.07$0.27—
    Nebius AI
    qwen3-coder-480b-a35b-instruct
    $0.40$1.80—
    Vertex AI (OpenAI-compatible)
    qwen3-coder-480b-a35b-instruct
    $0.22$1.80$0.02
    NovitaAI
    qwen3-coder-480b-a35b-instruct
    $0.30$1.30—
    Nebius AI
    qwen2-5-vl-72b-instruct
    $0.13$0.40—
    Cerebras
    qwen3-32b
    $0.40$0.80—
    Nebius AI
    qwen3-32b
    $0.10$0.30—
    Nebius AI
    qwen3-235b-a22b-thinking-2507
    $0.20$0.60—
    NovitaAI
    qwen3-235b-a22b-thinking-2507
    $0.30$3.00—
    Nebius AI
    qwen3-235b-a22b-instruct-2507
    $0.20$0.60—
    Vertex AI (OpenAI-compatible)
    qwen3-235b-a22b-instruct-2507
    $0.22$0.88—
    Cerebras
    qwen3-235b-a22b-instruct-2507
    $0.60$1.20—
    NovitaAI
    qwen3-235b-a22b-instruct-2507
    $0.09$0.58—
    Alibaba Cloud
    kimi-k2
    $0.57$2.29—
    Alibaba Cloud(cn-beijing)
    kimi-k2
    $0.57$2.29—
    Moonshot AI
    kimi-k2
    $0.60$2.50$0.15
    Groq
    kimi-k2
    $1.00$3.00$0.50
    ByteDance
    kimi-k2
    $0.60$2.50$0.12
    Nebius AI
    kimi-k2
    $0.50$2.40—
    NovitaAI
    kimi-k2
    $0.57$2.30—
    DeepSeek
    deepseek-v3.1
    $0.56$1.68$0.07
    ByteDance
    deepseek-v3.1
    $0.56$1.68$0.11
    Groq
    deepseek-r1-distill-llama-70b
    $0.75$0.99—
    DeepSeek
    deepseek-r1-0528
    $0.55$2.19—
    Nebius AI
    deepseek-r1-0528
    $0.80$2.40—
    Nebius AI
    deepseek-v3
    $0.50$1.50—
    Together AI
    llama-4-scout
    $0.18$0.59—
    Nebius AI
    llama-3.1-405b-instruct
    $1.00$3.00—
    NovitaAI
    llama-3.3-70b-instruct
    $0.14$0.40—
    Nebius AI
    llama-3.3-70b-instruct
    $0.13$0.40—
    Cerebras
    llama-3.3-70b-instruct
    $0.85$1.20—
    Groq
    llama-guard-4-12b
    $0.20$0.20—
    Nebius AI
    llama-3.1-nemotron-ultra-253b
    $0.60$1.80—
    Inference.net
    llama-3.2-11b-instruct
    $0.07$0.33—
    Cerebras
    llama-3.1-8b-instruct
    $0.10$0.10—
    NovitaAI
    llama-3.1-8b-instruct
    $0.02$0.05—
    Inference.net
    llama-3.1-8b-instruct
    $0.07$0.33—
    Nebius AI
    llama-3.1-8b-instruct
    $0.02$0.06—
    Together AI
    llama-3.1-8b-instruct
    $0.06$0.06—
    AWS Bedrock
    llama-3.1-8b-instruct
    $0.22$0.22—
    Groq
    gpt-oss-20b
    $0.10$0.50—
    Together AI
    gpt-oss-20b
    $0.05$0.20—
    NanoGPT
    gpt-oss-20b
    $0.04$0.15—
    Cerebras
    gpt-oss-120b
    $0.35$0.75—
    Together AI
    gpt-oss-120b
    $0.15$0.60—
    ByteDance
    gpt-oss-120b
    $0.10$0.50$0.02
    Azure
    gpt-oss-120b
    $0.15$0.60—
    NanoGPT
    gpt-oss-120b
    $0.05$0.25—
    Nebius AI
    gpt-oss-120b
    $0.15$0.60—
    Page 4 of 5

    Open-weight models have closed most of the gap with proprietary frontiers: DeepSeek V4, Qwen3.7, GLM-5, Kimi K2, and MiniMax M3 sit near the top of real-world leaderboards, joined by OpenAI's GPT-OSS and Google's Gemma releases. Their weights are public — but running a 200B+ parameter model yourself means serious GPU infrastructure.

    This page lists open-weight models served by hosted providers, so you get the openness — inspectable weights, no lock-in, the option to self-host later — with API convenience. LLM Gateway itself is open source (AGPLv3) and self-hostable, so the whole stack can run on your terms.

    Frequently asked questions

    What is the best open source LLM?

    DeepSeek V4, Qwen3.7, GLM-5.2, Kimi K2.6, and MiniMax M3 are the current leaders, each within striking distance of proprietary frontier models. For smaller, hardware-friendly options, GPT-OSS 20B, Gemma 4, and Qwen3.5 9B are the standouts.

    What does 'open source' mean for LLMs?

    Usually 'open weight': the trained weights are downloadable, but licenses vary — some are Apache 2.0 or MIT, others (like the Llama license) carry usage restrictions, and training data is rarely published. Check the license of a specific model before building on it.

    Should I self-host or use an API?

    Self-hosting pays off with steady high volume, strict data-residency needs, or fine-tuned weights. For everything else, per-token APIs are cheaper than idle GPUs. A middle path: develop against hosted open models and keep self-hosting as an exit option, since the weights are public.

    Are open models cheaper than proprietary ones?

    Dramatically, per token. Competition among hosts drives prices down — DeepSeek V4 Flash and Qwen3 Coder 30B cost 10–50x less than frontier proprietary models. The list above shows every provider's price for each model.

    Newsletter

    Stay ahead of the curve

    Join developers who get weekly insights on LLM routing, new model launches, and cost optimization — straight to their inbox.

    • New models & providers as they drop
    • Tips to cut latency & costs
    • Early access to beta features

    No spam. Unsubscribe anytime.

    All systems operational
    AICPA SOC for Service Organizations badgeSOC 2 Type II
    compliant

    Product

    • Features
    • Models
    • Providers
    • Chat Playground
    • Changelog
    • DevPass
    • Compare Models
    • Enterprise

    Resources

    • Apps
    • Templates
    • Agents
    • MCP Server
    • Use Cases
    • Blog
    • Documentation
    • Integrations
    • Guides
    • Brand Assets
    • Token Cost Calculator
    • Referral Program
    • GitHub
    • Contact Us

    Community

    • Twitter
    • Discord

    Compliance

    • Trust Center
    • Security Portal
    • Terms
    • Privacy Policy
    • GDPR
    • SOC 2 Type II
    • Status

    Compare

    • OpenRouter
    • LiteLLM
    • Portkey
    • Migration Guides

    Models

    • Text Generation
    • Text to Image
    • Image to Image
    • Video Generation
    • Embeddings
    • Vision
    • Reasoning
    • Tool Calling
    • Web Search
    • Discounted
    • Best for Roleplay
    • Best for Coding
    • Best for Creative Writing
    • Best for Translation
    • Best for Math
    • Long Context
    • Cheapest
    • Open Source

    Providers

    • OpenAI
    • Anthropic
    • Google AI Studio
    • Glacier
    • Granite
    • Google Vertex AI
    • Vertex AI (OpenAI-compatible)
    • Vertex AI (Anthropic)
    • Quartz
    • Avalanche
    • Groq
    • Cerebras
    • xAI
    • DeepSeek
    • Alibaba Cloud
    • NovitaAI
    • AtlasCloud
    • AWS Bedrock
    • Azure
    • Azure AI Foundry
    • Z AI
    • Moonshot AI
    • Perplexity
    • Nebius AI
    • Mistral AI
    • CanopyWave
    • Inference.net
    • Together AI
    • Custom
    • NanoGPT
    • ByteDance
    • MiniMax
    • EmberCloud
    • Sakana AI
    • Tundra
    • Xiaomi
    • DeepInfra
    • Reve
    • ElevenLabs

    © 2026 LLM Gateway. All rights reserved.