Support

AI-powered help

Welcome!

Please introduce yourself before we start.

    LLM Gateway
    • Docs
    • Pricing
    • Pricing
    • Docs
    • Models
      • AI Gateway
      • DevPass
      • Chat Playground
      • Observability
      • Enterprise
      • Blog
      • Changelog
      • Integrations
      • Reliability
      • Guardrails
      • Providers
      • Apps
      • Models
      • Model Timeline
      • Compare
      • Token Cost Calculator
      • Referral Program
      • MCP Server
      • Agents
      • AI SDK Provider
      • Agent Skills
      • Templates
      • Guides
    1.4k
    Log InGet Started

    Open Source Models

    Open-weight models — Llama, DeepSeek, Qwen, GLM, Kimi, GPT-OSS, Gemma, and more — served through one API

    Compare

    Use Case

    Capabilities

    Provider

    Input Price ($/M tokens)

    Output Price ($/M tokens)

    Context Size (tokens)

    88
    Models
    44
    Providers
    25
    Vision Models
    66
    Tool-enabled
    3
    Free Models
    Features
    Granite
    glm-5.2
    $1.40$1.12
    -20% off
    $4.40$3.52
    -20% off
    $0.26$0.21
    -20% off
    Z AI
    glm-5.2
    $1.40$4.40$0.26
    EmberCloud
    glm-5.2
    $1.26$3.96$0.23
    DeepInfra
    qwen3.5-9b
    $0.10$0.15—
    Moonshot AI
    kimi-k2.7-code-highspeed
    $1.90$8.00$0.38
    NovitaAI
    gemma-4-26b-a4b-it
    $0.07$0.34—
    DeepInfra
    gemma-4-26b-a4b-it
    $0.07$0.34—
    Cerebras
    gemma-4-31b-it
    $0.99$1.49—
    NovitaAI
    gemma-4-31b-it
    $0.13$0.38—
    DeepInfra
    gemma-4-31b-it
    $0.13$0.38—
    Together AI
    gemma-4-31b-it
    $0.13$0.38—
    Moonshot AI
    kimi-k2.7-code
    $0.95$4.00$0.19
    Z AI
    glm-4.7-flash-free
    $0.00$0.00$0.00
    DeepInfra
    nemotron-3-ultra-550b
    $0.50$2.50$0.15
    MiniMax
    minimax-m3
    $0.60$2.40$0.12
    Xiaomi
    mimo-v2-omni
    $0.40$2.00$0.08
    Xiaomi
    mimo-v2.5
    $0.14$0.28$0.00
    Xiaomi
    mimo-v2-pro
    $1.00$3.00$0.20
    Xiaomi
    mimo-v2.5-pro
    $0.43$0.87$0.00
    NovitaAI
    qwen3.6-35b-a3b
    $0.25$1.48—
    Alibaba Cloud(singapore)
    qwen3.6-35b-a3b
    $0.25$1.48—
    Alibaba Cloud
    qwen3.6-35b-a3b
    $0.25$1.48—
    Alibaba Cloud(singapore)
    deepseek-v4-flash
    $0.20$0.40$0.04
    DeepSeek
    deepseek-v4-flash
    $0.14$0.28$0.00
    DeepInfra
    deepseek-v4-flash
    $0.14$0.28$0.03
    NovitaAI
    deepseek-v4-flash
    $0.14$0.28$0.03
    Alibaba Cloud(cn-beijing)
    deepseek-v4-flash
    $0.14$0.28$0.03
    Alibaba Cloud
    deepseek-v4-flash
    $0.20$0.40$0.04
    DeepSeek
    deepseek-v4-pro
    $0.43$0.87$0.00
    Alibaba Cloud(singapore)
    deepseek-v4-pro
    $2.40$4.80$0.20
    Together AI
    deepseek-v4-pro
    $1.74$3.48$0.20
    Alibaba Cloud(cn-beijing)
    deepseek-v4-pro
    $1.65$3.30$0.14
    Alibaba Cloud
    deepseek-v4-pro
    $2.40$4.80$0.20
    DeepInfra
    deepseek-v4-pro
    $1.74$3.48$0.14
    Tundra
    kimi-k2.6
    $0.40$2.20$0.08
    Together AI
    kimi-k2.6
    $1.20$4.50$0.20
    CanopyWave
    kimi-k2.6
    $0.50$2.80$0.10
    NovitaAI
    kimi-k2.6
    $0.95$4.00$0.16
    Moonshot AI
    kimi-k2.6
    $0.95$4.00$0.16
    DeepInfra
    glm-5.1
    $1.05$3.50$0.20
    EmberCloud
    glm-5.1
    $0.93$2.93$0.17
    Together AI
    glm-5.1
    $1.40$4.40$0.26
    NovitaAI
    glm-5.1
    $1.40$4.40$0.26
    Z AI
    glm-5.1
    $1.40$4.40$0.26
    Xiaomi
    mimo-v2-flash
    $0.10$0.30$0.02
    MiniMax
    minimax-m2.5-highspeed
    $0.60$2.40$0.03
    MiniMax
    minimax-m2.7-highspeed
    $0.60$2.40$0.06
    Together AI
    minimax-m2.7
    $0.30$1.20$0.06
    MiniMax
    minimax-m2.7
    $0.30$1.20$0.06
    NovitaAI
    minimax-m2.7
    $0.30$1.20$0.06
    Page 1 of 5

    Open-weight models have closed most of the gap with proprietary frontiers: DeepSeek V4, Qwen3.7, GLM-5, Kimi K2, and MiniMax M3 sit near the top of real-world leaderboards, joined by OpenAI's GPT-OSS and Google's Gemma releases. Their weights are public — but running a 200B+ parameter model yourself means serious GPU infrastructure.

    This page lists open-weight models served by hosted providers, so you get the openness — inspectable weights, no lock-in, the option to self-host later — with API convenience. LLM Gateway itself is open source (AGPLv3) and self-hostable, so the whole stack can run on your terms.

    Frequently asked questions

    What is the best open source LLM?

    DeepSeek V4, Qwen3.7, GLM-5.2, Kimi K2.6, and MiniMax M3 are the current leaders, each within striking distance of proprietary frontier models. For smaller, hardware-friendly options, GPT-OSS 20B, Gemma 4, and Qwen3.5 9B are the standouts.

    What does 'open source' mean for LLMs?

    Usually 'open weight': the trained weights are downloadable, but licenses vary — some are Apache 2.0 or MIT, others (like the Llama license) carry usage restrictions, and training data is rarely published. Check the license of a specific model before building on it.

    Should I self-host or use an API?

    Self-hosting pays off with steady high volume, strict data-residency needs, or fine-tuned weights. For everything else, per-token APIs are cheaper than idle GPUs. A middle path: develop against hosted open models and keep self-hosting as an exit option, since the weights are public.

    Are open models cheaper than proprietary ones?

    Dramatically, per token. Competition among hosts drives prices down — DeepSeek V4 Flash and Qwen3 Coder 30B cost 10–50x less than frontier proprietary models. The list above shows every provider's price for each model.

    Newsletter

    Stay ahead of the curve

    Join developers who get weekly insights on LLM routing, new model launches, and cost optimization — straight to their inbox.

    • New models & providers as they drop
    • Tips to cut latency & costs
    • Early access to beta features

    No spam. Unsubscribe anytime.

    All systems operational
    AICPA SOC for Service Organizations badgeSOC 2 Type II
    compliant

    Product

    • Features
    • Models
    • Providers
    • Chat Playground
    • Changelog
    • DevPass
    • Compare Models
    • Enterprise

    Resources

    • Apps
    • Templates
    • Agents
    • MCP Server
    • Use Cases
    • Blog
    • Documentation
    • Integrations
    • Guides
    • Brand Assets
    • Token Cost Calculator
    • Referral Program
    • GitHub
    • Contact Us

    Community

    • Twitter
    • Discord

    Compliance

    • Trust Center
    • Security Portal
    • Terms
    • Privacy Policy
    • GDPR
    • SOC 2 Type II
    • Status

    Compare

    • OpenRouter
    • LiteLLM
    • Portkey
    • Migration Guides

    Models

    • Text Generation
    • Text to Image
    • Image to Image
    • Video Generation
    • Embeddings
    • Vision
    • Reasoning
    • Tool Calling
    • Web Search
    • Discounted
    • Best for Roleplay
    • Best for Coding
    • Best for Creative Writing
    • Best for Translation
    • Best for Math
    • Long Context
    • Cheapest
    • Open Source

    Providers

    • OpenAI
    • Anthropic
    • Google AI Studio
    • Glacier
    • Granite
    • Google Vertex AI
    • Vertex AI (OpenAI-compatible)
    • Vertex AI (Anthropic)
    • Quartz
    • Avalanche
    • Groq
    • Cerebras
    • xAI
    • DeepSeek
    • Alibaba Cloud
    • NovitaAI
    • AtlasCloud
    • AWS Bedrock
    • Azure
    • Azure AI Foundry
    • Z AI
    • Moonshot AI
    • Perplexity
    • Nebius AI
    • Mistral AI
    • CanopyWave
    • Inference.net
    • Together AI
    • Custom
    • NanoGPT
    • ByteDance
    • MiniMax
    • EmberCloud
    • Sakana AI
    • Tundra
    • Xiaomi
    • DeepInfra
    • Reve
    • ElevenLabs

    © 2026 LLM Gateway. All rights reserved.