Support

AI-powered help

Welcome!

Please introduce yourself before we start.

    LLM Gateway
    • Docs
    • Pricing
    • Pricing
    • Docs
    • Models
      • AI Gateway
      • DevPass
      • Chat Playground
      • Observability
      • Enterprise
      • Blog
      • Changelog
      • Integrations
      • Reliability
      • Guardrails
      • Providers
      • Apps
      • Models
      • Model Timeline
      • Compare
      • Token Cost Calculator
      • Referral Program
      • MCP Server
      • Agents
      • AI SDK Provider
      • Agent Skills
      • Templates
      • Guides
    1.4k
    Log InGet Started

    Open Source Models

    Open-weight models — Llama, DeepSeek, Qwen, GLM, Kimi, GPT-OSS, Gemma, and more — served through one API

    Compare

    Use Case

    Capabilities

    Provider

    Input Price ($/M tokens)

    Output Price ($/M tokens)

    Context Size (tokens)

    88
    Models
    44
    Providers
    25
    Vision Models
    66
    Tool-enabled
    3
    Free Models
    Features
    ByteDance
    glm-4.7
    $0.60$2.20$0.11
    Together AI
    glm-4.7
    $0.45$2.00—
    Z AI
    glm-4.6v-flash
    $0.00$0.00$0.00
    NovitaAI
    glm-4.6v
    $0.30$0.90$0.06
    Z AI
    glm-4.6v
    $0.30$0.90$0.05
    Alibaba Cloud
    deepseek-v3.2
    $0.57$1.71$0.11
    Vertex AI (OpenAI-compatible)
    deepseek-v3.2
    $0.56$1.68$0.06
    DeepInfra
    deepseek-v3.2
    $0.26$0.38$0.13
    Alibaba Cloud(singapore)
    deepseek-v3.2
    $0.57$1.71$0.11
    ByteDance
    deepseek-v3.2
    $0.28$0.42$0.06
    DeepSeek
    deepseek-v3.2
    $0.28$0.42$0.03
    Nebius AI
    deepseek-v3.2
    $0.30$0.45—
    NovitaAI
    deepseek-v3.2
    $0.27$0.40$0.13
    Alibaba Cloud(cn-beijing)
    deepseek-v3.2
    $0.29$0.43$0.06
    Moonshot AI
    kimi-k2-thinking-turbo
    $1.15$8.00$0.15
    MiniMax
    minimax-m2
    $0.20$1.00$0.03
    Alibaba Cloud
    kimi-k2-thinking
    $0.57$2.29—
    Moonshot AI
    kimi-k2-thinking
    $0.60$2.50$0.15
    Alibaba Cloud(cn-beijing)
    kimi-k2-thinking
    $0.57$2.29—
    Vertex AI (OpenAI-compatible)
    kimi-k2-thinking
    $0.60$2.50$0.06
    ByteDance
    kimi-k2-thinking
    $0.60$2.50$0.12
    AWS Bedrock
    llama-3.1-70b-instruct
    $0.72$0.72—
    AWS Bedrock
    llama-4-maverick-17b-instruct
    $0.24$0.97—
    NovitaAI
    llama-4-maverick-17b-instruct
    $0.27$0.85—
    AWS Bedrock
    llama-4-scout-17b-instruct
    $0.17$0.66—
    NovitaAI
    llama-4-scout-17b-instruct
    $0.18$0.59—
    Cerebras
    glm-4.6
    $2.25$2.75—
    Z AI
    glm-4.6
    $0.60$2.20$0.11
    Alibaba Cloud
    glm-4.6
    $0.43$2.01—
    NovitaAI
    glm-4.6
    $0.55$2.20$0.11
    Alibaba Cloud(cn-beijing)
    glm-4.6
    $0.43$2.01—
    Z AI
    glm-4-32b-0414-128k
    $0.10$0.10$0.00
    Z AI
    glm-4.5-flash
    $0.00$0.00$0.00
    Z AI
    glm-4.5-airx
    $1.10$4.50$0.22
    Z AI
    glm-4.5-x
    $2.20$8.90$0.45
    Z AI
    glm-4.5-air
    $0.20$1.10$0.03
    EmberCloud
    glm-4.5-air
    $0.13$0.85$0.02
    NovitaAI
    glm-4.5v
    $0.60$1.80$0.11
    Z AI
    glm-4.5v
    $0.60$1.80$0.11
    EmberCloud
    glm-4.5
    $0.60$2.20$0.11
    Z AI
    glm-4.5
    $0.60$2.20$0.11
    Nebius AI
    hermes-3-llama-405b
    $1.00$3.00—
    NovitaAI
    qwen3-next-80b-a3b-instruct
    $0.15$1.50—
    Vertex AI (OpenAI-compatible)
    qwen3-next-80b-a3b-instruct
    $0.15$1.20—
    Alibaba Cloud
    qwen3-next-80b-a3b-instruct
    $0.50$2.00—
    NovitaAI
    qwen3-next-80b-a3b-thinking
    $0.15$1.50—
    Nebius AI
    qwen3-next-80b-a3b-thinking
    $0.15$1.20—
    Vertex AI (OpenAI-compatible)
    qwen3-next-80b-a3b-thinking
    $0.15$1.20—
    Alibaba Cloud
    qwen3-next-80b-a3b-thinking
    $0.50$6.00—
    Nebius AI
    qwen3-30b-a3b-instruct-2507
    $0.10$0.30—
    Page 3 of 5

    Open-weight models have closed most of the gap with proprietary frontiers: DeepSeek V4, Qwen3.7, GLM-5, Kimi K2, and MiniMax M3 sit near the top of real-world leaderboards, joined by OpenAI's GPT-OSS and Google's Gemma releases. Their weights are public — but running a 200B+ parameter model yourself means serious GPU infrastructure.

    This page lists open-weight models served by hosted providers, so you get the openness — inspectable weights, no lock-in, the option to self-host later — with API convenience. LLM Gateway itself is open source (AGPLv3) and self-hostable, so the whole stack can run on your terms.

    Frequently asked questions

    What is the best open source LLM?

    DeepSeek V4, Qwen3.7, GLM-5.2, Kimi K2.6, and MiniMax M3 are the current leaders, each within striking distance of proprietary frontier models. For smaller, hardware-friendly options, GPT-OSS 20B, Gemma 4, and Qwen3.5 9B are the standouts.

    What does 'open source' mean for LLMs?

    Usually 'open weight': the trained weights are downloadable, but licenses vary — some are Apache 2.0 or MIT, others (like the Llama license) carry usage restrictions, and training data is rarely published. Check the license of a specific model before building on it.

    Should I self-host or use an API?

    Self-hosting pays off with steady high volume, strict data-residency needs, or fine-tuned weights. For everything else, per-token APIs are cheaper than idle GPUs. A middle path: develop against hosted open models and keep self-hosting as an exit option, since the weights are public.

    Are open models cheaper than proprietary ones?

    Dramatically, per token. Competition among hosts drives prices down — DeepSeek V4 Flash and Qwen3 Coder 30B cost 10–50x less than frontier proprietary models. The list above shows every provider's price for each model.

    Newsletter

    Stay ahead of the curve

    Join developers who get weekly insights on LLM routing, new model launches, and cost optimization — straight to their inbox.

    • New models & providers as they drop
    • Tips to cut latency & costs
    • Early access to beta features

    No spam. Unsubscribe anytime.

    All systems operational
    AICPA SOC for Service Organizations badgeSOC 2 Type II
    compliant

    Product

    • Features
    • Models
    • Providers
    • Chat Playground
    • Changelog
    • DevPass
    • Compare Models
    • Enterprise

    Resources

    • Apps
    • Templates
    • Agents
    • MCP Server
    • Use Cases
    • Blog
    • Documentation
    • Integrations
    • Guides
    • Brand Assets
    • Token Cost Calculator
    • Referral Program
    • GitHub
    • Contact Us

    Community

    • Twitter
    • Discord

    Compliance

    • Trust Center
    • Security Portal
    • Terms
    • Privacy Policy
    • GDPR
    • SOC 2 Type II
    • Status

    Compare

    • OpenRouter
    • LiteLLM
    • Portkey
    • Migration Guides

    Models

    • Text Generation
    • Text to Image
    • Image to Image
    • Video Generation
    • Embeddings
    • Vision
    • Reasoning
    • Tool Calling
    • Web Search
    • Discounted
    • Best for Roleplay
    • Best for Coding
    • Best for Creative Writing
    • Best for Translation
    • Best for Math
    • Long Context
    • Cheapest
    • Open Source

    Providers

    • OpenAI
    • Anthropic
    • Google AI Studio
    • Glacier
    • Granite
    • Google Vertex AI
    • Vertex AI (OpenAI-compatible)
    • Vertex AI (Anthropic)
    • Quartz
    • Avalanche
    • Groq
    • Cerebras
    • xAI
    • DeepSeek
    • Alibaba Cloud
    • NovitaAI
    • AtlasCloud
    • AWS Bedrock
    • Azure
    • Azure AI Foundry
    • Z AI
    • Moonshot AI
    • Perplexity
    • Nebius AI
    • Mistral AI
    • CanopyWave
    • Inference.net
    • Together AI
    • Custom
    • NanoGPT
    • ByteDance
    • MiniMax
    • EmberCloud
    • Sakana AI
    • Tundra
    • Xiaomi
    • DeepInfra
    • Reve
    • ElevenLabs

    © 2026 LLM Gateway. All rights reserved.