Support

AI-powered help

Welcome!

Please introduce yourself before we start.

    LLM Gateway
    • Docs
    • Pricing
    • Pricing
    • Docs
    • Models
      • AI Gateway
      • DevPass
      • Chat Playground
      • Observability
      • Enterprise
      • Blog
      • Changelog
      • Integrations
      • Reliability
      • Guardrails
      • Providers
      • Apps
      • Models
      • Model Timeline
      • Compare
      • Token Cost Calculator
      • Referral Program
      • MCP Server
      • Agents
      • AI SDK Provider
      • Agent Skills
      • Templates
      • Guides
    1.4k
    Log InGet Started

    Long Context Models

    Models with context windows of 200K tokens or more — up to 2M — for whole-codebase and multi-document workloads

    Compare

    Use Case

    Capabilities

    Provider

    Input Price ($/M tokens)

    Output Price ($/M tokens)

    Context Size (tokens)

    152
    Models
    44
    Providers
    101
    Vision Models
    134
    Tool-enabled
    2
    Free Models
    Features
    AWS Bedrock(global)
    claude-sonnet-5
    $2.00$10.00$0.20
    AWS Bedrock
    claude-sonnet-5
    $2.00$10.00$0.20
    Anthropic
    claude-sonnet-5
    $2.00$10.00$0.20
    AWS Bedrock(us)
    claude-sonnet-5
    $2.20$11.00$0.22
    Anthropic
    claude-haiku-4-5-free
    $0.00$0.00$0.00
    Sakana AI
    fugu-ultra
    $5.00$30.00$0.50
    Granite
    glm-5.2
    $1.40$1.12
    -20% off
    $4.40$3.52
    -20% off
    $0.26$0.21
    -20% off
    Z AI
    glm-5.2
    $1.40$4.40$0.26
    EmberCloud
    glm-5.2
    $1.26$3.96$0.23
    DeepInfra
    qwen3.5-9b
    $0.10$0.15—
    Moonshot AI
    kimi-k2.7-code-highspeed
    $1.90$8.00$0.38
    NovitaAI
    gemma-4-26b-a4b-it
    $0.07$0.34—
    DeepInfra
    gemma-4-26b-a4b-it
    $0.07$0.34—
    Cerebras
    gemma-4-31b-it
    $0.99$1.49—
    NovitaAI
    gemma-4-31b-it
    $0.13$0.38—
    DeepInfra
    gemma-4-31b-it
    $0.13$0.38—
    Together AI
    gemma-4-31b-it
    $0.13$0.38—
    Moonshot AI
    kimi-k2.7-code
    $0.95$4.00$0.19
    AWS Bedrock(us)
    claude-fable-5
    $11.00$55.00$1.10
    AWS Bedrock(global)
    claude-fable-5
    $10.00$50.00$1.00
    Anthropic
    claude-fable-5
    $10.00$50.00$1.00
    AWS Bedrock
    claude-fable-5
    $10.00$50.00$1.00
    Z AI
    glm-4.7-flash-free
    $0.00$0.00$0.00
    DeepInfra
    nemotron-3-ultra-550b
    $0.50$2.50$0.15
    Alibaba Cloud
    qwen3.7-plus
    $0.40$1.60$0.08
    Alibaba Cloud(singapore)
    qwen3.7-plus
    $0.40$1.60$0.08
    MiniMax
    minimax-m3
    $0.60$2.40$0.12
    AWS Bedrock(jp)
    claude-opus-4-8
    $5.50$27.50$0.55
    AWS Bedrock(us)
    claude-opus-4-8
    $5.50$27.50$0.55
    Anthropic
    claude-opus-4-8
    $5.00$25.00$0.50
    AWS Bedrock
    claude-opus-4-8
    $5.00$25.00$0.50
    AWS Bedrock(global)
    claude-opus-4-8
    $5.00$25.00$0.50
    AWS Bedrock(au)
    claude-opus-4-8
    $5.50$27.50$0.55
    AWS Bedrock(eu)
    claude-opus-4-8
    $5.50$27.50$0.55
    Alibaba Cloud
    qwen3.7-max
    $2.50$7.50$0.50
    Granite
    qwen3.7-max
    $2.50$1.25
    -50% off
    $7.50$3.75
    -50% off
    $0.50$0.25
    -50% off
    Alibaba Cloud(singapore)
    qwen3.7-max
    $2.50$7.50$0.50
    Alibaba Cloud(cn-beijing)
    qwen3.7-max
    $1.72$5.17$0.34
    NovitaAI
    qwen3.7-max
    $1.25$3.75$0.13
    xAI
    grok-build-0-1
    $1.00$2.00$0.20
    Google Vertex AI
    gemini-3.5-flash
    $1.50$9.00$0.15
    Google AI Studio
    gemini-3.5-flash
    $1.50$9.00$0.15
    Vertex AI (OpenAI-compatible)
    grok-4-20-non-reasoning
    $1.25$2.50$0.20
    Vertex AI (OpenAI-compatible)
    grok-4-20-reasoning
    $1.25$2.50$0.20
    Xiaomi
    mimo-v2-omni
    $0.40$2.00$0.08
    Xiaomi
    mimo-v2.5
    $0.14$0.28$0.00
    Xiaomi
    mimo-v2-pro
    $1.00$3.00$0.20
    Xiaomi
    mimo-v2.5-pro
    $0.43$0.87$0.00
    Google Vertex AI
    gemini-3.1-flash-lite
    $0.25$1.50$0.02
    Google AI Studio
    gemini-3.1-flash-lite
    $0.25$1.50$0.02
    Page 1 of 8

    Every model on this page accepts at least 200,000 tokens of context — roughly 150,000 words — and the largest stretch much further: Grok 4.1 Fast at 2 million tokens, with Gemini, Claude Sonnet 5, GPT-5.4, DeepSeek V4, and GLM-5.2 at or above the million-token mark. That's enough to fit an entire codebase, a legal document set, or months of chat history into a single prompt.

    Advertised size isn't everything: retrieval quality can degrade well before the window is full, and long prompts get expensive fast. Cached input pricing — shown in the list — matters more than the headline price when you re-send large contexts on every request.

    Frequently asked questions

    Which LLM has the largest context window?

    Grok 4.1 Fast currently leads with a 2 million token window. Gemini models run just over 1 million, and Claude Sonnet 5, GPT-5.4, DeepSeek V4, GLM-5.2, and Qwen3.7 also offer million-token windows.

    How many words fit in a 200K context window?

    Roughly 150,000 English words — about 600 pages. A million-token window fits around 750,000 words: several full-length books, or a mid-sized codebase.

    Do models actually use the full window well?

    Not uniformly. Most models recall the start and end of a prompt better than the middle, and effective context is often smaller than the advertised maximum. For critical retrieval over huge inputs, test with your own data and consider chunking plus retrieval instead of one giant prompt.

    How do I keep long-context costs down?

    Use cached input pricing: providers charge a fraction of the normal rate for re-sent, unchanged prefixes, which is exactly the shape of chatting over a large document or codebase. Structure prompts so the big static context comes first and only the question changes.

    Newsletter

    Stay ahead of the curve

    Join developers who get weekly insights on LLM routing, new model launches, and cost optimization — straight to their inbox.

    • New models & providers as they drop
    • Tips to cut latency & costs
    • Early access to beta features

    No spam. Unsubscribe anytime.

    All systems operational
    AICPA SOC for Service Organizations badgeSOC 2 Type II
    compliant

    Product

    • Features
    • Models
    • Providers
    • Chat Playground
    • Changelog
    • DevPass
    • Compare Models
    • Enterprise

    Resources

    • Apps
    • Templates
    • Agents
    • MCP Server
    • Use Cases
    • Blog
    • Documentation
    • Integrations
    • Guides
    • Brand Assets
    • Token Cost Calculator
    • Referral Program
    • GitHub
    • Contact Us

    Community

    • Twitter
    • Discord

    Compliance

    • Trust Center
    • Security Portal
    • Terms
    • Privacy Policy
    • GDPR
    • SOC 2 Type II
    • Status

    Compare

    • OpenRouter
    • LiteLLM
    • Portkey
    • Migration Guides

    Models

    • Text Generation
    • Text to Image
    • Image to Image
    • Video Generation
    • Embeddings
    • Vision
    • Reasoning
    • Tool Calling
    • Web Search
    • Discounted
    • Best for Roleplay
    • Best for Coding
    • Best for Creative Writing
    • Best for Translation
    • Best for Math
    • Long Context
    • Cheapest
    • Open Source

    Providers

    • OpenAI
    • Anthropic
    • Google AI Studio
    • Glacier
    • Granite
    • Google Vertex AI
    • Vertex AI (OpenAI-compatible)
    • Vertex AI (Anthropic)
    • Quartz
    • Avalanche
    • Groq
    • Cerebras
    • xAI
    • DeepSeek
    • Alibaba Cloud
    • NovitaAI
    • AtlasCloud
    • AWS Bedrock
    • Azure
    • Azure AI Foundry
    • Z AI
    • Moonshot AI
    • Perplexity
    • Nebius AI
    • Mistral AI
    • CanopyWave
    • Inference.net
    • Together AI
    • Custom
    • NanoGPT
    • ByteDance
    • MiniMax
    • EmberCloud
    • Sakana AI
    • Tundra
    • Xiaomi
    • DeepInfra
    • Reve
    • ElevenLabs

    © 2026 LLM Gateway. All rights reserved.