Support

AI-powered help

Welcome!

Please introduce yourself before we start.

    LLM Gateway
    • Docs
    • Pricing
    • Pricing
    • Docs
    • Models
      • AI Gateway
      • DevPass
      • Chat Playground
      • Observability
      • Enterprise
      • Blog
      • Changelog
      • Integrations
      • Reliability
      • Guardrails
      • Providers
      • Apps
      • Models
      • Model Timeline
      • Compare
      • Token Cost Calculator
      • Referral Program
      • MCP Server
      • Agents
      • AI SDK Provider
      • Agent Skills
      • Templates
      • Guides
    1.4k
    Log InGet Started

    Long Context Models

    Models with context windows of 200K tokens or more — up to 2M — for whole-codebase and multi-document workloads

    Compare

    Use Case

    Capabilities

    Provider

    Input Price ($/M tokens)

    Output Price ($/M tokens)

    Context Size (tokens)

    152
    Models
    44
    Providers
    101
    Vision Models
    134
    Tool-enabled
    2
    Free Models
    Features
    AWS Bedrock(global)
    grok-4-3
    $1.25$2.50$0.20
    AWS Bedrock(us)
    grok-4-3
    $1.38$2.75$0.22
    xAI
    grok-4-3
    $1.25$2.50$0.31
    AWS Bedrock(us-west-2)
    grok-4-3
    $1.38$2.75$0.22
    AWS Bedrock
    grok-4-3
    $1.25$2.50$0.20
    Azure AI Foundry
    grok-4-3
    $1.25$2.50$0.20
    NovitaAI
    qwen3.6-35b-a3b
    $0.25$1.48—
    Alibaba Cloud(singapore)
    qwen3.6-35b-a3b
    $0.25$1.48—
    Alibaba Cloud
    qwen3.6-35b-a3b
    $0.25$1.48—
    Alibaba Cloud(singapore)
    qwen3.6-plus
    $0.50$3.00$0.05
    Alibaba Cloud
    qwen3.6-plus
    $0.50$3.00$0.05
    Alibaba Cloud
    qwen3.6-max-preview
    $1.30$7.80$0.13
    Alibaba Cloud(singapore)
    qwen3.6-max-preview
    $1.30$7.80$0.13
    OpenAI
    gpt-5.5-pro
    $30.00$180.00—
    Azure
    gpt-5.5
    $5.00$30.00$0.50
    OpenAI
    gpt-5.5
    $5.00$30.00$0.50
    Alibaba Cloud(singapore)
    deepseek-v4-flash
    $0.20$0.40$0.04
    DeepSeek
    deepseek-v4-flash
    $0.14$0.28$0.00
    DeepInfra
    deepseek-v4-flash
    $0.14$0.28$0.03
    NovitaAI
    deepseek-v4-flash
    $0.14$0.28$0.03
    Alibaba Cloud(cn-beijing)
    deepseek-v4-flash
    $0.14$0.28$0.03
    Alibaba Cloud
    deepseek-v4-flash
    $0.20$0.40$0.04
    DeepSeek
    deepseek-v4-pro
    $0.43$0.87$0.00
    Alibaba Cloud(singapore)
    deepseek-v4-pro
    $2.40$4.80$0.20
    Together AI
    deepseek-v4-pro
    $1.74$3.48$0.20
    Alibaba Cloud(cn-beijing)
    deepseek-v4-pro
    $1.65$3.30$0.14
    Alibaba Cloud
    deepseek-v4-pro
    $2.40$4.80$0.20
    DeepInfra
    deepseek-v4-pro
    $1.74$3.48$0.14
    Tundra
    kimi-k2.6
    $0.40$2.20$0.08
    Together AI
    kimi-k2.6
    $1.20$4.50$0.20
    CanopyWave
    kimi-k2.6
    $0.50$2.80$0.10
    NovitaAI
    kimi-k2.6
    $0.95$4.00$0.16
    Moonshot AI
    kimi-k2.6
    $0.95$4.00$0.16
    AWS Bedrock(eu)
    claude-opus-4-7
    $5.50$27.50$0.55
    AWS Bedrock(global)
    claude-opus-4-7
    $5.00$25.00$0.50
    Anthropic
    claude-opus-4-7
    $5.00$25.00$0.50
    AWS Bedrock(jp)
    claude-opus-4-7
    $5.50$27.50$0.55
    Vertex AI (Anthropic)
    claude-opus-4-7
    $5.00$25.00$0.50
    AWS Bedrock
    claude-opus-4-7
    $5.00$25.00$0.50
    AWS Bedrock(au)
    claude-opus-4-7
    $5.50$27.50$0.55
    AWS Bedrock(us)
    claude-opus-4-7
    $5.50$27.50$0.55
    DeepInfra
    glm-5.1
    $1.05$3.50$0.20
    EmberCloud
    glm-5.1
    $0.93$2.93$0.17
    Together AI
    glm-5.1
    $1.40$4.40$0.26
    NovitaAI
    glm-5.1
    $1.40$4.40$0.26
    Z AI
    glm-5.1
    $1.40$4.40$0.26
    Xiaomi
    mimo-v2-flash
    $0.10$0.30$0.02
    EmberCloud
    qwen3-coder-next
    $0.11$0.68$0.06
    MiniMax
    minimax-m2.5-highspeed
    $0.60$2.40$0.03
    MiniMax
    minimax-m2.7-highspeed
    $0.60$2.40$0.06
    Page 2 of 8

    Every model on this page accepts at least 200,000 tokens of context — roughly 150,000 words — and the largest stretch much further: Grok 4.1 Fast at 2 million tokens, with Gemini, Claude Sonnet 5, GPT-5.4, DeepSeek V4, and GLM-5.2 at or above the million-token mark. That's enough to fit an entire codebase, a legal document set, or months of chat history into a single prompt.

    Advertised size isn't everything: retrieval quality can degrade well before the window is full, and long prompts get expensive fast. Cached input pricing — shown in the list — matters more than the headline price when you re-send large contexts on every request.

    Frequently asked questions

    Which LLM has the largest context window?

    Grok 4.1 Fast currently leads with a 2 million token window. Gemini models run just over 1 million, and Claude Sonnet 5, GPT-5.4, DeepSeek V4, GLM-5.2, and Qwen3.7 also offer million-token windows.

    How many words fit in a 200K context window?

    Roughly 150,000 English words — about 600 pages. A million-token window fits around 750,000 words: several full-length books, or a mid-sized codebase.

    Do models actually use the full window well?

    Not uniformly. Most models recall the start and end of a prompt better than the middle, and effective context is often smaller than the advertised maximum. For critical retrieval over huge inputs, test with your own data and consider chunking plus retrieval instead of one giant prompt.

    How do I keep long-context costs down?

    Use cached input pricing: providers charge a fraction of the normal rate for re-sent, unchanged prefixes, which is exactly the shape of chatting over a large document or codebase. Structure prompts so the big static context comes first and only the question changes.

    Newsletter

    Stay ahead of the curve

    Join developers who get weekly insights on LLM routing, new model launches, and cost optimization — straight to their inbox.

    • New models & providers as they drop
    • Tips to cut latency & costs
    • Early access to beta features

    No spam. Unsubscribe anytime.

    All systems operational
    AICPA SOC for Service Organizations badgeSOC 2 Type II
    compliant

    Product

    • Features
    • Models
    • Providers
    • Chat Playground
    • Changelog
    • DevPass
    • Compare Models
    • Enterprise

    Resources

    • Apps
    • Templates
    • Agents
    • MCP Server
    • Use Cases
    • Blog
    • Documentation
    • Integrations
    • Guides
    • Brand Assets
    • Token Cost Calculator
    • Referral Program
    • GitHub
    • Contact Us

    Community

    • Twitter
    • Discord

    Compliance

    • Trust Center
    • Security Portal
    • Terms
    • Privacy Policy
    • GDPR
    • SOC 2 Type II
    • Status

    Compare

    • OpenRouter
    • LiteLLM
    • Portkey
    • Migration Guides

    Models

    • Text Generation
    • Text to Image
    • Image to Image
    • Video Generation
    • Embeddings
    • Vision
    • Reasoning
    • Tool Calling
    • Web Search
    • Discounted
    • Best for Roleplay
    • Best for Coding
    • Best for Creative Writing
    • Best for Translation
    • Best for Math
    • Long Context
    • Cheapest
    • Open Source

    Providers

    • OpenAI
    • Anthropic
    • Google AI Studio
    • Glacier
    • Granite
    • Google Vertex AI
    • Vertex AI (OpenAI-compatible)
    • Vertex AI (Anthropic)
    • Quartz
    • Avalanche
    • Groq
    • Cerebras
    • xAI
    • DeepSeek
    • Alibaba Cloud
    • NovitaAI
    • AtlasCloud
    • AWS Bedrock
    • Azure
    • Azure AI Foundry
    • Z AI
    • Moonshot AI
    • Perplexity
    • Nebius AI
    • Mistral AI
    • CanopyWave
    • Inference.net
    • Together AI
    • Custom
    • NanoGPT
    • ByteDance
    • MiniMax
    • EmberCloud
    • Sakana AI
    • Tundra
    • Xiaomi
    • DeepInfra
    • Reve
    • ElevenLabs

    © 2026 LLM Gateway. All rights reserved.