Long Context Models

Models with context windows of 200K tokens or more — up to 2M — for whole-codebase and multi-document workloads

Compare

152

Models

Providers

101

Vision Models

134

Tool-enabled

Free Models


NovitaAI	glm-5	$1.00	$3.20	$0.20
Z AI	glm-5	$1.00	$3.20	$0.20
Together AI	glm-5	$1.00	$3.20	—
Alibaba Cloud	glm-5	$0.57	$2.58	—
MiniMax	minimax-m2.5	$0.30	$1.20	$0.03
EmberCloud	minimax-m2.5	$0.20	$1.20	$0.04
Nebius AI	minimax-m2.5	$0.30	$1.20	—
NovitaAI	minimax-m2.5	$0.30	$1.20	$0.03
Together AI	minimax-m2.5	$0.30	$1.20	—
AWS Bedrock	claude-opus-4-6	$5.00	$25.00	$0.50
AWS Bedrock(eu)	claude-opus-4-6	$5.50	$27.50	$0.55
AWS Bedrock(eu-west-2)	claude-opus-4-6	$5.50	$27.50	$0.55
AWS Bedrock(au)	claude-opus-4-6	$5.50	$27.50	$0.55
AWS Bedrock(global)	claude-opus-4-6	$5.00	$25.00	$0.50
Anthropic	claude-opus-4-6	$5.00	$25.00	$0.50
AWS Bedrock(us)	claude-opus-4-6	$5.50	$27.50	$0.55
Vertex AI (Anthropic)	claude-opus-4-6	$5.00	$25.00	$0.50
Alibaba Cloud(cn-beijing)	kimi-k2.5	$0.57	$3.01	—
Moonshot AI	kimi-k2.5	$0.60	$3.00	$0.10
Alibaba Cloud	kimi-k2.5	$0.57	$3.01	—
EmberCloud	kimi-k2.5	$0.40	$1.98	$0.22
Nebius AI	kimi-k2.5	$0.50	$2.50	$0.02
DeepInfra	kimi-k2.5	$0.45	$2.25	$0.07
Alibaba Cloud(singapore)	qwen3-max-2026-01-23	$1.20	$6.00	$0.24
Alibaba Cloud(cn-beijing)	qwen3-max-2026-01-23	$0.36	$1.43	$0.07
Alibaba Cloud(us-virginia)	qwen3-max-2026-01-23	$0.36	$1.43	$0.07
Alibaba Cloud	qwen3-max-2026-01-23	$1.20	$6.00	$0.24
Alibaba Cloud(cn-beijing)	qwen3-vl-flash	$0.02	$0.21	$0.00
Alibaba Cloud(singapore)	qwen3-vl-flash	$0.05	$0.40	$0.01
Alibaba Cloud(us-virginia)	qwen3-vl-flash	$0.02	$0.21	$0.00
Alibaba Cloud	qwen3-vl-flash	$0.05	$0.40	$0.01
Alibaba Cloud	qwen3-vl-plus	$0.20	$1.60	$0.04
Alibaba Cloud(us-virginia)	qwen3-vl-plus	$0.14	$1.43	$0.03
Alibaba Cloud(cn-beijing)	qwen3-vl-plus	$0.14	$1.43	$0.03
Alibaba Cloud(singapore)	qwen3-vl-plus	$0.20	$1.60	$0.04
Alibaba Cloud	qwen3-coder-flash	$0.30	$1.50	$0.06
Alibaba Cloud(cn-beijing)	qwen3-coder-flash	$0.14	$0.57	$0.03
Alibaba Cloud(us-virginia)	qwen3-coder-flash	$0.14	$0.57	$0.03
Alibaba Cloud(singapore)	qwen3-coder-flash	$0.30	$1.50	$0.06
MiniMax	minimax-text-01	$0.20	$1.10	—
EmberCloud	glm-4.7-flash	$0.06	$0.40	$0.01
Z AI	glm-4.7-flashx	$0.07	$0.40	$0.01
ByteDance	seed-1-8-251228	$0.25	$2.00	$0.05
ByteDance	seed-1-6-flash-250715	$0.07	$0.30	$0.01
ByteDance	seed-1-6-250915	$0.25	$2.00	$0.05
ByteDance	seed-1-6-250615	$0.25	$2.00	$0.05
MiniMax	minimax-m2.1	$0.27	$1.10	—
NovitaAI	minimax-m2.1	$0.30	$1.20	$0.03
Vertex AI (OpenAI-compatible)	glm-4.7	$0.60	$2.20	—
Alibaba Cloud(cn-beijing)	glm-4.7	$0.43	$2.01	—

Every model on this page accepts at least 200,000 tokens of context — roughly 150,000 words — and the largest stretch much further: Grok 4.1 Fast at 2 million tokens, with Gemini, Claude Sonnet 5, GPT-5.4, DeepSeek V4, and GLM-5.2 at or above the million-token mark. That's enough to fit an entire codebase, a legal document set, or months of chat history into a single prompt.

Advertised size isn't everything: retrieval quality can degrade well before the window is full, and long prompts get expensive fast. Cached input pricing — shown in the list — matters more than the headline price when you re-send large contexts on every request.

Frequently asked questions

Which LLM has the largest context window?

Grok 4.1 Fast currently leads with a 2 million token window. Gemini models run just over 1 million, and Claude Sonnet 5, GPT-5.4, DeepSeek V4, GLM-5.2, and Qwen3.7 also offer million-token windows.

How many words fit in a 200K context window?

Roughly 150,000 English words — about 600 pages. A million-token window fits around 750,000 words: several full-length books, or a mid-sized codebase.

Do models actually use the full window well?

Not uniformly. Most models recall the start and end of a prompt better than the middle, and effective context is often smaller than the advertised maximum. For critical retrieval over huge inputs, test with your own data and consider chunking plus retrieval instead of one giant prompt.

How do I keep long-context costs down?

Use cached input pricing: providers charge a fraction of the normal rate for re-sent, unchanged prefixes, which is exactly the shape of chatting over a large document or codebase. Structure prompts so the big static context comes first and only the question changes.

Long Context Models

Models with context windows of 200K tokens or more — up to 2M — for whole-codebase and multi-document workloads

Compare

152

Models

Providers

101

Vision Models

134

Tool-enabled

Free Models


NovitaAI	glm-5	$1.00	$3.20	$0.20
Z AI	glm-5	$1.00	$3.20	$0.20
Together AI	glm-5	$1.00	$3.20	—
Alibaba Cloud	glm-5	$0.57	$2.58	—
MiniMax	minimax-m2.5	$0.30	$1.20	$0.03
EmberCloud	minimax-m2.5	$0.20	$1.20	$0.04
Nebius AI	minimax-m2.5	$0.30	$1.20	—
NovitaAI	minimax-m2.5	$0.30	$1.20	$0.03
Together AI	minimax-m2.5	$0.30	$1.20	—
AWS Bedrock	claude-opus-4-6	$5.00	$25.00	$0.50
AWS Bedrock(eu)	claude-opus-4-6	$5.50	$27.50	$0.55
AWS Bedrock(eu-west-2)	claude-opus-4-6	$5.50	$27.50	$0.55
AWS Bedrock(au)	claude-opus-4-6	$5.50	$27.50	$0.55
AWS Bedrock(global)	claude-opus-4-6	$5.00	$25.00	$0.50
Anthropic	claude-opus-4-6	$5.00	$25.00	$0.50
AWS Bedrock(us)	claude-opus-4-6	$5.50	$27.50	$0.55
Vertex AI (Anthropic)	claude-opus-4-6	$5.00	$25.00	$0.50
Alibaba Cloud(cn-beijing)	kimi-k2.5	$0.57	$3.01	—
Moonshot AI	kimi-k2.5	$0.60	$3.00	$0.10
Alibaba Cloud	kimi-k2.5	$0.57	$3.01	—
EmberCloud	kimi-k2.5	$0.40	$1.98	$0.22
Nebius AI	kimi-k2.5	$0.50	$2.50	$0.02
DeepInfra	kimi-k2.5	$0.45	$2.25	$0.07
Alibaba Cloud(singapore)	qwen3-max-2026-01-23	$1.20	$6.00	$0.24
Alibaba Cloud(cn-beijing)	qwen3-max-2026-01-23	$0.36	$1.43	$0.07
Alibaba Cloud(us-virginia)	qwen3-max-2026-01-23	$0.36	$1.43	$0.07
Alibaba Cloud	qwen3-max-2026-01-23	$1.20	$6.00	$0.24
Alibaba Cloud(cn-beijing)	qwen3-vl-flash	$0.02	$0.21	$0.00
Alibaba Cloud(singapore)	qwen3-vl-flash	$0.05	$0.40	$0.01
Alibaba Cloud(us-virginia)	qwen3-vl-flash	$0.02	$0.21	$0.00
Alibaba Cloud	qwen3-vl-flash	$0.05	$0.40	$0.01
Alibaba Cloud	qwen3-vl-plus	$0.20	$1.60	$0.04
Alibaba Cloud(us-virginia)	qwen3-vl-plus	$0.14	$1.43	$0.03
Alibaba Cloud(cn-beijing)	qwen3-vl-plus	$0.14	$1.43	$0.03
Alibaba Cloud(singapore)	qwen3-vl-plus	$0.20	$1.60	$0.04
Alibaba Cloud	qwen3-coder-flash	$0.30	$1.50	$0.06
Alibaba Cloud(cn-beijing)	qwen3-coder-flash	$0.14	$0.57	$0.03
Alibaba Cloud(us-virginia)	qwen3-coder-flash	$0.14	$0.57	$0.03
Alibaba Cloud(singapore)	qwen3-coder-flash	$0.30	$1.50	$0.06
MiniMax	minimax-text-01	$0.20	$1.10	—
EmberCloud	glm-4.7-flash	$0.06	$0.40	$0.01
Z AI	glm-4.7-flashx	$0.07	$0.40	$0.01
ByteDance	seed-1-8-251228	$0.25	$2.00	$0.05
ByteDance	seed-1-6-flash-250715	$0.07	$0.30	$0.01
ByteDance	seed-1-6-250915	$0.25	$2.00	$0.05
ByteDance	seed-1-6-250615	$0.25	$2.00	$0.05
MiniMax	minimax-m2.1	$0.27	$1.10	—
NovitaAI	minimax-m2.1	$0.30	$1.20	$0.03
Vertex AI (OpenAI-compatible)	glm-4.7	$0.60	$2.20	—
Alibaba Cloud(cn-beijing)	glm-4.7	$0.43	$2.01	—

Frequently asked questions

Which LLM has the largest context window?

Grok 4.1 Fast currently leads with a 2 million token window. Gemini models run just over 1 million, and Claude Sonnet 5, GPT-5.4, DeepSeek V4, GLM-5.2, and Qwen3.7 also offer million-token windows.

How many words fit in a 200K context window?

Roughly 150,000 English words — about 600 pages. A million-token window fits around 750,000 words: several full-length books, or a mid-sized codebase.

Long Context Models

Use Case

Capabilities

Provider

Input Price ($/M tokens)

Output Price ($/M tokens)

Context Size (tokens)

Frequently asked questions

Which LLM has the largest context window?

How many words fit in a 200K context window?

Do models actually use the full window well?

How do I keep long-context costs down?

Stay ahead of the curve

Support

Welcome!

Long Context Models

Use Case

Capabilities

Provider

Input Price ($/M tokens)

Output Price ($/M tokens)

Context Size (tokens)

Frequently asked questions

Which LLM has the largest context window?

How many words fit in a 200K context window?

Do models actually use the full window well?

How do I keep long-context costs down?

Stay ahead of the curve