Best Models for Translation

Multilingual models with strong translation quality across major and low-resource languages — compared by price and context

Compare

Models

Providers

Vision Models

Tool-enabled

Free Models


AWS Bedrock(global)	claude-sonnet-5	$2.00	$10.00	$0.20
AWS Bedrock	claude-sonnet-5	$2.00	$10.00	$0.20
Anthropic	claude-sonnet-5	$2.00	$10.00	$0.20
AWS Bedrock(us)	claude-sonnet-5	$2.20	$11.00	$0.22
Cerebras	gemma-4-31b-it	$0.99	$1.49	—
NovitaAI	gemma-4-31b-it	$0.13	$0.38	—
DeepInfra	gemma-4-31b-it	$0.13	$0.38	—
Together AI	gemma-4-31b-it	$0.13	$0.38	—
Alibaba Cloud	qwen3.7-plus	$0.40	$1.60	$0.08
Alibaba Cloud(singapore)	qwen3.7-plus	$0.40	$1.60	$0.08
Alibaba Cloud	qwen3.7-max	$2.50	$7.50	$0.50
Granite	qwen3.7-max	$2.50$1.25 -50% off	$7.50$3.75 -50% off	$0.50$0.25 -50% off
Alibaba Cloud(singapore)	qwen3.7-max	$2.50	$7.50	$0.50
Alibaba Cloud(cn-beijing)	qwen3.7-max	$1.72	$5.17	$0.34
NovitaAI	qwen3.7-max	$1.25	$3.75	$0.13
Alibaba Cloud(singapore)	deepseek-v4-flash	$0.20	$0.40	$0.04
DeepSeek	deepseek-v4-flash	$0.14	$0.28	$0.00
DeepInfra	deepseek-v4-flash	$0.14	$0.28	$0.03
NovitaAI	deepseek-v4-flash	$0.14	$0.28	$0.03
Alibaba Cloud(cn-beijing)	deepseek-v4-flash	$0.14	$0.28	$0.03
Alibaba Cloud	deepseek-v4-flash	$0.20	$0.40	$0.04
Tundra	kimi-k2.6	$0.40	$2.20	$0.08
Together AI	kimi-k2.6	$1.20	$4.50	$0.20
CanopyWave	kimi-k2.6	$0.50	$2.80	$0.10
NovitaAI	kimi-k2.6	$0.95	$4.00	$0.16
Moonshot AI	kimi-k2.6	$0.95	$4.00	$0.16
OpenAI	gpt-5.4-mini	$0.75	$4.50	$0.07
Azure	gpt-5.4-mini	$0.75	$4.50	$0.07
Azure	gpt-5.4	$2.50	$15.00	$0.25
OpenAI	gpt-5.4	$2.50	$15.00	$0.25
Mistral AI	mistral-large-2512	$0.50	$1.50	—
Quartz	gemini-3.1-pro-preview	$2.00	$12.00	$0.20
Google AI Studio	gemini-3.1-pro-preview	$2.00	$12.00	$0.20
Google Vertex AI	gemini-3.1-pro-preview	$2.00	$12.00	$0.20
ByteDance	seed-1-8-251228	$0.25	$2.00	$0.05
AWS Bedrock(apac)	claude-haiku-4-5	$1.00	$5.00	$0.10
AWS Bedrock(global)	claude-haiku-4-5	$1.00	$5.00	$0.10
AWS Bedrock(jp)	claude-haiku-4-5	$1.10	$5.50	$0.11
Vertex AI (Anthropic)	claude-haiku-4-5	$1.00	$5.00	$0.10
AWS Bedrock(us)	claude-haiku-4-5	$1.10	$5.50	$0.11
AWS Bedrock(au)	claude-haiku-4-5	$1.10	$5.50	$0.11
AWS Bedrock(eu)	claude-haiku-4-5	$1.10	$5.50	$0.11
Anthropic	claude-haiku-4-5	$1.00	$5.00	$0.10
AWS Bedrock	claude-haiku-4-5	$1.00	$5.00	$0.10
Google Vertex AI	gemini-2.5-flash-lite	$0.10	$0.40	$0.01
Google AI Studio	gemini-2.5-flash-lite	$0.10	$0.40	$0.01
Nebius AI	qwen3-235b-a22b-instruct-2507	$0.20	$0.60	—
Vertex AI (OpenAI-compatible)	qwen3-235b-a22b-instruct-2507	$0.22	$0.88	—
Cerebras	qwen3-235b-a22b-instruct-2507	$0.60	$1.20	—
NovitaAI	qwen3-235b-a22b-instruct-2507	$0.09	$0.58	—

Modern LLMs now rival dedicated translation engines for most language pairs — and beat them on context awareness, tone, terminology consistency, and formatting. The strongest multilingual models are Google's Gemini line, OpenAI's GPT-5.4, Anthropic's Claude, and Alibaba's Qwen, which is particularly strong on Chinese and other Asian languages.

Long context windows also change how translation work gets done: instead of translating strings in isolation, you can put an entire document plus a glossary into one prompt and keep terminology consistent throughout. For bulk workloads, budget models like Gemini Flash-Lite and DeepSeek V4 Flash bring the cost per translated word down to fractions of a cent.

Frequently asked questions

What is the best LLM for translation?

Gemini 3.1 Pro and GPT-5.4 deliver the most consistent quality across a broad set of language pairs. Qwen3.7 Max is a top pick for Chinese, Japanese, and Korean, and Claude Sonnet 5 excels when tone and nuance matter. For bulk work, Gemini 2.5 Flash-Lite and DeepSeek V4 Flash offer the best cost per word.

Are LLMs better than Google Translate or DeepL?

For most content, yes — LLMs follow style guides, preserve formatting and placeholders, keep terminology consistent across a document, and adapt register on request. Dedicated engines still win on raw speed and per-character price for very simple, high-volume strings.

How do I translate long documents?

Use a long-context model and send the whole document in one call: a million-token window fits roughly 750,000 words, and single-call translation keeps names and terminology consistent. If a document exceeds the window, chunk it and include a running glossary in each prompt.

Which models handle low-resource languages best?

Coverage drops for languages with little training data. Gemini Pro and GPT-5.4 generally hold up best, but always test with your actual language pair before committing volume — with one API key you can run the same text through several models in minutes and compare.

Support

Welcome!

Best Models for Translation

Use Case

Capabilities

Provider

Input Price ($/M tokens)

Output Price ($/M tokens)

Context Size (tokens)

Frequently asked questions

What is the best LLM for translation?

Are LLMs better than Google Translate or DeepL?

How do I translate long documents?

Which models handle low-resource languages best?

Stay ahead of the curve