New Alibaba Qwen Models: Qwen3 Next, Max, Plus, Flash, Vision

Added support for qwen-max, qwen-max-latest, qwen-plus-latest, qwen-flash, qwen-vl-max, qwen-vl-plus and the new Qwen3 Next 80B A3B Instruct and Thinking models.

September 11, 2025

Alibaba Qwen models now available on LLM Gateway

We’ve added support for the latest Alibaba Qwen models through our unified API. These models bring strong reasoning, speed-focused variants, and multimodal (vision) capabilities.

🧠 New Qwen3 Next Generation Models

qwen3-next-80b-a3b-instruct — next-gen 80B instruct model

Pricing: Input $0.50 / 1M tokens, Output $2.00 / 1M tokens
Context: 129,024 tokens, Max output: 32,768 tokens
Capabilities: Streaming, Tools, JSON output Try in Chat Playground

qwen3-next-80b-a3b-thinking — next-gen reasoning model

Pricing: Input $0.50 / 1M tokens, Output $6.00 / 1M tokens
Context: 131,072 tokens, Max output: 32,768 tokens
Capabilities: Streaming, Reasoning, Tools, JSON output Try in Chat Playground

🚀 New Text Models

qwen-max — flagship performance

Pricing: Input $1.60 / 1M tokens, Output $6.40 / 1M tokens
Context: 131,072 tokens, Max output: 32,000 tokens
Capabilities: Streaming, Vision, Tools, JSON output Try in Chat Playground

qwen-max-latest — rolling latest for Max:

Pricing: Input $1.60 / 1M tokens, Output $6.40 / 1M tokens
Context: 131,072 tokens, Max output: 32,000 tokens
Capabilities: Streaming, Vision, Tools, JSON output Try in Chat Playground

qwen-plus-latest — balanced performance and cost

Pricing: Input $0.40 / 1M tokens, Output $1.20 / 1M tokens
Context: 1,000,000 tokens, Max output: 32,000 tokens
Capabilities: Streaming, Tools, JSON output Try in Chat Playground

qwen-flash — fast, cost‑efficient responses

Pricing: Input $0.05 / 1M tokens, Output $0.40 / 1M tokens
Context: 1,000,000 tokens, Max output: 32,000 tokens
Capabilities: Streaming, Tools, JSON output Try in Chat Playground

👀 New Vision Models

qwen-vl-max — high‑end multimodal (vision + text)

Pricing: Input $0.80 / 1M tokens, Output $3.20 / 1M tokens
Context: 131,072 tokens, Max output: 32,000 tokens
Capabilities: Streaming, Vision, JSON output Try in Chat Playground

qwen-vl-plus — balanced multimodal

Pricing: Input $0.21 / 1M tokens, Output $0.64 / 1M tokens
Context: 131,072 tokens, Max output: 32,000 tokens
Capabilities: Streaming, Vision, JSON output Try in Chat Playground

Use these models immediately via our OpenAI‑compatible endpoint — no extra setup required. Explore pricing and capabilities on the Models page.

New Alibaba Qwen Models: Qwen3 Next, Max, Plus, Flash, Vision

🧠 New Qwen3 Next Generation Models

🚀 New Text Models

👀 New Vision Models

Stay ahead of the curve

Support

Welcome!