Vertex AI (OpenAI-compatible) Provider

Access partner models (e.g. xAI Grok) via Google Cloud Vertex AI's OpenAI-compatible Chat Completions endpoint.

Data & Privacy

HQ:US
API Training:No
Consumer Training:No
Prompt Logging:No
Retention:0 days
GDPR:Compliant
SOC2:Certified
ISO 27001:Certified

Available Models

Grok 4.20 Reasoning

xai
grok-4-20-reasoning
Streaming
Vision
Tools
Reasoning
JSON Output
Vertex AI (OpenAI-compatible)
Context: 2M
Input
$1.25
/M tokens
Cached
$0.2
/M tokens
Output
$2.5
/M tokens

Grok 4.20 Non-Reasoning

xai
grok-4-20-non-reasoning
Streaming
Vision
Tools
JSON Output
Vertex AI (OpenAI-compatible)
Context: 2M
Input
$1.25
/M tokens
Cached
$0.2
/M tokens
Output
$2.5
/M tokens

GLM-5

zai
glm-5
Streaming
Tools
Reasoning
JSON Output
Vertex AI (OpenAI-compatible)
Context: 202.8k
Input
$1
/M tokens
Cached
$0.1
/M tokens
Output
$3.2
/M tokens

GLM-4.7

zai
glm-4.7
Streaming
Tools
Reasoning
JSON Output
Vertex AI (OpenAI-compatible)
Context: 202.8k
Input
$0.6
/M tokens
Cached
/M tokens
Output
$2.2
/M tokens

Kimi K2 Thinking

moonshot
kimi-k2-thinking
Streaming
Tools
Reasoning
JSON Output
Vertex AI (OpenAI-compatible)
Context: 262.1k
Input
$0.6
/M tokens
Cached
$0.06
/M tokens
Output
$2.5
/M tokens

DeepSeek V3.2

deepseek
deepseek-v3.2
Streaming
Tools
JSON Output
Vertex AI (OpenAI-compatible)
Context: 163.8k
Input
$0.56
/M tokens
Cached
$0.056
/M tokens
Output
$1.68
/M tokens

Qwen3 Next 80B A3B Thinking

alibaba
qwen3-next-80b-a3b-thinking
Streaming
Tools
Reasoning
Vertex AI (OpenAI-compatible)
Context: 131.1k
Input
$0.15
/M tokens
Cached
/M tokens
Output
$1.2
/M tokens

Qwen3 Next 80B A3B Instruct

alibaba
qwen3-next-80b-a3b-instruct
Streaming
Tools
JSON Output
Vertex AI (OpenAI-compatible)
Context: 131.1k
Input
$0.15
/M tokens
Cached
/M tokens
Output
$1.2
/M tokens

Qwen3 235B A22B Instruct 2507

alibaba
qwen3-235b-a22b-instruct-2507
Streaming
Tools
JSON Output
Vertex AI (OpenAI-compatible)
Context: 262.1k
Input
$0.22
/M tokens
Cached
/M tokens
Output
$0.88
/M tokens

Qwen3 Coder 480B A35B Instruct

alibaba
qwen3-coder-480b-a35b-instruct
Streaming
Tools
JSON Output
Vertex AI (OpenAI-compatible)
Context: 262.1k
Input
$0.22
/M tokens
Cached
$0.022
/M tokens
Output
$1.8
/M tokens