Multi-Region Routing, Content Filters & More
Route requests to regional providers, protect your apps with built-in content moderation, enforce API key rate limits, and explore new models.
Read more about Multi-Region Routing, Content Filters & More→Stay up to date with the latest features, improvements, and fixes in LLM Gateway.
Route requests to regional providers, protect your apps with built-in content moderation, enforce API key rate limits, and explore new models.
Read more about Multi-Region Routing, Content Filters & More→Generate videos via the API, track conversations with sessions, and more — plus new models and providers.
Read more about Video Generation, Sessions & More→Access OpenAI's most capable models — GPT-5.4 for complex professional work and GPT-5.4 Pro for smarter, more precise responses — with 1.05M context windows and reasoning support.
Read more about GPT-5.4 and GPT-5.4 Pro Now Available→A dedicated Image Studio in the Playground for gallery-based generation with multi-model comparison, an OpenAI-compatible /v1/images/edits endpoint, and a wave of image generation improvements.
Read more about Image Studio, Image Edits API & More→When a provider fails, LLMGateway now automatically retries your request on another provider. Every attempt is logged with full routing visibility, so you always know what happened.
Read more about Automatic Retry & Fallback with Full Routing Transparency→Build AI-powered applications faster with pre-built agents, production-ready templates, and a new CLI tool for scaffolding projects.
Read more about AI Agent skills, Agents, Templates & CLI→New unified reasoning object for precise control over reasoning models. Specify exact token budgets with max_tokens or use effort levels — all in one consistent API.
Read more about Unified Reasoning Configuration→Ship faster with Dev Plans — AI-powered development planning now in beta. Plus native web search for real-time data, MiniMax provider, structured outputs for Anthropic & Perplexity, and a redesigned models experience.
Read more about Dev Plans, Native Web Search, and MiniMax Provider→Track all organization activity with comprehensive audit logs. See who did what, when, and to which resource — available for Enterprise customers.
Read more about Enterprise Audit Logs→Protect your LLM usage with content guardrails. Detect and block prompt injections, PII, secrets, and more — available for Enterprise customers.
Read more about Enterprise Guardrails→We're simplifying our pricing. All Pro plan features are now free for everyone — BYOK, team management, 30-day data retention, and more.
Read more about Pro Features Now Free for Everyone→Introducing Alibaba Cloud's Qwen Image model family - powerful models for text-to-image generation and image editing, now available in four variants: Qwen Image, Qwen Image Max, Qwen Image Max 2025-12-30, and Qwen Image Plus.
Read more about Alibaba Cloud Qwen Image Models: Advanced Image Generation and Editing→New Cerebras provider with six high-performance models, including GPT-OSS 120B and Qwen 3, now available through LLM Gateway.
Read more about Cerebras: Ultra-Fast Inference with 6 New Models→Google's latest Gemini 3 Pro Preview is now available with an exclusive 20% launch discount, featuring 1M context window and prompt caching.
Read more about Gemini 3 Pro Preview: 20% Off Launch Discount→Introducing Sherlock Dash Alpha and Sherlock Think Alpha (Grok 4.1) - free stealth models with 1.8M context, reasoning, vision, and advanced capabilities.
Read more about Sherlock: Two New Stealth Alpha Models→CanopyWave brings Kimi K2 Thinking to LLM Gateway with an exclusive 75% discount.
Read more about CanopyWave: 75% Off Kimi K2 Thinking→CanopyWave brings Qwen3 Coder, MiniMax M2, and GLM-4.6 to LLM Gateway with an exclusive 75% discount on all three models.
Read more about CanopyWave: 3 New Models with 75% Off→Added support for Moonshot AI's Kimi K2 Thinking model with 262K context window, advanced reasoning capabilities, and prompt caching for cost-effective thinking tasks.
Read more about Kimi K2 Thinking Model Support→Save on all Z.ai models with 10% off and get 20% off all Google models through LLM Gateway.
Read more about Z.ai 10% Off & Google 20% Off All Models→Added native support for AWS Bedrock, Google Vertex AI, and Microsoft Azure.
Read more about AWS Bedrock, Google Vertex AI and Microsoft Azure→Invite teammates, assign roles (Owner, Admin, Developer), track included seats, and add more seats as your org grows. Pro includes team management; Enterprise adds SSO/SAML, SCIM, audit logs, and advanced permissions.
Read more about Team Members: Roles, Seats, and Access Controls→Exclusive partnership with CanopyWave brings massive 90% discount on DeepSeek v3.1, making advanced reasoning capabilities more accessible than ever.
Read more about CanopyWave Partnership: 90% Off DeepSeek v3.1→Added support for Anthropic's Claude Sonnet 4.5
Read more about Claude Sonnet 4.5 Model Support→Added support for Grok 4 Fast Reasoning, and Grok 4 Fast Non-Reasoning models via xAI provider.
Read more about Grok 4 Fast Models: Flagship and Fast Variants Now Available→Added support for qwen-max, qwen-max-latest, qwen-plus-latest, qwen-flash, qwen-vl-max, qwen-vl-plus and the new Qwen3 Next 80B A3B Instruct and Thinking models.
Read more about New Alibaba Qwen Models: Qwen3 Next, Max, Plus, Flash, Vision→Configure Claude Code to use any LLM model through LLMGateway's unified API with simple environment variable setup.
Read more about Claude Code Configuration Now Supported→Access Alibaba's powerful Qwen3 Max model with 256K context window, advanced reasoning capabilities, vision support, and function calling - all at competitive pricing.
Read more about Qwen3 Max Model Now Available→Generate stunning images with Google's Gemini 2.5 Flash Image Preview - our first image generation model with 32.8k context window and competitive pricing.
Read more about Introducing Our First Image Generation Model: Gemini 2.5 Flash Image Preview→Added support for DeepSeek's latest v3.1 model with 128K context window and competitive pricing for advanced reasoning capabilities.
Read more about DeepSeek v3.1 Model Support→Browse and compare 100+ AI models with advanced filtering, plus access Meta Llama 3.1 70B Instruct FP8 completely free through our CloudRift partnership.
Read more about New Models Directory & Free Llama 3.1 70B via CloudRift→Set individual credit limits for API keys to better control spending and prevent unexpected overages.
Read more about API Key Usage Limits & Credit Controls→Get instant access to OpenAI's powerful new GPT-5 model family including gpt-5, gpt-5-mini, gpt-5-nano, and gpt-5-chat-latest with 400k context windows.
Read more about GPT-5 Model Family Now Available→Released v2.0 of our @llmgateway/ai-sdk-provider npm package with improved Vercel AI SDK integration and simplified model access.
Read more about AI SDK Provider v2.0 Released→Added support for Claude 4.1 models including claude-opus-4-1, claude-opus-4-20250514, and claude-sonnet-4-20250514 via Anthropic provider.
Read more about Claude 4.1 Models: Opus and Sonnet Now Available→Added support for GPT-OSS-120B and GPT-OSS-20B models via Groq, offering powerful open-source alternatives with extensive context windows and competitive pricing.
Read more about New GPT-OSS Models: 120B and 20B via Groq→We’ve moved from TanStack Start to Next.js. Here’s why it matters
Read more about Next.js migration→Added support for Cloudrift, Moonshot AI and Novita AI providers, both offering the powerful kimi-k2 model with extensive context windows and competitive pricing.
Read more about New Providers: Cloudrift, Moonshot AI and Novita AI Support→Added support for Groq and xAI providers with their latest models, plus credits are now always visible in the sidebar for easy access.
Read more about New Providers: Groq and xAI Support + Always-Visible Credits→Introducing organizations and projects for clearer controls and statistics.
Read more about Dashboard UI Improvements & Project Context→Bring your own LLM provider keys or use credits with reduced gateway fees (2.5% vs 5%). Includes premium analytics, higher rate limits, and priority email support.
Read more about Pro Subscription Launch→New and improved self-hosting documentation for teams and enterprises looking to deploy LLM Gateway on their own infrastructure.
Read more about Self-Hosting Just Got Easier→Massive savings with Deepseek models and the arrival of Mistral models for all users. Discover new performance benchmarks at lower costs.
Read more about Deepseek Discount + Mistral Joins the Lineup→The unified AI gateway is here! Access 30+ models from 8 providers through one OpenAI-compatible API with transparent pricing and powerful analytics.
Read more about LLM Gateway v1.0 Launch→AI-powered help
Please introduce yourself before we start.