Speech Generation + Audio Studio

Text-to-speech is live: nine models from ElevenLabs, OpenAI, and Gemini behind the OpenAI-compatible /v1/audio/speech endpoint, plus a new Audio Studio in the Playground to compare voices side by side.

June 10, 2026

Speech generation and Audio Studio on LLM Gateway

LLM Gateway can now talk. Text-to-speech ships today with nine models across three providers — all behind the same API key, billing, and logs as your chat, image, and video requests.

The `/v1/audio/speech` Endpoint

A drop-in replacement for OpenAI's audio API — point your existing OpenAI client at the gateway and you're done:

1curl -X POST "https://api.llmgateway.io/v1/audio/speech" \2  -H "Authorization: Bearer $LLM_GATEWAY_API_KEY" \3  -H "Content-Type: application/json" \4  -d '{5    "model": "eleven-multilingual-v2",6    "input": "Hello, welcome to LLM Gateway!",7    "voice": "Sarah"8  }' \9  --output speech.mp3

1curl -X POST "https://api.llmgateway.io/v1/audio/speech" \2  -H "Authorization: Bearer $LLM_GATEWAY_API_KEY" \3  -H "Content-Type: application/json" \4  -d '{5    "model": "eleven-multilingual-v2",6    "input": "Hello, welcome to LLM Gateway!",7    "voice": "Sarah"8  }' \9  --output speech.mp3

Supports voice, response_format (mp3, wav, opus, aac, flac, pcm — varies by family), style instructions, and speed. See the speech generation docs for the full reference.

ElevenLabs Joins the Gateway

A brand-new provider with four models and 20 named voices:

eleven-multilingual-v2 — most lifelike, 29 languages · $0.11 / 1K characters
eleven-v3 — most expressive, 70+ languages · $0.11 / 1K characters
eleven-flash-v2-5 — ultra-low latency, 32 languages · $0.055 / 1K characters
eleven-turbo-v2-5 — fast and balanced · $0.055 / 1K characters

They join OpenAI's tts-1, tts-1-hd, and steerable gpt-4o-mini-tts, plus gemini-2.5-flash-preview-tts and gemini-2.5-pro-preview-tts — 60+ prebuilt voices in total. Browse them all on the models page.

Audio Studio

The Playground gets a third studio at /audio, joining Image and Video:

Compare mode — generate the same script with up to 3 models in parallel and listen side by side
Per-model controls — voice, output format, playback speed, and style instructions adapt to each model family
Org-scoped history — every generation is saved; revisit, rename, or delete past takes
One-click download straight from the player

Try Audio Studio →

Read the announcement → | Speech docs →

Speech Generation + Audio Studio

The `/v1/audio/speech` Endpoint

ElevenLabs Joins the Gateway

Audio Studio

Stay ahead of the curve

Support

Welcome!

The /v1/audio/speech Endpoint

ElevenLabs Joins the Gateway

Audio Studio

The `/v1/audio/speech` Endpoint