Real-time reliability for every provider serving Gemini 2.5 Flash Preview Thinking (04-17) on LLM Gateway. Compare success rates, time-to-first-token, throughput, and error breakdown across 2 providers so you can pick the fastest, most stable route for your workload.
Each card shows live traffic from the last 4 hours. Switch tabs to inspect requests, errors, latency, or token volume.
Every Gemini 2.5 Flash Preview Thinking (04-17) request flowing through LLM Gateway is scored on uptime, latency, and throughput. When an upstream provider degrades, traffic shifts to the next-best healthy endpoint without any client-side changes.
Use this page to verify SLA performance, debug regressions, or pick a primary provider for a self-hosted deployment.
Uptime is the share of requests that completed successfully on the upstream provider over the last 4 hours. Client errors (4xx from your request) and gateway errors are excluded so the number reflects the provider's reliability, not user errors.
Gemini 2.5 Flash Preview Thinking (04-17) is currently served by 2 providers: Google AI Studio, Google Vertex AI. LLM Gateway routes requests to the best healthy provider in real time.
TTFT (time to first token) is the latency between the request and the first streamed token. Lower TTFT means the model starts responding faster — critical for chat UIs and agent loops.
Charts refresh every minute and aggregate the most recent 4 hours of traffic across all LLM Gateway projects. Data points are bucketed by minute.