LiveLast 4 hours

Qwen2.5 72B Instruct – Uptime & Latency

Name: Qwen2.5 72B Instruct provider uptime — last 4 hours
Creator: LLM Gateway
License: https://llmgateway.io/legal/terms

Real-time reliability for every provider serving Qwen2.5 72B Instruct on LLM Gateway. Compare success rates, time-to-first-token, throughput, and error breakdown across 1 provider so you can pick the fastest, most stable route for your workload.

Uptime

Per provider

Share of requests with no upstream error

Latency

TTFT + duration

Time to first token and total duration in ms

Throughput

Tokens / sec

Sustained generation speed across requests

Errors

Client / gateway / upstream

Failure source breakdown

Provider performance

Each card shows live traffic from the last 4 hours. Switch tabs to inspect requests, errors, latency, or token volume.

How LLM Gateway uses these metrics

Routing happens automatically — these charts show the data behind it.

Every Qwen2.5 72B Instruct request flowing through LLM Gateway is scored on uptime, latency, and throughput. When an upstream provider degrades, traffic shifts to the next-best healthy endpoint without any client-side changes.

Use this page to verify SLA performance, debug regressions, or pick a primary provider for a self-hosted deployment.

Frequently asked questions

How is Qwen2.5 72B Instruct uptime measured?

Uptime is the share of requests that completed successfully on the upstream provider over the last 4 hours. Client errors (4xx from your request) and gateway errors are excluded so the number reflects the provider's reliability, not user errors.

Which providers serve Qwen2.5 72B Instruct?

Qwen2.5 72B Instruct is currently served by 1 provider: Nebius AI. LLM Gateway routes requests to the best healthy provider in real time.

What is TTFT and why does it matter?

TTFT (time to first token) is the latency between the request and the first streamed token. Lower TTFT means the model starts responding faster — critical for chat UIs and agent loops.

How often does this page update?

Charts refresh every minute and aggregate the most recent 4 hours of traffic across all LLM Gateway projects. Data points are bucketed by minute.