Llama 3.2 3B Instruct

Compact Llama 3.2 3B for efficient inference.

llama-3.2-3b-instruct

32,768 context

Starting at $0.03/M input tokens

Starting at $0.05/M output tokens

Streaming

JSON Output

Select Provider

LLM Gateway routes requests to the best providers that are able to handle your prompt size and parameters.

NovitaAI

Context: 32.8k

Input

$0.03

/M tokens

Cached

—

/M tokens

Output

$0.05

/M tokens

AI-powered help

Please introduce yourself before we start.