Support

AI-powered help

Welcome!

Please introduce yourself before we start.

    LLM Gateway
    • Docs
    • Pricing
    • Pricing
    • Docs
    • Models
      • AI Gateway
      • DevPass
      • Chat Playground
      • Observability
      • Enterprise
      • Blog
      • Changelog
      • Integrations
      • Reliability
      • Guardrails
      • Providers
      • Apps
      • Models
      • Model Timeline
      • Compare
      • Token Cost Calculator
      • Referral Program
      • MCP Server
      • Agents
      • AI SDK Provider
      • Agent Skills
      • Templates
      • Guides
    1.4k
    Log InGet Started

    Best Models for Roleplay

    Models with strong character consistency, creative prose, and long context windows — compared by price and context size

    Compare

    Use Case

    Capabilities

    Provider

    Input Price ($/M tokens)

    Output Price ($/M tokens)

    Context Size (tokens)

    20
    Models
    44
    Providers
    10
    Vision Models
    17
    Tool-enabled
    0
    Free Models
    Features
    AWS Bedrock(global)
    claude-sonnet-5
    $2.00$10.00$0.20
    AWS Bedrock
    claude-sonnet-5
    $2.00$10.00$0.20
    Anthropic
    claude-sonnet-5
    $2.00$10.00$0.20
    AWS Bedrock(us)
    claude-sonnet-5
    $2.20$11.00$0.22
    Granite
    glm-5.2
    $1.40$1.12
    -20% off
    $4.40$3.52
    -20% off
    $0.26$0.21
    -20% off
    Z AI
    glm-5.2
    $1.40$4.40$0.26
    EmberCloud
    glm-5.2
    $1.26$3.96$0.23
    MiniMax
    minimax-m3
    $0.60$2.40$0.12
    AWS Bedrock(jp)
    claude-opus-4-8
    $5.50$27.50$0.55
    AWS Bedrock(us)
    claude-opus-4-8
    $5.50$27.50$0.55
    Anthropic
    claude-opus-4-8
    $5.00$25.00$0.50
    AWS Bedrock
    claude-opus-4-8
    $5.00$25.00$0.50
    AWS Bedrock(global)
    claude-opus-4-8
    $5.00$25.00$0.50
    AWS Bedrock(au)
    claude-opus-4-8
    $5.50$27.50$0.55
    AWS Bedrock(eu)
    claude-opus-4-8
    $5.50$27.50$0.55
    AWS Bedrock(global)
    grok-4-3
    $1.25$2.50$0.20
    AWS Bedrock(us)
    grok-4-3
    $1.38$2.75$0.22
    xAI
    grok-4-3
    $1.25$2.50$0.31
    AWS Bedrock(us-west-2)
    grok-4-3
    $1.38$2.75$0.22
    AWS Bedrock
    grok-4-3
    $1.25$2.50$0.20
    Azure AI Foundry
    grok-4-3
    $1.25$2.50$0.20
    Alibaba Cloud(singapore)
    deepseek-v4-flash
    $0.20$0.40$0.04
    DeepSeek
    deepseek-v4-flash
    $0.14$0.28$0.00
    DeepInfra
    deepseek-v4-flash
    $0.14$0.28$0.03
    NovitaAI
    deepseek-v4-flash
    $0.14$0.28$0.03
    Alibaba Cloud(cn-beijing)
    deepseek-v4-flash
    $0.14$0.28$0.03
    Alibaba Cloud
    deepseek-v4-flash
    $0.20$0.40$0.04
    DeepSeek
    deepseek-v4-pro
    $0.43$0.87$0.00
    Alibaba Cloud(singapore)
    deepseek-v4-pro
    $2.40$4.80$0.20
    Together AI
    deepseek-v4-pro
    $1.74$3.48$0.20
    Alibaba Cloud(cn-beijing)
    deepseek-v4-pro
    $1.65$3.30$0.14
    Alibaba Cloud
    deepseek-v4-pro
    $2.40$4.80$0.20
    DeepInfra
    deepseek-v4-pro
    $1.74$3.48$0.14
    Tundra
    kimi-k2.6
    $0.40$2.20$0.08
    Together AI
    kimi-k2.6
    $1.20$4.50$0.20
    CanopyWave
    kimi-k2.6
    $0.50$2.80$0.10
    NovitaAI
    kimi-k2.6
    $0.95$4.00$0.16
    Moonshot AI
    kimi-k2.6
    $0.95$4.00$0.16
    Mistral AI
    mistral-small-2506
    $0.10$0.30—
    Mistral AI
    mistral-large-2512
    $0.50$1.50—
    Vertex AI (OpenAI-compatible)
    glm-5
    $1.00$3.20$0.10
    EmberCloud
    glm-5
    $0.72$2.30$0.14
    Alibaba Cloud(cn-beijing)
    glm-5
    $0.57$2.58—
    Nebius AI
    glm-5
    $1.00$3.20—
    NovitaAI
    glm-5
    $1.00$3.20$0.20
    Z AI
    glm-5
    $1.00$3.20$0.20
    Together AI
    glm-5
    $1.00$3.20—
    Alibaba Cloud
    glm-5
    $0.57$2.58—
    Alibaba Cloud(cn-beijing)
    kimi-k2.5
    $0.57$3.01—
    Moonshot AI
    kimi-k2.5
    $0.60$3.00$0.10
    Page 1 of 2

    A good roleplay model needs three things: prose that stays in character over hundreds of messages, a context window large enough to hold character cards and long chat histories, and per-token pricing that doesn't punish long sessions. This page lists the models the roleplay community actually uses — from budget favorites like DeepSeek and GLM to premium options like Claude — with live pricing and context sizes for every provider.

    Every model here is available through the same OpenAI-compatible endpoint, so you can plug LLM Gateway into SillyTavern, RisuAI, or your own frontend with one API key, switch models mid-conversation, and fall back automatically when a provider has an outage.

    Frequently asked questions

    What is the best AI model for roleplay?

    It depends on your budget. DeepSeek V4 and GLM-5 are the best value for money and rarely break character, Kimi K2.6 is known for expressive creative prose, and Claude Opus 4.8 and Claude Sonnet 5 write the highest-quality prose if you're willing to pay premium per-token rates. Grok's non-reasoning models are a popular fast middle ground.

    Can I use these models with SillyTavern or my own frontend?

    Yes. LLM Gateway exposes an OpenAI-compatible chat completions API, so any frontend that supports a custom base URL — SillyTavern, RisuAI, Agnai, or your own app — works by pointing it at the gateway and using your LLM Gateway API key.

    Which roleplay models have the largest context windows?

    Grok 4.1 Fast supports up to 2 million tokens, and Claude Sonnet 5, GLM-5.2, DeepSeek V4, and MiniMax Text-01 all reach 1 million tokens. That's enough to keep an entire long-running roleplay, including character cards and lorebooks, in context.

    How much does API roleplay cost compared to a subscription?

    Usually less. A typical roleplay exchange runs a few thousand tokens, so on a model like DeepSeek V4 Flash (about $0.14 per million input tokens) even heavy daily use costs a fraction of a fixed chatbot subscription — and you only pay for what you use.

    Newsletter

    Stay ahead of the curve

    Join developers who get weekly insights on LLM routing, new model launches, and cost optimization — straight to their inbox.

    • New models & providers as they drop
    • Tips to cut latency & costs
    • Early access to beta features

    No spam. Unsubscribe anytime.

    All systems operational
    AICPA SOC for Service Organizations badgeSOC 2 Type II
    compliant

    Product

    • Features
    • Models
    • Providers
    • Chat Playground
    • Changelog
    • DevPass
    • Compare Models
    • Enterprise

    Resources

    • Apps
    • Templates
    • Agents
    • MCP Server
    • Use Cases
    • Blog
    • Documentation
    • Integrations
    • Guides
    • Brand Assets
    • Token Cost Calculator
    • Referral Program
    • GitHub
    • Contact Us

    Community

    • Twitter
    • Discord

    Compliance

    • Trust Center
    • Security Portal
    • Terms
    • Privacy Policy
    • GDPR
    • SOC 2 Type II
    • Status

    Compare

    • OpenRouter
    • LiteLLM
    • Portkey
    • Migration Guides

    Models

    • Text Generation
    • Text to Image
    • Image to Image
    • Video Generation
    • Embeddings
    • Vision
    • Reasoning
    • Tool Calling
    • Web Search
    • Discounted
    • Best for Roleplay
    • Best for Coding
    • Best for Creative Writing
    • Best for Translation
    • Best for Math
    • Long Context
    • Cheapest
    • Open Source

    Providers

    • OpenAI
    • Anthropic
    • Google AI Studio
    • Glacier
    • Granite
    • Google Vertex AI
    • Vertex AI (OpenAI-compatible)
    • Vertex AI (Anthropic)
    • Quartz
    • Avalanche
    • Groq
    • Cerebras
    • xAI
    • DeepSeek
    • Alibaba Cloud
    • NovitaAI
    • AtlasCloud
    • AWS Bedrock
    • Azure
    • Azure AI Foundry
    • Z AI
    • Moonshot AI
    • Perplexity
    • Nebius AI
    • Mistral AI
    • CanopyWave
    • Inference.net
    • Together AI
    • Custom
    • NanoGPT
    • ByteDance
    • MiniMax
    • EmberCloud
    • Sakana AI
    • Tundra
    • Xiaomi
    • DeepInfra
    • Reve
    • ElevenLabs

    © 2026 LLM Gateway. All rights reserved.