Skip to content
DeepTokenInference Gateway
HomeDashboardModelsDocsPricingBlog
    Pay-as-you-go balance

    Add funds for metered model usage

    Recharge balance for API calls across routed providers. Small top-ups stay simple; production spend can move into invoices, commitments, and volume terms.

    Custom recharge rules: 10+ self-serve minimum, up to 5000 before enterprise review.

    Metered usage
    Every request is deducted by model, provider, and usage unit.
    Auto recharge
    Set threshold and refill amount before traffic spikes.
    Budget controls
    Control spend by org, project, member, and API key.
    Starter

    API testing and small experiments.

    $10
    prepaid balance
    Builder

    Individual development and early prototypes.

    $50
    prepaid balance
    Default
    Scale

    Early production traffic with predictable balance.

    $200
    prepaid balance
    Growth

    Team production workloads and finance review.

    $1,000
    prepaid balance
    Business

    Invoice, discount, and customer-success path.

    $5,000+
    prepaid balance
    Commit

    Annual commitment, SLA, and dedicated capacity.

    Custom
    prepaid balance

    Everything included in your account

    All platform features come standard — no subscription fees. Just add credits and go.

    Unified Inference API

    One OpenAI-compatible endpoint for every model and provider.

    Routing & Fallback

    Multi-provider routing with health-aware automatic fallback.

    Usage & Billing

    Token-accurate metering with clean cost attribution.

    Budget Controls

    Hard caps and soft alerts by key, member, and project.

    Team & Organizations

    Shared balance pools, member roles, and org policies.

    Webhooks & Automation

    Order, subscription, and payout events with HMAC signing.

    Audit Logs

    Per-request logs with append-only audit trail.

    Support

    Email support included. Priority and SLA for committed spend.

    Start building
    Model pricing

    Model usage stays transparent

    The final catalog should be server-owned and show the current platform price per model, provider, and metering unit. These rows show the intended structure.

    Prices can change as providers change their rates. The billing ledger should record the price used for each request.
    Example model price grid
    ModelProviderInputCached inputOutputUnit
    GPT-4.1 miniOpenAI-compatible$0.40$0.10$1.60per 1M tokens
    Claude Sonnet classAnthropic-compatible$3.00$0.30$15.00per 1M tokens
    DeepSeek reasoning classRouted providermodel basedmodel basedmodel basedper token / request
    Image / video modelsSpecialized providersrequest basedN/Arequest basedper image, video, or GPU second
    GPT-4.1 mini
    OpenAI-compatible
    Input$0.40
    Cached input$0.10
    Output$1.60
    per 1M tokens
    Claude Sonnet class
    Anthropic-compatible
    Input$3.00
    Cached input$0.30
    Output$15.00
    per 1M tokens
    DeepSeek reasoning class
    Routed provider
    Inputmodel based
    Cached inputmodel based
    Outputmodel based
    per token / request
    Image / video models
    Specialized providers
    Inputrequest based
    Cached inputN/A
    Outputrequest based
    per image, video, or GPU second
    FAQ

    Billing questions

    Answers about balance, model usage, auto recharge, budgets, and enterprise commitments.