TensorLoop

Rate limits & budgets

How budgets reset, how rate limits are bucketed, and what happens when you exceed either.

Rate limits

Every key has a requests-per-minute (RPM) cap. The bucket is a rolling 60-second window. Going over the cap returns:

HTTP 429 Too Many Requests

Free keys cap at 30 RPM; pro keys can go up to 300 RPM. Set the cap lower than the plan max to tighten the blast radius if a key leaks.

Rate limits are enforced server-side by the upstream LiteLLM proxy. Local retries with exponential backoff are encouraged — TensorLoop does not queue rejected requests.

Budgets

Every key has a USD budget that resets every 30 days. The budget tracks spend across all calls made with that key, regardless of model.

When a key's spend reaches its budget:

  1. The next call returns 429 Budget exceeded.
  2. The key shows up in the dashboard as Budget exhausted (red badge).
  3. You can either wait for the rolling window to reset, or mint a new key with a higher budget.

The 30-day window starts when the key is first minted, not on the first of the month.

Watching spend

The dashboard surfaces spend three ways:

  • Per key — the Budget column in the API keys table shows spend / cap and a fill bar.
  • Per day — the Analytics page shows a 30-day timeseries.
  • Per call — the Activity page lists the most recent calls with token counts and cost.

Spend is denominated in USD and rounded to four decimals.

Plan ceilings vs key settings

A free user can mint a key with a $2 budget and 10 RPM — well below the plan max. The plan number is just an upper bound:

SettingFree maxPro max
Budget per key$5$100
RPM per key30300

On this page