TensorLoop

Troubleshooting

Common failure modes — what the symptom looks like, what's actually wrong, and how to fix it.

If you don't see your problem here, check Errors for the full code table.

401 Unauthorized

Symptom: every call returns 401, even with what looks like a valid key.

Most likely:

  • Wrong header format. The header must be Authorization: Bearer <key> — case-sensitive on Bearer, with exactly one space.
  • Key was revoked. Check the keys table in the dashboard — revoked keys disappear from it. Mint a new one.
  • Key expired. Default expiry is 90 days from mint. Expired keys also disappear from the dashboard.
  • You're calling the dashboard API with a bearer token. tensorloop.tech/api/* uses session cookies, not keys. You want litellm.tensorloop.tech/v1/* for inference.

403 Forbidden on a model that worked yesterday

Symptom: a key that has been calling gpt-4o-mini happily starts returning 403.

Almost always: the request is targeting a different model ID than the one in the key's allowlist. Open the keys table and confirm the exact model strings the key is scoped to. Model IDs are case- and dash-sensitive.

If you're on free, the only allowed model is gpt-4o-mini. Free keys cannot call any other model regardless of what you type.

429 Too Many Requests

Two distinct causes return the same code:

  • RPM cap hit. You sent more requests in the last 60 seconds than the key's rate limit allows. Back off and retry — the bucket is rolling.
  • Budget exhausted. The key has spent its USD ceiling for the current 30-day window. The dashboard will mark the key with a red Budget exhausted badge. Wait for the window to roll, or mint a new key with a higher budget.

The error body distinguishes them:

{ "error": { "type": "rate_limit_error", "message": "..." } }
{ "error": { "type": "budget_exceeded", "message": "..." } }

"My spend looks wrong"

Spend is reported by the upstream LiteLLM proxy after the request completes. A few things cause confusion:

  • Streaming completions charge for the full output, not the bytes you actually read. Aborting mid-stream does not reduce cost.
  • Failed requests don't charge. If the upstream model errored, you don't pay.
  • Rounding. Spend is stored to four decimal places, so a single sub-cent call may show as $0.0000 in Activity but still count toward the budget bucket.

Cross-check by looking at the Activity page, which lists per-call tokens and cost.

Streams disconnect early

Symptom: SSE stream cuts off before data: [DONE].

Possible causes:

  • Intermediate proxy timeout. Some serverless platforms idle out long-running responses. Tune your runtime or use a non-edge function.
  • Client-side fetch buffering. Native fetch in some environments buffers until headers flush. Use a streaming library (eventsource-parser for JS, the OpenAI SDK's stream=True for Python).
  • The model finished naturally. Check finish_reasonstop, length, and content_filter are all valid terminations.

"OpenAI SDK can't find the model"

Symptom: openai.NotFoundError: model not found.

The OpenAI SDK does a GET /v1/models under the hood for some helpers. If your key's plan filter doesn't include the model, you get a 404 even though chat completions would have worked. Verify with curl:

curl https://litellm.tensorloop.tech/v1/models \
  -H "Authorization: Bearer $TENSORLOOP_KEY"

If the model isn't in the list, you're on the wrong plan or the wrong key.

Base URL mistakes

The single most common integration bug. The correct base URL is:

https://litellm.tensorloop.tech/v1

Wrong:

  • https://tensorloop.tech/v1 — that's the marketing site.
  • https://api.tensorloop.tech/v1 — no such host.
  • https://litellm.tensorloop.tech (no /v1) — SDK will append paths wrong.

In the OpenAI SDK, this is baseURL (JS) or base_url (Python).

CORS errors in the browser

You should not call TensorLoop from the browser at all — your key would be exposed in DevTools. Proxy through your own backend.

If you absolutely must call from the browser (e.g. internal tooling on a private network), proxy through a same-origin route handler and forward the request server-side. See the JavaScript examples for a Next.js route handler.

Still stuck

Open the Activity page and find the failing call. The cost and tokens columns confirm whether the request even reached the model, and the timestamp pinpoints when it failed. Then re-check Errors for the exact code.

On this page