Troubleshooting
Common failure modes — what the symptom looks like, what's actually wrong, and how to fix it.
If you don't see your problem here, check Errors for the full code table.
401 Unauthorized
Symptom: every call returns 401, even with what looks like a valid key.
Most likely:
- Wrong header format. The header must be
Authorization: Bearer <key>— case-sensitive onBearer, with exactly one space. - Key was revoked. Check the keys table in the dashboard — revoked keys disappear from it. Mint a new one.
- Key expired. Default expiry is 90 days from mint. Expired keys also disappear from the dashboard.
- You're calling the dashboard API with a bearer token.
tensorloop.tech/api/*uses session cookies, not keys. You wantlitellm.tensorloop.tech/v1/*for inference.
403 Forbidden on a model that worked yesterday
Symptom: a key that has been calling gpt-4o-mini happily starts returning 403.
Almost always: the request is targeting a different model ID than the one in the key's allowlist. Open the keys table and confirm the exact model strings the key is scoped to. Model IDs are case- and dash-sensitive.
If you're on free, the only allowed model is gpt-4o-mini. Free keys cannot call any other model regardless of what you type.
429 Too Many Requests
Two distinct causes return the same code:
- RPM cap hit. You sent more requests in the last 60 seconds than the key's rate limit allows. Back off and retry — the bucket is rolling.
- Budget exhausted. The key has spent its USD ceiling for the current 30-day window. The dashboard will mark the key with a red Budget exhausted badge. Wait for the window to roll, or mint a new key with a higher budget.
The error body distinguishes them:
{ "error": { "type": "rate_limit_error", "message": "..." } }
{ "error": { "type": "budget_exceeded", "message": "..." } }"My spend looks wrong"
Spend is reported by the upstream LiteLLM proxy after the request completes. A few things cause confusion:
- Streaming completions charge for the full output, not the bytes you actually read. Aborting mid-stream does not reduce cost.
- Failed requests don't charge. If the upstream model errored, you don't pay.
- Rounding. Spend is stored to four decimal places, so a single sub-cent call may show as
$0.0000in Activity but still count toward the budget bucket.
Cross-check by looking at the Activity page, which lists per-call tokens and cost.
Streams disconnect early
Symptom: SSE stream cuts off before data: [DONE].
Possible causes:
- Intermediate proxy timeout. Some serverless platforms idle out long-running responses. Tune your runtime or use a non-edge function.
- Client-side fetch buffering. Native
fetchin some environments buffers until headers flush. Use a streaming library (eventsource-parserfor JS, the OpenAI SDK'sstream=Truefor Python). - The model finished naturally. Check
finish_reason—stop,length, andcontent_filterare all valid terminations.
"OpenAI SDK can't find the model"
Symptom: openai.NotFoundError: model not found.
The OpenAI SDK does a GET /v1/models under the hood for some helpers. If your key's plan filter doesn't include the model, you get a 404 even though chat completions would have worked. Verify with curl:
curl https://litellm.tensorloop.tech/v1/models \
-H "Authorization: Bearer $TENSORLOOP_KEY"If the model isn't in the list, you're on the wrong plan or the wrong key.
Base URL mistakes
The single most common integration bug. The correct base URL is:
https://litellm.tensorloop.tech/v1Wrong:
https://tensorloop.tech/v1— that's the marketing site.https://api.tensorloop.tech/v1— no such host.https://litellm.tensorloop.tech(no/v1) — SDK will append paths wrong.
In the OpenAI SDK, this is baseURL (JS) or base_url (Python).
CORS errors in the browser
You should not call TensorLoop from the browser at all — your key would be exposed in DevTools. Proxy through your own backend.
If you absolutely must call from the browser (e.g. internal tooling on a private network), proxy through a same-origin route handler and forward the request server-side. See the JavaScript examples for a Next.js route handler.
Still stuck
Open the Activity page and find the failing call. The cost and tokens columns confirm whether the request even reached the model, and the timestamp pinpoints when it failed. Then re-check Errors for the exact code.