Production inferencefor every model
OpenAI-compatible API for Kimi, MiniMax, Z.AI GLM, GPT, Claude, DeepSeek, and open-weight models. Token pricing, per-key budgets, and usage analytics.
Capabilities
Everything you need to ship inference
How it works
Three steps to your first completion
Create an account
Free tier includes gpt-4o-mini and 2 API keys. No credit card required.
Mint a scoped key
Set budget, RPM, and allowed models per key. Kimi and MiniMax unlock on Pro.
Call the API
POST to litellm.tensorloop.tech/v1/chat/completions. Swap models with one parameter change.
Supported models
Frontier and open-weight models
Kimi, MiniMax, Z.AI GLM, GPT, Claude, DeepSeek, and more through one OpenAI-compatible endpoint.
Developer experience
One endpoint. Any model.
import requests
response = requests.post(
"https://api.tensorloop.dev/v1/chat/completions",
headers={"Authorization": "Bearer tl_your_key"},
json={
"model": "llama-3.1-70b",
"messages": [
{"role": "user", "content": "Explain transformers"}
],
"max_tokens": 512
}
)
print(response.json()["choices"][0]["message"]["content"])Why TensorLoop
Passthrough upstream pricing on every model
30-day spend cap per key · 30 RPM
Quickstart in the docs
Get started
Start building with TensorLoop
Free tier includes gpt-4o-mini, 2 API keys, and $5 budget per key. Upgrade to Pro for Kimi, MiniMax, and the full catalog.
Free
$0Try gpt-4o-mini with scoped keys and usage caps.
- ·2 API keys
- ·$5 budget per key / 30d
- ·30 requests per minute
- ·gpt-4o-mini
- ·Streaming & tool calling
Pro
Usage-basedFull catalog including Kimi K2.5 and MiniMax M2.5.
- ·10 API keys
- ·$100 budget per key / 30d
- ·300 requests per minute
- ·All available models
- ·Vision on supported models