Python

TensorLoop is OpenAI-compatible, so the official openai Python SDK works unchanged — just change base_url and api_key.

Install

pip install openai

Basic call

from openai import OpenAI

client = OpenAI(
    base_url="https://api.tensorloop.tech/v1",
    api_key="sk-...",  # or os.environ["TENSORLOOP_KEY"]
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello, briefly."}],
)
print(response.choices[0].message.content)

Streaming

stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Count to ten."}],
    stream=True,
)

for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)

Tool use

A minimal end-to-end loop — see Tool calling for the full guide.

import json

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a city.",
            "parameters": {
                "type": "object",
                "properties": {"city": {"type": "string"}},
                "required": ["city"],
            },
        },
    },
]

def get_weather(city: str) -> dict:
    return {"city": city, "temp": 22, "unit": "celsius"}

messages = [{"role": "user", "content": "What's the weather in Paris?"}]

# First turn: model asks to call the tool
first = client.chat.completions.create(
    model="gpt-4o-mini", tools=tools, messages=messages
)
assistant = first.choices[0].message
messages.append(assistant)

# Execute every tool call and append the results
for call in assistant.tool_calls or []:
    args = json.loads(call.function.arguments)
    result = get_weather(**args)
    messages.append({
        "role": "tool",
        "tool_call_id": call.id,
        "content": json.dumps(result),
    })

# Second turn: model uses the tool result to answer
final = client.chat.completions.create(
    model="gpt-4o-mini", tools=tools, messages=messages
)
print(final.choices[0].message.content)

Retries

The OpenAI SDK retries 5xx and 429 with backoff by default. For long-running scripts, wrap calls in try/except to log spend errors:

from openai import RateLimitError

try:
    response = client.chat.completions.create(...)
except RateLimitError as e:
    # Budget exhausted, or RPM cap hit — back off
    print("rate limited:", e)