Examples
Python
Use the OpenAI Python SDK against TensorLoop.
TensorLoop is OpenAI-compatible, so the official openai Python SDK works unchanged — just change base_url and api_key.
Install
pip install openaiBasic call
from openai import OpenAI
client = OpenAI(
base_url="https://litellm.tensorloop.tech/v1",
api_key="sk-...", # or os.environ["TENSORLOOP_KEY"]
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello, briefly."}],
)
print(response.choices[0].message.content)Streaming
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Count to ten."}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content or ""
print(delta, end="", flush=True)Tool use
A minimal end-to-end loop — see Tool calling for the full guide.
import json
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a city.",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"],
},
},
},
]
def get_weather(city: str) -> dict:
return {"city": city, "temp": 22, "unit": "celsius"}
messages = [{"role": "user", "content": "What's the weather in Paris?"}]
# First turn: model asks to call the tool
first = client.chat.completions.create(
model="gpt-4o-mini", tools=tools, messages=messages
)
assistant = first.choices[0].message
messages.append(assistant)
# Execute every tool call and append the results
for call in assistant.tool_calls or []:
args = json.loads(call.function.arguments)
result = get_weather(**args)
messages.append({
"role": "tool",
"tool_call_id": call.id,
"content": json.dumps(result),
})
# Second turn: model uses the tool result to answer
final = client.chat.completions.create(
model="gpt-4o-mini", tools=tools, messages=messages
)
print(final.choices[0].message.content)Retries
The OpenAI SDK retries 5xx and 429 with backoff by default. For long-running scripts, wrap calls in try/except to log spend errors:
from openai import RateLimitError
try:
response = client.chat.completions.create(...)
except RateLimitError as e:
# Budget exhausted, or RPM cap hit — back off
print("rate limited:", e)