TensorLoop
Examples

Tool calling

Multi-turn function calling — define a tool, handle the model's request, return the result.

Tool calling lets the model ask your code to do something (look up weather, query a database, call another API) and then continue the conversation with the result. It's a three-step loop: define → respond → continue.

The loop

user message

model decides to call a tool   →   you execute the tool
     ↓                                       ↓
     ←──── tool result ─────────────────────

model produces final answer

You make two calls to /v1/chat/completions. The first returns a tool_calls block; the second returns the final user-facing message.

Python

import json
from openai import OpenAI

client = OpenAI(
    base_url="https://litellm.tensorloop.tech/v1",
    api_key="tl_...",
)

# 1. Define the tool
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a city.",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"},
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "default": "celsius",
                    },
                },
                "required": ["city"],
            },
        },
    },
]

# 2. The function the tool will call
def get_weather(city: str, unit: str = "celsius") -> dict:
    # Replace with your real implementation
    return {"city": city, "temperature": 22, "unit": unit, "summary": "Sunny"}

# 3. Build the conversation
messages = [{"role": "user", "content": "What's the weather in Paris?"}]

# 4. First call — model decides to invoke the tool
first = client.chat.completions.create(
    model="gpt-4o-mini",
    tools=tools,
    messages=messages,
)
assistant_msg = first.choices[0].message

# Append the assistant turn (with tool_calls) to the conversation
messages.append(assistant_msg)

# 5. Execute every tool call the model requested
for tool_call in assistant_msg.tool_calls or []:
    name = tool_call.function.name
    args = json.loads(tool_call.function.arguments)

    if name == "get_weather":
        result = get_weather(**args)
    else:
        result = {"error": f"unknown tool: {name}"}

    # Append the tool result, tagged with the tool_call_id
    messages.append({
        "role": "tool",
        "tool_call_id": tool_call.id,
        "content": json.dumps(result),
    })

# 6. Second call — model uses the tool result to write the final answer
final = client.chat.completions.create(
    model="gpt-4o-mini",
    tools=tools,  # keep tools available in case the model wants to call again
    messages=messages,
)
print(final.choices[0].message.content)
# → "It's currently 22°C and sunny in Paris."

A few things worth knowing:

  • The model may emit multiple tool_calls in one turn. Iterate over the list.
  • Always append the assistant message with tool_calls to the conversation before the tool results, in the same order.
  • Tool result content must be a string — JSON-encode any structured payload.
  • If the model decides not to call a tool, assistant_msg.tool_calls is empty and you can use assistant_msg.content directly.

JavaScript / TypeScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://litellm.tensorloop.tech/v1",
  apiKey: process.env.TENSORLOOP_KEY!,
});

const tools = [
  {
    type: "function" as const,
    function: {
      name: "get_weather",
      description: "Get the current weather for a city.",
      parameters: {
        type: "object",
        properties: {
          city: { type: "string" },
          unit: { type: "string", enum: ["celsius", "fahrenheit"] },
        },
        required: ["city"],
      },
    },
  },
];

function getWeather(city: string, unit = "celsius") {
  return { city, temperature: 22, unit, summary: "Sunny" };
}

const messages: any[] = [
  { role: "user", content: "What's the weather in Paris?" },
];

const first = await client.chat.completions.create({
  model: "gpt-4o-mini",
  tools,
  messages,
});

const assistantMsg = first.choices[0].message;
messages.push(assistantMsg);

for (const call of assistantMsg.tool_calls ?? []) {
  const args = JSON.parse(call.function.arguments);
  const result =
    call.function.name === "get_weather"
      ? getWeather(args.city, args.unit)
      : { error: `unknown tool: ${call.function.name}` };

  messages.push({
    role: "tool",
    tool_call_id: call.id,
    content: JSON.stringify(result),
  });
}

const final = await client.chat.completions.create({
  model: "gpt-4o-mini",
  tools,
  messages,
});

console.log(final.choices[0].message.content);

Forcing or disabling tool use

Use tool_choice to override the model's decision:

ValueBehavior
"auto" (default)Model picks whether to call a tool.
"none"Don't call any tool; answer directly.
"required"Must call at least one tool.
{ "type": "function", "function": { "name": "get_weather" } }Must call this specific tool.

"required" and the object form are useful when you have a deterministic workflow (you know this turn needs a tool call) and don't want the model to second-guess.

Streaming tool calls

When you stream a response that contains a tool call, delta.tool_calls[i].function.arguments arrives as concatenated string fragments — not parseable JSON until the stream completes. Buffer the fragments, wait for finish_reason === "tool_calls", then parse:

buffers: dict[int, str] = {}
for chunk in stream:
    for tc in chunk.choices[0].delta.tool_calls or []:
        buffers[tc.index] = buffers.get(tc.index, "") + (
            tc.function.arguments or ""
        )

# After the stream ends, every entry in buffers is complete JSON
args = {idx: json.loads(buf) for idx, buf in buffers.items()}

See also

On this page