Tool calling
Multi-turn function calling — define a tool, handle the model's request, return the result.
Tool calling lets the model ask your code to do something (look up weather, query a database, call another API) and then continue the conversation with the result. It's a three-step loop: define → respond → continue.
The loop
user message
↓
model decides to call a tool → you execute the tool
↓ ↓
←──── tool result ─────────────────────
↓
model produces final answerYou make two calls to /v1/chat/completions. The first returns a tool_calls block; the second returns the final user-facing message.
Python
import json
from openai import OpenAI
client = OpenAI(
base_url="https://litellm.tensorloop.tech/v1",
api_key="tl_...",
)
# 1. Define the tool
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a city.",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"default": "celsius",
},
},
"required": ["city"],
},
},
},
]
# 2. The function the tool will call
def get_weather(city: str, unit: str = "celsius") -> dict:
# Replace with your real implementation
return {"city": city, "temperature": 22, "unit": unit, "summary": "Sunny"}
# 3. Build the conversation
messages = [{"role": "user", "content": "What's the weather in Paris?"}]
# 4. First call — model decides to invoke the tool
first = client.chat.completions.create(
model="gpt-4o-mini",
tools=tools,
messages=messages,
)
assistant_msg = first.choices[0].message
# Append the assistant turn (with tool_calls) to the conversation
messages.append(assistant_msg)
# 5. Execute every tool call the model requested
for tool_call in assistant_msg.tool_calls or []:
name = tool_call.function.name
args = json.loads(tool_call.function.arguments)
if name == "get_weather":
result = get_weather(**args)
else:
result = {"error": f"unknown tool: {name}"}
# Append the tool result, tagged with the tool_call_id
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result),
})
# 6. Second call — model uses the tool result to write the final answer
final = client.chat.completions.create(
model="gpt-4o-mini",
tools=tools, # keep tools available in case the model wants to call again
messages=messages,
)
print(final.choices[0].message.content)
# → "It's currently 22°C and sunny in Paris."A few things worth knowing:
- The model may emit multiple
tool_callsin one turn. Iterate over the list. - Always append the assistant message with
tool_callsto the conversation before the tool results, in the same order. - Tool result
contentmust be a string — JSON-encode any structured payload. - If the model decides not to call a tool,
assistant_msg.tool_callsis empty and you can useassistant_msg.contentdirectly.
JavaScript / TypeScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://litellm.tensorloop.tech/v1",
apiKey: process.env.TENSORLOOP_KEY!,
});
const tools = [
{
type: "function" as const,
function: {
name: "get_weather",
description: "Get the current weather for a city.",
parameters: {
type: "object",
properties: {
city: { type: "string" },
unit: { type: "string", enum: ["celsius", "fahrenheit"] },
},
required: ["city"],
},
},
},
];
function getWeather(city: string, unit = "celsius") {
return { city, temperature: 22, unit, summary: "Sunny" };
}
const messages: any[] = [
{ role: "user", content: "What's the weather in Paris?" },
];
const first = await client.chat.completions.create({
model: "gpt-4o-mini",
tools,
messages,
});
const assistantMsg = first.choices[0].message;
messages.push(assistantMsg);
for (const call of assistantMsg.tool_calls ?? []) {
const args = JSON.parse(call.function.arguments);
const result =
call.function.name === "get_weather"
? getWeather(args.city, args.unit)
: { error: `unknown tool: ${call.function.name}` };
messages.push({
role: "tool",
tool_call_id: call.id,
content: JSON.stringify(result),
});
}
const final = await client.chat.completions.create({
model: "gpt-4o-mini",
tools,
messages,
});
console.log(final.choices[0].message.content);Forcing or disabling tool use
Use tool_choice to override the model's decision:
| Value | Behavior |
|---|---|
"auto" (default) | Model picks whether to call a tool. |
"none" | Don't call any tool; answer directly. |
"required" | Must call at least one tool. |
{ "type": "function", "function": { "name": "get_weather" } } | Must call this specific tool. |
"required" and the object form are useful when you have a deterministic workflow (you know this turn needs a tool call) and don't want the model to second-guess.
Streaming tool calls
When you stream a response that contains a tool call, delta.tool_calls[i].function.arguments arrives as concatenated string fragments — not parseable JSON until the stream completes. Buffer the fragments, wait for finish_reason === "tool_calls", then parse:
buffers: dict[int, str] = {}
for chunk in stream:
for tc in chunk.choices[0].delta.tool_calls or []:
buffers[tc.index] = buffers.get(tc.index, "") + (
tc.function.arguments or ""
)
# After the stream ends, every entry in buffers is complete JSON
args = {idx: json.loads(buf) for idx, buf in buffers.items()}See also
- Chat completions reference — full
toolsandtool_choiceschema. - API → Streaming — SSE shape for streamed responses.