LLM的Function Calling（工具调用）

一、什么是 Function Calling？

Function Calling（函数调用），也称为 Tool Calling（工具调用），是 LLM 与外部系统交互的一种结构化方式。根据 OpenAI 官方文档的定义：

Function calling provides a powerful and flexible way for OpenAI models to interface with external systems and access data outside their training data.

简单来说，LLM 本身只能生成文本。通过 Function Calling，模型可以告诉应用程序"我需要调用某个函数"，提供结构化的参数，接收执行结果后继续生成回答——突破纯文本生成的限制。

核心概念

Tools（工具）

开发者定义并提供给模型的功能描述（JSON Schema），告诉模型有哪些函数可用

Tool Calls（工具调用）

模型判断需要调用某个工具时返回的特殊响应，包含函数名和参数

Tool Call Outputs（调用结果）

应用程序执行函数后返回给模型的结果，模型据此生成最终回答

二、流程图视角

Function Calling 是一个多步骤的交互过程。以"查询巴黎天气"为例：

① 应用 → LLM
发送用户消息 + 工具定义（tools）

请求

② LLM → 应用
返回 tool_call: get_weather("Paris")

工具调用

③ 应用执行函数
get_weather("Paris") → {"temperature":"25°C"}

结果回传

④ 应用 → LLM
将工具执行结果发回模型

最终回答

⑤ LLM 生成回答
“巴黎现在的气温是 25°C”

关键点：模型本身不执行函数。它只生成结构化的"调用请求"，真正的执行由应用程序完成。模型的角色是"决策者"——决定调用什么、传什么参数。

交互演示

以下动态展示 Function Calling 的完整交互过程：

用户发送消息

LLM 分析意图

LLM 输出 tool_call

应用执行函数

结果回传 LLM

LLM 生成回答

用户

三、前端对话渲染视角

在一个聊天界面中，Function Calling 涉及多个角色的消息交互。下面完整展示每一步中各角色的消息内容，包括 LLM 的原始输出结构：

AI Chat — Function Calling 完整过程

用户

巴黎现在天气怎么样？

第一次请求发送至 LLM（携带 tools 定义）

LLM 响应（第一次）

LLM 原始输出（output 数组）

[{
  "type": "function_call",
  "call_id": "call_abc123",
  "name": "get_weather",
  "arguments": "{\n    \"location\": \"Paris, France\"\n  }"
}]

模型没有直接回答用户，而是输出了一个 function_call 类型的结构化请求

应用程序拦截 tool_call，执行函数

应用程序（本地执行）

// 应用程序代码执行
const result = get_weather("Paris, France");
// => { temperature: "25", unit: "°C", condition: "sunny" }

将函数结果作为 tool 消息回传 LLM

工具结果（function_call_output）

{
  "type": "function_call_output",
  "call_id": "call_abc123",
  "output": "{\"temperature\":\"25\",\"unit\":\"°C\",\"condition\":\"sunny\"}"
}

第二次请求发送至 LLM（携带完整历史 + 工具结果）

LLM 响应（第二次）

LLM 原始输出（output 数组）

[{
  "type": "message",
  "role": "assistant",
  "content": "巴黎现在天气晴朗，气温 25°C，适合出行。"
}]

巴黎现在天气晴朗，气温 25°C，适合出行。

完整的 messages 数组

上述过程对应的完整消息列表如下，这就是第二次请求时发送给 LLM 的 input 内容：

[
  // 1. 用户原始消息
  { "role": "user", "content": "巴黎现在天气怎么样？" },

  // 2. LLM 第一次响应：tool_call（不是文本回答）
  {
    "type": "function_call",
    "call_id": "call_abc123",
    "name": "get_weather",
    "arguments": "{\"location\":\"Paris, France\"}"
  },

  // 3. 应用程序执行函数后的结果
  {
    "type": "function_call_output",
    "call_id": "call_abc123",
    "output": "{\"temperature\":\"25\",\"unit\":\"°C\",\"condition\":\"sunny\"}"
  }

  // → LLM 收到以上完整历史后，生成最终文本回答
]

前端渲染要点：在用户界面中，function_call 和 function_call_output 类型的消息通常不直接展示给终端用户。前端只渲染 role: "user" 和最终的 role: "assistant" 文本消息。但在开发调试界面中，展示完整的中间步骤有助于排查问题。

四、HTTP 请求原始过程

从 HTTP 层面看，Function Calling 需要至少两次 API 请求：

第一次请求：发送用户消息 + 工具定义

POST /v1/responses
Content-Type: application/json

{
  "model": "gpt-4.1",
  "input": [
    {"role": "user", "content": "巴黎现在天气怎么样？"}
  ],
  "tools": [
    {
      "type": "function",
      "name": "get_weather",
      "description": "获取指定地点的当前天气",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "城市名称，如 Paris, France"
          }
        },
        "required": ["location"],
        "additionalProperties": false
      },
      "strict": true
    }
  ]
}

第一次响应：模型返回 tool_call

{
  "output": [
    {
      "type": "function_call",
      "call_id": "call_abc123",
      "name": "get_weather",
      "arguments": "{\"location\":\"Paris, France\"}"
    }
  ]
}

第二次请求：发送工具执行结果

{
  "model": "gpt-4.1",
  "input": [
    {"role": "user", "content": "巴黎现在天气怎么样？"},
    {
      "type": "function_call",
      "call_id": "call_abc123",
      "name": "get_weather",
      "arguments": "{\"location\":\"Paris, France\"}"
    },
    {
      "type": "function_call_output",
      "call_id": "call_abc123",
      "output": "{\"temperature\":\"25\",\"unit\":\"C\"}"
    }
  ],
  "tools": ["..."]
}

第二次响应：模型生成最终回答

{
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": "巴黎现在天气晴朗，气温 25°C。"
    }
  ]
}

每次请求都需要携带完整的对话历史和工具定义。模型是无状态的——它不记得上一次请求的内容。工具定义会消耗 token，OpenAI 建议控制工具数量（不超过 20 个）。

五、LLM 模型内部工作过程

从模型的视角来看，Function Calling 的实现涉及以下几个层面：

5.1 系统提示注入

当你在 API 请求中传入 tools 参数时，OpenAI 会将工具定义转换为特殊格式注入到系统提示中。模型在训练时已经学会了识别和响应这种格式：

工具定义被转换为模型能理解的特殊语法
模型通过训练学会了在合适的时机生成 function_call 格式的输出
工具定义占用 token 额度，计入输入 token 计费

5.2 决策过程

模型在生成回答时会经历以下决策：

理解意图

分析用户的问题是否需要外部数据或操作

匹配工具

从可用工具列表中找到最合适的工具

提取参数

从用户消息中提取工具所需的参数

生成调用

输出结构化的 JSON 格式调用请求

当开启 strict: true 模式时，模型的输出会被约束为严格符合 JSON Schema 的格式（利用 Structured Outputs 技术），确保参数类型和结构的正确性。

5.3 训练方式

以 LLaMA 3.1 为例，工具调用能力是在后训练阶段（post-training）通过多轮 SFT（监督微调）和 DPO（直接偏好优化）引入的：

SFT 阶段：约 21.9% 的训练数据涉及推理和工具使用
训练数据包含单步和多步工具调用的示例
通过合成数据教会模型调用内置工具（如 brave_search、wolfram_alpha、code_interpreter）
DPO 阶段：约 5.89% 的偏好数据涉及推理和工具使用

六、主流模型的 Function Calling 实现对比

OpenAI GPT 系列闭源

OpenAI 是 Function Calling 的先驱。工具定义通过 tools 参数传入，使用 JSON Schema 描述。支持 strict 模式、并行工具调用、tool_choice 控制调用行为。

OpenAI 工具定义示例
{
  "type": "function",
  "name": "get_weather",
  "description": "获取当前天气",
  "parameters": {
    "type": "object",
    "properties": {
      "location": { "type": "string" }
    },
    "required": ["location"]
  },
  "strict": true
}

Anthropic Claude 系列闭源

Claude 的工具调用通过 content 数组中的 tool_use 和 tool_result 类型实现。

Claude 工具调用格式
// 助手响应
{
  "role": "assistant",
  "content": [{
    "type": "tool_use",
    "id": "toolu_01A09q90qw90lq917835lq9",
    "name": "get_weather",
    "input": {"location": "Paris, France"}
  }]
}

// 工具结果返回
{
  "role": "user",
  "content": [{
    "type": "tool_result",
    "tool_use_id": "toolu_01A09q90qw90lq917835lq9",
    "content": "25°C, 晴"
  }]
}

LLaMA 3.1（Meta）开源

LLaMA 3.1 引入了多个特殊 token 来支持工具调用，支持 JSON 格式调用和内置 Python 工具调用两种风格。

Token	作用
`<\|begin_of_text\|>`	提示开始
`<\|start_header_id\|>` / `<\|end_header_id\|>`	标记角色（system / user / assistant / ipython）
`<\|eom_id\|>`	消息结束，可能需要调用工具
`<\|eot_id\|>`	轮次结束
`<\|python_tag\|>`	内置工具调用标记

LLaMA 3.1 JSON 格式工具调用模板
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Environment: ipython
Cutting Knowledge Date: December 2023

You have access to the following functions:
{"name": "get_weather", "parameters": {"location": {"type": "string"}}}

<|eot_id|><|start_header_id|>user<|end_header_id|>

巴黎天气怎么样？
<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{"name": "get_weather", "parameters": {"location": "Paris, France"}}
<|eot_id|><|start_header_id|>ipython<|end_header_id|>

{"temperature": "25°C"}
<|eot_id|><|start_header_id|>assistant<|end_header_id|>

巴黎现在气温 25°C。

内置 Python 工具调用

对于内置工具（brave_search、wolfram_alpha、code_interpreter），LLaMA 使用 Python 风格的调用：

<|python_tag|>wolfram_alpha.call(query="solve x^3 - 4x^2 + 6x - 24 = 0")<|eom_id|>

Mistral Large 2（Mistral AI）开源

Mistral 使用方括号标签来管理工具调用：

Mistral 工具调用模板
<s>[AVAILABLE_TOOLS] [
  {
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "获取当前天气",
      "parameters": {
        "type": "object",
        "properties": { "location": {"type": "string"} },
        "required": ["location"]
      }
    }
  }
][/AVAILABLE_TOOLS]

[INST] 巴黎天气怎么样？[/INST]

[TOOL_CALLS] [
  {"name": "get_weather",
   "arguments": {"location": "Paris, France"},
   "id": "call_01"}
]</s>

[TOOL_RESULTS]
{"content": "25°C, 晴", "call_id": "call_01"}
[/TOOL_RESULTS]

巴黎现在 25°C，天气晴朗。

Qwen 系列（阿里巴巴）开源

Qwen 使用类似 ChatML 的格式，工具调用通过 <tool_call> 标签实现：

Qwen ChatML 工具调用模板
<|im_start|>system
You are a helpful assistant.

# Tools

You may call one or more functions to assist.

[{"type": "function", "function": {
    "name": "get_weather",
    "description": "获取天气",
    "parameters": {
      "type": "object",
      "properties": {"location": {"type": "string"}},
      "required": ["location"]
    }
}}]<|im_end|>
<|im_start|>user
巴黎天气怎么样？<|im_end|>
<|im_start|>assistant
<tool_call>
{"name": "get_weather", "arguments": {"location": "Paris, France"}}
</tool_call><|im_end|>
<|im_start|>tool
<tool_response>
{"temperature": "25°C", "condition": "晴"}
</tool_response><|im_end|>
<|im_start|>assistant
巴黎现在气温 25°C，天气晴朗。<|im_end|>

七、模型工具调用对比总结

特性	OpenAI GPT	Claude	LLaMA 3.1	Mistral	Qwen
开源	否	否	是	是	是
调用格式	JSON Schema	tool_use block	JSON / Python	[TOOL_CALLS]	<tool_call>
工具结果角色	function_call_output	tool_result	ipython	[TOOL_RESULTS]	tool
并行调用	支持	支持	支持	支持	支持
Strict 模式	支持	—	—	—	—
内置工具	web_search 等	—	brave_search 等	—	code_interpreter

八、总结

Function Calling 是 LLM 从"文本生成器"进化为"智能代理"的关键能力。它的本质是一种结构化的协议—— 模型通过生成符合特定格式的 JSON 来表达"我需要调用某个工具"，应用程序负责实际执行并将结果返回。

不同模型在实现细节上有所差异（特殊 token、标签格式、角色命名等），但核心思想一致：让模型能够感知工具的存在、决定何时调用、生成正确的参数，并利用返回结果完成任务。

随着 Agent 框架的发展，Function Calling 正在成为构建 AI 应用的基础设施。理解它的工作原理，对于开发基于 LLM 的应用至关重要。