单轮对话->多轮对话->简单agent示例

单轮对话

以下是一个很基础的调用大模型的方式，除了 model, temperature 等参数外，最重要的就是 messages，即对话记录。大模型将基于传入的对话记录生成一个新的 message 。


import os
from openai import OpenAI

default_client = OpenAI(
    api_key="xxx", 
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)


if __name__ == "__main__":

	completion = default_client.chat.completions.create(
		model="qwen-turbo",
		messages=[
			{'role': 'system', 'content': 'You are a helpful assistant.'},
			{'role': 'user', 'content': '你是谁？'}],	
		temperature=0.8,
		max_tokens=120,
		top_p=0.9,
		frequency_penalty=0.5,
		presence_penalty=0.0,
		)
	
	# print(completion.choices[0].message.content)
	print(completion.model_dump_json(indent=2))

以下是结果

{
  "id": "chatcmpl-cf215b2e-e870-9acd-90cb-e5e6a6a531b0",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "我是通义千问，阿里巴巴集团旗下的超大规模语言模型。我能够帮助你回答问题、创作文字，如写故事、公文、技术文档等，还能表达观点，玩游戏等。如果你有任何问题或需要帮助，随时可以告诉我！",
        "refusal": null,
        "role": "assistant",
        "audio": null,
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1745847448,
  "model": "qwen-turbo",
  "object": "chat.completion",
  "service_tier": null,
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 53,
    "prompt_tokens": 22,
    "total_tokens": 75,
    "completion_tokens_details": null,
    "prompt_tokens_details": {
      "audio_tokens": null,
      "cached_tokens": 0
    }
  }
}

最主要的参数是 choices[0].message.content，这是大模型返回结果的头条结果 (这并不代表这个结果是概率最大的结果，如果是采样的话，那结果就是随机采集的，当然会排除小概率的序列)

多轮对话

单轮对话仅适用于简单的或者一次性的任务，现在如果你对大模型的输出不满意，你不需要从头开始要求大模型，而是直接在以上对话的基础上 (你最初的问题/任务，大模型首次答案)，追加你的反馈信息，然后以这些 message 作为历史信息，使大模型生成新的答案。

多轮对话实际上就是不断的把大模型输出结果和新的用户输入加入对话历史，然后再将所有对话历史输入大模型，产生新的回答。因此只要维护一个对话历史，以及新 message 的加入，就可实现多轮对话。


from openai import OpenAI
from openai_client import default_client


default_chat_config = {
	"model": "qwen-turbo",
	"temperature": 0.8,
	"max_tokens": 120,
	"top_p": 0.9,
	"frequency_penalty": 0.5,
	"presence_penalty": 0.0,
}


class ChatBot:

    # 初始化函数，指定调用大模型的超参，和最基础的 system prompt
	def __init__(self, 
				openai_client:OpenAI=default_client,
				system_prompt:str="You are a helpful assistant.",
				chat_config:dict=default_chat_config
				):
		
		self.openai_client = openai_client
		self.system_prompt = system_prompt
		self.chat_config = chat_config
		
		self.messages = []
		if self.system_prompt != "":
			self.messages.append({"role": "system", "content": system_prompt})

	# 执行大模型生成，仅是将维护的对话传入大模型
	def execute(self):
		completion = self.openai_client.chat.completions.create(
			messages=self.messages,
			**self.chat_config
		)
		return completion.choices[0].message.content

	# 实际调用多轮对话的函数，每次将用户输入作为新的 user message 加入 messages，然后调用执行函数获得大模型的回答，将这个回答加入 messages
	def __call__(self, message):
		self.messages.append({"role": "user", "content": message})
		result = self.execute()
		self.messages.append({"role": "assistant", "content": result})
		return result
	
	# 清空 messages
	def clear_chat_history(self):
		self.messages = []
		if self.system_prompt != "":
			self.messages.append({"role": "system", "content": self.system_prompt})


if __name__ == "__main__":

	chatbot = ChatBot()

	print(f"[System Prompt]:\n{chatbot.system_prompt}")

	while True:

		user_input = input("[User]:\n")
		if user_input.lower() == "exit":
			break
		
		response = chatbot(user_input)
		print(f"[AI]:\n{response}")

接下来简单用下多轮对话，主要验证大模型是否能记得对话历史，这里以我提前声明我的名字再提问大模型的方式验证。

[System Prompt]:
You are a helpful assistant.
[User]:
I am john, what's your name?
[AI]:
Hello John! My name is Qwen. Nice to meet you! How can I assist you today?
[User]:
What's my name?
[AI]:
I'm sorry, but I don't have access to any personal information about you, including your name. You mentioned that you are John, but I don't have any way of verifying that. Is there anything else you'd like to share or ask? 😊

简单agent

agent 的流程是在多轮对话的基础上，加上大模型执行工具的能力，这个是由我们控制，大模型只是产生调用工具的超参。

agent 的流程参考了 ReAct 思路，需要在动作之前思考，然后执行动作 (在我的 prompt 里动作分为了正常的 action 和 answer)。如果不用 ReAct，那 agent 会不断执行 action 最后 answer。

1
2
3

问题 --> 调用工具 --> 是否可以回答(非必要显式判断) --> 回答
			A             |
			|-------------|

代码如下:

import json
import re
from chat_bot import ChatBot
from typing import Callable, Dict


def calculator(expression: str) -> str:
	try:
		return str(eval(expression))
	except Exception as e:
		return f"Calculator Error: {str(e)}"


def reverse_text(text: str) -> str:
	return text[::-1]


class Agent:
	def __init__(
		self,
		chatbot: ChatBot,
		tools: Dict[str, Callable] = None,
	):
		self.chatbot = chatbot
		self.tools = tools or {
			"calculator": calculator,
			"reverse_text": reverse_text,
		}
		self.chatbot.clear_chat_history()
		self.chatbot.messages.append({
			"role": "system",
			"content": self._build_agent_prompt()
		})
	
    # 生成system prompt (包含工具的说明、每次输出的说明、agent 执行流程)
	def _build_agent_prompt(self):
		tool_descriptions = "\n".join([
			"- calculator: evaluates math expressions. Params: {'expression': str}",
			"- reverse_text: reverses a string. Params: {'text': str}",
		])
		return f"""You are a structured reasoning agent. You can use the following tools:

{tool_descriptions}

You run in a loop of Reasoning, Action, and Execution. If you have enough information to answer the question, you should do so. Otherwise, you will carefully oberve, reason, and use one of the tools to get the information you need. 

1. Reason and Action: In this step, you will reason about the question and decide which tool to use and call one of the tools with the required parameters. You must reply using this format:

<thought>
your current reasoning
</thought>
<action>
{{"function_name": "tool_name", "function_params": {{"param1": "...", "param2": "..."}}}}
</action>

After you call a tool, you will receive an observation. 

2. Observation: It is a passive step where the output of the tool will be returned in the role of User. The observation will be in the format:

<observation>
tool execution output here
</observation>

Only use one tool at a time. Always use valid JSON for <action>. After you receive the observation, you will reason again and decide. If you can answer the question based on existing information or you sense it is a insolvable question, then go to step 3. Otherwise, if you cannot answer the question based on existing information and you still think it is a olvable question, you will go to step 1. 

3. Reason and Answer: After you receive the observation, you will reason again and decide if you can answer the question. If you can, reply with:

<thought>
your current reasoning
</thought>
<answer>
your final answer
</answer>
"""

	def __call__(self, user_input: str):

        # 用户输入
		self.chatbot.messages.append({"role": "user", "content": f"User Input: {user_input}"})

        # 死循环，当大模型输出真正答案时 break
		while True:

            # 大模型返回
			response = self.chatbot.execute()
			# print(response)

            # 解析返回结果
            # 我用类 html 格式包裹住相关内容
			# Check if it's the final answer
			final_match = re.search(r"<answer>(.*?)</answer>", response, re.DOTALL)
			if final_match:
                print(f'---\n[AI final response]:\n{response}\n---')
				self.chatbot.messages.append({"role": "assistant", "content": response})
				return final_match.group(1).strip()

			# Extract tool call
			action_match = re.search(r"<action>\n\s*(\{.*?\})\s*\n</action>", response, re.DOTALL)
			if not action_match:
				return "Invalid format: <action> block not found."

			try:
				action_dict = json.loads(action_match.group(1))
				func_name = action_dict["function_name"]
				params = action_dict["function_params"]
				assert isinstance(params, dict)
			except Exception as e:
				return f"Error parsing <action>: {e}"

			if func_name not in self.tools:
				observation = f"Unknown tool: {func_name}"
			else:
				try:
					observation = self.tools[func_name](**params)
					observation = f"tool execution results: \n{observation}"
				except Exception as e:
					observation = f"Error during tool execution: {e}"

			# 把大模型输出和工具调用结果加入对话历史
			# Append response and observation
			appending_new_ai_message = {
				"role": "assistant",
				"content": response
			}
			appending_new_tool_message = {
				"role": "user",
				"content": f"<observation>\n{observation}\n</observation>"
			}
            
            # 输出中间结果方便检查
			print(f'---\n[AI intermediate response]:\n{appending_new_ai_message["content"]}\n---')
			print(f'---\n[tool intermediate response]:\n{appending_new_tool_message["content"]}\n---')
			self.chatbot.messages.append(appending_new_ai_message)
			self.chatbot.messages.append(appending_new_tool_message)

if __name__ == "__main__":

	agent = Agent(ChatBot())

	print(agent("What is the result of 12 * 7 + 3?"))
# print(agent("Reverse the phrase 'Hello Agent'"))

以下是对话记录，注意 [...] 是我方便区分对话角色的修饰，下面的才是真正内容。

---
[AI intermediate response]:
<thought>
To solve this problem, I need to use the calculator tool to evaluate the expression.
</thought>
<action>
{"function_name": "calculator", "function_params": {"expression": "12 * 7 + 3"}}
</action>
---
---
[tool intermediate response]:
<observation>
tool execution results: 
87
</observation>
---
---
[AI final response]:
<thought>
The calculator tool returned the result of the expression 12 * 7 + 3 as 87.
</thought>
<answer>
The result of 12 * 7 + 3 is 87.
</answer>
---
The result of 12 * 7 + 3 is 87.

在用户输出的基础上，大模型输出格式化的结构，以非 ReAct 的思路来讲，每次检查大模型输出是 action 还是 answer。

如果是 action，那解析为函数名和函数超参，接下来执行这个函数，并将其结果作为 observation 返回给大模型 (这里是以 user 身份告知大模型)。这一步 (生成工具参数 + 返回工具调用结果) 可能会执行多次，知道大模型觉得可以回答问题了。
如果是 answer，那将回答返回给用户，这就是 agent 流程的最后一步。

注意 ReAct 思路只是为了在 action 之前显式的输出中间思考过程。