ToolUserAgent
This agent is designed to use the available Tool
s to answer the user's
question. The decisions like "which tool to use" and "how to use it" are made by
the large language model you provide in the chatllm
parameter.
It runs the following loop internally:
- (Decision 1) Choose a tool that could best answer the current question
- (Decision 2) Decide how to use the chosen tool
- Ask for human confirmation showing them the above decisions
- Run the tool with the chosen arguments and observe its result
- (Decision 3) Decide whether to ask itself another question (and continue the loop) or to return the output of the last tool run to the user
The loop runs for a maximum of max_steps
steps or max_duration
seconds
(whichever happens first).
Since ToolUserAgent
is a subclass of Tool
, it can be used as a tool for
another ToolUserAgent
.
Parameters
chatllm
(ChatLLM
): The chatllm that is used to - choose the appropriate tool, choose the tool arguments, and choose the next step.tools
(List[Tool
]): The tools available to the agent.max_steps
(int, optional): The maximum number of steps the agent can take. Defaults to 10.max_duration
(int, optional): The maximum duration (in seconds) the agent is allowed to run. Defaults to 60.
Returns
The return value is of the type: ToolReturn
(Learn more about:
ToolReturn) with the following output and
exit_code:
output
(str): Output of the last tool run.exit_code
(int): 0 if success, 1 if failure.
Usage
Using the ToolUserAgent
is as simple as giving it the tools and asking it a
question. It will then run in a loop, using tools, observing their results until
your question is answered.
What is important to note is the steps the agent is taking in order to answer the question. To understand that, let us walk though the events that get published when an iteration of the agent's loop is run.
In the following example we are giving the ToolUserAgent
access to the
PythonInterpreterTool
and TerminalTool
(Learn more about
PythonInterpreterTool,
TerminalTool) and asking it to count the
number of lines in the current file.
Note that the order in which you provide the tools might have an affect on the agent's choice. This is due to the fact that an LLM gives more priority to the tokens that appear earlier in the prompt.
import asyncio
import os
import openai
import tiktoken
from embedia import ChatLLM, Tokenizer
from embedia.agents import ToolUserAgent
from embedia.tools import PythonInterpreterTool, TerminalTool
class OpenAITokenizer(Tokenizer):
def __init__(self):
super().__init__()
async def _tokenize(self, text):
return tiktoken.encoding_for_model("gpt-3.5-turbo").encode(text)
class OpenAIChatLLM(ChatLLM):
def __init__(self):
super().__init__(tokenizer=OpenAITokenizer())
openai.api_key = os.environ["OPENAI_API_KEY"]
async def _reply(self, prompt):
completion = await openai.ChatCompletion.acreate(
model="gpt-3.5-turbo",
temperature=0.1,
messages=[
{"role": msg.role, "content": msg.content} for msg in self.chat_history
],
)
return completion.choices[0].message.content
if __name__ == "__main__":
tool_user_agent = ToolUserAgent(
chatllm=OpenAIChatLLM(), tools=[PythonInterpreterTool(), TerminalTool()]
)
resp = asyncio.run(
tool_user_agent(
"Count the number of lines of code (without empty lines) in main.py"
)
)
Running the above code prints the following on the terminal (Explaination mentioned below):
[time: 2023-10-05T16:08:11.107019+05:30] [id: 140095190135312] [event: Tool Start]
Tool: ToolUserAgent
Args: ('Count the number of lines of code (without empty lines) in main.py',)
Kwargs: {}
[time: 2023-10-05T16:08:11.107092+05:30] [id: 140095190135312] [event: Agent Start]
Main Question:
Count the number of lines of code (without empty lines) in main.py
[time: 2023-10-05T16:08:11.224911+05:30] [id: 140095207849232] [event: ChatLLM Init]
system (40 tokens):
You're an expert in choosing the best tool for answering the user's question.
The list of tools and their descriptions will be provided to you.
Reply with the name of the chosen tool and nothing else
[time: 2023-10-05T16:08:11.256197+05:30] [id: 140095207849232] [event: ChatLLM Start]
user (42 tokens):
Question: Count the number of lines of code (without empty lines) in main.py
Tools:
Python Interpreter: Runs the provided python code in the current interpreter
Terminal: Run the provided commands in the shell
[time: 2023-10-05T16:08:12.736856+05:30] [id: 140095207849232] [event: ChatLLM End]
assistant (2 tokens):
Python Interpreter
[time: 2023-10-05T16:08:12.781312+05:30] [id: 140095673279376] [event: ChatLLM Init]
system (81 tokens):
You're an expert in choosing the function arguments based on the user's question.
The question, the function description, the list of parameters and their descriptions
will be provided to you. Reply with all arguments in the following json format:
{
<parameter name>: <argument value>,
<parameter name>: <argument value>,
<parameter name>: <argument value>,
}
Do not reply with anything else
[time: 2023-10-05T16:08:12.825028+05:30] [id: 140095673279376] [event: ChatLLM Start]
user (73 tokens):
Question: Count the number of lines of code (without empty lines) in main.py
Function: Runs the provided python code in the current interpreter
Parameters:
code: Python code to be run (type: str)
vars: A python dictionary containing variables to be passed to the code
timeout: Timeout in seconds (type: int). Defaults to 60.
[time: 2023-10-05T16:08:17.067839+05:30] [id: 140095673279376] [event: ChatLLM End]
assistant (74 tokens):
{
"code": "with open('main.py', 'r') as file:\n lines = file.readlines()\n non_empty_lines = [line for line in lines if line.strip() != '']\n line_count = len(non_empty_lines)\n print(line_count)",
"vars": {},
"timeout": 60
}
Tool: ToolUserAgent
Details: {'tool': 'Python Interpreter', 'args': {'code': "with open('main.py', 'r') as file:\n lines = file.readlines()\n non_empty_lines = [line for line in lines if line.strip() != '']\n line_count = len(non_empty_lines)\n print(line_count)", 'vars': {}, 'timeout': 60}} Confirm (y/n): y
[time: 2023-10-05T16:08:31.952483+05:30] [id: 140095732346320] [event: Tool Start]
Tool: PythonInterpreterTool
Args: ()
Kwargs: {'code': "with open('main.py', 'r') as file:\n lines = file.readlines()\n non_empty_lines = [line for line in lines if line.strip() != '']\n line_count = len(non_empty_lines)\n print(line_count)", 'vars': {}, 'timeout': 60}
[time: 2023-10-05T16:08:31.963782+05:30] [id: 140095732346320] [event: Tool End]
Tool: PythonInterpreterTool
ExitCode: 0
Output:
34
[time: 2023-10-05T16:08:32.025006+05:30] [id: 140095190135312] [event: Agent Step]
Question: Count the number of lines of code (without empty lines) in main.py
ToolChoice: Python Interpreter
ToolArgs: {'code': "with open('main.py', 'r') as file:\n lines = file.readlines()\n non_empty_lines = [line for line in lines if line.strip() != '']\n line_count = len(non_empty_lines)\n print(line_count)", 'vars': {}, 'timeout': 60}
ToolExitCode: 0
ToolOutput: 34
[time: 2023-10-05T16:08:32.058926+05:30] [id: 140095190069328] [event: ChatLLM Init]
system (113 tokens):
You're an expert in deciding what next question should be asked (if any) to reach the final answer.
Your question will be acted upon and its result will be provided to you. This will repeat until we reach the final answer.
The main question, actions taken till now and their results will be provided to you.
If we've reached the final answer, reply with the answer in the following format:
Final Answer: <final answer>
If not, reply with the next question in the following format:
Question: <next question>
Do not reply with anything else
[time: 2023-10-05T16:08:32.094455+05:30] [id: 140095190069328] [event: ChatLLM Start]
user (42 tokens):
Main question: Count the number of lines of code (without empty lines) in main.py
Question: Count the number of lines of code (without empty lines) in main.py
Output: 34
[time: 2023-10-05T16:08:33.368412+05:30] [id: 140095190069328] [event: ChatLLM End]
assistant (5 tokens):
Final Answer: 34
[time: 2023-10-05T16:08:33.419216+05:30] [id: 140095190135312] [event: Agent End]
Final Answer:
34
[time: 2023-10-05T16:08:33.452890+05:30] [id: 140095190135312] [event: Tool End]
Tool: ToolUserAgent
ExitCode: 0
Output:
34
Let us go through the above output step by step:
(Line no 1-4) Since the agent is a subclass of Tool
, the ToolStart
event
gets published first, just like every other Tool
in Embedia.
(Line no 6-8) Then the AgentStart
event gets published where the user's
question is printed as the "Main Question".
Decision 1: Which tool to use?
Now, the first decision needs to be taken: "Which tool to use?".
(Line no 10-14) To do that, a copy of the provided ChatLLM
instance is created
and assigned to it the system prompt mentioned in Persona.ToolChooser
(Learn
more about: Persona)). This publishes the
ChatLLMInit
event.
(Line no 16-18) The above created ChatLLM
instance is then given the current
"Question" it needs to answer, and the tools it needs to choose from. This
publishes the ChatLLMStart
event.
(Line no 24-26) The ChatLLM
instance then replies with the name of the chosen
tool. This publishes the ChatLLMEnd
event.
If the ChatLLM
instance replies with a tool that is not in the list of
available tools, the agent will raise a AgentError
.
Decision 2: How to use the tool?
After the tool is chosen, the next decision that needs to be taken is: "How to use the tool?".
(Line no 28-38) For this, a separate copy of the provided ChatLLM
instance is
created and assigned to it is the system prompt mentioned in
Persona.ToolArgsChooser
. This publishes another ChatLLMInit
event.
(Line no 40-47) The above created ChatLLM
instance is then given the current
"Question" it needs to answer, the description of the chosen tool, and the
parameters it needs and their descriptions. This publishes the ChatLLMStart
event.
(Line no 50-56) The ChatLLM
instance then replies with the arguments to be
passed to the chosen tool. This publishes the ChatLLMEnd
event.
If the ChatLLM
instance replies with a non-json response or a json response
does not have all the required parameters, the agent will raise a
AgentError
.
Each chosen argument is passed through python's eval
function to convert it
from a string to its actual python type before sending it to the next step. If
it fails, the string is passed as is.
Human confirmation
After the tool and its arguments are chosen, the agent asks for approval from the user.
(Line no 58-59) This is done by printing the details of the chosen tool and its
arguments and asking the user to confirm. You need to enter y
to proceed
further. Any other input is considered same as n
.
Running the tool
(Line no 61-70) After the user confirms, the chosen tool is run with the chosen
arguments. This publishes the ToolStart
and the ToolEnd
events.
(Line no 72-77) This concludes the first iteration of the loop. This publishes
the AgentStep
event. All the important datapoints from the iteration are
stored in the step_history
attribute of the ToolUserAgent
. The datapoints
include:
question
: The question that was asked in the iteration.tool_choice
: The tool that was chosen to answer the question.tool_args
: The arguments that were chosen for the tool.tool_exit_code
: Did the tool run successfully or not.tool_output
: The output of the tool.
Decision 3: Continue the loop or return?
Now, the agent needs to decide whether to continue the loop or not.
(Line no 79-88) To do that, a copy of the provided ChatLLM
instance is created
and assigned to it is the system prompt mentioned in Persona.Sys1Thinker
. This
publishes another ChatLLMInit
event.
(Line no 90-95) The above created ChatLLM
instance is then given the "Main
Question" and the actions taken till now and their results. This publishes the
ChatLLMStart
event.
(Line no 97-99) The ChatLLM
instance can take one of the two actions:
- If the answer to the "Main Question" is found, it replies with the "Final Answer".
- If the answer to the "Main Question" is not found, it replies with the next "Question" to be asked.
Either way, the ChatLLMEnd
event is published.
(Line no 101-103) If the "Final Answer" is returned by the ChatLLM
instance,
the loop is broken and the AgentEnd
event is published with the "Final Answer"
as the output.
If the "Question" is returned by the ChatLLM
instance, the loop will continue
and print lines similar to lines 10-99 again. But this time the "Main Question"
will be replaced by the "Question" returned by the ChatLLM
instance.
If the ChatLLM
reply doesnt have "Final Answer:" or "Question:" as the
starting text, the agent will raise a AgentError
.
(Line no 105-109) After the AgentEnd
event is published, the ToolEnd
event
is published with the same output.