ToolUserAgent

This agent is designed to use the available Tools to answer the user's question. The decisions like "which tool to use" and "how to use it" are made by the large language model you provide in the chatllm parameter.

It runs the following loop internally:

(Decision 1) Choose a tool that could best answer the current question
(Decision 2) Decide how to use the chosen tool
Ask for human confirmation showing them the above decisions
Run the tool with the chosen arguments and observe its result
(Decision 3) Decide whether to ask itself another question (and continue the loop) or to return the output of the last tool run to the user

The loop runs for a maximum of max_steps steps or max_duration seconds (whichever happens first).

ℹ️

Since ToolUserAgent is a subclass of Tool, it can be used as a tool for another ToolUserAgent.

Parameters

chatllm (ChatLLM): The chatllm that is used to - choose the appropriate tool, choose the tool arguments, and choose the next step.
tools (List[Tool]): The tools available to the agent.
max_steps (int, optional): The maximum number of steps the agent can take. Defaults to 10.
max_duration (int, optional): The maximum duration (in seconds) the agent is allowed to run. Defaults to 60.

Returns

The return value is of the type: ToolReturn (Learn more about: ToolReturn) with the following output and exit_code:

output (str): Output of the last tool run.
exit_code (int): 0 if success, 1 if failure.

Usage

Using the ToolUserAgent is as simple as giving it the tools and asking it a question. It will then run in a loop, using tools, observing their results until your question is answered.

What is important to note is the steps the agent is taking in order to answer the question. To understand that, let us walk though the events that get published when an iteration of the agent's loop is run.

In the following example we are giving the ToolUserAgent access to the PythonInterpreterTool and TerminalTool (Learn more about PythonInterpreterTool, TerminalTool) and asking it to count the number of lines in the current file.

ℹ️

Note that the order in which you provide the tools might have an affect on the agent's choice. This is due to the fact that an LLM gives more priority to the tokens that appear earlier in the prompt.

main.py

import asyncio
import os
 
import openai
import tiktoken
from embedia import ChatLLM, Tokenizer
from embedia.agents import ToolUserAgent
from embedia.tools import PythonInterpreterTool, TerminalTool
 
 
class OpenAITokenizer(Tokenizer):
    def __init__(self):
        super().__init__()
 
    async def _tokenize(self, text):
        return tiktoken.encoding_for_model("gpt-3.5-turbo").encode(text)
 
 
class OpenAIChatLLM(ChatLLM):
    def __init__(self):
        super().__init__(tokenizer=OpenAITokenizer())
        openai.api_key = os.environ["OPENAI_API_KEY"]
 
    async def _reply(self, prompt):
        completion = await openai.ChatCompletion.acreate(
            model="gpt-3.5-turbo",
            temperature=0.1,
            messages=[
                {"role": msg.role, "content": msg.content} for msg in self.chat_history
            ],
        )
        return completion.choices[0].message.content
 
 
if __name__ == "__main__":
    tool_user_agent = ToolUserAgent(
        chatllm=OpenAIChatLLM(), tools=[PythonInterpreterTool(), TerminalTool()]
    )
    resp = asyncio.run(
        tool_user_agent(
            "Count the number of lines of code (without empty lines) in main.py"
        )
    )

Running the above code prints the following on the terminal (Explaination mentioned below):

[time: 2023-10-05T16:08:11.107019+05:30] [id: 140095190135312] [event: Tool Start]
Tool: ToolUserAgent
Args: ('Count the number of lines of code (without empty lines) in main.py',)
Kwargs: {}
 
[time: 2023-10-05T16:08:11.107092+05:30] [id: 140095190135312] [event: Agent Start]
Main Question:
Count the number of lines of code (without empty lines) in main.py
 
[time: 2023-10-05T16:08:11.224911+05:30] [id: 140095207849232] [event: ChatLLM Init]
system (40 tokens):
You're an expert in choosing the best tool for answering the user's question.
The list of tools and their descriptions will be provided to you.
Reply with the name of the chosen tool and nothing else
 
[time: 2023-10-05T16:08:11.256197+05:30] [id: 140095207849232] [event: ChatLLM Start]
user (42 tokens):
Question: Count the number of lines of code (without empty lines) in main.py
 
 Tools:
Python Interpreter: Runs the provided python code in the current interpreter
Terminal: Run the provided commands in the shell
 
[time: 2023-10-05T16:08:12.736856+05:30] [id: 140095207849232] [event: ChatLLM End]
assistant (2 tokens):
Python Interpreter
 
[time: 2023-10-05T16:08:12.781312+05:30] [id: 140095673279376] [event: ChatLLM Init]
system (81 tokens):
You're an expert in choosing the function arguments based on the user's question.
The question, the function description, the list of parameters and their descriptions
will be provided to you. Reply with all arguments in the following json format:
{
    <parameter name>: <argument value>,
    <parameter name>: <argument value>,
    <parameter name>: <argument value>,
}
Do not reply with anything else
 
[time: 2023-10-05T16:08:12.825028+05:30] [id: 140095673279376] [event: ChatLLM Start]
user (73 tokens):
Question: Count the number of lines of code (without empty lines) in main.py
Function: Runs the provided python code in the current interpreter
Parameters:
code: Python code to be run (type: str)
vars: A python dictionary containing variables to be passed to the code
timeout: Timeout in seconds (type: int). Defaults to 60.
 
 
[time: 2023-10-05T16:08:17.067839+05:30] [id: 140095673279376] [event: ChatLLM End]
assistant (74 tokens):
{
    "code": "with open('main.py', 'r') as file:\n    lines = file.readlines()\n    non_empty_lines = [line for line in lines if line.strip() != '']\n    line_count = len(non_empty_lines)\n    print(line_count)",
    "vars": {},
    "timeout": 60
}
 
Tool: ToolUserAgent
Details: {'tool': 'Python Interpreter', 'args': {'code': "with open('main.py', 'r') as file:\n    lines = file.readlines()\n    non_empty_lines = [line for line in lines if line.strip() != '']\n    line_count = len(non_empty_lines)\n    print(line_count)", 'vars': {}, 'timeout': 60}} Confirm (y/n): y
 
[time: 2023-10-05T16:08:31.952483+05:30] [id: 140095732346320] [event: Tool Start]
Tool: PythonInterpreterTool
Args: ()
Kwargs: {'code': "with open('main.py', 'r') as file:\n    lines = file.readlines()\n    non_empty_lines = [line for line in lines if line.strip() != '']\n    line_count = len(non_empty_lines)\n    print(line_count)", 'vars': {}, 'timeout': 60}
 
[time: 2023-10-05T16:08:31.963782+05:30] [id: 140095732346320] [event: Tool End]
Tool: PythonInterpreterTool
ExitCode: 0
Output:
34
 
[time: 2023-10-05T16:08:32.025006+05:30] [id: 140095190135312] [event: Agent Step]
Question: Count the number of lines of code (without empty lines) in main.py
ToolChoice: Python Interpreter
ToolArgs: {'code': "with open('main.py', 'r') as file:\n    lines = file.readlines()\n    non_empty_lines = [line for line in lines if line.strip() != '']\n    line_count = len(non_empty_lines)\n    print(line_count)", 'vars': {}, 'timeout': 60}
ToolExitCode: 0
ToolOutput: 34
 
[time: 2023-10-05T16:08:32.058926+05:30] [id: 140095190069328] [event: ChatLLM Init]
system (113 tokens):
You're an expert in deciding what next question should be asked (if any) to reach the final answer.
Your question will be acted upon and its result will be provided to you. This will repeat until we reach the final answer.
The main question, actions taken till now and their results will be provided to you.
If we've reached the final answer, reply with the answer in the following format:
Final Answer: <final answer>
If not, reply with the next question in the following format:
Question: <next question>
Do not reply with anything else
 
[time: 2023-10-05T16:08:32.094455+05:30] [id: 140095190069328] [event: ChatLLM Start]
user (42 tokens):
Main question: Count the number of lines of code (without empty lines) in main.py
 
Question: Count the number of lines of code (without empty lines) in main.py
 Output: 34
 
[time: 2023-10-05T16:08:33.368412+05:30] [id: 140095190069328] [event: ChatLLM End]
assistant (5 tokens):
Final Answer: 34
 
[time: 2023-10-05T16:08:33.419216+05:30] [id: 140095190135312] [event: Agent End]
Final Answer:
 34
 
[time: 2023-10-05T16:08:33.452890+05:30] [id: 140095190135312] [event: Tool End]
Tool: ToolUserAgent
ExitCode: 0
Output:
 34

Let us go through the above output step by step:

(Line no 1-4) Since the agent is a subclass of Tool, the ToolStart event gets published first, just like every other Tool in Embedia.

(Line no 6-8) Then the AgentStart event gets published where the user's question is printed as the "Main Question".

Decision 1: Which tool to use?

Now, the first decision needs to be taken: "Which tool to use?".

(Line no 10-14) To do that, a copy of the provided ChatLLM instance is created and assigned to it the system prompt mentioned in Persona.ToolChooser (Learn more about: Persona)). This publishes the ChatLLMInit event.

(Line no 16-18) The above created ChatLLM instance is then given the current "Question" it needs to answer, and the tools it needs to choose from. This publishes the ChatLLMStart event.

(Line no 24-26) The ChatLLM instance then replies with the name of the chosen tool. This publishes the ChatLLMEnd event.

ℹ️

If the ChatLLM instance replies with a tool that is not in the list of available tools, the agent will raise a AgentError.

Decision 2: How to use the tool?

After the tool is chosen, the next decision that needs to be taken is: "How to use the tool?".

(Line no 28-38) For this, a separate copy of the provided ChatLLM instance is created and assigned to it is the system prompt mentioned in Persona.ToolArgsChooser. This publishes another ChatLLMInit event.

(Line no 40-47) The above created ChatLLM instance is then given the current "Question" it needs to answer, the description of the chosen tool, and the parameters it needs and their descriptions. This publishes the ChatLLMStart event.

(Line no 50-56) The ChatLLM instance then replies with the arguments to be passed to the chosen tool. This publishes the ChatLLMEnd event.

ℹ️

If the ChatLLM instance replies with a non-json response or a json response does not have all the required parameters, the agent will raise a AgentError.

ℹ️

Each chosen argument is passed through python's eval function to convert it from a string to its actual python type before sending it to the next step. If it fails, the string is passed as is.

Human confirmation

After the tool and its arguments are chosen, the agent asks for approval from the user.

(Line no 58-59) This is done by printing the details of the chosen tool and its arguments and asking the user to confirm. You need to enter y to proceed further. Any other input is considered same as n.

Running the tool

(Line no 61-70) After the user confirms, the chosen tool is run with the chosen arguments. This publishes the ToolStart and the ToolEnd events.

(Line no 72-77) This concludes the first iteration of the loop. This publishes the AgentStep event. All the important datapoints from the iteration are stored in the step_history attribute of the ToolUserAgent. The datapoints include:

question: The question that was asked in the iteration.
tool_choice: The tool that was chosen to answer the question.
tool_args: The arguments that were chosen for the tool.
tool_exit_code: Did the tool run successfully or not.
tool_output: The output of the tool.

Decision 3: Continue the loop or return?

Now, the agent needs to decide whether to continue the loop or not.

(Line no 79-88) To do that, a copy of the provided ChatLLM instance is created and assigned to it is the system prompt mentioned in Persona.Sys1Thinker. This publishes another ChatLLMInit event.

(Line no 90-95) The above created ChatLLM instance is then given the "Main Question" and the actions taken till now and their results. This publishes the ChatLLMStart event.

(Line no 97-99) The ChatLLM instance can take one of the two actions:

If the answer to the "Main Question" is found, it replies with the "Final Answer".
If the answer to the "Main Question" is not found, it replies with the next "Question" to be asked.

Either way, the ChatLLMEnd event is published.

(Line no 101-103) If the "Final Answer" is returned by the ChatLLM instance, the loop is broken and the AgentEnd event is published with the "Final Answer" as the output.

If the "Question" is returned by the ChatLLM instance, the loop will continue and print lines similar to lines 10-99 again. But this time the "Main Question" will be replaced by the "Question" returned by the ChatLLM instance.

ℹ️

If the ChatLLM reply doesnt have "Final Answer:" or "Question:" as the starting text, the agent will raise a AgentError.

(Line no 105-109) After the AgentEnd event is published, the ToolEnd event is published with the same output.

Try it out yourself

Overview