Minimal Docker Sandbox with GPT-3.5 Execution Example (All-Hands-AI#48)

* minimal docker sandbox * make container_image as an argument (fall back to ubuntu); increase timeout to avoid return too early for long running commands; * add a minimal working (imperfect) example * fix typo * change default container name * attempt to fix "Bad file descriptor" error * handle ctrl+D * add Python gitignore * push sandbox to shared dockerhub for ease of use * move codeact example into research folder * add README for opendevin * change container image name to opendevin dockerhub * move folder; change example to a more general agent * update Message and Role * update docker sandbox to support mounting folder and switch to user with correct permission * make network as host * handle erorrs when attrs are not set yet * convert codeact agent into a compatible agent * add workspace to gitignore * make sure the agent interface adjustment works for langchain_agent
omok · Mar 21, 2024 · 2de75d4 · 2de75d4
1 parent a722f5c
commit 2de75d4
Show file tree

Hide file tree

Showing 12 changed files with 373 additions and 13 deletions.
diff --git a/.gitignore b/.gitignore
@@ -187,4 +187,4 @@ yarn-error.log*
 
 # agent
 .envrc
-agent/workspace
+/workspace
diff --git a/agenthub/__init__.py b/agenthub/__init__.py
@@ -1 +1,2 @@
 from . import langchains_agent
+from . import codeact_agent
diff --git a/agenthub/codeact_agent/README.md b/agenthub/codeact_agent/README.md
@@ -0,0 +1,21 @@
+# CodeAct-based Agent Framework
+
+This folder implements the [CodeAct idea](https://arxiv.org/abs/2402.13463) that relies on LLM to autonomously perform actions in a Bash shell. It requires more from the LLM itself: LLM needs to be capable enough to do all the stuff autonomously, instead of stuck in an infinite loop. 
+
+A minimalistic exmaple can be found at [research/codeact/examples/run_flask_server_with_bash.py](./examples/run_flask_server_with_bash.py):
+
+```bash
+mkdir workspace
+PYTHONPATH=`pwd`:$PYTHONPATH python3 opendevin/main.py -d ./workspace -c CodeActAgent -t "Please write a flask app that returns 'Hello, World\!' at the root URL, then start the app on port 5000. python3 has already been installed for you."
+```
+
+
+Example: prompts `gpt-3.5-turbo-0125` to write a flask server, install `flask` library, and start the server.
+
+<img width="951" alt="image" src="https://github.com/OpenDevin/OpenDevin/assets/38853559/325c3115-a343-4cc5-a92b-f1e5d552a077">
+
+<img width="957" alt="image" src="https://github.com/OpenDevin/OpenDevin/assets/38853559/68ad10c1-744a-4e9d-bb29-0f163d665a0a">
+
+Most of the things are working as expected, except at the end, the model did not follow the instruction to stop the interaction by outputting `<execute> exit </execute>` as instructed. 
+
+**TODO**: This should be fixable by either (1) including a complete in-context example like [this](https://github.com/xingyaoww/mint-bench/blob/main/mint/tasks/in_context_examples/reasoning/with_tool.txt), OR (2) collect some interaction data like this and fine-tune a model (like [this](https://github.com/xingyaoww/code-act), a more complex route).
diff --git a/agenthub/codeact_agent/__init__.py b/agenthub/codeact_agent/__init__.py
@@ -0,0 +1,124 @@
+import os
+import re
+import argparse
+from litellm import completion
+from termcolor import colored
+from typing import List, Dict
+
+from opendevin.agent import Agent, Message, Role
+from opendevin.sandbox.docker import DockerInteractive
+
+assert (
+    "OPENAI_API_KEY" in os.environ
+), "Please set the OPENAI_API_KEY environment variable."
+
+
+
+SYSTEM_MESSAGE = """You are a helpful assistant. You will be provided access (as root) to a bash shell to complete user-provided tasks.
+You will be able to execute commands in the bash shell, interact with the file system, install packages, and receive the output of your commands.
+
+DO NOT provide code in ```triple backticks```. Instead, you should execute bash command on behalf of the user by wrapping them with <execute> and </execute>.
+For example:
+
+You can list the files in the current directory by executing the following command:
+<execute>ls</execute>
+
+You can also install packages using pip:
+<execute> pip install numpy </execute>
+
+You can also write a block of code to a file:
+<execute>
+echo "import math
+print(math.pi)" > math.py
+</execute>
+
+When you are done, execute "exit" to close the shell and end the conversation.
+"""
+
+INVALID_INPUT_MESSAGE = (
+    "I don't understand your input. \n"
+    "If you want to execute command, please use <execute> YOUR_COMMAND_HERE </execute>.\n"
+    "If you already completed the task, please exit the shell by generating: <execute> exit </execute>."
+)
+
+
+def parse_response(response) -> str:
+    action = response.choices[0].message.content
+    if "<execute>" in action and "</execute>" not in action:
+        action += "</execute>"
+    return action
+
+
+class CodeActAgent(Agent):
+    def __init__(
+        self,
+        instruction: str,
+        workspace_dir: str,
+        model_name: str,
+        max_steps: int = 100
+    ) -> None:
+        """
+        Initializes a new instance of the CodeActAgent class.
+
+        Parameters:
+        - instruction (str): The instruction for the agent to execute.
+        - max_steps (int): The maximum number of steps to run the agent.
+        """
+        super().__init__(instruction, workspace_dir, model_name, max_steps)
+        self._history = [Message(Role.SYSTEM, SYSTEM_MESSAGE)]
+        self._history.append(Message(Role.USER, instruction))
+        self.env = DockerInteractive(workspace_dir=workspace_dir)
+        print(colored("===USER:===\n" + instruction, "green"))
+
+    def _history_to_messages(self) -> List[Dict]:
+        return [message.to_dict() for message in self._history]
+
+    def run(self) -> None:
+        """
+        Starts the execution of the assigned instruction. This method should
+        be implemented by subclasses to define the specific execution logic.
+        """
+        for _ in range(self.max_steps):
+            response = completion(
+                messages=self._history_to_messages(),
+                model=self.model_name,
+                stop=["</execute>"],
+                temperature=0.0,
+                seed=42,
+            )
+            action = parse_response(response)
+            self._history.append(Message(Role.ASSISTANT, action))
+            print(colored("===ASSISTANT:===\n" + action, "yellow"))
+
+            command = re.search(r"<execute>(.*)</execute>", action, re.DOTALL)
+            if command is not None:
+                # a command was found
+                command = command.group(1)
+                if command.strip() == "exit":
+                    print(colored("Exit received. Exiting...", "red"))
+                    break
+                # execute the code
+                observation = self.env.execute(command)
+                self._history.append(Message(Role.ASSISTANT, observation))
+                print(colored("===ENV OBSERVATION:===\n" + observation, "blue"))
+            else:
+                # we could provide a error message for the model to continue similar to
+                # https://github.com/xingyaoww/mint-bench/blob/main/mint/envs/general_env.py#L18-L23
+                observation = INVALID_INPUT_MESSAGE
+                self._history.append(Message(Role.ASSISTANT, observation))
+                print(colored("===ENV OBSERVATION:===\n" + observation, "blue"))
+
+        self.env.close()
+
+    def chat(self, message: str) -> None:
+        """
+        Optional method for interactive communication with the agent during its execution. Implementations
+        can use this method to modify the agent's behavior or state based on chat inputs.
+
+        Parameters:
+        - message (str): The chat message or command.
+        """
+        raise NotImplementedError
+
+
+Agent.register("CodeActAgent", CodeActAgent)
diff --git a/agenthub/langchains_agent/__init__.py b/agenthub/langchains_agent/__init__.py
@@ -69,6 +69,9 @@ def run(self) -> None:
         Starts the execution of the assigned instruction. This method should
         be implemented by subclasses to define the specific execution logic.
         """
+        print("Working in directory:", self.workspace_dir)
+        os.chdir(self.workspace_dir)
+
         agent = LangchainsAgentImpl(self.instruction)
         next_is_output = False
         for thought in INITIAL_THOUGHTS:

diff --git a/agenthub/langchains_agent/requirements.txt b/agenthub/langchains_agent/requirements.txt
@@ -4,3 +4,5 @@ langchain-community
 llama-index
 llama-index-vector-stores-chroma
 chromadb
+litellm
+termcolor
diff --git a/opendevin/README.md b/opendevin/README.md
@@ -0,0 +1,18 @@
+# OpenDevin Shared Abstraction and Components
+
+This is a Python package that contains all the shared abstraction (e.g., Agent) and components (e.g., sandbox, web browser, search API, selenium).
+
+## Sandbox component
+
+Run the docker-based sandbox interactive:
+
+```bash
+mkdir workspace
+python3 opendevin/sandbox/docker.py -d workspace
+```
+
+It will map `./workspace` into the docker container with the folder permission correctly adjusted for current user.
+
+Example screenshot:
+
+<img width="868" alt="image" src="https://github.com/OpenDevin/OpenDevin/assets/38853559/8dedcdee-437a-4469-870f-be29ca2b7c32">
diff --git a/opendevin/agent.py b/opendevin/agent.py
@@ -5,11 +5,11 @@
 
 
 class Role(Enum):
+    SYSTEM = "system"  # system message for LLM
     USER = "user"  # the user
     ASSISTANT = "assistant"  # the agent
     ENVIRONMENT = "environment"  # the environment (e.g., bash shell, web browser, etc.)
 
-
 @dataclass
 class Message:
     """
@@ -20,23 +20,46 @@ class Message:
     content: str
     # TODO: add more fields as needed
 
+    def to_dict(self) -> Dict:
+        """
+        Converts the message to a dictionary (OpenAI chat-completion format).
+
+        Returns:
+        - message (Dict): A dictionary representation of the message.
+        """
+        role = self.role.value
+        content = self.content
+        if self.role == Role.ENVIRONMENT:
+            content = f"Environment Observation:\n{content}"
+            role = "user"  # treat environment messages as user messages
+        return {"role": role, "content": content}
+
 
 class Agent(ABC):
     """
     This abstract base class is an general interface for an agent dedicated to
     executing a specific instruction and allowing human interaction with the
     agent during execution.
     It tracks the execution status and maintains a history of interactions.
+
+    :param instruction: The instruction for the agent to execute.
+    :param workspace_dir: The working directory for the agent.
+    :param model_name: The litellm name of the model to use for the agent.
+    :param max_steps: The maximum number of steps to run the agent.
     """
 
     _registry: Dict[str, Type['Agent']] = {}
 
     def __init__(
         self,
         instruction: str,
+        workspace_dir: str,
+        model_name: str,
         max_steps: int = 100
     ):
         self.instruction = instruction
+        self.workspace_dir = workspace_dir
+        self.model_name = model_name
         self.max_steps = max_steps
 
         self._complete = False
@@ -105,18 +128,16 @@ def register(cls, name: str, agent_cls: Type['Agent']):
         cls._registry[name] = agent_cls
 
     @classmethod
-    def create_instance(cls, name: str, instruction: str) -> 'Agent':
+    def get_cls(cls, name: str) -> Type['Agent']:
         """
-        Creates an instance of a registered agent class based on the given name.
+        Retrieves an agent class from the registry.
 
         Parameters:
-        - name (str): The name of the agent class to instantiate.
-        - instruction (str): The instruction for the new agent instance.
+        - name (str): The name of the class to retrieve
 
         Returns:
-        - An instance of the specified agent class.
+        - agent_cls (Type['Agent']): The class registered under the specified name.
         """
         if name not in cls._registry:
             raise ValueError(f"No agent class registered under '{name}'.")
-        agent_cls = cls._registry[name]
-        return agent_cls(instruction)
+        return cls._registry[name]
diff --git a/opendevin/main.py b/opendevin/main.py
@@ -9,10 +9,13 @@
     parser.add_argument("-d", "--directory", required=True, type=str, help="The working directory for the agent")
     parser.add_argument("-t", "--task", required=True, type=str, help="The task for the agent to perform")
     parser.add_argument("-c", "--agent-cls", default="LangchainsAgent", type=str, help="The agent class to use")
+    parser.add_argument("-m", "--model-name", default="gpt-3.5-turbo-0125", type=str, help="The (litellm) model name to use")
     args = parser.parse_args()
 
-    print("Working in directory:", args.directory)
-    os.chdir(args.directory)
-
-    agent = Agent.create_instance(args.agent_cls, args.task)
+    AgentCls: Agent = Agent.get_cls(args.agent_cls)
+    agent = AgentCls(
+        instruction=args.task,
+        workspace_dir=args.directory,
+        model_name=args.model_name
+    )
     agent.run()
diff --git a/opendevin/sandbox/Dockerfile b/opendevin/sandbox/Dockerfile
@@ -0,0 +1,20 @@
+FROM ubuntu:22.04
+
+# install basic packages
+RUN apt-get update && apt-get install -y \
+    curl \
+    wget \
+    git \
+    vim \
+    nano \
+    unzip \
+    zip \
+    python3 \
+    python3-pip \
+    python3-venv \
+    python3-dev \
+    build-essential \
+    && rm -rf /var/lib/apt/lists/*
+
+# docker build -f opendevin/sandbox/Dockerfile -t opendevin/sandbox:v0.1 .
+# docker push opendevin/sandbox:v0.1
-Original file line number
+Diff line change
@@ Expand Up / @@ -187,4 +187,4 @@ yarn-error.log* @@
     # agent
     .envrc
-    agent/workspace
+    /workspace
Original file line number	Diff line number	Diff line change
		@@ -1 +1,2 @@
		from . import langchains_agent
		from . import codeact_agent