Skip to content

An AutoGPT agent that controls Chrome on your desktop

License

Notifications You must be signed in to change notification settings

Animesh-Barai/Chrome-GPT

 
 

Repository files navigation

🤖 Chrome-GPT: An experimental AutoGPT agent that interacts with Chrome

lint test Twitter

⚠️This is an experimental AutoGPT agent that might take incorrect actions and could lead to serious consequences. Please use it at your own discretion⚠️

Chrome-GPT is an AutoGPT experiment that utilizes Langchain and Selenium to enable an AutoGPT agent take control of an entire Chrome session. With the ability to interactively scroll, click, and input text on web pages, the AutoGPT agent can navigate and manipulate web content.

🖥️ Demo

Input Prompt: Find me a bar that can host a 20 person event near Chelsea, Manhattan evening of Apr 30th. Fill out contact us form if they have one with info: Name Richard, email [email protected].

DEMO.mov

Demo made by Richard He

🔮 Features

  • 🌎 Google search
  • 🧠 Long-term and short-term memory management
  • 🔨 Chrome actions: describe a webpage, scroll to element, click on buttons/links, input forms, switch tabs
  • 🤖 Supports multiple agent types: Zero-shot, BabyAGI and Auto-GPT
  • 🔥 (IN PROGRESS) Chrome plugin support

🧱 Known Limitations

  • There are limited web crawling features, with buttons and input fields sometimes failing to appear in prompt.
  • The response time is slow, with each action taking between 1-10 seconds to run.
  • At times, langchain agents are unable to parse GPT outputs (refer to langchain discussion: langchain-ai/langchain#4065). If you run into this, try specifying a different agent; ie: python -m chromegpt -a auto-gpt -v -t "{your request}"

Requirements

  • Chrome
  • Python >3.8
  • Install Poetry

🛠️ Setup

  1. Set up your OpenAI API Keys and add OPENAI_API_KEY env variable
  2. Install Python requirements via poetry poetry install
  3. Open a poetry shell poetry shell
  4. Run chromegpt via python -m chromegpt

You can start in you own codespace here:

Open in GitHub Codespaces

🧠 Usage

  • GPT-3.5 Usage (Default): python -m chromegpt -v -t "{your request}"
  • GPT-4 Usage (Recommended, needs GPT-4 access): python -m chromegpt -v -a auto-gpt -m gpt-4 -t "{your request}"
  • For help: python -m chromegpt --help
Usage: python -m chromegpt [OPTIONS]

  Run ChromeGPT: An AutoGPT agent that interacts with Chrome

Options:
  -t, --task TEXT                 The task to execute  [required]
  -a, --agent [auto-gpt|baby-agi|zero-shot]
                                  The agent type to use
  -m, --model TEXT                The model to use
  --headless                      Run in headless mode
  -v, --verbose                   Run in verbose mode
  --human-in-loop                 Run in human-in-loop mode, only available
                                  when using auto-gpt agent
  --help                          Show this message and exit.

Or Just update .env and

source .env && docker-compose up

⭐ Star History

Star History Chart

About

An AutoGPT agent that controls Chrome on your desktop

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.1%
  • Other 1.9%