This is a tool which allows you to navigate the web from a chat window, by taking screenshots and sending them to GPT4Vision using the OpenAI API.
The idea was to use this as an additional search tool when webscraping isn't successful. I added a websocket client to connect it to another LLM/bot.
This is a fork from https://github.com/unconv/gpt4v-browsing Props to unconventional-coding for this cool project!
What I've added:
-improved chat navigation/link selection.
-added a websocket chat feature to use with other projects.
-removed the python implementations.
-works best w open browser/non-headless mode.
-chat history saved in chatlog, to be used for context later
$ npm install
$ node vision_browse.js
$ npm install
$ pip install -r requirements.txt
$ python3 vision_crawl.py