Vision Browser

Notes

This is a tool which allows you to navigate the web from a chat window, by taking screenshots and sending them to GPT4Vision using the OpenAI API.

The idea was to use this as an additional search tool when webscraping isn't successful. I added a websocket client to connect it to another LLM/bot.

This is a fork from https://github.com/unconv/gpt4v-browsing Props to unconventional-coding for this cool project!

What I've added: -improved chat navigation/link selection.
-added a websocket chat feature to use with other projects.
-removed the python implementations.
-works best w open browser/non-headless mode.
-chat history saved in chatlog, to be used for context later

$ npm install
$ node vision_browse.js

$ npm install
$ pip install -r requirements.txt
$ python3 vision_crawl.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Vision Browser

Notes

Examples

Files

README.md

Latest commit

History

README.md

File metadata and controls

Vision Browser

Notes

Examples