Skip to content

Latest commit

 

History

History
35 lines (24 loc) · 1.25 KB

README.md

File metadata and controls

35 lines (24 loc) · 1.25 KB

Vision Browser

converted_image_variation

Notes

This is a tool which allows you to navigate the web from a chat window, by taking screenshots and sending them to GPT4Vision using the OpenAI API.

The idea was to use this as an additional search tool when webscraping isn't successful. I added a websocket client to connect it to another LLM/bot.

This is a fork from https://github.com/unconv/gpt4v-browsing Props to unconventional-coding for this cool project!

What I've added: -improved chat navigation/link selection.
-added a websocket chat feature to use with other projects.
-removed the python implementations.
-works best w open browser/non-headless mode.
-chat history saved in chatlog, to be used for context later

$ npm install
$ node vision_browse.js
$ npm install
$ pip install -r requirements.txt
$ python3 vision_crawl.py

Examples

Screenshot from 2024-04-09 02-47-50

Screenshot from 2024-04-09 01-51-02