Skip to content

mrdavtan/Vision_Browse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vision Browser

converted_image_variation

Notes

This is a tool which allows you to navigate the web from a chat window, by taking screenshots and sending them to GPT4Vision using the OpenAI API.

The idea was to use this as an additional search tool when webscraping isn't successful. I added a websocket client to connect it to another LLM/bot.

This is a fork from https://github.com/unconv/gpt4v-browsing Props to unconventional-coding for this cool project!

What I've added: -improved chat navigation/link selection.
-added a websocket chat feature to use with other projects.
-removed the python implementations.
-works best w open browser/non-headless mode.
-chat history saved in chatlog, to be used for context later

$ npm install
$ node vision_browse.js
$ npm install
$ pip install -r requirements.txt
$ python3 vision_crawl.py

Examples

Screenshot from 2024-04-09 02-47-50

Screenshot from 2024-04-09 01-51-02

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published