Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
examples		examples
README.md		README.md
vlog.jpg		vlog.jpg

Repository files navigation

🎞 VLog: Video as a Long Document

News

20/April/2023: We release our project on github and Huggingface!

To Do List

Done

Huggingface Space
LLM Reasoner: ChatGPT (multilingual) + LangChain
Vision Captioner: BLIP2 + GRIT
ASR Translator: Whisper (multilingual)
Video Segmenter: KTS

Doing

there are a lot of improvement space we are working on it

Improve Vision Models: MiniGPT-4, LLaVA, Family of Segment-anything
Replace ChatGPT with own trained LLM
Improve ASR Translator

🧸 Examples

🔨 Preparation

Please find installation instructions in install.md.

🌟 Start here

Run in cmd

python main.py --video_path "examples/demo.mp4"

The generated vlog is saved in examples/demo.log

Run in Gradio

python main_gradio.py

🙋 Suggestion

The project is stay tuned 🔥

If you have more suggestions or functions need to be implemented in this codebase, feel free to drop us an email kevin.qh.lin@gmail, [email protected] or open an issue.

😊 Acknowledgment

This work is based on ChatGPT, BLIP2, GRIT, KTS, Whisper, LangChain, Image2Paragraph.

See other wonderful Video + LLM projects: Ask-anything, Socratic Models, Vid2Seq, LaViLa.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎞 VLog: Video as a Long Document

News

To Do List

🧸 Examples

🔨 Preparation

🌟 Start here

Run in cmd

Run in Gradio

🙋 Suggestion

😊 Acknowledgment

About

Releases

Packages

Languages

License

yvonekit/VLog

Folders and files

Latest commit

History

Repository files navigation

🎞 VLog: Video as a Long Document

News

To Do List

🧸 Examples

🔨 Preparation

🌟 Start here

Run in cmd

Run in Gradio

🙋 Suggestion

😊 Acknowledgment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages