Skip to content
View ShipingGe's full-sized avatar
:shipit:
Focusing
:shipit:
Focusing
  • Guangzhou

Block or report ShipingGe

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

《把时间当作朋友》

2,446 992 Updated Jan 19, 2024

A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability

74 Updated Nov 28, 2024

This is the official repository for M2UGen

Jupyter Notebook 455 38 Updated Dec 23, 2024

Official implementation of the paper "Acoustic Music Understanding Model with Large-Scale Self-supervised Training".

Python 321 18 Updated Apr 20, 2024

MU-LLaMA: Music Understanding Large Language Model

Python 247 17 Updated Mar 25, 2024
Python 20 2 Updated Oct 10, 2024

Scripts to optimize NJU EasyConnect client routing rules.

Shell 61 10 Updated Jul 25, 2024

《机器翻译:基础与模型》肖桐 朱靖波 著 - Machine Translation: Foundations and Models

TeX 2,736 760 Updated Sep 14, 2024

code for paper "Modality-Aware Triplet Hard Mining for Zero-shot Sketch-Based Image Retrieval"

Python 7 1 Updated Mar 4, 2022

[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models

Python 306 17 Updated May 27, 2024

Clash for Windows for Mac,Clash for Windows for Mac教程,Clash for Windows for Mac配置说明,Clash for Mac

221 32 Updated Nov 23, 2024

[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"

Python 132 4 Updated Sep 10, 2024

Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs".

Python 44 4 Updated Aug 23, 2024

Weakly Supervised Video Moment Localisation with Contrastive Negative Sample Mining

Python 25 4 Updated Apr 4, 2022

Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models

Python 75 4 Updated Dec 15, 2024

[Preprint] TRACE: Temporal Grounding Video LLM via Casual Event Modeling

Python 53 Updated Nov 8, 2024

[AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

Python 80 1 Updated Dec 10, 2024

A paper list of some recent works about Token Compress for Vit and VLM

243 12 Updated Dec 23, 2024

This is the official implementation of "Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams"

Python 141 10 Updated Dec 24, 2024

Video datasets

1,258 96 Updated Mar 8, 2023

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Python 2,135 163 Updated Dec 6, 2024

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 2,178 148 Updated Sep 3, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 20,860 2,301 Updated Aug 12, 2024

The official repository of "Video assistant towards large language model makes everything easy"

Python 211 14 Updated Dec 24, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 3,828 235 Updated Dec 4, 2024

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Python 3,122 253 Updated Nov 26, 2024

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 2,854 265 Updated Jun 4, 2024

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

Python 555 42 Updated Dec 18, 2024

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Python 317 27 Updated Nov 19, 2024

Official repository for the paper PLLaVA

Python 621 44 Updated Jul 28, 2024
Next