ShipingGe

Focusing

Shiping ShipingGe

Focusing

PhD student at Nanjing University.

3 followers · 6 following

Guangzhou

Stars

xiaolai / time-as-a-friend

《把时间当作朋友》

2,446 992 Updated Jan 19, 2024

TimeMarker-LLM / TimeMarker

A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability

74 Updated Nov 28, 2024

shansongliu / MuMu-LLaMA

This is the official repository for M2UGen

Jupyter Notebook 455 38 Updated Dec 23, 2024

yizhilll / MERT

Official implementation of the paper "Acoustic Music Understanding Model with Large-Scale Self-supervised Training".

Python 321 18 Updated Apr 20, 2024

shansongliu / MU-LLaMA

MU-LLaMA: Music Understanding Large Language Model

Python 247 17 Updated Mar 25, 2024

schowdhury671 / meerkat

Python 20 2 Updated Oct 10, 2024

tangruize / NJU-EasyConnect-Script

Scripts to optimize NJU EasyConnect client routing rules.

Shell 61 10 Updated Jul 25, 2024

NiuTrans / MTBook

《机器翻译：基础与模型》肖桐朱靖波著 - Machine Translation: Foundations and Models

TeX 2,736 760 Updated Sep 14, 2024

huangzongheng / MATHM

code for paper "Modality-Aware Triplet Hard Mining for Zero-shot Sketch-Based Image Retrieval"

Python 7 1 Updated Mar 4, 2022

OpenGVLab / unmasked_teacher

[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models

Python 306 17 Updated May 27, 2024

githubvpn007 / Clash-for-Mac

Clash for Windows for Mac，Clash for Windows for Mac教程，Clash for Windows for Mac配置说明，Clash for Mac

221 32 Updated Nov 23, 2024

TencentARC / ST-LLM

[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"

Python 132 4 Updated Sep 10, 2024

mbzuai-oryx / CVRR-Evaluation-Suite

Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs".

Python 44 4 Updated Aug 23, 2024

minghangz / cnm

Weakly Supervised Video Moment Localisation with Contrastive Negative Sample Mining

Python 25 4 Updated Apr 4, 2022

WHB139426 / Grounded-Video-LLM

Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models

Python 75 4 Updated Dec 15, 2024

gyxxyg / TRACE

[Preprint] TRACE: Temporal Grounding Video LLM via Casual Event Modeling

Python 53 Updated Nov 8, 2024

gyxxyg / VTG-LLM

[AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

Python 80 1 Updated Dec 10, 2024

daixiangzi / Awesome-Token-Compress

A paper list of some recent works about Token Compress for Vit and VLM

243 12 Updated Dec 23, 2024

IVGSZ / Flash-VStream

This is the official implementation of "Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams"

Python 141 10 Updated Dec 24, 2024

xiaobai1217 / Awesome-Video-Datasets

Video datasets

1,258 96 Updated Mar 8, 2023

linto-ai / whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Python 2,135 163 Updated Dec 6, 2024

THUDM / CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 2,178 148 Updated Sep 3, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 20,860 2,301 Updated Aug 12, 2024

RupertLuo / Valley

The official repository of "Video assistant towards large language model makes everything easy"

Python 211 14 Updated Dec 24, 2024

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 3,828 235 Updated Dec 4, 2024

OpenGVLab / Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Python 3,122 253 Updated Nov 26, 2024

DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 2,854 265 Updated Jun 4, 2024

rese1f / MovieChat

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

Python 555 42 Updated Dec 18, 2024

RenShuhuai-Andy / TimeChat

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Python 317 27 Updated Nov 19, 2024

magic-research / PLLaVA

Official repository for the paper PLLaVA

Python 621 44 Updated Jul 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly