Ren-70637

Follow

任天昊 Ren-70637

Follow

Master of computer science, the university of Sydney🤗

2 followers · 43 following

Sydney
19:29 (UTC +11:00)

Lists (1)

Sort

🔮 Future ideas

Stars

xinzhel / LLM-Agent-Survey

Survey on LLM Agents (Published on CoLing 2025)

195 9 Updated Mar 19, 2025

TianxingChen / Paper-List-For-EmbodiedAI

A paper list for Robotics / Embodied AI - Tianxing Chen

92 2 Updated Jan 8, 2025

StarCycle / Awesome-Embodied-AI-Job

DeepTimber Robotics Talent Call | DeepTimber社区具身智能招贤榜 | A list for Embodied AI / Robotics Jobs (PhD, RA, intern, full-time, etc

540 10 Updated Apr 3, 2025

gyxxyg / TRACE

[ICLR 2025] TRACE: Temporal Grounding Video LLM via Casual Event Modeling

Python 78 Updated Jan 23, 2025

Breakthrough / PySceneDetect

🎥 Python and OpenCV-based scene cut/transition detection program & library.

Python 3,738 428 Updated Mar 28, 2025

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 21,012 2,615 Updated Mar 4, 2025

X-PLUG / mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family

Python 2,446 180 Updated Apr 2, 2025

Vinit-R-Iyer / Study_Materials

I have always had a weird knack of writing notes for each and every topic I did. This is a repository dedicated to those notes. Please feel free to use them and pass it on to those who you think mi…

41 3 Updated Jun 12, 2024

EasonXiao-888 / UVCOM

[CVPR 2024] Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection

Python 86 5 Updated Jul 17, 2024

fletcherjiang / LLMEPET

[MM'24 Oral] Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval

Python 124 11 Updated Aug 23, 2024

ChocoWu / SeTok

Codes for Paper: Towards Semantic Equivalence of Tokenization in Multimodal LLM

53 Updated Oct 8, 2024

SkyworkAI / Vitron

NeurIPS 2024 Paper: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Python 519 30 Updated Oct 20, 2024

EgoAlpha / prompt-in-context-learning

Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.

Jupyter Notebook 1,568 95 Updated Dec 26, 2024

RenShuhuai-Andy / TimeChat

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Python 354 32 Updated Nov 19, 2024

mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…

Python 1,327 111 Updated Mar 29, 2025

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 22,073 2,424 Updated Aug 12, 2024

jpthu17 / DiffusionRet

[ICCV 2023] DiffusionRet: Generative Text-Video Retrieval with Diffusion Model

Python 129 7 Updated Apr 9, 2024

m-bain / frozen-in-time

Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]

Python 360 44 Updated May 19, 2022

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,727 436 Updated Aug 7, 2024

ruili33 / TPO

Python 30 1 Updated Jan 24, 2025

PKU-YuanGroup / Chat-UniVi

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

Python 930 45 Updated Oct 16, 2024

52CV / CVPR-2024-Papers

996 55 Updated Jun 27, 2024

kohya-ss / sd-scripts

Python 5,959 974 Updated Mar 31, 2025

Con6924 / SPM

Official implementation of paper "One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications".

Python 138 12 Updated Dec 28, 2023

yongliang-wu / DoCo

[AAAI2025] Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient

Python 28 Updated Mar 28, 2025

franciscoliu / Awesome-GenAI-Unlearning

123 11 Updated Mar 31, 2025

Hanqer / Evaluate-SOD

A One-key fast evaluation on saliency object detection with GPU implementation including MAE, Max F-measure, S-measure, E-measure.

Python 68 21 Updated Jul 11, 2020

mit-han-lab / nunchaku

[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Cuda 1,075 72 Updated Apr 1, 2025

xuyang-liu16 / Awesome-Generation-Acceleration

📚 Collection of awesome generation acceleration resources.

185 4 Updated Mar 28, 2025

rese1f / MovieChat

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

Python 606 42 Updated Jan 29, 2025