Stars
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation
Tensors and Dynamic neural networks in Python with strong GPU acceleration
GPT as a Monte Carlo Language Tree: A Probabilistic Perspective
[CVPR 2025🔥] Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model
Official Repository for the Uni-Mol Series Methods
LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
[NeurIPS 2024 D&B Spotlight🔥] ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Machine Learning models for in vitro enzyme kinetic parameter prediction
Saprot: Protein Language Model with Structural Alphabet (AA+3Di)
⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
Code for the ProteinMPNN paper
A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models!
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
[TPAMI 2025🔥] MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
本项目用于Multimodal领域新手的学习路线,包括该领域的经典论文,项目及课程。旨在希望学习者在一定的时间内达到对这个领域有较为深刻的认知,能够自己进行的独立研究。
Making Protein folding accessible to all!
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Mixture-of-Experts for Large Vision-Language Models