Stars
A curated list of papers & resources linked to data poisoning, backdoor attacks and defenses against them (no longer maintained)
High-Performance Symbolic Regression in Python and Julia
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
A generative world for general-purpose robotics & embodied AI learning.
CoTracker is a model for tracking any point (pixel) on a video.
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
Janus-Series: Unified Multimodal Understanding and Generation Models
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
High-resolution models for human tasks.
[NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.
MoVQGAN - model for the image encoding and reconstruction
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
Code for 3D-LLM: Injecting the 3D World into Large Language Models
Research code for ACL2024 paper: "Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline"
[NeurIPS 2023] Scalable 3D Captioning with Pretrained Models
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation
Code repository for T2V-Turbo and T2V-Turbo-v2
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
[IJCV 2024] LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models