-
Shenzhen Institutes of Advanced Technology, CAS
- Shenzhen, China
Stars
Wan: Open and Advanced Large-Scale Video Generative Models
Enjoy the magic of Diffusion models!
[CVPR 2025] DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"
[CVPR 2025🔥] Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model
[CVPR 2025] Assessing and Learning Alignment of Unimodal Vision and Language Models
assistant tools for attention visualization in deep learning
Helpful tools and examples for working with flex-attention
An open source implementation of CLIP.
[ICLR 2025 Spotlight] Official Implementation for ToST (Token Statistics Transformer)
Memory-optimized training library for diffusion models
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling.
[CVPR 2025] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
🦜🔗 Build context-aware reasoning applications
Pytorch library for fast transformer implementations
[ICLR 2025] Rectified Diffusion: Straightness Is Not Your Need
Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
Neighborhood Attention Extension. Bringing attention to a neighborhood near you!
[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
Investigating CoT Reasoning in Autoregressive Image Generation
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
PyTorch Implementation of "Resource Efficient 3D Convolutional Neural Networks", codes and pretrained models.
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
VideoVAE+: Large Motion Video Autoencoding with Cross-modal Video VAE
Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".
A generative world for general-purpose robotics & embodied AI learning.