Stars
Making large AI models cheaper, faster and more accessible
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
Open-Sora: Democratizing Efficient Video Production for All
Generative Models by Stability AI
deep learning for image processing including classification and object-detection etc.
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
PyTorch implementations of Generative Adversarial Networks.
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
🐍 Geometric Computer Vision Library for Spatial AI
A collaboration friendly studio for NeRFs
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Wan: Open and Advanced Large-Scale Video Generative Models
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
Pretrained ConvNets for pytorch: NASNet, ResNeXt, ResNet, InceptionV4, InceptionResnetV2, Xception, DPN, etc.
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
Enjoy the magic of Diffusion models!
A Collection of Variational Autoencoders (VAE) in PyTorch.
Official PyTorch implementation of StyleGAN3
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
Keras model to generate HTML code from hand-drawn website mockups. Implements an image captioning architecture to drawn source images.
Single Image to 3D using Cross-Domain Diffusion for 3D Generation
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
A collection of important graph embedding, classification and representation learning papers with implementations.
[ICLR 2024 Oral] Generative Gaussian Splatting for Efficient 3D Content Creation