Starred repositories
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
PerCoV2: Improved Ultra-Low Bit-Rate Perceptual Image Compression with Implicit Hierarchical Masked Image Modeling
[CVPR 2022--Oral] Restormer: Efficient Transformer for High-Resolution Image Restoration. SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.
[AAAI 2023] Adaptive Dynamic Filtering Network for Image Denoising.
🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton
[🔥 ICLR2025 Spotlight] On Disentangled Training for Nonlinear Transform in Learned Image Compression
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
🦜🔗 Build context-aware reasoning applications
Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.
Self-Supervised Speech Pre-training and Representation Learning Toolkit
Deepak Ghimire, Kilho Lee, and Seong-heum Kim, Loss-aware automatic selection of structured pruning criteria for deep neural network acceleration, Image and Vision Computing, vol. 136, p. 104745, 2…
The HierText dataset contains ~12k images from the Open Images dataset v6 with large amount of text entities. We provide word, line and paragraph level annotations.
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression
Official code for the paper "LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes".
Official implementation of `Splatter Image: Ultra-Fast Single-View 3D Reconstruction' CVPR 2024
Instant-ngp in pytorch+cuda trained with pytorch-lightning (high quality with high speed, with only few lines of legible code)
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
Instant neural graphics primitives: lightning fast NeRF and more
[CVPR2025] PyTorch-based reimplementation of CrossFlow, as proposed in 'Flowing from Words to Pixels: A Noise-Free Framework for Cross-Modality Evolution'
diffusion models tutorials