Stars
My learning notes/codes for ML SYS.
Give us minutes, we give back a faster Mamba. The official implementation of "Faster Vision Mamba is Rebuilt in Minutes via Merged Token Re-training".
This is the official repo for the paper "Accelerating Parallel Sampling of Diffusion Models" Tang et al. ICML 2024 https://openreview.net/forum?id=CjVWen8aJL
PyTorch code for hierarchical k-means -- a data curation method for self-supervised learning
Welcome to the Awesome Feature Learning in Deep Learning Thoery Reading Group! This repository serves as a collaborative platform for scholars, enthusiasts, and anyone interested in delving into th…
depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.
The official implementation of "2024NeurIPS Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation"
We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters
Lossless Training Speed Up by Unbiased Dynamic Data Pruning
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
基于达摩院在深度学习、计算机视觉、地理空间分析等方向上的技术积累,结合阿里云强大算力支撑,提供多源对地观测数据的云计算分析服务,用数据感知地球世界,让AI助力科学研究。
DAMO-YOLO: a fast and accurate object detection method with some new techs, including NAS backbones, efficient RepGFPN, ZeroHead, AlignedOTA, and distillation enhancement.
A machine learning compiler for GPUs, CPUs, and ML accelerators
A baseline repository of Auto-Parallelism in Training Neural Networks
The official implementation of ELSA: Enhanced Local Self-Attention for Vision Transformer
Transformer related optimization, including BERT, GPT
Tracking code for the winner of track 1 in the MMP-Tracking Challenge at ICCV 2021 Workshop.
BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models
[ICCV-2021] TransReID: Transformer-based Object Re-Identification
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
NanoDet-Plus⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥
1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking