Skip to content
View kentang-mit's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report kentang-mit

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 139 3 Updated Dec 17, 2024

A suite of image and video neural tokenizers

Jupyter Notebook 1,559 70 Updated Feb 11, 2025

HLS-based framework to accelerate the implementation of 2-D DP kernels on FPGA

C++ 7 1 Updated Dec 29, 2024

[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

Python 229 5 Updated Jan 22, 2025

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Python 3,422 206 Updated Feb 12, 2025

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Python 418 20 Updated Oct 16, 2024

A sparse attention kernel supporting mix sparse patterns

C++ 142 4 Updated Feb 13, 2025

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,291 70 Updated Sep 27, 2024
Python 145 7 Updated Jul 12, 2024

[ICML 2024] LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery

Python 63 8 Updated May 31, 2024

Code for the paper DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents, ICML 2024

Python 80 2 Updated Jun 12, 2024

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Cuda 246 24 Updated Nov 22, 2024

SEED-Voken: A Series of Powerful Visual Tokenizers

Python 834 32 Updated Feb 19, 2025

[NeurIPS 2024] Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning

Python 68 4 Updated Feb 11, 2025

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Python 310 7 Updated Nov 17, 2024

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,580 70 Updated Aug 15, 2024

Tile primitives for speedy kernels

Cuda 2,064 116 Updated Feb 22, 2025

(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis

Python 663 34 Updated Jan 21, 2025

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Jupyter Notebook 562 30 Updated Oct 6, 2024

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Python 1,502 152 Updated Oct 28, 2024

Model Compression Toolbox for Large Language Models and Diffusion Models

Python 338 26 Updated Feb 21, 2025

[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

C++ 514 31 Updated Feb 21, 2025

PyTorch emulation library for Microscaling (MX)-compatible data formats

Python 199 30 Updated Sep 23, 2024
Jupyter Notebook 917 102 Updated Apr 29, 2024

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,641 435 Updated Jan 12, 2025

Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.

Python 343 31 Updated Nov 26, 2024

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,241 281 Updated May 4, 2024

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 2,935 238 Updated Feb 10, 2025

Microsoft Collective Communication Library

C++ 335 31 Updated Sep 20, 2023

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)

Python 769 45 Updated Jul 29, 2024
Next