Skip to content
View dongheuw's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@uwdb @facebookexternal @fairinternal

Block or report dongheuw

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,626 266 Updated Apr 14, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,723 289 Updated Mar 10, 2025

Analyze computation-communication overlap in V3/R1.

995 141 Updated Mar 21, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Python 5,220 563 Updated Apr 16, 2025

FlashMLA: Efficient MLA decoding kernels

C++ 11,439 822 Updated Mar 1, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,442 712 Updated Apr 16, 2025

PyTorch extensions for high performance and large scale training.

Python 3,299 290 Updated Apr 8, 2025

Ongoing research training transformer models at scale

Python 12,104 2,715 Updated Apr 17, 2025

A PyTorch native library for large-scale model training

Python 3,600 335 Updated Apr 17, 2025

VOCAL-UDF: Self-Enhancing Video Data Management System for Compositional Events with Large Language Models

Jupyter Notebook 6 Updated Feb 10, 2025

🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSy…

2,865 323 Updated Aug 14, 2024

EB1A Full Application - I-140 and I-485

TeX 283 115 Updated Nov 20, 2023

Fast and memory-efficient exact attention

Python 16,927 1,611 Updated Apr 13, 2025

Code for paper "AI for radiographic COVID-19 detection selects shortcuts over signal"

Jupyter Notebook 29 7 Updated Mar 19, 2021

[NeurIPS 2024] A task generation and model evaluation system for multimodal language models.

Python 70 3 Updated Nov 27, 2024

Class file for University of Washington thesis formatting with LaTeX.

TeX 72 61 Updated Nov 9, 2021

Inference code for Llama models

Python 58,111 9,741 Updated Jan 26, 2025

Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.

Python 296,524 49,299 Updated Dec 2, 2024

Awesome-LLM: a curated list of Large Language Model

22,769 1,890 Updated Mar 26, 2025

Snowflake dataset containing statistics for 70 million queries over 14 day period

Jupyter Notebook 111 24 Updated Sep 27, 2021

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Python 6,048 523 Updated Sep 6, 2024

LLM papers I'm reading, mostly on inference and model compression

721 34 Updated Dec 21, 2023
Python 3 Updated Sep 4, 2023

Graph Compression using Quasi-stable Coloring

Julia 6 Updated Mar 1, 2023

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 37,960 4,337 Updated Apr 17, 2025

EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions

Jupyter Notebook 6 1 Updated Jun 24, 2024
JavaScript 3 Updated Apr 7, 2023

A Zsh theme

Shell 48,844 2,279 Updated Jan 29, 2025

Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, B…

Jupyter Notebook 2,026 173 Updated Aug 15, 2024

JPEG encoder/decoder written in Python

Python 184 46 Updated Jun 26, 2018
Next