Stars
nnScaler: Compiling DNN models for Parallel Training
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS'24)
A Easy-to-understand TensorOp Matmul Tutorial
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
r2e: turn any github repository into a programming agent environment
Hardware Acceleration of Long Read Pairwise Overlapping in Genome Sequencing: Open Source Repository
MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)
ICCAD'23 Best Paper Award candidate: Robust GNN-based Representation Learning for HLS
Allo: A Programming Model for Composable Accelerator Design
A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture
Python Sorted Container Types: Sorted List, Sorted Dict, and Sorted Set
DAC'22 paper: "Automated Accelerator Optimization Aided by Graph Neural Networks"
A playbook for systematically maximizing the performance of deep learning models.
Code base for OOPSLA'24 paper: UniSparse: An Intermediate Language for General Sparse Format Customization
An analytical framework that models hardware dataflow of tensor applications on spatial architectures using the relation-centric notation.