-
Stevens Institute of Technology
- Hoboken, New Jersey, USA
-
18:44
(UTC -05:00) - hanfeiyu.github.io
Highlights
- Pro
Stars
ROguelike Toolkit in JavaScript. Cool dungeon-related stuff, interactive manual, documentation, tests!
Next-generation datacenter OS built on kernel bypass to speed up unmodified code while improving platform density and security
User documentation for Knative components.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
Kata Containers is an open source project and community working to build a standard implementation of lightweight Virtual Machines (VMs) that feel and perform like containers, but provide the workl…
Reproduce of Pre-warming is Not Enough (SoCC'24)
PyTorch implementation of paper "Customized Multiple Clustering via Multi-Modal Subspace Proxy Learning" (NeurIPS 2024)
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
The official implementation of the paper "Demystifying the Compression of Mixture-of-Experts Through a Unified Framework".
Bayesian Modeling and Probabilistic Programming in Python
[ICLR 2024] SWE-bench: Can Language Models Resolve Real-world Github Issues?
Feature-rich and easy-to-use Jekyll website template for academic courses
This project aims to collect the latest "call for reviewers" links from various top CS/ML/AI conferences/journals
allRank is a framework for training learning-to-rank neural models based on PyTorch.
Simple, safe way to store and distribute tensors
PyTorch library for cost-effective, fast and easy serving of MoE models.
ALPS: An Adaptive Learning, Priority OS Scheduler for Serverless Functions (USENIX ATC'24)
Serverless LLM Serving for Everyone.
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
A large-scale simulation framework for LLM inference
cluster data collected from production clusters in Alibaba for cluster management research
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
A low-latency & high-throughput serving engine for LLMs
Development repository for the Triton language and compiler
A collection of AWESOME things about mixture-of-experts