Skip to content
View fanyangCS's full-sized avatar

Block or report fanyangCS

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Python 539 39 Updated Feb 14, 2025

We invite you to visit and follow our new repository at https://github.com/microsoft/TileFusion. TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstra…

C++ 176 11 Updated Jan 28, 2025
Python 23 4 Updated Dec 21, 2024

A beautiful, simple, clean, and responsive Jekyll theme for academics

HTML 12,363 11,631 Updated Mar 2, 2025

[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable

Python 146 8 Updated Sep 21, 2024

nnScaler: Compiling DNN models for Parallel Training

Python 98 13 Updated Feb 14, 2025

Low-bit LLM inference on CPU with lookup table

C++ 691 54 Updated Jan 9, 2025

MSVBASE is a system that efficiently supports complex queries of both approximate similarity search and relational operators. It integrates high-dimensional vector indices into PostgreSQL, a relati…

C++ 90 12 Updated Nov 19, 2024
Python 136 12 Updated Jul 22, 2024
Python 73 1 Updated Feb 22, 2023

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

Python 14,139 1,820 Updated Jul 3, 2024

A unified 3D Transformer Pipeline for visual synthesis

2,809 164 Updated May 29, 2023

Tutel MoE: An Optimized Mixture-of-Experts Implementation

Python 779 96 Updated Mar 7, 2025

A validation and profiling tool for AI infrastructure

Python 300 63 Updated Mar 5, 2025

System for AI Education Resource.

Python 3,875 483 Updated Oct 25, 2024

An experimental parallel training platform

54 14 Updated Mar 25, 2024

Antares: an automatic engine for multi-platform kernel generation and optimization. Supporting CPU, CUDA, ROCm, DirectX12, GraphCore, SYCL for CPU/GPU, OpenCL for AMD/NVIDIA, Android CPU/GPU backends.

C++ 456 48 Updated Feb 19, 2025

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

C++ 978 163 Updated Sep 19, 2024

A decoupled transaction component providing transaction processing for applications

C++ 6 2 Updated Jul 15, 2020

Resource scheduling and cluster management for AI

JavaScript 2,650 549 Updated Jun 6, 2024

OpenPAI SDK

TypeScript 19 16 Updated Dec 10, 2022

Extension to connect OpenPAI clusters, submit AI jobs, simulate jobs locally, manage files, and so on.

TypeScript 14 5 Updated Dec 10, 2022

A marketplace which stores examples and job templates of openpai. Users could use openpaimarketplace to share their jobs or run-and-learn others' sharing job.

JavaScript 33 21 Updated Dec 13, 2022

Runtime for deep learning workload

Python 20 16 Updated May 24, 2022

Kubernetes Scheduler for Deep Learning

Go 260 39 Updated May 22, 2022

Resource scheduling and cluster management for AI

Java 2 Updated Jan 15, 2019

A distributed approximate nearest neighborhood search (ANN) library which provides a high quality vector index build, search and distributed online serving toolkits for large scale vector search sc…

C++ 4,877 584 Updated Feb 17, 2025

General-Purpose Kubernetes Pod Controller

Go 174 44 Updated Apr 4, 2023

High performance container overlay networks on Linux. Enabling RDMA (on both InfiniBand and RoCE) and accelerating TCP to bare metal performance. Freeflow requires zero modification on application …

C 618 93 Updated Jun 12, 2023
Next