Skip to content
View kfpanda123's full-sized avatar
  • BUPT
  • Beijing, China
  • 08:21 (UTC +08:00)

Highlights

  • Pro

Block or report kfpanda123

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Artifact from "Hardware Compute Partitioning on NVIDIA GPUs". THIS IS A FORK OF BAKITAS REPO

C 18 2 Updated Dec 8, 2023

SGLang is a fast serving framework for large language models and vision language models.

Python 12,285 1,322 Updated Mar 22, 2025

📰 Must-read papers on KV Cache Compression (constantly updating 🤗).

350 8 Updated Mar 14, 2025

A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).

Python 151 12 Updated Jul 5, 2024

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 759 57 Updated Mar 22, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,057 520 Updated Mar 16, 2025

GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as…

C++ 1,265 543 Updated Feb 15, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,911 220 Updated Mar 4, 2025

Video stabilization using gyroscope data

Rust 7,186 318 Updated Mar 20, 2025

Curated collection of papers in machine learning systems

264 15 Updated Feb 28, 2025

Free and Open Source telemetry overlay application for racing simulation

Python 116 14 Updated Mar 22, 2025

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

C++ 1,277 551 Updated Mar 22, 2025

Python tool for converting files and office documents to Markdown.

Python 41,162 1,945 Updated Mar 22, 2025

A highly optimized LLM inference acceleration engine for Llama and its variants.

C++ 881 103 Updated Mar 14, 2025

Horizontal Fusion

C++ 22 8 Updated Jan 7, 2022

Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)

Cuda 721 260 Updated Aug 19, 2024

基于自定义规则的番剧采集APP,支持流媒体在线观看,支持弹幕,支持实时超分辨率。

Dart 7,565 206 Updated Mar 22, 2025

Sharing the codebase and steps for artifact evaluation/reproduction for MICRO 2024 paper

Cuda 9 Updated Sep 1, 2024

DLRover: An Automatic Distributed Deep Learning System

Python 1,374 173 Updated Mar 21, 2025

Free Images for EVE-NG and GNS3 containing routers, switches,Firewalls and other appliances, including Cisco, Fortigate, Palo Alto, Sophos and more. Master the art of networking and improve your sk…

HTML 1,174 277 Updated Mar 14, 2025

CUDA Templates for Linear Algebra Subroutines

C++ 7,151 1,174 Updated Mar 21, 2025

Penn CIS 5650 (GPU Programming and Architecture) Final Project

C++ 29 4 Updated Dec 11, 2023

User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)

TypeScript 33,548 3,199 Updated Mar 20, 2025

CUDA Kernel Benchmarking Library

Cuda 595 72 Updated Mar 12, 2025

A tool for examining GPU scheduling behavior.

Cuda 73 18 Updated Aug 17, 2024

Enable macOS HiDPI and have a native setting.

Shell 9,473 1,046 Updated Jul 3, 2024

Automatically switches between the dark and light theme of Windows 10 and Windows 11

C# 8,077 269 Updated Jan 24, 2025

很多镜像都在国外。比如 gcr 。国内下载很慢,需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。

Shell 9,179 1,113 Updated Mar 20, 2025
Next