GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as…

C++ 1,265 543 Updated Feb 15, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,911 220 Updated Mar 4, 2025

gyroflow / gyroflow

Video stabilization using gyroscope data

Rust 7,186 318 Updated Mar 20, 2025

deepseek-ai / DeepSeek-R1

87,155 11,254 Updated Feb 24, 2025

byungsoo-oh / ml-systems-papers

Curated collection of papers in machine learning systems

264 15 Updated Feb 28, 2025

deepseek-ai / DeepSeek-V3

Python 92,829 15,092 Updated Mar 16, 2025

s-victor / TinyPedal

Free and Open Source telemetry overlay application for racing simulation

Python 116 14 Updated Mar 22, 2025

pytorch / FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

C++ 1,277 551 Updated Mar 22, 2025

microsoft / markitdown

Python tool for converting files and office documents to Markdown.

Python 41,162 1,945 Updated Mar 22, 2025

zhihu / ZhiLight

A highly optimized LLM inference acceleration engine for Llama and its variants.

C++ 881 103 Updated Mar 14, 2025

aoli-al / HFuse

Horizontal Fusion

C++ 22 8 Updated Jan 7, 2022

olcf / cuda-training-series

Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)

Cuda 721 260 Updated Aug 19, 2024

Predidit / Kazumi

基于自定义规则的番剧采集APP，支持流媒体在线观看，支持弹幕，支持实时超分辨率。

Dart 7,565 206 Updated Mar 22, 2025

rishucoding / reproduce_MICRO24_GPU_DLRM_inference

Sharing the codebase and steps for artifact evaluation/reproduction for MICRO 2024 paper

Cuda 9 Updated Sep 1, 2024

intelligent-machine-learning / dlrover

DLRover: An Automatic Distributed Deep Learning System

Python 1,374 173 Updated Mar 21, 2025

hegdepavankumar / Cisco-Images-for-GNS3-and-EVE-NG

Free Images for EVE-NG and GNS3 containing routers, switches,Firewalls and other appliances, including Cisco, Fortigate, Palo Alto, Sophos and more. Master the art of networking and improve your sk…

HTML 1,174 277 Updated Mar 14, 2025

NVIDIA / cutlass

CUDA Templates for Linear Algebra Subroutines

C++ 7,151 1,174 Updated Mar 21, 2025

yinuotxie / Efficient-LLM-Inferencing-on-GPUs

Penn CIS 5650 (GPU Programming and Architecture) Final Project

C++ 29 4 Updated Dec 11, 2023

chatboxai / chatbox

User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)

TypeScript 33,548 3,199 Updated Mar 20, 2025

NVIDIA / nvbench

CUDA Kernel Benchmarking Library

Cuda 595 72 Updated Mar 12, 2025

yalue / cuda_scheduling_examiner_mirror

A tool for examining GPU scheduling behavior.

Cuda 73 18 Updated Aug 17, 2024

xzhih / one-key-hidpi

Enable macOS HiDPI and have a native setting.

Shell 9,473 1,046 Updated Jul 3, 2024

AutoDarkMode / Windows-Auto-Night-Mode

Automatically switches between the dark and light theme of Windows 10 and Windows 11

C# 8,077 269 Updated Jan 24, 2025

DaoCloud / public-image-mirror

很多镜像都在国外。比如 gcr 。国内下载很慢，需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。

Shell 9,179 1,113 Updated Mar 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kfpanda123

Highlights

Block or report kfpanda123

Stars

atomicapple0 / libsmctrl

sgl-project / sglang

October2001 / Awesome-KV-Cache-Compression

interestingLSY / swiftLLM

tile-ai / tilelang

deepseek-ai / DeepGEMM

gpgpu-sim / gpgpu-sim_distribution