Stars
Thin, unified, C++-flavored wrappers for the CUDA APIs
An open-source, cross-platform terminal for seamless workflows
A shader-based software renderer written from scratch in C89
NVIDIA Linux open GPU kernel module source
SonicBOOM: The Berkeley Out-of-Order Machine
Ray tracing examples and tutorials using VK_KHR_ray_tracing
MSCCL++: A GPU-driven communication stack for scalable AI applications
Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA
Some miscellaneous OpenSHMEM examples
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
Automate browser-based workflows with LLMs and Computer Vision
TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches
Turns Data and AI algorithms into production-ready web applications in no time.
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
🤘 TT-NN operator library, and TT-Metalium low level kernel programming model.
An interactive TLS-capable intercepting HTTP proxy for penetration testers and software developers.
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
Rendering glTF scenes with ray tracer and raster (Vulkan)
chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.
Mesa is an open-source Python library for agent-based modeling, ideal for simulating complex systems and exploring emergent behaviors.
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Openai style api for open large language models, using LLMs just as chatgpt! Support for LLaMA, LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, Xverse, SqlCoder, CodeLLaMA, ChatGLM, ChatGLM2, ChatGLM3 etc.…