Highlights
Lists (6)
Sort Name ascending (A-Z)
Stars
- All languages
- Adblock Filter List
- Assembly
- AutoHotkey
- AutoIt
- Batchfile
- C
- C#
- C++
- CMake
- CSS
- Clojure
- Cuda
- Cython
- Dart
- Dockerfile
- Eagle
- Elixir
- Emacs Lisp
- G-code
- GDScript
- Go
- HTML
- Haskell
- IDL
- Inno Setup
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Markdown
- Mojo
- NSIS
- Nix
- OpenEdge ABL
- PHP
- Pascal
- Perl
- PostScript
- PowerShell
- Python
- R
- Ruby
- Rust
- SCSS
- Scala
- Shell
- Solidity
- Svelte
- Swift
- TSQL
- TeX
- TypeScript
- Vim Script
- Visual Basic .NET
- Vue
- XSLT
- Zig
Analyze computation-communication overlap in V3/R1.
DeepEP: an efficient expert-parallel communication library
Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Wan: Open and Advanced Large-Scale Video Generative Models
PyTorch implementation of FractalGen https://arxiv.org/abs/2502.17437
An obsidian template vault for tracking your academic life.
Cost-efficient and pluggable Infrastructure components for GenAI inference
A token pruning method that accelerates ViTs for various tasks while maintaining high performance.
[Oral; Neurips OPT2024 ] μLO: Compute-Efficient Meta-Generalization of Learned Optimizers
Official implementation of the paper "You Do Not Fully Utilize Transformer's Representation Capacity"
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models
YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
Official code for paper: F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Consistent Gaussian Splatting
LookHere position encoding for ViTs (NeurIPS 2024)
[3DV'25] 3D Reconstruction with Spatial Memory
Efficient End2End Compiler for Mixed-Precision Deep Learning
Reproducible evaluation of NeRF and 3DGS methods
This repository provides tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive A…
📦 Repomix (formerly Repopack) is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) o…
FlashMLA: Efficient MLA Decoding Kernel for Hopper GPUs
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from scratch with ease.
MoH: Multi-Head Attention as Mixture-of-Head Attention