- ๐ Iโm currently working on LLM-inference
- ๐ป Iโm currently learning AI-Infra
Pinned Loading
-
-
Awesome-LLM-Inference
Awesome-LLM-Inference PublicForked from xlite-dev/Awesome-LLM-Inference
๐A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
-
EAGLE
EAGLE PublicForked from SafeAILab/EAGLE
Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)
Python
-
-
how-to-optim-algorithm-in-cuda
how-to-optim-algorithm-in-cuda PublicForked from BBuf/how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
Cuda
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.