-
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedSep 25, 2024 -
YHs_Sample Public
Forked from Yinghan-Li/YHs_SampleYinghan's Code Sample
Cuda GNU General Public License v3.0 UpdatedFeb 18, 2022 -
-
AutoKernel Public
Forked from OAID/AutoKernelAutoKernel 是一个简单易用,低门槛的自动算子优化工具,提高深度学习算法部署效率。
C++ Apache License 2.0 UpdatedDec 12, 2020 -
wrox-pro-cuda-c Public
Forked from kriegalex/wrox-pro-cuda-cSample code from the book "Professional CUDA C Programming"
Cuda MIT License UpdatedJul 28, 2020 -
ostep-translations Public
Forked from remzi-arpacidusseau/ostep-translationsVarious translations of OSTEP can be found here. Help the cause and contribute!
UpdatedJun 3, 2019 -
maxas Public
Forked from NervanaSystems/maxasAssembler for NVIDIA Maxwell architecture
CSS MIT License UpdatedSep 9, 2018 -
small-matrix-inverse Public
Forked from niswegmann/small-matrix-inverseSIMD optimised library for matrix inversion of 2x2, 3x3, and 4x4 matrices.
C Creative Commons Zero v1.0 Universal UpdatedFeb 2, 2016