-
CUDATutorial Public
Forked from PaddleJitLab/CUDATutorialA self-learning tutorail for CUDA High Performance Programing.
JavaScript Apache License 2.0 UpdatedDec 17, 2024 -
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedAug 26, 2024 -
pytorch_dlprim Public
Forked from artyom-beilis/pytorch_dlprimDLPrimitives/OpenCL out of tree backend for pytorch
C++ MIT License UpdatedAug 25, 2024 -
-
onnxruntime Public
Forked from microsoft/onnxruntimeONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
C++ MIT License UpdatedMar 13, 2024 -
DeepLearningSystem Public
Forked from chenzomi12/AISystemDeep Learning System core principles introduction.
Jupyter Notebook Apache License 2.0 UpdatedFeb 22, 2024 -
xla Public
Forked from openxla/xlaA machine learning compiler for GPUs, CPUs, and ML accelerators
C++ Apache License 2.0 UpdatedDec 18, 2023 -
llvm-project Public
Forked from llvm/llvm-projectThe LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
Other UpdatedDec 16, 2023 -
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
C++ MIT License UpdatedDec 15, 2023 -
TIM-VX Public
Forked from VeriSilicon/TIM-VXVerisilicon Tensor Interface Module
C MIT License UpdatedOct 11, 2023 -
-
openmlsys-zh Public
Forked from openmlsys/openmlsys-zh《Machine Learning Systems: Design and Implementation》- Chinese Version
TeX UpdatedNov 3, 2022 -
TensorRT-Developer_Guide_in_Chinese Public
Forked from HeKun-NVIDIA/TensorRT-Developer_Guide_in_ChineseUpdatedMay 11, 2022 -