Skip to content
View huang-baixin's full-sized avatar

Block or report huang-baixin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
huang-baixin/README.md

Hi I'm Baixin ๐Ÿ‘‹

  • ๐Ÿ“ Iโ€™m currently working on LLM-inference
  • ๐Ÿ’ป Iโ€™m currently learning AI-Infra

Pinned Loading

  1. llama.cpp llama.cpp Public

    Forked from ggml-org/llama.cpp

    LLM inference in C/C++

    C++

  2. Awesome-LLM-Inference Awesome-LLM-Inference Public

    Forked from xlite-dev/Awesome-LLM-Inference

    ๐Ÿ“–A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

  3. EAGLE EAGLE Public

    Forked from SafeAILab/EAGLE

    Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)

    Python

  4. cuda_practices cuda_practices Public

    Cuda

  5. how-to-optim-algorithm-in-cuda how-to-optim-algorithm-in-cuda Public

    Forked from BBuf/how-to-optim-algorithm-in-cuda

    how to optimize some algorithm in cuda.

    Cuda