Skip to content
View lemon-little's full-sized avatar

Block or report lemon-little

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

AIinfer

7 repositories

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 42,109 5,156 Updated Feb 25, 2025

[NeurIPS'24 Spotlight, ICLR'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an …

Python 920 46 Updated Feb 25, 2025

This is a Chinese translation of the CUDA programming guide

1,434 222 Updated Nov 13, 2024

Learn CUDA Programming, published by Packt

Cuda 1,110 249 Updated Dec 30, 2023

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

PostScript 18,591 2,257 Updated Nov 13, 2024

Material for gpu-mode lectures

Jupyter Notebook 3,818 387 Updated Feb 9, 2025

CPT, SFT, DPO LLM model with QLora

Python 8 1 Updated Jan 14, 2024