yifan1130

Follow

yifan1130

Follow

2 followers · 4 following

Stars

matsui528 / faiss_tips

Some useful tips for faiss

Shell 607 46 Updated Nov 2, 2023

LMCache / LMCache

Making Long-Context LLM Inference 10x Faster and 10x Cheaper

Python 305 35 Updated Dec 25, 2024

HazyResearch / aisys-building-blocks

Building blocks for foundation models.

424 17 Updated Jan 3, 2024

fanlai0990 / CS598

Systems for GenAI

74 5 Updated Nov 13, 2024

roboticcam / machine-learning-notes

My continuously updated Machine Learning, Probabilistic Models and Deep Learning notes and demos (2000+ slides) 我不间断更新的机器学习，概率模型和深度学习的讲义(2000+页)和视频链接

Jupyter Notebook 8,476 1,722 Updated Sep 29, 2024

QingruZhang / PLATON

This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).

Python 43 5 Updated Oct 17, 2022

jquesnelle / yarn

YaRN: Efficient Context Window Extension of Large Language Models

Python 1,371 118 Updated Apr 17, 2024

nelson-liu / lost-in-the-middle

Code and data for "Lost in the Middle: How Language Models Use Long Contexts"

Python 325 27 Updated Jan 4, 2024

joneswack / dp-rfs

This repository contains PyTorch implementations of various random feature maps for dot product kernels.

Jupyter Notebook 18 2 Updated Jul 13, 2024

teddykoker / performer

Simply Numpy implementation of the FAVOR+ attention mechanism, https://teddykoker.com/2020/11/performers/

Python 37 4 Updated Dec 30, 2020

Noahs-ARK / RFA

Python 32 2 Updated Apr 12, 2021

mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 6,718 374 Updated Jul 11, 2024

QingruZhang / PASTA

PASTA: Post-hoc Attention Steering for LLMs

Python 109 8 Updated Nov 24, 2024

facebookresearch / mega

Sequence modeling with Mega.

Python 298 28 Updated Jan 28, 2023

yxli2123 / LoftQ

Python 209 19 Updated Jun 11, 2024

HuangOwen / Awesome-LLM-Compression

Awesome LLM compression research papers and tools.

1,272 84 Updated Dec 24, 2024

abukharin3 / asteroid

Code for the ICML 2023 paper: Machine Learning Force Fields with Data Cost Aware Training

Python 4 Updated Jun 21, 2023

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 136,753 27,387 Updated Dec 25, 2024

declare-lab / instruct-eval

This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.

Python 534 43 Updated Mar 10, 2024

liltom-eth / llama2-webui

Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend for Generative Agents/Apps.

Jupyter Notebook 1,968 203 Updated Mar 22, 2024

jllllll / bitsandbytes-windows-webui

Windows compile of bitsandbytes for use in text-generation-webui.

HTML 343 35 Updated Nov 18, 2023

milesgranger / gap_statistic

Dynamically get the suggested clusters in the data for unsupervised learning.

Rust 218 46 Updated Jul 31, 2024

artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,128 826 Updated Jun 10, 2024

google-research / tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

27,649 2,287 Updated Jun 18, 2024

yifan1130 / PLATON

Forked from QingruZhang/PLATON

This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).

Python 1 Updated Jul 20, 2022