Skip to content
View yifan1130's full-sized avatar

Block or report yifan1130

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Some useful tips for faiss

Shell 607 46 Updated Nov 2, 2023

Making Long-Context LLM Inference 10x Faster and 10x Cheaper

Python 305 35 Updated Dec 25, 2024

Building blocks for foundation models.

424 17 Updated Jan 3, 2024

Systems for GenAI

74 5 Updated Nov 13, 2024

My continuously updated Machine Learning, Probabilistic Models and Deep Learning notes and demos (2000+ slides) 我不间断更新的机器学习,概率模型和深度学习的讲义(2000+页)和视频链接

Jupyter Notebook 8,476 1,722 Updated Sep 29, 2024

This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).

Python 43 5 Updated Oct 17, 2022

YaRN: Efficient Context Window Extension of Large Language Models

Python 1,371 118 Updated Apr 17, 2024

Code and data for "Lost in the Middle: How Language Models Use Long Contexts"

Python 325 27 Updated Jan 4, 2024

This repository contains PyTorch implementations of various random feature maps for dot product kernels.

Jupyter Notebook 18 2 Updated Jul 13, 2024

Simply Numpy implementation of the FAVOR+ attention mechanism, https://teddykoker.com/2020/11/performers/

Python 37 4 Updated Dec 30, 2020
Python 32 2 Updated Apr 12, 2021

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 6,718 374 Updated Jul 11, 2024

PASTA: Post-hoc Attention Steering for LLMs

Python 109 8 Updated Nov 24, 2024

Sequence modeling with Mega.

Python 298 28 Updated Jan 28, 2023
Python 209 19 Updated Jun 11, 2024

Awesome LLM compression research papers and tools.

1,272 84 Updated Dec 24, 2024

Code for the ICML 2023 paper: Machine Learning Force Fields with Data Cost Aware Training

Python 4 Updated Jun 21, 2023

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 136,753 27,387 Updated Dec 25, 2024

This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.

Python 534 43 Updated Mar 10, 2024

Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend for Generative Agents/Apps.

Jupyter Notebook 1,968 203 Updated Mar 22, 2024

Windows compile of bitsandbytes for use in text-generation-webui.

HTML 343 35 Updated Nov 18, 2023

Dynamically get the suggested clusters in the data for unsupervised learning.

Rust 218 46 Updated Jul 31, 2024

QLoRA: Efficient Finetuning of Quantized LLMs

Jupyter Notebook 10,128 826 Updated Jun 10, 2024

A playbook for systematically maximizing the performance of deep learning models.

27,649 2,287 Updated Jun 18, 2024

This pytorch package implements PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance (ICML 2022).

Python 1 Updated Jul 20, 2022