Skip to content

shitanshubhushan/Linearizing-Llama-3.2-1B

Repository files navigation

LMAOCaT: Low-Rank Mamba and gated Attention Optimization via Conversion and Transfer

Repository for code to linearize Llama-3.2-1B

Currently contains code only for linear attention + sliding window

To run:

  1. Create and activate a virtual environment:
python -m venv .venv

# Activate virtual environment
# On Windows:
.venv\Scripts\activate
# On macOS/Linux:
source .venv/bin/activate
  1. Install requirements:
pip install -r requirements.txt
  1. Run the notebooks in the following order:
# Attention Transfer
Llama_attn_transfer.ipynb

# LoRA fintune
llama_lora_finetune.ipynb

# Evaluation
Linear_llama_eval_inference_speed.ipynb
MMLU_eval-0shot.ipynb
MMLU_eval-5shot.ipynb

Poster for the Project:

Outdated and does not reflect accurate results Alt text

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published