reconstruct folder and change the readme

nightscape · Jul 22, 2024 · 990cea1 · 990cea1
1 parent 7dc2a5a
commit 990cea1
Show file tree

Hide file tree

Showing 5 changed files with 41 additions and 13 deletions.
diff --git a/accelerate_config.yaml → HF_Trainer/accelerate_config.yaml b/accelerate_config.yaml → HF_Trainer/accelerate_config.yaml
diff --git a/training_config.yaml → HF_Trainer/training_config.yaml b/training_config.yaml → HF_Trainer/training_config.yaml
diff --git a/README.md b/README.md
@@ -1,37 +1,65 @@
 # sound_instruct_llama3
 
-## Clone
-
+## Organize the input/output directory 
+1. First Clone the Repo from github:
 ```
 git clone --single-branch --branch training_script https://github.com/janhq/llama3-s.git
 ```
+The folder structure should be organized as follows before training. Note that tokenizer should be on the same directory as the model.
+```
+llama3-s
+├── HF_Trainer
+├── scripts
+├── torchtune
+├── model_zoo
+│   ├── LLM
+│   │   ├── Meta-Llama-3-8B-Instruct
+│   │   ├── Jan-Llama3s-cp-6520-intermediate
+│   │   ├── Meta-Llama-3-70B-Instruct
 
-## Install
 ```
-chmod +x install.sh
-./install.sh
+## Training with HF Trainer
+### Install Depencencies
+```
+python -m venv hf_trainer
+chmod +x scripts/install.sh
+./scripts/install.sh
 ```
 Restart shell now
 ```
-chmod +x setup.sh
-./setup.sh
+chmod +x scripts/setup.sh
+./scripts/setup.sh
 source myenv/bin/activate
 ```
-
-## Logging Huggingface
-
+### Logging Huggingface
 ```
 huggingface-cli login --token=<token>
 ```
-
-## Training
+### Training
 ```
 export CUTLASS_PATH="cutlass"
 export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
 accelerate launch --config_file ./accelerate_config.yaml train.py 
 ```
 ## Training with Torchtune
+### Install Package
 ```
+python -m venv torchtune
+pip install --pre torch==2.5.0.dev20240617  --index-url https://download.pytorch.org/whl/nightly/cu121 #or cu118
+pip install --pre torchdata --index-url https://download.pytorch.org/whl/nightly
 cd ./torchtune
-tune run --nproc_per_node 4 full_finetune_distributed --config llama2/8B_full
+pip install -e .
+```
+You can also download the model using tune:
+```
+tune download meta-llama/Meta-Llama-3-70b --hf-token <token> --output-dir ../model_zoo/Meta-Llama-3-70b --ignore-patterns "original/consolidated*"
+```
+Setup the Dataset from HF path by change the path and change the name of the model in the following YAML file.
+```
+nano torchtune/recipes/configs/jan-llama3-s/8B_full.yaml
+```
+
+### Training Mutil GPU (1-8GPUs Supported)
+```
+tune run --nproc_per_node 4 full_finetune_distributed --config janhq-llama3-s/8B_full
 ```
diff --git a/install.sh → scripts/install.sh b/install.sh → scripts/install.sh
diff --git a/setup.sh → scripts/setup.sh b/setup.sh → scripts/setup.sh