Taiwanese Instruction-following Language Models

Setup

You can install the required packages by running the following command (after installing pytorch):

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh
export PATH=/usr/local/cuda-12.1/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64:${LD_LIBRARY_PATH}
echo 'export PATH=/usr/local/cuda-12.1/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64:${LD_LIBRARY_PATH}' >> ~/.bashrc


conda create -n open-instruct python=3.10 -y
conda activate open-instruct
#pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu121 
pip3 install --upgrade --force-reinstall --pre torch --index-url https://download.pytorch.org/whl/nightly/cu121 
pip install -r llama2_requirements.txt && pip install flash-attn>=2.0.0 --no-build-isolation

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh
export PATH=/usr/local/cuda-12.2/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64:${LD_LIBRARY_PATH}
echo 'export PATH=/usr/local/cuda-12.2/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64:${LD_LIBRARY_PATH}' >> ~/.bashrc


conda create -n open-instruct python=3.10 -y
conda activate open-instruct
#pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu121 
pip3 install --upgrade --force-reinstall --pre "torch<2.2.0" --index-url https://download.pytorch.org/whl/nightly/cu121 
pip install -r llama2_requirements.txt && pip install flash-attn==2.2.2 --no-build-isolation

Training

Pretrain

bash pretrain_sft_flash.sh 13 1 0

Instruction-tuning

./scripts/finetune_with_hf_trainer.sh

Demo and Model Checkpoints

We provide a number of model checkpoints as diffs. You can find them on Hugging Face here. They are also all here:

Licensing

The is licensed under Apache 2.0 as given in LICENSE.

Citation

If you used this repository or our models, please cite our work:

@inproceedings{lin-chen-2023-llm,
    title = "{LLM}-Eval: Unified Multi-Dimensional Automatic Evaluation for Open-Domain Conversations with Large Language Models",
    author = "Lin, Yen-Ting  and
      Chen, Yun-Nung",
    booktitle = "Proceedings of the 5th Workshop on NLP for Conversational AI (NLP4ConvAI 2023)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.nlp4convai-1.5",
    pages = "47--58",
    abstract = "We propose LLM-Eval, a unified multi-dimensional automatic evaluation method for open-domain conversations with large language models (LLMs). Existing evaluation methods often rely on human annotations, ground-truth responses, or multiple LLM prompts, which can be expensive and time-consuming. To address these issues, we design a single prompt-based evaluation method that leverages a unified evaluation schema to cover multiple dimensions of conversation quality in a single model call. We extensively evaluate the performance of LLM-Eval on various benchmark datasets, demonstrating its effectiveness, efficiency, and adaptability compared to state-of-the-art evaluation methods. Our analysis also highlights the importance of choosing suitable LLMs and decoding strategies for accurate evaluation results. LLM-Eval offers a versatile and robust solution for evaluating open-domain conversation systems, streamlining the evaluation process and providing consistent performance across diverse scenarios.",
}

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.idea		.idea
beaker_configs		beaker_configs
ds_configs		ds_configs
eval		eval
images		images
model_licenses		model_licenses
open_instruct		open_instruct
quantize		quantize
scripts		scripts
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
conversation.py		conversation.py
crawl_prompt.py		crawl_prompt.py
filter_sharegpt.py		filter_sharegpt.py
finetune_llama2_guanaco_7b.sh		finetune_llama2_guanaco_7b.sh
finetune_sft_flash.sh		finetune_sft_flash.sh
finetune_vicuna.sh		finetune_vicuna.sh
hostfile.txt		hostfile.txt
llama2_requirements.txt		llama2_requirements.txt
llama_flash_attn_monkey_patch.py		llama_flash_attn_monkey_patch.py
load_and_upload.py		load_and_upload.py
longchat_train.py		longchat_train.py
pretrain.py		pretrain.py
pretrain_13b.sh		pretrain_13b.sh
pretrain_sft_flash.sh		pretrain_sft_flash.sh
qlora.py		qlora.py
qlora_requirements.txt		qlora_requirements.txt
requirements.txt		requirements.txt
run_sft.sh		run_sft.sh
run_text_generation_inference.sh		run_text_generation_inference.sh
safe_save_trainer.py		safe_save_trainer.py
sft_trainer.py		sft_trainer.py
taiwan_instruction.py		taiwan_instruction.py
taiwan_llama_sft_test.py		taiwan_llama_sft_test.py
teset_longchat_train.py		teset_longchat_train.py
to_share_gpt_format.py		to_share_gpt_format.py
weight-diff-requirements.txt		weight-diff-requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Taiwanese Instruction-following Language Models

Setup

Training

Pretrain

Instruction-tuning

Demo and Model Checkpoints

Licensing

Citation

About

Releases

Packages

Languages

License

adamlin120/open-instruct

Folders and files

Latest commit

History

Repository files navigation

Taiwanese Instruction-following Language Models

Setup

Training

Pretrain

Instruction-tuning

Demo and Model Checkpoints

Licensing

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages