GitHub - VT-NLP/MixLoRA: Multimodal Instruction Tuning with Conditional Mixture of LoRA (ACL 2024)

Multimodal Instruction Tuning with Conditional Mixture of LoRA (ACL 2024)

Requirements and Installation

To set up your environment, run the following commands:

conda create -n mixlora python=3.8 -y
conda activate mixlora
sh setup.sh

Data Preparation

Training Dataset

Please download the dataset from Vision-Flan.

Evaluation Datset

The evaluation dataset we used can be downloaded from here.

Training & Inference

Specify the image_folder and data_path in the fine-tuning scripts according to the data preparation.

Training

To fine-tune the mode, run the following command:

sh scripts/finetune_mixlora.sh <routing-type> <num-experts> <num-rank>

<routing-type>: Specify the type of routing (input for instance-based IFS routing alone, input_lora_a_param for combined instance-based IFS routing and CFS routing).
<num-experts>: Specify the number of factors.
<num-rank>: Specify the number of rank.

The projector weights mm_projector.bin can be downloaded from the original LLava repo.

The trained model checkpoints can be found from here.

Inference

To run inference on all the multimodal tasks:

sh scripts/run_eval.sh <model-path> <data-dir>

<model-path>: Specify the path to the model
<data-dir>: Specify the path to the evaluation dataset

To run inference on MME:

sh scripts/run_eval_mme.sh <model-path> <data-dir>

<model-path>: Specify the path to the model
<data-dir>: Specify the path to the MME dataset

Acknowledgement

The codebase is built upon LLaVA. We would like to thank the authors for publicly releasing their code.

Citation

@article{shen2024multimodal,
  title={Multimodal Instruction Tuning with Conditional Mixture of LoRA},
  author={Shen, Ying and Xu, Zhiyang and Wang, Qifan and Cheng, Yu and Yin, Wenpeng and Huang, Lifu},
  journal={arXiv preprint arXiv:2402.15896},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
llava		llava
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal Instruction Tuning with Conditional Mixture of LoRA (ACL 2024)

Requirements and Installation

Data Preparation

Training Dataset

Evaluation Datset

Training & Inference

Training

Inference

Acknowledgement

Citation

About

Releases

Packages

Languages

License

VT-NLP/MixLoRA

Folders and files

Latest commit

History

Repository files navigation

Multimodal Instruction Tuning with Conditional Mixture of LoRA (ACL 2024)

Requirements and Installation

Data Preparation

Training Dataset

Evaluation Datset

Training & Inference

Training

Inference

Acknowledgement

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages