Modelscope Hub
中文 | English
SWIFT (Scalable lightWeight Infrastructure for Fine-Tuning) is an extensible framwork designed to faciliate lightweight model fine-tuning. It integrates implementations for various efficient fine-tuning methods, by embracing approaches that is parameter-efficient, memory-efficient, and time-efficient. SWIFT integrates seamlessly into ModelScope ecosystem and offers the capabilities to finetune various modles, with a primary emphasis on LLMs and vision models. Additionally, SWIFT is fully compatible with Peft, enabling users to leverage the familiar Peft interface to finetune ModelScope models.
Currently supported approches (and counting):
- LoRA: LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS
- Adapter: Parameter-Efficient Transfer Learning for NLP
- Prompt Tuning: Visual Prompt Tuning
- All tuners offered on Peft.
Key features:
- By integrating the ModelScope library, models can be readily obatined via a model-id.
- Tuners provided by SWIFT be combined together to allow exploration of multiple tuners on a model for best result.
- supported SFT methods: lora, qlora, full(full parameter fine-tuning)
- supported models:
- qwen series: qwen-7b, qwen-7b-chat
- qwen-vl series: qwen-vl, qwen-vl-chat
- baichuan series: baichuan-7b, baichuan-13b, baichuan-13b-chat, baichuan2-7b, baichuan2-7b-chat, baichuan2-13b, baichuan2-13b-chat
- chatglm2 series: chatglm2-6b, chatglm2-6b-32k
- llama series: llama2-7b, llama2-7b-chat, llama2-13b, llama2-13b-chat, llama2-70b, llama2-70b-chat
- openbuddy-llama series: openbuddy-llama2-13b, openbuddy-llama-65b, openbuddy-llama2-70b
- internlm series: internlm-7b, internlm-7b-chat, internlm-7b-chat-8k
- other: polylm-13b, seqgpt-560m
- supported features: quantization, DDP, model parallelism(device map), gradient checkpointing, gradient accumulation, pushing to modelscope hub, custom datasets, multimodal and agent SFT, mutli-round chat, ...
- supported datasets:
- NLP: alpaca-en(gpt4), alpaca-zh(gpt4), finance-en, multi-alpaca-all, code-en, instinwild-en, instinwild-zh, cot-en, cot-zh, firefly-all-zh, poetry-zh, instruct-en, gpt4all-en, cmnli-zh, jd-zh, dureader-robust-zh, medical-en, medical-zh, medical-mini-zh, sharegpt-en, sharegpt-zh
- agent: damo-agent-zh, damo-agent-mini-zh
- multi-modal: coco-en
- other: cls-fudan-news-zh, ner-jave-zh
- supported templates: chatml(qwen), baichuan, chatglm2, llama, openbuddy-llama, default, default-generation
SWIFT is running in Python environment. Please make sure your python version is higher than 3.8.
Please install SWIFT by the pip
command:
pip install ms-swift -U
If you want to install SWIFT by source code, please run:
git clone https://github.com/modelscope/swift.git
cd swift
pip install -e .
If you are using source code, please remember install requirements by:
pip install -r requirements/framework.txt
SWIFT requires torch>=1.13.
We also recommend to use SWIFT in our docker image:
docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.7.1-py38-torch2.0.1-tf1.15.5-1.8.0
SWIFT supports multiple tuners, as well as tuners provided by Peft. To use the these tuners, simply call:
from swift import Swift
model = Swift.prepare_model(model, config, extra_state_keys=['...'])
The code snippet above initialized the tuner randomly. The input model is an instance of torch.nn.Module
, config is a subclass instance of SwiftConfig
or PeftConfig
. extra_state_keys is
the extra module weights(like the linear head) to be trained and stored in the output dir.
You may combine multiple tuners by:
from swift import Swift, LoRAConfig, PromptConfig
model = Swift.prepare_model(model, {'lora': LoRAConfig(...), 'prompt': PromptConfig(...)})
You can all save_pretrained
and push_to_hub
after finetuning:
from swift import push_to_hub
model.save_pretrained('some-output-folder')
push_to_hub('my-group/some-repo-id-modelscope', 'some-output-folder', token='some-ms-token')
Assume my-group/some-repo-id-modelscope
is the model-id in the hub, and some-ms-token
is the token for uploading.
Using the model-id to do later inference:
from swift import Swift
model = Swift.from_pretrained(model, 'my-group/some-repo-id-modelscope')
Here shows a runnable example:
import os
import tempfile
# Please install modelscope by `pip install modelscope`
from modelscope import Model
from swift import LoRAConfig, SwiftModel, Swift, push_to_hub
tmp_dir = tempfile.TemporaryDirectory().name
if not os.path.exists(tmp_dir):
os.makedirs(tmp_dir)
model = Model.from_pretrained('modelscope/Llama-2-7b-ms', device_map='auto')
lora_config = LoRAConfig(target_modules=['q_proj', 'k_proj', 'v_proj'])
model: SwiftModel = Swift.prepare_model(model, lora_config)
# Do some finetuning here
model.save_pretrained(tmp_dir)
push_to_hub('my-group/swift_llama2', output_dir=tmp_dir)
model = Model.from_pretrained('modelscope/Llama-2-7b-ms', device_map='auto')
model = SwiftModel.from_pretrained(model, 'my-group/swift_llama2', device_map='auto')
This is a example that uses transformers for model creation uses SWIFT for efficient tuning.
from swift import Swift, LoRAConfig, AdapterConfig, PromptConfig
from transformers import AutoModelForImageClassification
# init vit model
model = AutoModelForImageClassification.from_pretrained("google/vit-base-patch16-224")
# init lora tuner config
lora_config = LoRAConfig(
r=10, # the rank of the LoRA module
target_modules=['query', 'key', 'value'], # the modules to be replaced with the end of the module name
merge_weights=False # whether to merge weights
)
# init adapter tuner config
adapter_config = AdapterConfig(
dim=768, # the dimension of the hidden states
hidden_pos=0, # the position of the hidden state to passed into the adapter
target_modules=r'.*attention.output.dense$', # the modules to be replaced with regular expression
adapter_length=10 # the length of the adapter length
)
# init prompt tuner config
prompt_config = PromptConfig(
dim=768, # the dimension of the hidden states
target_modules=r'.*layer\.\d+$', # the modules to be replaced with regular expression
embedding_pos=0, # the position of the embedding tensor
prompt_length=10, # the length of the prompt tokens
attach_front=False # Whether prompt is attached in front of the embedding
)
# create model with swift. In practice, you can use any of these tuners or a combination of them.
model = Swift.prepare_model(model, {"lora_tuner": lora_config, "adapter_tuner": adapter_config, "prompt_tuner": prompt_config})
# get the trainable parameters of model
model.get_trainable_parameters()
# 'trainable params: 838,776 || all params: 87,406,432 || trainable%: 0.9596273189597764'
You can use the features offered by Peft in SWIFT:
from swift import LoraConfig, Swift
from peft import TaskType
lora_config = LoraConfig(target_modules=['query', 'key', 'value'], task_type=TaskType.CAUSAL_LM)
model_wrapped = Swift.prepare_model(model, lora_config)
# or call from_pretrained to load weights in the modelhub
model_wrapped = Swift.from_pretrained(model, 'some-id-in-the-modelscope-modelhub')
or:
from swift import LoraConfig, get_peft_model, PeftModel
from peft import TaskType
lora_config = LoraConfig(target_modules=['query', 'key', 'value'], task_type=TaskType.CAUSAL_LM)
model_wrapped = get_peft_model(model, lora_config)
# or call from_pretrained to load weights in the modelhub
model_wrapped = PeftModel.from_pretrained(model, 'some-id-in-the-modelscope-modelhub')
The saving strategy between Swift tuners and Peft tuners are slightly different. You can name a tuner of a SWIFT by:
model = Swift.prepare_model(model, {'default': LoRAConfig(...)})
model.save_pretrained('./output')
In the output dir, you will have a dir structure like this:
output
|-- default
|-- adapter_config.json
|-- adapter_model.bin
|-- adapter_config.json
|-- adapter_model.bin
The config/weights stored in the output dir is the config of extra_state_keys
and the weights of it. This is different from Peft, which stores the weights and config of the default
tuner.
-
ModelScope Library is the model library of ModelScope project, which contains a large number of popular models.
This project is licensed under the Apache License (Version 2.0).