Skip to content

Commit

Permalink
Merge branch 'THUDM:main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
ai-liuys authored Mar 29, 2024
2 parents 59ec5c6 + af9be9e commit e062715
Show file tree
Hide file tree
Showing 4 changed files with 16 additions and 10 deletions.
2 changes: 1 addition & 1 deletion finetune_demo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ pip install -r requirements.txt
> 1. 未知的训练问题 / 显存占用与上述有误差。
> 2. 架构过低而不支持某些特性。
> 3. 推理效果问题。
> 以上三种情况为社区曾经遇到过的问题,虽然概率极地,如果您遇到了以上问题,可以尝试在社区中解决。
> 以上三种情况为社区曾经遇到过的问题,虽然概率较低,如果您遇到了以上问题,可以尝试在社区中解决。
## 多轮对话格式

Expand Down
12 changes: 7 additions & 5 deletions finetune_demo/configs/lora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,21 @@ data_config:
val_file: dev.json
test_file: dev.json
num_proc: 16
max_input_length: 128
max_output_length: 256
max_input_length: 256
max_output_length: 512
training_args:
# see `transformers.Seq2SeqTrainingArguments`
output_dir: ./output
max_steps: 3000
# needed to be fit for the dataset
learning_rate: 5e-5
# settings for data loading
per_device_train_batch_size: 1
per_device_train_batch_size: 4
dataloader_num_workers: 16
remove_unused_columns: false
# settings for saving checkpoints
save_strategy: steps
save_steps: 500
save_steps: 2000
# settings for logging
log_level: info
logging_strategy: steps
Expand All @@ -31,7 +33,7 @@ training_args:
predict_with_generate: true
# see `transformers.GenerationConfig`
generation_config:
max_new_tokens: 256
max_new_tokens: 512
# set your absolute deepspeed path here
#deepspeed: ds_zero_2.json
# set to true if train with cpu.
Expand Down
2 changes: 2 additions & 0 deletions finetune_demo/configs/ptuning_v2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ training_args:
# see `transformers.Seq2SeqTrainingArguments`
output_dir: ./output
max_steps: 3000
# needed to be fit for the dataset
learning_rate: 5e-5
# settings for data loading
per_device_train_batch_size: 4
dataloader_num_workers: 16
Expand Down
10 changes: 6 additions & 4 deletions finetune_demo/configs/sft.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,16 @@ data_config:
val_file: dev.json
test_file: dev.json
num_proc: 16
max_input_length: 128
max_output_length: 256
max_input_length: 256
max_output_length: 512
training_args:
# see `transformers.Seq2SeqTrainingArguments`
output_dir: ./output
max_steps: 3000
# needed to be fit for the dataset
learning_rate: 5e-5
# settings for data loading
per_device_train_batch_size: 1
per_device_train_batch_size: 4
dataloader_num_workers: 16
remove_unused_columns: false
# settings for saving checkpoints
Expand All @@ -30,6 +32,6 @@ training_args:
# debug: underflow_overflow
predict_with_generate: true
generation_config:
max_new_tokens: 256
max_new_tokens: 512
# set your absolute deepspeed path here
deepspeed: ds_zero_3.json

0 comments on commit e062715

Please sign in to comment.