We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug 运行这个脚本报错
CUDA_VISIBLE_DEVICES=0 swift sft --model '/mnt/workspace/Qwen2.5-7B-Instruct' --train_type full --dataset '/mnt/workspace/dataset/traindata-self.json' --agent_template hermes --torch_dtype bfloat16 --num_train_epochs 2 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-5 --gradient_accumulation_steps 8 --eval_steps 100 --save_steps 100 --save_total_limit 2 --logging_steps 5 --max_length 8192 --save_only_model true --packing true --use_liger_kernel true --output_dir output --warmup_ratio 0.05 --attn_impl flash_attn --dataloader_num_workers 4 --dataset_num_proc 16
Your hardware and system info transformers 4.50.0.dev0
The text was updated successfully, but these errors were encountered:
#4119
Sorry, something went wrong.
No branches or pull requests
Describe the bug
运行这个脚本报错
35GiB
CUDA_VISIBLE_DEVICES=0
swift sft
--model '/mnt/workspace/Qwen2.5-7B-Instruct'
--train_type full
--dataset '/mnt/workspace/dataset/traindata-self.json'
--agent_template hermes
--torch_dtype bfloat16
--num_train_epochs 2
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--learning_rate 1e-5
--gradient_accumulation_steps 8
--eval_steps 100
--save_steps 100
--save_total_limit 2
--logging_steps 5
--max_length 8192
--save_only_model true
--packing true
--use_liger_kernel true
--output_dir output
--warmup_ratio 0.05
--attn_impl flash_attn
--dataloader_num_workers 4
--dataset_num_proc 16
Your hardware and system info
transformers 4.50.0.dev0
The text was updated successfully, but these errors were encountered: