fix doc (modelscope#1875)

tiomfree · Aug 31, 2024 · 469d44c · 469d44c
1 parent 923c7d8
commit 469d44c
Show file tree

Hide file tree

Showing 2 changed files with 2 additions and 2 deletions.
diff --git a/docs/source/LLM/命令行参数.md b/docs/source/LLM/命令行参数.md
@@ -79,7 +79,7 @@
 - `--use_dora`: 默认为`False`, 是否使用`DoRA`.
 - `--use_rslora`: 默认为`False`, 是否使用`RS-LoRA`.
 - `--neftune_noise_alpha`: `NEFTune`添加的噪声系数, 可以提升模型在指令微调中的性能, 默认为`None`. 通常可以设置为5, 10, 15. 你可以查看[相关论文](https://arxiv.org/abs/2310.05914).
-- `--neftune_backend`: `NEFTune`的backend，默认使用`transformers`库, 当训练VL模型时可能遇到不适配的情况, 此时建议指定为`swift`.
+- `--neftune_backend`: `NEFTune`的backend，支持`transformers`和`swift`两种, 默认使用`transformers`库.
 - `--gradient_checkpointing`: 是否开启gradient checkpointing, 默认为`True`. 该参数可以用于节约显存, 虽然这会略微降低训练速度. 该参数在max_length较大, batch_size较大时作用显著.
 - `--deepspeed`: 用于指定deepspeed的配置文件的路径或者直接传入json格式的配置信息, 默认为`None`, 即不开启deepspeed. deepspeed可以节约显存. 我们书写了默认的[ZeRO-2配置文件](https://github.com/modelscope/swift/blob/main/swift/llm/ds_config/zero2.json), [ZeRO-3配置文件](https://github.com/modelscope/swift/blob/main/swift/llm/ds_config/zero3.json), [ZeRO-2 Offload配置文件](https://github.com/modelscope/swift/blob/main/swift/llm/ds_config/zero2_offload.json)和[ZeRO-3 Offload配置文件](https://github.com/modelscope/swift/blob/main/swift/llm/ds_config/zero3_offload.json). 你只需要指定'default-zero2', 'default-zero3', 'zero2-offload', 'zero3-offload'即可.
 - `--batch_size`: 训练时的batch_size, 默认为`1`. 增大batch_size可以增加GPU的利用率, 但不一定会增加训练速度, 因为在一个batch中, 需要对较短的句子按该batch中最长句子的长度进行padding, 从而引入无效的计算量.

diff --git a/docs/source_en/LLM/Command-line-parameters.md b/docs/source_en/LLM/Command-line-parameters.md
@@ -80,7 +80,7 @@
 - `--use_dora`: Default is `False`, whether to use `DoRA`.
 - `--use_rslora`: Default is `False`, whether to use `RS-LoRA`.
 - `--neftune_noise_alpha`: The noise coefficient added by `NEFTune` can improve performance of instruction fine-tuning, default is `None`. Usually can be set to 5, 10, 15. See [related paper](https://arxiv.org/abs/2310.05914).
-- `--neftune_backend`: The backend of `NEFTune`, default uses `transformers` library, may encounter incompatibility when training VL models, in which case it's recommended to specify as `swift`.
+- `--neftune_backend`: The backend of `NEFTune`, supported values are `transformers`, `swift`, default is `transformers`.
 - `--gradient_checkpointing`: Whether to enable gradient checkpointing, default is `True`. This can be used to save memory, although it slightly reduces training speed. Has significant effect when max_length and batch_size are large.
 - `--deepspeed`: Used to specify the path to the deepspeed configuration file or directly pass JSON formatted configuration information. By default, it is set to `None`, which means deepspeed is not enabled. Deepspeed can save GPU memory. We have written default [ZeRO-2 configuration file](https://github.com/modelscope/swift/blob/main/swift/llm/ds_config/zero2_offload.json), [ZeRO-3 configuration file](https://github.com/modelscope/swift/blob/main/swift/llm/ds_config/zero3.json), [ZeRO-2 Offload configuration file](https://github.com/modelscope/swift/blob/main/swift/llm/ds_config/zero2_offload.json ), and [ZeRO-3 Offload configuration file](https://github.com/modelscope/swift/blob/main/swift/llm/ds_config/zero3_offload.json). You only need to specify 'default-zero2', 'default-zero3', 'zero2-offload', 'zero3-offload'.
 - `--batch_size`: Batch_size during training, default is `1`. Increasing batch_size can improve GPU utilization, but won't necessarily improve training speed, because within a batch, shorter sentences need to be padded to the length of the longest sentence in the batch, introducing invalid computations.