Skip to content

Commit

Permalink
minor fixes (rasbt#246)
Browse files Browse the repository at this point in the history
* removed duplicated white spaces

* Update ch07/01_main-chapter-code/ch07.ipynb

* Update ch07/05_dataset-generation/llama3-ollama.ipynb

* removed duplicated white spaces

* fixed title again

---------

Co-authored-by: Sebastian Raschka <[email protected]>
  • Loading branch information
d-kleine and rasbt authored Jun 25, 2024
1 parent 9a9b353 commit 81c843b
Show file tree
Hide file tree
Showing 10 changed files with 19 additions and 19 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ Several folders contain optional materials as a bonus for interested readers:

### Citation

If you find this book or code useful for your research, please consider citing it:
If you find this book or code useful for your research, please consider citing it:

```
@book{build-llms-from-scratch-book,
Expand Down
4 changes: 2 additions & 2 deletions appendix-A/01_main-chapter-code/code-part1.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1263,7 +1263,7 @@
}
],
"source": [
"model = NeuralNetwork(2, 2) # needs to match the original model exactly\n",
"model = NeuralNetwork(2, 2) # needs to match the original model exactly\n",
"model.load_state_dict(torch.load(\"model.pth\"))"
]
},
Expand Down Expand Up @@ -1340,7 +1340,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.10.11"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion ch02/01_main-chapter-code/ch02.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -710,7 +710,7 @@
"- `[UNK]` to represent works that are not included in the vocabulary\n",
"\n",
"- Note that GPT-2 does not need any of these tokens mentioned above but only uses an `<|endoftext|>` token to reduce complexity\n",
"- The `<|endoftext|>` is analogous to the `[EOS]` token mentioned above\n",
"- The `<|endoftext|>` is analogous to the `[EOS]` token mentioned above\n",
"- GPT also uses the `<|endoftext|>` for padding (since we typically use a mask when training on batched inputs, we would not attend padded tokens anyways, so it does not matter what these tokens are)\n",
"- GPT-2 does not use an `<UNK>` token for out-of-vocabulary words; instead, GPT-2 uses a byte-pair encoding (BPE) tokenizer, which breaks down words into subword units which we will discuss in a later section\n",
"\n"
Expand Down
4 changes: 2 additions & 2 deletions ch04/01_main-chapter-code/ch04.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -520,7 +520,7 @@
"- Note that we also add a smaller value (`eps`) before computing the square root of the variance; this is to avoid division-by-zero errors if the variance is 0\n",
"\n",
"**Biased variance**\n",
"- In the variance calculation above, setting `unbiased=False` means using the formula $\\frac{\\sum_i (x_i - \\bar{x})^2}{n}$ to compute the variance where n is the sample size (here, the number of features or columns); this formula does not include Bessel's correction (which uses `n-1` in the denominator), thus providing a biased estimate of the variance \n",
"- In the variance calculation above, setting `unbiased=False` means using the formula $\\frac{\\sum_i (x_i - \\bar{x})^2}{n}$ to compute the variance where n is the sample size (here, the number of features or columns); this formula does not include Bessel's correction (which uses `n-1` in the denominator), thus providing a biased estimate of the variance \n",
"- For LLMs, where the embedding dimension `n` is very large, the difference between using n and `n-1`\n",
" is negligible\n",
"- However, GPT-2 was trained with a biased variance in the normalization layers, which is why we also adopted this setting for compatibility reasons with the pretrained weights that we will load in later chapters\n",
Expand Down Expand Up @@ -1498,7 +1498,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.10.11"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion ch04/02_performance-analysis/flops-analysis.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
"metadata": {},
"source": [
"- FLOPs (Floating Point Operations Per Second) measure the computational complexity of neural network models by counting the number of floating-point operations executed\n",
"- High FLOPs indicate more intensive computation and energy consumption"
"- High FLOPs indicate more intensive computation and energy consumption"
]
},
{
Expand Down
4 changes: 2 additions & 2 deletions ch05/01_main-chapter-code/ch05.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1959,7 +1959,7 @@
"id": "10e4c7f9-592f-43d6-a00e-598fa01dfb82",
"metadata": {},
"source": [
"- The recommended way in PyTorch is to save the model weights, the so-called `state_dict` via by applying the `torch.save` function to the `.state_dict()` method:"
"- The recommended way in PyTorch is to save the model weights, the so-called `state_dict` via by applying the `torch.save` function to the `.state_dict()` method:"
]
},
{
Expand Down Expand Up @@ -2458,7 +2458,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.10.11"
}
},
"nbformat": 4,
Expand Down
12 changes: 6 additions & 6 deletions ch07/01_main-chapter-code/ch07.ipynb

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions ch07/01_main-chapter-code/exercise-solutions.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -267,7 +267,7 @@
"Model saved as gpt2-medium355M-sft-phi3-prompt.pth\n",
"```\n",
"\n",
"For comparison, you can run the original chapter 7 finetuning code via `python exercise_experiments.py --exercise_solution baseline`. \n",
"For comparison, you can run the original chapter 7 finetuning code via `python exercise_experiments.py --exercise_solution baseline`. \n",
"\n",
"Note that on an Nvidia L4 GPU, the code above, using the Phi-3 prompt template, takes 1.5 min to run. In comparison, the Alpaca-style template takes 1.80 minutes to run. So, the Phi-3 template is approximately 17% faster since it results in shorter model inputs. \n",
"\n",
Expand Down Expand Up @@ -954,7 +954,7 @@
"Model saved as gpt2-medium355M-sft-lora.pth\n",
"```\n",
"\n",
"For comparison, you can run the original chapter 7 finetuning code via `python exercise_experiments.py --exercise_solution baseline`. \n",
"For comparison, you can run the original chapter 7 finetuning code via `python exercise_experiments.py --exercise_solution baseline`. \n",
"\n",
"Note that on an Nvidia L4 GPU, the code above, using LoRA, takes 1.30 min to run. In comparison, the baseline takes 1.80 minutes to run. So, LoRA is approximately 28% faster.\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion ch07/03_model-evaluation/llm-instruction-eval-ollama.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@
"\n",
"- After the download has been completed, you will see a command line prompt that allows you to chat with the model\n",
"\n",
"- Try a prompt like \"What do llamas eat?\", which should return an output similar to the following:\n",
"- Try a prompt like \"What do llamas eat?\", which should return an output similar to the following:\n",
"\n",
"```\n",
">>> What do llamas eat?\n",
Expand Down
2 changes: 1 addition & 1 deletion ch07/05_dataset-generation/llama3-ollama.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@
"\n",
"- After the download has been completed, you will see a command line prompt that allows you to chat with the model\n",
"\n",
"- Try a prompt like \"What do llamas eat?\", which should return an output similar to the following:\n",
"- Try a prompt like \"What do llamas eat?\", which should return an output similar to the following:\n",
"\n",
"```\n",
">>> What do llamas eat?\n",
Expand Down

0 comments on commit 81c843b

Please sign in to comment.