Skip to content

Commit

Permalink
more changes to attention and visuals
Browse files Browse the repository at this point in the history
  • Loading branch information
jamescalam committed Apr 10, 2021
1 parent 8fd24ee commit 1ff8a88
Show file tree
Hide file tree
Showing 16 changed files with 44 additions and 4 deletions.
Binary file added assets/images/RECOVER_multihead_attention.fla
Binary file not shown.
Binary file not shown.
Binary file added assets/images/RECOVER_three_eras.fla
Binary file not shown.
Binary file added assets/images/attention_overview.fla
Binary file not shown.
Binary file added assets/images/attention_overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/bidirectional_attention.fla
Binary file not shown.
Binary file added assets/images/bidirectional_attention.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/multihead_attention.fla
Binary file not shown.
Binary file added assets/images/multihead_attention.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/multihead_attention_highlevel.fla
Binary file not shown.
Binary file added assets/images/self_attention.fla
Binary file not shown.
Binary file added assets/images/self_attention.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/images/three_eras.fla
Binary file not shown.
Binary file added assets/images/three_eras.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
40 changes: 40 additions & 0 deletions course/attention/02_self_attention.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Self Attention\n",
"\n",
"With dot-product attention, we calculated the alignment between word vectors from two different sequences - perfect for translation. Self-attention takes a different approach, here we compare words to previous words in the *same sequence*. So, where with dot-product attention we took our queries **Q** and keys **K** from two different sequences, self-attention takes them from the same sequence. Transformer models that look at previous tokens and try to predict the next include both text generation, and summarization.\n",
"\n",
"So, just like before with dot-product attention, we calculate the dot-product again - this time taking **Q** and **K** from the same sequence.\n",
"\n",
"![Self attention visual](../../assets/images/self_attention.png)\n",
"\n",
"After calculating the dot-product across all items in the sequence, we apply a mask to remove all values calculated for future words - leaving us with the dot-product between past words only. Next, we take the softmax just as before, and multiply the result by **V** to get our attention **Z**."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "ML",
"language": "python",
"name": "ml"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
8 changes: 4 additions & 4 deletions course/attention/04_multihead_attention.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@
"source": [
"# Multihead Attention\n",
"\n",
"Multihead attention allows us to build several representations of attention between words - so rather than calculating attention once, we calculate it several times, concatenate the results, and pass them through a linear layer. In a transformer model it would look like this:\n",
"\n",
"Tk image\n",
"\n",
"![Flow in multihead attention](../../assets/images/multihead_attention.png)\n",
"And if we were to look at the multi-head attention segment in more detail we would see this:\n",
"\n",
"## From Scratch in Numpy\n",
"\n",
"TK work through example in Numpy"
"![Flow in multihead attention](../../assets/images/multihead_attention.png)"
]
},
{
Expand Down

0 comments on commit 1ff8a88

Please sign in to comment.