diff --git a/chapter_recurrent-modern/lstm.md b/chapter_recurrent-modern/lstm.md index ba26d4bbb3..e80b4eb217 100644 --- a/chapter_recurrent-modern/lstm.md +++ b/chapter_recurrent-modern/lstm.md @@ -9,7 +9,7 @@ became salient, with Bengio and Hochreiter discussing the problem :cite:`bengio1994learning,Hochreiter.Bengio.Frasconi.ea.2001`. Hochreiter had articulated this problem as early -as in his 1991 masters thesis, although the results +as 1991 in his masters thesis, although the results were not widely known because the thesis was written in German. While gradient clipping helps with exploding gradients, handling vanishing gradients appears