diff --git a/docs/en/week11/11-2.md b/docs/en/week11/11-2.md index a67419b02..a6f7d5386 100644 --- a/docs/en/week11/11-2.md +++ b/docs/en/week11/11-2.md @@ -83,15 +83,13 @@ This margin-base loss allows for different inputs to have variable amounts of ta ### Hinge Embedding Loss - `nn.HingeEmbeddingLoss()` $$ -\begin{equation} l_n = \left\{ - \begin{array}{lr} - x_n, &\quad y_n=1, \\ - \max\{0,\Delta-x_n\}, &\quad y_n=-1 \\ - \end{array} + \begin{array}{lr} + x_n, &\quad y_n=1, \\ + \max\{0,\Delta-x_n\}, &\quad y_n=-1 \\ + \end{array} \right. -\end{equation} $$ Hinge embedding loss used for semi-supervised learning by measuring whether two inputs are similar or dissimilar. It pulls together things that are similar and pushes away things are dissimilar. The $y$ variable indicates whether the pair of scores need to go in a certain direction. Using a hinge loss, the score is positive if $y$ is 1 and some margin $\Delta$ if $y$ is -1. @@ -100,15 +98,13 @@ Hinge embedding loss used for semi-supervised learning by measuring whether two ### Cosine Embedding Loss - `nn.CosineEmbeddingLoss()` $$ -\begin{equation} l_n = \left\{ - \begin{array}{lr} - 1-\cos(x_1,x_2), & \quad y=1, \\ - \max(0,\cos(x_1,x_2)-\text{margin}), & \quad y=-1 - \end{array} + \begin{array}{lr} + 1-\cos(x_1,x_2), & \quad y=1, \\ + \max(0,\cos(x_1,x_2)-\text{margin}), & \quad y=-1 + \end{array} \right. -\end{equation} $$ This loss is used for measuring whether two inputs are similar or dissimilar, using the cosine distance, and is typically used for learning nonlinear embeddings or semi-supervised learning. diff --git a/docs/en/week15/15-2.md b/docs/en/week15/15-2.md index e1502adc6..ea312ed0c 100644 --- a/docs/en/week15/15-2.md +++ b/docs/en/week15/15-2.md @@ -118,7 +118,7 @@ In technical terms, if free energy is Objective - Finding a well behaved energy function -A loss functional, minimized during learning, is used to measure the quality of the available energy functions. In simple terms, loss functional is a scalar function that tells us how good our energy function is. A distinction should be made between the energy function, which is minimized by the inference process, and the loss functional (introduced in Section 2), which is minimized by the learning process. +A loss functional, minimized during learning, is used to measure the quality of the available energy functions. In simple terms, loss functional is a scalar function that tells us how good our energy function is. A distinction should be made between the energy function, which is minimized by the inference process, and the loss functional, which is minimized by the learning process. $$\mathcal{L}(F(\cdot),Y) = \frac{1}{N} \sum_{n=1}^{N} l(F(\cdot),\vect{y}^{(n)}) \in \R$$ diff --git a/docs/es/week11/11-2.md b/docs/es/week11/11-2.md index e707e010e..ac835dc2b 100644 --- a/docs/es/week11/11-2.md +++ b/docs/es/week11/11-2.md @@ -200,17 +200,16 @@ $$ --> $$ -\begin{equation} l_n = \left\{ - \begin{array}{lr} - x_n, &\quad y_n=1, \\ - \max\{0,\Delta-x_n\}, &\quad y_n=-1 \\ - \end{array} + \begin{array}{lr} + x_n, &\quad y_n=1, \\ + \max\{0,\Delta-x_n\}, &\quad y_n=-1 \\ + \end{array} \right. -\end{equation} $$ + @@ -237,15 +236,13 @@ $$ --> $$ -\begin{equation} l_n = \left\{ - \begin{array}{lr} - 1-\cos(x_1,x_2), & \quad y=1, \\ - \max(0,\cos(x_1,x_2)-\text{margen}), & \quad y=-1 - \end{array} + \begin{array}{lr} + 1-\cos(x_1,x_2), & \quad y=1, \\ + \max(0,\cos(x_1,x_2)-\text{margin}), & \quad y=-1 + \end{array} \right. -\end{equation} $$ -## Pérdida de Margen Generalizado +## Pérdida de Margen Generalizado @@ -118,15 +116,13 @@ $$ ### Cosine Embedding Loss - `nn.CosineEmbeddingLoss()` $$ -\begin{equation} l_n = \left\{ - \begin{array}{lr} - 1-\cos(x_1,x_2), & \quad y=1, \\ - \max(0,\cos(x_1,x_2)-\text{margin}), & \quad y=-1 - \end{array} + \begin{array}{lr} + 1-\cos(x_1,x_2), & \quad y=1, \\ + \max(0,\cos(x_1,x_2)-\text{margin}), & \quad y=-1 + \end{array} \right. -\end{equation} $$ ここで、$bar Y^i$は、*最も問題のある不正解*です。この損失は、正解と最も問題のある不正解の差が少なくとも$m$であることを要請します。 diff --git a/docs/ko/week11/11-2.md b/docs/ko/week11/11-2.md index d8ae044c1..afdfef19a 100644 --- a/docs/ko/week11/11-2.md +++ b/docs/ko/week11/11-2.md @@ -106,15 +106,13 @@ $$ ### 힌지 임베딩 손실 - `nn.HingeEmbeddingLoss()` $$ -\begin{equation} l_n = \left\{ - \begin{array}{lr} - x_n, &\quad y_n=1, \\ - \max\{0,\Delta-x_n\}, &\quad y_n=-1 \\ - \end{array} + \begin{array}{lr} + x_n, &\quad y_n=1, \\ + \max\{0,\Delta-x_n\}, &\quad y_n=-1 \\ + \end{array} \right. -\end{equation} $$ @@ -125,15 +123,13 @@ $$ ### 코사인 임베딩 손실 - `nn.CosineEmbeddingLoss()` $$ -\begin{equation} l_n = \left\{ - \begin{array}{lr} - 1-\cos(x_1,x_2), & \quad y=1, \\ - \max(0,\cos(x_1,x_2)-\text{margin}), & \quad y=-1 - \end{array} + \begin{array}{lr} + 1-\cos(x_1,x_2), & \quad y=1, \\ + \max(0,\cos(x_1,x_2)-\text{margin}), & \quad y=-1 + \end{array} \right. -\end{equation} $$ @@ -438,4 +434,4 @@ $$ -우리는 $Y$가 이산이지만, 만약 연속이라면 그 합은 적분으로 바뀔 것이라 가정한다. 여기서 $E(W, Y^i,X^i)-E(W,y,X^i)$는 정답과 어떤 다른 답에서 측정된 $E$의 차이이다. $C(Y^i,y)$는 마진이며 일반적으로 $Y^i$와 $y$ 사이의 거리값을 나타낸다. 동기는 우리가 잘못된 표본 $y$에서 밀어올리고 싶은 합이 $y$와 옳은 표본 $Y_i$ 사이 거리에 따라 달라져야 한다는 것이다. 이것은 최적화하는데 더 어려운 손실이 될 수 있는 것이다. \ No newline at end of file +우리는 $Y$가 이산이지만, 만약 연속이라면 그 합은 적분으로 바뀔 것이라 가정한다. 여기서 $E(W, Y^i,X^i)-E(W,y,X^i)$는 정답과 어떤 다른 답에서 측정된 $E$의 차이이다. $C(Y^i,y)$는 마진이며 일반적으로 $Y^i$와 $y$ 사이의 거리값을 나타낸다. 동기는 우리가 잘못된 표본 $y$에서 밀어올리고 싶은 합이 $y$와 옳은 표본 $Y_i$ 사이 거리에 따라 달라져야 한다는 것이다. 이것은 최적화하는데 더 어려운 손실이 될 수 있는 것이다. diff --git a/docs/tr/week11/11-2.md b/docs/tr/week11/11-2.md index 51130e31f..9ee1e5736 100644 --- a/docs/tr/week11/11-2.md +++ b/docs/tr/week11/11-2.md @@ -141,7 +141,6 @@ Bu marj bazlı kayıp terimi, farklı girdilerin değişken miktarlarda hedefler ### Hinge Gömü Kayıp Terimi (Hinge Embedding Loss) - `nn.HingeEmbeddingLoss()` $$ -\begin{equation} l_n = \left\{ - \begin{array}{lr} - x_n, &\quad y_n=1, \\ - \max\{0,\Delta-x_n\}, &\quad y_n=-1 \\ - \end{array} + \begin{array}{lr} + x_n, &\quad y_n=1, \\ + \max\{0,\Delta-x_n\}, &\quad y_n=-1 \\ + \end{array} \right. -\end{equation} $$ İki girdinin benzer veya farklı olup olmadığını ölçmek için yarı denetimli öğrenmede kullanılan bir gömme kaybıdır. Benzer olan şeyleri bir araya getirir ve farklı olan şeyleri uzaklaştırır. $y$ değişkeni, puan çiftinin belirli bir yönde gitmesi gerekip gerekmediğini gösterir. Hinge kaybı kullanıldığında, puan $y$ 1 ise pozitif, $y$ -1 ise $\Delta$ marjı elde edilir. @@ -172,15 +168,13 @@ $$