Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relation between scaling weights of paper and implementation #59

Closed
francois-rozet opened this issue Jan 15, 2021 · 7 comments
Closed

Comments

@francois-rozet
Copy link

francois-rozet commented Jan 15, 2021

Hello @richzhang,

In the LPIPS paper, the 1x1 scaling convolution of the difference of the activations is performed before the squaring.

image

But, in the implementation, the difference of the activations is first squared, and after scaled.

diffs[kk] = (feats0[kk]-feats1[kk])**2
...
self.lin[kk](diffs[kk])

Is this a mistake ? If yes, in the paper or in the implementation ?

@richzhang
Copy link
Owner

richzhang commented Jan 15, 2021

Yes, the weights in the implementation correspond to w^2 in the paper. Fig 10 is also plotting w^2

@francois-rozet
Copy link
Author

francois-rozet commented Jan 15, 2021

Thanks @richzhang ! Therefore, if I am not mistaken, the convolution could even be performed after the averaging.

Like,

formula

or even

formula

@richzhang
Copy link
Owner

richzhang commented Jan 15, 2021

You cannot collapse the channel direction before multiplying by w (which is scaling each channel).

In other words, any of these are fine:
(w y - w yhat)^2
= (w (y-yhat))^2
= w^2 (y -yhat)^2

Hope that makes sense

@francois-rozet
Copy link
Author

Ah yes sure, my notation isn't very accurate. The norm and MSE should be spatial only.

@richzhang
Copy link
Owner

Great, yes that seems correct then! (also add a sum over channel direction, in front of w_l)

@francois-rozet
Copy link
Author

Great, yes that seems correct then! (also add a sum over channel direction, in front of w_l)

Oh my bad, I wanted to write a dot product and not an element wise product.

formula

In fact, it means that we can drop the nn.Conv2d for a nn.Linear, with the same weights. It might also make the code a bit faster since the product is performed on only one channel vector.

@francois-rozet francois-rozet changed the title Implementation doesn't match paper Relation between scaling weights of paper and implementation Jan 15, 2021
@francois-rozet
Copy link
Author

Thank you, you can close the issue 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants