Batch Norm momentum for MaxViT #2099

gau-nernst · 2024-02-24T02:26:09Z

gau-nernst
Feb 24, 2024

Hello,

I'm trying to re-implement MaxViT for learning. Referencing both the original repo and your code, I noticed that original repo uses Keras Batch Norm momentum=0.99 (equivalent to PyTorch Batch Norm momentum=0.01), while your code uses PyTorch's default 0.1. This will not change inference results but probably will affect fine-tuning and training from scratch behavior, though probably not significant. May I know if you are aware of this small difference, and whether you decide to ignore it (since it's probably not that important)?

Answered by rwightman

Feb 26, 2024

@gau-nernst I'm aware and but yeah, didnt think it was worth changing from pytorch defaults. EPS values are covered as that has a material impact on a given set of weights, but the momentum interacts more with the training/fine-tune hparams than the weights, I've generally used hparams close to swin for working with these models.

View full answer

rwightman · 2024-02-26T15:57:07Z

rwightman
Feb 26, 2024
Maintainer

@gau-nernst I'm aware and but yeah, didnt think it was worth changing from pytorch defaults. EPS values are covered as that has a material impact on a given set of weights, but the momentum interacts more with the training/fine-tune hparams than the weights, I've generally used hparams close to swin for working with these models.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch Norm momentum for MaxViT #2099

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Batch Norm momentum for MaxViT #2099

gau-nernst Feb 24, 2024

Replies: 1 comment

rwightman Feb 26, 2024 Maintainer

gau-nernst
Feb 24, 2024

rwightman
Feb 26, 2024
Maintainer