Antialiased CNNs [Project Page] [Paper] [Talk]
Making Convolutional Networks Shift-Invariant Again
Richard Zhang. In ICML, 2019.
Run pip install antialiased-cnns
import antialiased_cnns
model = antialiased_cnns.resnet50(pretrained=True)
Now you are antialiased!
If you have a model trained and don't want to retrain the antialiased model from scratch, no problem! Simply load your old weights and fine-tune:
import torchvision.models as models
old_model = models.resnet50(pretrained=True) # old (aliased) model
antialiased_cnns.copy_params_buffers(old_model, model) # copy the weights over
If you want to antialias your own model, use the BlurPool layer.
C = 10 # example feature channel size
blurpool = antialiased_cnns.BlurPool(C, stride=2) # BlurPool layer; use to downsample a feature map
ex_tens = torch.Tensor(1,C,128,128)
print(blurpool(ex_tens).shape) # 1xCx64x64 tensor
More information about our provided models and how to use BlurPool is below.
Update (Sept 2020) You can also now pip install antialiased-cnns
and load models with the pretrained=True
flag. I have added kernel size 4 experiments. When downsampling an even sized feature map (e.g., a 128x128-->64x64), this is actually the correct size to use to keep the indices from drifting.
- More information about antialiased models
- Instructions for antialiasing your own model, using the
BlurPool
layer - ImageNet training and evaluation code. Achieving better consistency, while maintaining or improving accuracy, is an open problem. Help improve the results!
Pip install this package
pip install antialiased-cnns
Or clone this repository and install requirements (notably, PyTorch)
https://github.com/adobe/antialiased-cnns.git
cd antialiased-cnns
pip install -r requirements.txt
The following loads a pretrained antialiased model, perhaps as a backbone for your application.
import antialiased_cnns
model = antialiased_cnns.resnet50(pretrained=True, filter_size=4)
We also provide weights for antialiased AlexNet
, VGG16(bn)
, Resnet18,34,50,101
, Densenet121
, and MobileNetv2
(see example_usage.py).
The antialiased_cnns
module contains the BlurPool
class, which does blur+subsampling. Run pip install antialiased-cnns
or copy the antialiased_cnns
subdirectory.
The methodology is simple -- first evaluate with stride 1, and then use our BlurPool
layer to do antialiased downsampling. Make the following architectural changes. Typically, blur kernel M
is 4.
import antialiased_cnns
# MaxPool --> MaxBlurPool
baseline = nn.MaxPool2d(kernel_size=2, stride=2)
antialiased = [nn.MaxPool2d(kernel_size=2, stride=1),
antialiased_cnns.BlurPool(C, filt_size=M, stride=2)]
# Conv --> ConvBlurPool
baseline = [nn.Conv2d(Cin, C, kernel_size=3, stride=2, padding=1),
nn.ReLU(inplace=True)]
antialiased = [nn.Conv2d(Cin, C, kernel_size=3, stride=1, padding=1),
nn.ReLU(inplace=True),
antialiased_cnns.BlurPool(C, filt_size=M, stride=2)]
# AvgPool --> BlurPool
baseline = nn.AvgPool2d(kernel_size=2, stride=2)
antialiased = antialiased_cnns.BlurPool(C, filt_size=M, stride=2)
We assume incoming tensor has C
channels. Computing a layer at stride 1 instead of stride 2 adds memory and run-time. As such, we typically skip antialiasing at the highest-resolution (early in the network), to prevent large increases.
If you already trained a model, and then add antialiasing, you can fine-tune from that old model:
antialiased_cnns.copy_params_buffers(old_model, antialiased_model)
If this doesn't work, you can just copy the parameters (and not buffers). Adding antialiasing doesn't add any parameters, so the parameter lists are identical. (It does add buffers, so some heuristic is used to match the buffers, which may throw an error.)
antialiased_cnns.copy_params(old_model, antialiased_model)
Accuracy How often the image is classified correctly
Baseline | Antialiased | Delta | |
---|---|---|---|
AlexNet | 56.55 | 56.72 | +0.17 |
VGG16 | 71.59 | 72.43 | +0.84 |
VGG16bn | 73.36 | 74.12 | +0.76 |
Resnet18 | 69.74 | 71.48 | +1.74 |
Resnet34 | 73.30 | 74.38 | +1.08 |
Resnet50 | 76.16 | 77.23 | +1.07 |
Resnet101 | 77.37 | 78.22 | +0.85 |
DenseNet121 | 74.43 | 75.29 | +0.86 |
MobileNetv2 | 71.88 | 72.72 | +0.84 |
Consistency How often two shifts of the same image are classified the same
Baseline | Antialiased | Delta | |
---|---|---|---|
AlexNet | 78.18 | 82.54 | +4.36 |
VGG16 | 88.52 | 89.92 | +1.40 |
VGG16bn | 89.24 | 91.22 | +1.98 |
Resnet18 | 85.11 | 88.07 | +2.96 |
Resnet34 | 87.56 | 89.53 | +1.97 |
Resnet50 | 89.20 | 91.29 | +2.09 |
Resnet101 | 89.81 | 91.85 | +2.04 |
DenseNet121 | 88.81 | 90.29 | +1.48 |
MobileNetv2 | 86.50 | 87.72 | +1.22 |
To reduce clutter, extended results are here. Help improve the results!
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
All material is made available under Creative Commons BY-NC-SA 4.0 license by Adobe Inc. You can use, redistribute, and adapt the material for non-commercial purposes, as long as you give appropriate credit by citing our paper and indicating any changes that you've made.
The repository builds off the PyTorch examples repository and torchvision models repository. These are BSD-style licensed.
If you find this useful for your research, please consider citing this bibtex. Please contact Richard Zhang <rizhang at adobe dot com> with any comments or feedback.