Antialiased CNNs [Project Page] [Paper]

Making Convolutional Networks Shift-Invariant Again
Richard Zhang.
To appear in ICML, 2019.

This repository contains examples of anti-aliased convnets.

Table of contents

Pretrained antialiased models
Instructions for antialiasing your own model, using the BlurPool layer
Results on Imagenet consistency + accuracy.
ImageNet training and evaluation code. Achieving better consistency, while maintaining or improving accuracy, is an open problem. Help improve the results!

Licenses

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

All material is made available under Creative Commons BY-NC-SA 4.0 license by Adobe Inc. You can use, redistribute, and adapt the material for non-commercial purposes, as long as you give appropriate credit by citing our paper and indicating any changes that you've made.

The repository builds off the PyTorch examples repository and torchvision models repository. These are BSD-style licensed.

(0) Getting started

PyTorch

Install PyTorch (pytorch.org)
pip install -r requirements.txt

Download anti-aliased models

Run bash weights/get_antialiased_models.sh

(1) Quickstart: load an antialiased model

The following loads a pretrained antialiased model, perhaps as a backbone for your application.

import torch
import models_lpf.resnet

model = models_lpf.resnet.resnet50(filter_size=3)
model.load_state_dict(torch.load('weights/resnet50_lpf3.pth.tar')['state_dict'])

We also provide weights for antialiased AlexNet, VGG16(bn), Resnet18,34,50,101, Densenet121, and MobileNetv2 (see example_usage.py).

(2) Antialias your own architecture

The methodology is simple -- first evaluate with stride 1, and then use our Downsample layer to do antialiased downsampling.

Copy models_lpf into your codebase. This file contains the Downsample class, which does blur+subsampling. Put the following into your header.

from models_lpf import *

Make the following architectural changes to antialias your strided layers. Typically, blur kernel M is 3 or 5.

MaxPool (stride 2) → Max (stride 1) + BlurPool (stride 2)
Conv (stride 2) + ReLU → Conv(stride 1) + ReLU + BlurPool(stride 2)
AvgPool (stride 2) → BlurPool (stride 2)

Original	Anti-aliased replacement
`[nn.MaxPool2d(kernel_size=2, stride=2),]`	`[nn.MaxPool2d(kernel_size=2, stride=1),` `Downsample(channels=C, filt_size=M, stride=2)]`
`[nn.Conv2d(Cin,C,kernel_size=3,stride=2,padding=1),` `nn.ReLU(inplace=True)]`	`[nn.Conv2d(Cin,C,kernel_size=3,stride=1,padding=1),` `nn.ReLU(inplace=True),` `Downsample(channels=C, filt_size=M, stride=2)]`
`nn.AvgPool2d(kernel_size=2, stride=2)`	`Downsample(channels=C, filt_size=M, stride=2)`

We assume tensor has C channels. Note that this requires computing a layer at stride 1 instead of stride 2, which adds memory and run-time. We typically skip this step at the highest-resolution (early in the network), to prevent large increases.

(3) Results

We show consistency (y-axis) vs accuracy (x-axis) for various networks. Up and to the right is good. Training and testing instructions are here.

We italicize a variant if it is not on the Pareto front -- that is, it is strictly dominated in both aspects by another variant. We bold a variant if it is on the Pareto front. We bold highest values per column.

Note that the current arxiv paper is slightly out of date; we will update soon.

AlexNet (plot)

	Accuracy	Consistency
Baseline	56.55	78.18
Rect-2	57.24	81.33
Tri-3	56.90	82.15
Bin-5	56.58	82.51

VGG16 (plot)

	Accuracy	Consistency
Baseline	71.59	88.52
Rect-2	72.15	89.24
Tri-3	72.20	89.60
Bin-5	72.33	90.19

VGG16bn (plot)

	Accuracy	Consistency
Baseline	73.36	89.24
Rect-2	74.01	90.72
Tri-3	73.91	91.10
Bin-5	74.05	91.35

ResNet18 (plot)

	Accuracy	Consistency
Baseline	69.74	85.11
Rect-2	71.39	86.90
Tri-3	71.69	87.51
Bin-5	71.38	88.25

ResNet34 (plot)

	Accuracy	Consistency
Baseline	73.30	87.56
Rect-2	74.46	89.14
Tri-3	74.33	89.32
Bin-5	74.20	89.49

ResNet50 (plot)

	Accuracy	Consistency
Baseline	76.16	89.20
Rect-2	76.81	89.96
Tri-3	76.83	90.91
Bin-5	77.04	91.31

ResNet101 (plot)

	Accuracy	Consistency
Baseline	77.37	89.81
Rect-2	77.82	91.04
Tri-3	78.13	91.62
Bin-5	77.92	91.74

DenseNet121 (plot)

	Accuracy	Consistency
Baseline	74.43	88.81
Rect-2	75.04	89.53
Tri-3	75.14	89.78
Bin-5	75.03	90.39

MobileNet-v2 (plot)

	Accuracy	Consistency
Baseline	71.88	86.50
Rect-2	72.63	87.33
Tri-3	72.59	87.46
Bin-5	72.50	87.79

(A) Acknowledgments

This repository is built off the PyTorch ImageNet training and torchvision models repositories.

(B) Citation, Contact

If you find this useful for your research, please consider citing this bibtex. Please contact Richard Zhang <rizhang at adobe dot com> with any comments or feedback.

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.github		.github
models_lpf		models_lpf
weights		weights
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE.md		LICENSE.md
README.md		README.md
README_IMAGENET.md		README_IMAGENET.md
example_usage.py		example_usage.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Antialiased CNNs [Project Page] [Paper]

Licenses

(0) Getting started

PyTorch

Download anti-aliased models

(1) Quickstart: load an antialiased model

(2) Antialias your own architecture

(3) Results

(A) Acknowledgments

(B) Citation, Contact

About

Releases

Packages

Languages

License

wangkanger/antialiased-cnns

Folders and files

Latest commit

History

Repository files navigation

Antialiased CNNs [Project Page] [Paper]

Licenses

(0) Getting started

PyTorch

Download anti-aliased models

(1) Quickstart: load an antialiased model

(2) Antialias your own architecture

(3) Results

(A) Acknowledgments

(B) Citation, Contact

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages