DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention

Official PyTorch implementation of DeBiFormer, from the following paper:

DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention. ACCV 2024.
Nguyen Huu Bao Long, Chenyu Zhang, Yuzhi Shi, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi, and Tohgoroh Matsui

News

2024-09-21: The paper has been accepted at ACCV 2024 !!!

Results and Pre-trained Models

ImageNet-1K trained models

name	resolution	acc@1	#params	FLOPs	model	log
DeBiFormer-T	224x224	81.9	21.4 M	2.6 G	model	log
DeBiFormer-S	224x224	83.9	44 M	5.4 G	model	log
DeBiFormer-B	224x224	84.4	77 M	11.8 G	model	log

Usage

First, clone the repository locally:

git clone https://github.com/maclong01/DeBiFormer.git
pip3 install -r requirements.txt

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val/ folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

Training

To train DeBiFormer-S on ImageNet using 8 gpus for 300 epochs, run:

cd classification/
bash train.sh 8 --model debiformer_small --batch-size 256 --lr 5e-4 --warmup-epochs 20 --weight-decay 0.1 --data-path your_imagenet_path

Evaluation

To evaluate the performance of DeBiFormer-S on ImageNet using 8 gpus, run:

cd classification/
bash train.sh 8 --model debiformer_small --batch-size 256 --lr 5e-4 --warmup-epochs 20 --weight-decay 0.1 --data-path your_imagenet_path --resume ../checkpoints/debiformer_small_in1k_224.pth --eval

Acknowledgement

This repository is built using the timm library, DAT, and BiFormer repositories.

License

This project is released under the MIT license. Please see the LICENSE file for more information.

Citation

If you find this repository helpful, please consider citing:

@InProceedings{BaoLong_2024_ACCV,
    author    = {BaoLong, NguyenHuu and Zhang, Chenyu and Shi, Yuzhi and Hirakawa, Tsubasa and Yamashita, Takayoshi and Matsui, Tohgoroh and Fujiyoshi, Hironobu},
    title     = {DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention},
    booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)},
    month     = {December},
    year      = {2024},
    pages     = {4455-4472}
}

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
assets		assets
classification		classification
deployment		deployment
detection		detection
segmentation		segmentation
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention

News

Results and Pre-trained Models

ImageNet-1K trained models

Usage

Data preparation

Training

Evaluation

Acknowledgement

License

Citation

About

Releases

Packages

Languages

License

maclong01/DeBiFormer

Folders and files

Latest commit

History

Repository files navigation

DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention

News

Results and Pre-trained Models

ImageNet-1K trained models

Usage

Data preparation

Training

Evaluation

Acknowledgement

License

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages