ICCV2023-paper-code

持续更新ICCV2023论文、代码等信息，欢迎关注AIWalker。主要聚焦以下几个方向，更多CV/AI资料可添加AIWalker助手【AIWalker-zhushou】获取(可扫描底部二维码)。

Backbone
Detection
Segmentation
Knowledge Distillation
Diffusion
Restoration
Super-Resolution
Deblurring
低光图像增强
IQA/IAA
数据集
....

Backbone

ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices

Paper: https://arxiv.org/pdf/2303.09730.pdf

Rethinking Mobile Block for Efficient Attention-based Models

Paper: https://arxiv.org/abs/2301.01146
Code: https://github.com/zhangzjn/EMO

UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer

Paper: https://arxiv.org/abs/2211.09552
Code: https://github.com/OpenGVLab/UniFormerV2

Unmasked Teacher: Towards Training-Efficient Video Foundation Models

Paper: https://arxiv.org/abs/2303.16058
Code: https://github.com/OpenGVLab/unmasked_teacher

A Unified Continual Learning Framework with General Parameter-Efficient Tuning

Paper: https://arxiv.org/abs/2303.10070
Code: https://github.com/gqk/LAE

Scale-Aware Modulation Meet Transformer

Paper: https://arxiv.org/abs/2307.08579
Code: https://github.com/AFeng-x/SMT

Improving Zero-Shot Generalization for CLIP with Synthesized Prompts

Paper: https://arxiv.org/abs/2307.07397
Code: https://github.com/mrflogs/SHIP

DreamTeacher: Pretraining Image Backbones with Deep Generative Models

ShiftNAS: Improving One-shot NAS via Probability Shift

Paper: https://arxiv.org/abs/2307.08300
Code: https://github.com/bestfleer/ShiftNAS

MULLER: Multilayer Laplacian Resizer for Vision

Paper: https://arxiv.org/pdf/2304.02859.pdf

FLatten Transformer: Vision Transformer with Focused Linear Attention

Paper: TODO
Code: https://github.com/LeapLabTHU/FLatten-Transformer

Not All Features Matter:Enhancing Few-shot CLIP with Adaptive Prior

Paper: https://arxiv.org/abs/2304.01195
Code: https://github.com/yangyangyang127/APE

Tuning Pre-trained Model via Moment Probing

Paper: https://arxiv.org/abs/2307.11342
Code: https://github.com/mingzeG/Moment-Probing

Strip-MLP: Efficient Token Interaction for Vision MLP

Paper: https://arxiv.org/pdf/2307.11458.pdf
Code: https://github.com/Med-Process/Strip_MLP

Adaptive Frequency Filters As Efficient Global Token Mixers

Paper: https://arxiv.org/abs/2307.14008

Learning Concise and Descriptive Attributes for Visual Recognition

Paper: https://arxiv.org/abs/2308.03685

Detection

FemtoDet: an object detection baseline for energy versus performance tradeoffs

Paper: https://arxiv.org/pdf/2301.06719.pdf
Code: https://github.com/yh-pengtu/FemtoDet

Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment

Paper: https://arxiv.org/pdf/2207.13085.pdf
Code: https://github.com/Atten4Vis/GroupDETR

Large Selective Kernel Network for Remote Sensing Object Detection

Paper: https://arxiv.org/abs/2303.09030
Code: https://github.com/zcablii/LSKNet

DiffusionDet: Diffusion Model for Object Detection

Paper: https://arxiv.org/abs/2211.09788
Code: https://github.com/ShoufaChen/DiffusionDet

DETRs with Collaborative Hybrid Assignments Training

Paper: https://arxiv.org/pdf/2211.12860.pdf
Code: https://github.com/Sense-X/Co-DETR

MIMDet: Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection

Paper: https://arxiv.org/abs/2204.02964
Code: https://github.com/hustvl/MIMDet

Detection Transformer with Stable Matching

Paper: https://arxiv.org/abs/2304.04742
Code: https://github.com/IDEA-Research/Stable-DINO

Random Boxes Are Open-world Object Detectors

Paper: https://arxiv.org/abs/2307.08249
Code: https://github.com/scuwyh2000/RandBox

AlignDet: Aligning Pre-training and Fine-tuning in Object Detection

Paper: https://arxiv.org/abs/2307.11077
Home: https://liming-ai.github.io/AlignDet

Cascade-DETR: Delving into High-Quality Universal Object Detection

Paper: https://arxiv.org/abs/2307.11035
Code: https://github.com/SysCV/cascade-detr

Deep Directly-Trained Spiking Neural Networks for Object Detection

Paper: https://arxiv.org/abs/2307.11411

COCO-O: A Benchmark for Object Detectors under Natural Distribution Shifts

Less is More: Focus Attention for Efficient DETR

Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes

RecursiveDet: End-to-End Region-based Recursive Object Detection

Paper: https://arxiv.org/abs/2307.13619
Code: https://github.com/bravezzzzzz/RecursiveDet

Segmentation

Segment Anything

Home: https://segment-anything.com/
Paper: https://arxiv.org/abs/2304.02643
Code: https://github.com/facebookresearch/segment-anything

SegGPT: Segmenting Everything in Context

Paper: https://arxiv.org/abs/2304.03284
Code: https://github.com/baaivision/Painter

VLPart: Going Denser with Open-Vocabulary Part Segmentation

Paper: https://arxiv.org/abs/2305.11173
Code: https://github.com/facebookresearch/VLPart

Referring Image Segmentation Using Text Supervision

Paper:
Code: https://github.com/fawnliu/WRIS_ICCV2023

EfficientViT: Lightweight Multi-Scale Attention for On-Device Semantic Segmentation

Paper: https://arxiv.org/pdf/2205.14756.pdf
Code: https://github.com/mit-han-lab/efficientvit

A Simple Framework for Open-Vocabulary Segmentation and Detection

Paper: https://arxiv.org/pdf/2303.08131.pdf
Code: https://github.com/IDEA-Research/OpenSeeD

Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation

Paper: https://arxiv.org/abs/2303.13399

Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation

Paper: https://arxiv.org/abs/2307.08388

OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation

Paper: https://arxiv.org/abs/2307.09356
Code: https://github.com/wudongming97/OnlineRefer

Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation

Paper: https://arxiv.org/abs/2307.11545
Code: https://github.com/kkakkkka/ETRIS

Exploring Transformers for Open-world Instance Segmentation

Code: https://arxiv.org/abs/2308.04206

Knowledge Distillation

From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels

Paper: https://arxiv.org/abs/2303.13005
Code: https://github.com/yzd-v/cls_KD

DOT: A Distillation-Oriented Trainer

Paper: https://arxiv.org/abs/2307.08436

Cumulative Spatial Knowledge Distillation for Vision Transformers

Paper: https://arxiv.org/abs/2307.08500

Class-relation Knowledge Distillation for Novel Class Discovery

Paper: https://arxiv.org/abs/2307.09158
Code: https://github.com/kleinzcy/Cr-KD-NCD

EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization

Code: https://github.com/lilujunai/EMQ-series

Rethinking Data Distillation: Do Not Overlook Calibration

Paper: https://arxiv.org/abs/2307.12463

Diffusion

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

Paper: https://arxiv.org/abs/2304.08465
Home: https://ljzycmd.github.io/projects/MasaCtrl/

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

Paper: https://arxiv.org/abs/2212.11565
Code: https://github.com/showlab/Tune-A-Video

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

Paper: https://arxiv.org/abs/2303.09535
Code: https://github.com/ChenyangQiQi/FateZero

Expressive Text-to-Image Generation with Rich Text

Paper: https://arxiv.org/abs/2304.06720
Home: https://rich-text-to-image.github.io/

Ablating Concepts in Text-to-Image Diffusion Models

Paper: https://arxiv.org/abs/2303.13516
Home: https://www.cs.cmu.edu/~concept-ablation/
Code: https://github.com/nupurkmr9/concept-ablation

Evaluating Data Attribution for Text-to-Image Models

Masked Diffusion Transformer is a Strong Image Synthesizer

Paper: TODO
Code: TODO

SVDiff: Compact Parameter Space for Diffusion Fine-tuning

paper: https://arxiv.org/abs/2303.11305

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

Paper: https://arxiv.org/abs/2307.10816

TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition

Code: https://github.com/Shilin-LU/TF-ICON

Depth

Neural Video Depth Stabilizer

Paper: https://arxiv.org/abs/2307.08695

Kick Back & Relax: Learning to Reconstruct the World by Watching SlowTV

Paper: https://arxiv.org/abs/2307.10713
Code: https://github.com/jspenmar/slowtv_monodepth

MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation

Paper: https://arxiv.org/abs/2307.14336

Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo Matching

Paper: https://arxiv.org/abs/2307.14071

VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation

Paper: https://arxiv.org/abs/2303.08340
Code: https://github.com/XiaoyuShi97/VideoFlow

Learning Depth Estimation for Transparent and Mirror Surfaces

Paper: https://arxiv.org/pdf/2307.15052.pdf

Restoration

Adaptive Nonlinear Latent Transformation for Conditional Face Editing

Paper: https://arxiv.org/abs/2307.07790
Code: https://github.com/Hzzone/AdaTrans

Towards Authentic Face Restoration with Iterative Diffusion Models and Beyond

Paper: https://arxiv.org/abs/2307.08996

Diffir: Efficient diffusion model for image restoration

Paper: https://arxiv.org/abs/2303.09472
Code: https://github.com/Zj-BinXia/DiffIR

Physics-Driven Turbulence Image Restoration with Stochastic Refinement

Paper: https://arxiv.org/pdf/2307.10603
Code: https://github.com/VITA-Group/PiRN

Learning Image-Adaptive Codebooks for Class-Agnostic Image Restoration

Paper: https://arxiv.org/abs/2306.06513
Code: https://github.com/kechunl/AdaCode

Under-Display Camera Image Restoration with Scattering Effect

Paper: https://arxiv.org/abs/2308.04163
Code: https://github.com/NamecantbeNULL/SRUDC

From Sky to the Ground: A Large-scale Benchmark and Simple Baseline Towards Real Rain Removal

Paper: https://arxiv.org/abs/2308.03867

GlowGAN: Unsupervised Learning of HDR Images from LDR Images in the Wild -Paper: https://arxiv.org/pdf/2211.12352.pdf

Super-Resolution

SRFormer: Permuted Self-Attention for Single Image Super-Resolution

Paper: https://arxiv.org/abs/2303.09735
Code: https://github.com/HVision-NKU/SRFormer

SAFMN: Spatially-Adaptive Feature Modulation for Efficient Image Super-Resolution

Paper: https://arxiv.org/pdf/2302.13800.pdf
Code: https://github.com/sunny2109/SAFMN

DLGSANet: Lightweight Dynamic Local and Global Self-Attention Network for Image Super-Resolution

Paper: https://arxiv.org/pdf/2301.02031.pdf
Code: https://github.com/NeonLeexiang/DLGSANet

Dual Aggregation Transformer for Image Super-Resolution

Paper: https://arxiv.org/abs/2308.03364
Code: https://github.com/zhengchen1999/DAT

A Benchmark for Chinese-English Scene Text Image Super-resolution

Paper: https://arxiv.org/abs/2308.03262
Code:https://github.com/mjq11302010044/Real-CE

Deblurring

Multi-scale Residual Low-Pass Filter Network for Image Deblurring

Paper: TODO
Code: TODO

Low-light Image Enhance

Implicit Neural Representation for Cooperative Low-light Image Enhancement

Paper: https://arxiv.org/abs/2303.11722
Code: https://github.com/Ysz2022/NeRCo

Iterative Prompt Learning for Unsupervised Backlit Image Enhancement

Paper: https://arxiv.org/abs/2303.17569
Home: https://zhexinliang.github.io/CLIP_LIT_page/

ExposureDiffusion: Learning to Expose for Low-light Image Enhancement

Paper: https://arxiv.org/abs/2307.07710

IQA/IAA

Delegate Transformer for Image Color Aesthetics Assessment

Paper: TODO
Code: https://github.com/woshidandan/Image-Color-Aesthetics-Assessment

Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives

Paper: https://arxiv.org/abs/2211.04894
Code: https://github.com/VQAssessment/DOVER

On the Effectiveness of Spectral Discriminators for Perceptual Quality Improvement

Paper: https://arxiv.org/abs/2307.12027
Code: https://github.com/Luciennnnnnn/DualFormer

Other

Fast Full-frame Video Stabilization with Iterative Optimization

Paper: https://arxiv.org/abs/2307.12774

Dataset

LPFF: A Portrait Dataset for Face Generators Across Large Poses

Paper: https://arxiv.org/abs/2303.14407
Code: https://github.com/oneThousand1000/LPFF-dataset

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
assets		assets
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ICCV2023-paper-code

Backbone

Detection

Segmentation

Knowledge Distillation

Diffusion

Depth

Restoration

Super-Resolution

Deblurring

Low-light Image Enhance

IQA/IAA

Other

Dataset

AIWalker-小助手

About

Releases

Packages

HappyAIWalker/ICCV2023-paper-code

Folders and files

Latest commit

History

Repository files navigation

ICCV2023-paper-code

Backbone

Detection

Segmentation

Knowledge Distillation

Diffusion

Depth

Restoration

Super-Resolution

Deblurring

Low-light Image Enhance

IQA/IAA

Other

Dataset

AIWalker-小助手

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages