ECCV 2024 decisions are now available!
注1:欢迎各位大佬提交issue,分享ECCV 2024论文和开源项目!
注2:关于往年CV顶会论文以及其他优质CV论文和大盘点,详见: https://github.com/amusi/daily-paper-computer-vision
欢迎扫码加入【CVer学术交流群】,这是最大的计算机视觉AI知识星球!每日更新,第一时间分享最新最前沿的计算机视觉、AI绘画、图像处理、深度学习、自动驾驶、医疗影像和AIGC等方向的学习资料,学起来!
- 3DGS(Gaussian Splatting)
- Mamba / SSM)
- Avatars
- Backbone
- CLIP
- MAE
- Embodied AI
- GAN
- GNN
- 多模态大语言模型(MLLM)
- 大语言模型(LLM)
- NAS
- OCR
- NeRF
- DETR
- Prompt
- 扩散模型(Diffusion Models)
- ReID(重识别)
- 长尾分布(Long-Tail)
- Vision Transformer
- 视觉和语言(Vision-Language)
- 自监督学习(Self-supervised Learning)
- 数据增强(Data Augmentation)
- 目标检测(Object Detection)
- 异常检测(Anomaly Detection)
- 目标跟踪(Visual Tracking)
- 语义分割(Semantic Segmentation)
- 实例分割(Instance Segmentation)
- 全景分割(Panoptic Segmentation)
- 医学图像(Medical Image)
- 医学图像分割(Medical Image Segmentation)
- 视频目标分割(Video Object Segmentation)
- 视频实例分割(Video Instance Segmentation)
- 参考图像分割(Referring Image Segmentation)
- 图像抠图(Image Matting)
- 图像编辑(Image Editing)
- Low-level Vision
- 超分辨率(Super-Resolution)
- 去噪(Denoising)
- 去模糊(Deblur)
- 自动驾驶(Autonomous Driving)
- 3D点云(3D Point Cloud)
- 3D目标检测(3D Object Detection)
- 3D语义分割(3D Semantic Segmentation)
- 3D目标跟踪(3D Object Tracking)
- 3D语义场景补全(3D Semantic Scene Completion)
- 3D配准(3D Registration)
- 3D人体姿态估计(3D Human Pose Estimation)
- 3D人体Mesh估计(3D Human Mesh Estimation)
- 医学图像(Medical Image)
- 图像生成(Image Generation)
- 视频生成(Video Generation)
- 3D生成(3D Generation)
- 视频理解(Video Understanding)
- 行为检测(Action Detection)
- 文本检测(Text Detection)
- 知识蒸馏(Knowledge Distillation)
- 模型剪枝(Model Pruning)
- 图像压缩(Image Compression)
- 三维重建(3D Reconstruction)
- 深度估计(Depth Estimation)
- 轨迹预测(Trajectory Prediction)
- 车道线检测(Lane Detection)
- 图像描述(Image Captioning)
- 视觉问答(Visual Question Answering)
- 手语识别(Sign Language Recognition)
- 视频预测(Video Prediction)
- 新视点合成(Novel View Synthesis)
- Zero-Shot Learning(零样本学习)
- 立体匹配(Stereo Matching)
- 特征匹配(Feature Matching)
- 场景图生成(Scene Graph Generation)
- 隐式神经表示(Implicit Neural Representations)
- 图像质量评价(Image Quality Assessment)
- 视频质量评价(Video Quality Assessment)
- 数据集(Datasets)
- 新任务(New Tasks)
- 其他(Others)
MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images
- Project: https://donydchen.github.io/mvsplat
- Paper: https://arxiv.org/abs/2403.14627
- Code:https://github.com/donydchen/mvsplat
CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians
FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting
- Project: https://zehaozhu.github.io/FSGS/
- Paper: https://arxiv.org/abs/2312.00451
- Code: https://github.com/VITA-Group/FSGS
VideoMamba: State Space Model for Efficient Video Understanding
Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors
Fully Sparse 3D Occupancy Prediction
SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant
ControlCap: Controllable Region-level Captioning
GiT: Towards Generalist Vision Transformer through Universal Language Interface
GalLoP: Learning Global and Local Prompts for Vision-Language Models
Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation
DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries
- Project: https://zhang-tao-whu.github.io/projects/DVIS_DAQ/
- Paper: https://arxiv.org/abs/2404.00086
- Code: https://github.com/zhang-tao-whu/DVIS_Plus
Fully Sparse 3D Occupancy Prediction
milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing
- Paper: https://arxiv.org/abs/2306.17010
- Code link: https://github.com/Toytiny/milliFlow/
3D Small Object Detection with Dynamic Spatial Pruning
- Project: https://xuxw98.github.io/DSPDet3D/
- Paper: https://arxiv.org/abs/2305.03716
- Code: https://github.com/xuxw98/DSPDet3D
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion
- Project https://tencentarc.github.io/BrushNet/
- Paper: https://arxiv.org/abs/2403.06976
- Code: https://github.com/TencentARC/BrushNet
Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models
VideoMamba: State Space Model for Efficient Video Understanding