GitHub

Updated on 2025.03.13

Table of Contents

6D Pose
Point Cloud Registration
Point Cloud Segmentation
Zero-shot

6D Pose

Publish Date	Title	Authors	PDF	Code
2025-03-11	Keypoint Detection and Description for Raw Bayer Images	Jiakai Lin et.al.	2503.08673v1	null
2025-03-11	SGNetPose+: Stepwise Goal-Driven Networks with Pose Information for Trajectory Prediction in Autonomous Driving	Akshat Ghiya et.al.	2503.08016v1	null
2025-03-10	Better Pose Initialization for Fast and Robust 2D/3D Pelvis Registration	Yehyun Suh et.al.	2503.07767v1	null
2025-03-10	HumanMM: Global Human Motion Recovery from Multi-shot Videos	Yuhong Zhang et.al.	2503.07597v1	null
2025-03-11	AthletePose3D: A Benchmark Dataset for 3D Human Pose Estimation and Kinematic Validation in Athletic Movements	Calvin Yeung et.al.	2503.07499v2	null
2025-03-10	Multi-Robot System for Cooperative Exploration in Unknown Environments: A Survey	Chuqi Wang et.al.	2503.07278v1	null
2025-03-12	Endo-FASt3r: Endoscopic Foundation model Adaptation for Structure from motion	Mona Sheikh Zeinoddin et.al.	2503.07204v2	null
2025-03-10	Multi-Modal 3D Mesh Reconstruction from Images and Text	Melvin Reka et.al.	2503.07190v1	null
2025-03-11	PoseLess: Depth-Free Vision-to-Joint Control via Direct Image Mapping with VLM	Alan Dao et.al.	2503.07111v2	null
2025-03-09	AxisPose: Model-Free Matching-Free Single-Shot 6D Object Pose Estimation via Axis Generation	Yang Zou et.al.	2503.06660v1	null
2025-03-08	NeuraLoc: Visual Localization in Neural Implicit Map with Dual Complementary Features	Hongjia Zhai et.al.	2503.06117v1	null
2025-03-08	Fish2Mesh Transformer: 3D Human Mesh Recovery from Egocentric Vision	David C. Jeong et.al.	2503.06089v1	null
2025-03-08	ReJSHand: Efficient Real-Time Hand Pose Estimation and Mesh Reconstruction Using Refined Joint and Skeleton Features	Shan An et.al.	2503.05995v1	null
2025-03-07	Differentiable Rendering-based Pose Estimation for Surgical Robotic Instruments	Zekai Liang et.al.	2503.05953v1	null
2025-03-07	Novel Object 6D Pose Estimation with a Single Reference View	Jian Liu et.al.	2503.05578v1	null
2025-03-07	Multi-Grained Feature Pruning for Video-Based Human Pose Estimation	Zhigang Wang et.al.	2503.05365v1	null
2025-03-07	Persistent Object Gaussian Splat (POGS) for Tracking Human and Robot Manipulation of Irregularly Shaped Objects	Justin Yu et.al.	2503.05189v1	null
2025-03-07	SplatPose: Geometry-Aware 6-DoF Pose Estimation from Single RGB Image via 3D Gaussian Splatting	Linqi Yang et.al.	2503.05174v1	null
2025-03-07	GaussianCAD: Robust Self-Supervised CAD Reconstruction from Three Orthographic Views Using 3D Gaussian Splatting	Zheng Zhou et.al.	2503.05161v1	null
2025-03-06	MarsLGPR: Mars Rover Localization with Ground Penetrating Radar	Anja Sheppard et.al.	2503.04944v1	null
2025-03-09	ReynoldsFlow: Exquisite Flow Estimation via Reynolds Transport Theorem	Yu-Hsi Chen et.al.	2503.04500v2	null
2025-03-05	Active 6D Pose Estimation for Textureless Objects using Multi-View RGB Frames	Jun Yang et.al.	2503.03726v1	null
2025-03-05	Machine Learning in Biomechanics: Key Applications and Limitations in Walking, Running, and Sports Movements	Carlo Dindorf et.al.	2503.03717v1	null
2025-03-05	Improving 6D Object Pose Estimation of metallic Household and Industry Objects	Thomas Pöllabauer et.al.	2503.03655v1	null
2025-03-05	Tiny Lidars for Manipulator Self-Awareness: Sensor Characterization and Initial Localization Experiments	Giammarco Caroleo et.al.	2503.03449v1	null
2025-03-05	Direct Sparse Odometry with Continuous 3D Gaussian Maps for Indoor Environments	Jie Deng et.al.	2503.03373v1	null
2025-03-05	Supervised Visual Docking Network for Unmanned Surface Vehicles Using Auto-labeling in Real-world Water Environments	Yijie Chu et.al.	2503.03282v1	null
2025-03-05	SCORE: Saturated Consensus Relocalization in Semantic Line Maps	Haodong Jiang et.al.	2503.03254v1	null
2025-03-04	Monocular Person Localization under Camera Ego-motion	Yu Zhan et.al.	2503.02916v1	null
2025-03-04	PIDLoc: Cross-View Pose Optimization Network Inspired by PID Controllers	Wooju Lee et.al.	2503.02388v1	null
2025-03-04	DQO-MAP: Dual Quadrics Multi-Object mapping with Gaussian Splatting	Haoyuan Li et.al.	2503.02223v1	null
2025-03-04	Zero-Shot Sim-to-Real Visual Quadrotor Control with Hard Constraints	Yan Miao et.al.	2503.02198v1	null
2025-03-03	Constraint-Based Modeling of Dynamic Entities in 3D Scene Graphs for Robust SLAM	Marco Giberna et.al.	2503.02050v1	null
2025-03-05	Category-level Meta-learned NeRF Priors for Efficient Object Mapping	Saad Ejaz et.al.	2503.01582v2	null
2025-03-03	RUSSO: Robust Underwater SLAM with Sonar Optimization against Visual Degradation	Shu Pan et.al.	2503.01434v1	null
2025-03-03	ecg2o: A Seamless Extension of g2o for Equality-Constrained Factor Graph Optimization	Anas Abdelkarim et.al.	2503.01311v1	null
2025-03-03	Convex Hull-based Algebraic Constraint for Visual Quadric SLAM	Xiaolong Yu et.al.	2503.01254v1	link
2025-03-04	Floorplan-SLAM: A Real-Time, High-Accuracy, and Long-Term Multi-Session Point-Plane SLAM for Efficient Floorplan Reconstruction	Haolin Wang et.al.	2503.00397v2	null
2025-03-01	BGM2Pose: Active 3D Human Pose Estimation with Non-Stationary Sounds	Yuto Shibata et.al.	2503.00389v1	null
2025-02-28	BST: Badminton Stroke-type Transformer for Skeleton-based Action Recognition in Racket Sports	Jing-Yuan Chang et.al.	2502.21085v1	null
2025-02-28	Two-Stream Spatial-Temporal Transformer Framework for Person Identification via Natural Conversational Keypoints	Masoumeh Chapariniya et.al.	2502.20803v1	null
2025-02-27	Cutting-edge 3D reconstruction solutions for underwater coral reef images: A review and comparison	Jiageng Zhong et.al.	2502.20154v1	null
2025-02-27	BEV-DWPVO: BEV-based Differentiable Weighted Procrustes for Low Scale-drift Monocular Visual Odometry on Ground	Yufei Wei et.al.	2502.20078v1	null
2025-02-28	SegLocNet: Multimodal Localization Network for Autonomous Driving via Bird's-Eye-View Segmentation	Zijie Zhou et.al.	2502.20077v2	link
2025-02-27	RUBIK: A Structured Benchmark for Image Matching across Geometric Challenges	Thibaut Loiseau et.al.	2502.19955v1	null
2025-02-27	QORT-Former: Query-optimized Real-time Transformer for Understanding Two Hands Manipulating Objects	Elkhan Ismayilzada et.al.	2502.19769v1	null
2025-02-27	Accurate Pose Estimation for Flight Platforms based on Divergent Multi-Aperture Imaging System	Shunkun Liang et.al.	2502.19708v1	null
2025-02-26	Increasing the Task Flexibility of Heavy-Duty Manipulators Using Visual 6D Pose Estimation of Objects	Petri Mäkinen et.al.	2502.19169v1	null
2025-02-25	EgoSim: An Egocentric Multi-view Simulator and Real Dataset for Body-worn Cameras during Motion and Activity	Dominik Hollidt et.al.	2502.18373v1	null
2025-02-25	Learning Structure-Supporting Dependencies via Keypoint Interactive Transformer for General Mammal Pose Estimation	Tianyang Xu et.al.	2502.18214v1	link
2025-02-24	V-HOP: Visuo-Haptic 6D Object Pose Tracking	Hongyu Li et.al.	2502.17434v1	null
2025-02-23	Orchestrating Joint Offloading and Scheduling for Low-Latency Edge SLAM	Yao Zhang et.al.	2502.16495v1	null
2025-02-23	DeProPose: Deficiency-Proof 3D Human Pose Estimation via Adaptive Multi-View Fusion	Jianbin Jiao et.al.	2502.16419v1	link
2025-02-21	RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes	Sicheng Yu et.al.	2502.15633v1	null
2025-02-21	SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-training	Nie Lin et.al.	2502.15251v1	null
2025-02-21	Nonlinear Dynamical Systems for Automatic Face Annotation in Head Tracking and Pose Estimation	Thoa Thieu et.al.	2502.15179v1	null
2025-02-20	Design of a Visual Pose Estimation Algorithm for Moon Landing	Atakan Süslü et.al.	2502.14942v1	null
2025-02-20	Hier-SLAM++: Neuro-Symbolic Semantic SLAM with a Hierarchically Categorical Gaussian Splatting	Boying Li et.al.	2502.14931v1	null
2025-02-19	EfficientPose 6D: Scalable and Efficient 6D Object Pose Estimation	Zixuan Fang et.al.	2502.14061v1	null
2025-02-19	Active Illumination for Visual Ego-Motion Estimation in the Dark	Francesco Crocetti et.al.	2502.13708v1	null
2025-02-19	Object-Pose Estimation With Neural Population Codes	Heiko Hoffmann et.al.	2502.13403v1	null
2025-02-18	Spatiotemporal Multi-Camera Calibration using Freely Moving People	Sang-Eun Lee et.al.	2502.12546v1	null
2025-02-18	Learning Transformation-Isomorphic Latent Space for Accurate Hand Pose Estimation	Kaiwen Ren et.al.	2502.12535v1	null
2025-02-19	FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views	Shangzhan Zhang et.al.	2502.12138v2	null
2025-02-17	Enhancing Transparent Object Pose Estimation: A Fusion of GDR-Net and Edge Detection	Tessa Pulli et.al.	2502.12027v1	null
2025-02-17	SurgPose: a Dataset for Articulated Robotic Surgical Tool Pose Estimation and Tracking	Zijian Wu et.al.	2502.11534v1	null
2025-02-18	VarGes: Improving Variation in Co-Speech 3D Gesture Generation via StyleCLIPS	Ming Meng et.al.	2502.10729v2	link
2025-02-15	Semantics-aware Test-time Adaptation for 3D Human Pose Estimation	Qiuxia Lin et.al.	2502.10724v1	null
2025-02-15	Learning semantical dynamics and spatiotemporal collaboration for human pose estimation in video	Runyang Feng et.al.	2502.10616v1	null
2025-02-14	HIPPo: Harnessing Image-to-3D Priors for Model-free Zero-shot 6D Pose Estimation	Yibo Liu et.al.	2502.10606v1	null
2025-02-14	Manual2Skill: Learning to Read Manuals and Acquire Robotic Skills for Furniture Assembly Using Vision-Language Models	Chenrui Tie et.al.	2502.10090v1	null
2025-02-13	Metamorphic Testing for Pose Estimation Systems	Matias Duran et.al.	2502.09460v1	null
2025-02-13	BevSplat: Resolving Height Ambiguity via Feature-Based Gaussian Primitives for Weakly-Supervised Cross-View Localization	Qiwei Wang et.al.	2502.09080v1	null
2025-02-14	Siren Song: Manipulating Pose Estimation in XR Headsets Using Acoustic Attacks	Zijian Huang et.al.	2502.08865v2	null
2025-02-12	LIR-LIVO: A Lightweight,Robust LiDAR/Vision/Inertial Odometry with Illumination-Resilient Deep Features	Shujie Zhou et.al.	2502.08676v1	link
2025-02-12	CordViP: Correspondence-based Visuomotor Policy for Dexterous Manipulation in Real-World	Yankai Fu et.al.	2502.08449v1	null
2025-02-11	GaRLIO: Gravity enhanced Radar-LiDAR-Inertial Odometry	Chiyun Noh et.al.	2502.07703v1	link
2025-02-11	Matrix3D: Large Photogrammetry Model All-in-One	Yuanxun Lu et.al.	2502.07685v1	null
2025-02-08	Vision-in-the-loop Simulation for Deep Monocular Pose Estimation of UAV in Ocean Environment	Maneesha Wickramasuriya et.al.	2502.05409v1	null
2025-02-06	Measuring Physical Plausibility of 3D Human Poses Using Physics Simulation	Nathan Louis et.al.	2502.04483v1	link
2025-02-06	GCE-Pose: Global Context Enhancement for Category-level Object Pose Estimation	Weihang Li et.al.	2502.04293v1	null
2025-02-06	Advanced Object Detection and Pose Estimation with Hybrid Task Cascade and High-Resolution Networks	Yuhui Jin et.al.	2502.03877v1	null
2025-02-05	Mapping and Localization Using LiDAR Fiducial Markers	Yibo Liu et.al.	2502.03510v1	null
2025-02-04	Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose Estimation	Jian Liu et.al.	2502.02525v1	link
2025-02-03	CleanPose: Category-Level Object Pose Estimation via Causal Learning and Knowledge Distillation	Xiao Lin et.al.	2502.01312v1	null
2025-02-03	Enhancing Feature Tracking Reliability for Visual Navigation using Real-Time Safety Filter	Dabin Kim et.al.	2502.01092v1	null
2025-02-03	ZeroBP: Learning Position-Aware Correspondence for Zero-shot 6D Pose Estimation in Bin-Picking	Jianqiu Chen et.al.	2502.01004v1	null
2025-01-31	A Direct Semi-Exhaustive Search Method for Robust, Partial-to-Full Point Cloud Registration	Richard Cheng et.al.	2502.00115v1	null
2025-01-31	XRF V2: A Dataset for Action Summarization with Wi-Fi Signals, and IMUs in Phones, Watches, Earbuds, and Glasses	Bo Lan et.al.	2501.19034v1	link
2025-01-30	SimpleDepthPose: Fast and Reliable Human Pose Estimation with RGBD-Images	Daniel Bermuth et.al.	2501.18478v1	link
2025-01-29	Online Trajectory Replanner for Dynamically Grasping Irregular Objects	Minh Nhat Vu et.al.	2501.17968v1	null
2025-01-28	DebugAgent: Efficient and Interpretable Error Slice Discovery for Comprehensive Model Debugging	Muxi Chen et.al.	2501.16751v1	null
2025-01-27	Toward Efficient Generalization in 3D Human Pose Estimation via a Canonical Domain Approach	Hoosang Lee et.al.	2501.16146v1	null
2025-01-27	NanoHTNet: Nano Human Topology Network for Efficient 3D Human Pose Estimation	Jialun Cai et.al.	2501.15763v1	null
2025-01-25	Towards Better Robustness: Progressively Joint Pose-3DGS Learning for Arbitrarily Long Videos	Zhen-Hui Dong et.al.	2501.15096v1	null
2025-01-25	SpatioTemporal Learning for Human Pose Estimation in Sparsely-Labeled Videos	Yingying Jiao et.al.	2501.15073v1	null
2025-01-24	3D/2D Registration of Angiograms using Silhouette-based Differentiable Rendering	Taewoong Lee et.al.	2501.14918v1	link
2025-01-24	Light3R-SfM: Towards Feed-forward Structure-from-Motion	Sven Elflein et.al.	2501.14914v1	null
2025-01-24	Glissando-Net: Deep sinGLe vIew category level poSe eStimation ANd 3D recOnstruction	Bo Sun et.al.	2501.14896v1	null
2025-01-24	Optimizing Grasping Precision for Industrial Pick-and-Place Tasks Through a Novel Visual Servoing Approach	Khairidine Benali et.al.	2501.14557v1	null
2025-01-24	LiDAR-Based Vehicle Detection and Tracking for Autonomous Racing	Marcello Cellina et.al.	2501.14502v1	null
2025-01-24	Optimizing Human Pose Estimation Through Focused Human and Joint Regions	Yingying Jiao et.al.	2501.14439v1	null
2025-01-24	Causal-Inspired Multitask Learning for Video-Based Human Pose Estimation	Haipeng Chen et.al.	2501.14356v1	null
2025-01-24	HAMMER: Heterogeneous, Multi-Robot Semantic Gaussian Splatting	Javier Yu et.al.	2501.14147v1	null
2025-01-23	Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass	Jianing Yang et.al.	2501.13928v1	null
2025-01-23	EgoHand: Ego-centric Hand Pose Estimation and Gesture Recognition with Head-mounted Millimeter-wave Radar and IMUs	Yizhe Lv et.al.	2501.13805v1	link
2025-01-23	VIGS SLAM: IMU-based Large-Scale 3D Gaussian Splatting SLAM	Gyuhyeon Pak et.al.	2501.13402v1	null
2025-01-22	Deep Learning-Based Image Recovery and Pose Estimation for Resident Space Objects	Louis Aberdeen et.al.	2501.13009v1	null
2025-01-21	BlanketGen2-Fit3D: Synthetic Blanket Augmentation Towards Improving Real-World In-Bed Blanket Occluded Human Pose Estimation	Tamás Karácsony et.al.	2501.12318v1	null
2025-01-19	Refinement Module based on Parse Graph of Feature Map for Human Pose Estimation	Shibang Liu et.al.	2501.11069v1	null
2025-01-18	RoMu4o: A Robotic Manipulation Unit For Orchard Operations Automating Proximal Hyperspectral Leaf Sensing	Mehrad Mortazavi et.al.	2501.10621v1	link
2025-01-17	landmarker: a Toolkit for Anatomical Landmark Localization in 2D/3D Images	Jef Jonkers et.al.	2501.10098v1	link
2025-01-16	A New Teacher-Reviewer-Student Framework for Semi-supervised 2D Human Pose Estimation	Wulian Yun et.al.	2501.09565v1	null
2025-01-21	Towards Robust and Realistic Human Pose Estimation via WiFi Signals	Yang Chen et.al.	2501.09411v2	link
2025-01-16	RoboReflect: Robotic Reflective Reasoning for Grasping Ambiguous-Condition Objects	Zhen Luo et.al.	2501.09307v1	null
2025-01-16	BRIGHT-VO: Brightness-Guided Hybrid Transformer for Visual Odometry with Multi-modality Refinement Module	Dongzhihan Wang et.al.	2501.08659v2	null
2025-01-14	Poseidon: A ViT-based Architecture for Multi-Frame Pose Estimation with Adaptive Frame Weighting and Multi-Scale Feature Fusion	Cesare Davide Pace et.al.	2501.08446v1	link
2025-01-14	Leveraging 2D Masked Reconstruction for Domain Adaptation of 3D Pose Estimation	Hansoo Park et.al.	2501.08408v1	null
2025-01-14	Predicting 4D Hand Trajectory from Monocular Videos	Yufei Ye et.al.	2501.08329v1	null
2025-01-14	A Critical Synthesis of Uncertainty Quantification and Foundation Models in Monocular Depth Estimation	Steven Landgraf et.al.	2501.08188v1	null
2025-01-14	AgentPose: Progressive Distribution Alignment via Feature Agent for Human Pose Distillation	Feng Zhang et.al.	2501.08088v1	null
2025-01-14	Robust Low-Light Human Pose Estimation through Illumination-Texture Modulation	Feng Zhang et.al.	2501.08038v1	null
2025-01-14	BioPose: Biomechanically-accurate 3D Pose Estimation from Monocular Videos	Farnoosh Koleini et.al.	2501.07800v1	null
2025-01-13	Fixing the Scale and Shift in Monocular Depth For Camera Pose Estimation	Yaqing Ding et.al.	2501.07742v1	link
2025-01-13	Efficiently Closing Loops in LiDAR-Based SLAM Using Point Cloud Density Maps	Saurabh Gupta et.al.	2501.07399v1	null
2025-01-13	Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics	Tze Ho Elden Tse et.al.	2501.07100v1	null
2025-01-10	eKalibr: Dynamic Intrinsic Calibration for Event Cameras From First Principles of Events	Shuolong Chen et.al.	2501.05688v1	link
2025-01-09	Relative Pose Estimation through Affine Corrections of Monocular Depth Priors	Yifan Yu et.al.	2501.05446v1	link
2025-01-09	From Simple to Complex Skills: The Case of In-Hand Object Reorientation	Haozhi Qi et.al.	2501.05439v1	null
2025-01-11	Towards Balanced Continual Multi-Modal Learning in Human Pose Estimation	Jiaxuan Peng et.al.	2501.05264v2	null
2025-01-08	KN-LIO: Geometric Kinematics and Neural Field Coupled LiDAR-Inertial Odometry	Zhong Wang et.al.	2501.04263v1	null
2025-01-07	OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints	Mingjie Pan et.al.	2501.03841v1	null
2025-01-10	MC-VTON: Minimal Control Virtual Try-On Diffusion Transformer	Junsheng Luan et.al.	2501.03630v2	null
2025-01-07	TexHOI: Reconstructing Textures of 3D Unknown Objects in Monocular Hand-Object Interaction Scenes	Alakh Aggarwal et.al.	2501.03525v1	link
2025-01-06	Mobile Augmented Reality Framework with Fusional Localization and Pose Estimation	Songlin Hou et.al.	2501.03336v1	null
2025-01-06	SurgRIPE challenge: Benchmark of Surgical Robot Instrument Pose Estimation	Haozheng Xu et.al.	2501.02990v1	null
2025-01-06	HaWoR: World-Space Hand Motion Reconstruction from Egocentric Videos	Jinglei Zhang et.al.	2501.02973v1	null
2025-01-06	Spiking monocular event based 6D pose estimation for space application	Jonathan Courtois et.al.	2501.02916v1	null
2025-01-06	Universal Features Guided Zero-Shot Category-Level Object Pose Estimation	Wentian Qu et.al.	2501.02831v1	null
2025-01-06	Unsupervised Domain Adaptation for Occlusion Resilient Human Pose Estimation	Arindam Dutta et.al.	2501.02773v1	null
2025-01-06	WorldPose: A World Cup Dataset for Global 3D Human Pose Estimation	Tianjian Jiang et.al.	2501.02771v1	null
2025-01-05	LP-ICP: General Localizability-Aware Point Cloud Registration for Robust Localization in Extreme Unstructured Environments	Haosong Yue et.al.	2501.02580v1	link
2025-01-04	ROLO-SLAM: Rotation-Optimized LiDAR-Only SLAM in Uneven Terrain with Ground Vehicle	Yinchuan Wang et.al.	2501.02166v1	link
2025-01-03	TCPFormer: Learning Temporal Correlation with Implicit Pose Proxy for 3D Human Pose Estimation	Jiajie Liu et.al.	2501.01770v1	link
2025-01-03	Laparoscopic Scene Analysis for Intraoperative Visualisation of Gamma Probe Signals in Minimally Invasive Cancer Surgery	Baoru Huang et.al.	2501.01752v1	null
2025-01-03	Free-Form Motion Control: A Synthetic Video Generation Dataset with Controllable Camera and Object Motions	Xincheng Shuai et.al.	2501.01425v2	null
2025-01-02	On Unifying Video Generation and Camera Pose Estimation	Chun-Hao Paul Huang et.al.	2501.01409v1	null
2025-01-02	L3D-Pose: Lifting Pose for 3D Avatars from a Single Camera in the Wild	Soumyaratna Debnath et.al.	2501.01174v1	null
2024-12-31	Relative Pose Observability Analysis Using Dual Quaternions	Nicholas B. Andrews et.al.	2501.00657v1	null
2024-12-31	VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception	Zhaoliang Wan et.al.	2501.00510v1	null
2024-12-30	Hierarchical Pose Estimation and Mapping with Multi-Scale Neural Feature Fields	Evgenii Kruzhkov et.al.	2412.20976v1	null
2024-12-30	ReFlow6D: Refraction-Guided Transparent Object 6D Pose Estimation via Intermediate Representation Learning	Hrishikesh Gupta et.al.	2412.20830v1	link
2024-12-30	Frequency-aware Event Cloud Network	Hongwei Ren et.al.	2412.20803v1	null
2024-12-30	KeyGS: A Keyframe-Centric Gaussian Splatting Method for Monocular Image Sequences	Keng-Wei Chang et.al.	2412.20767v1	null
2024-12-30	Towards nation-wide analytical healthcare infrastructures: A privacy-preserving augmented knee rehabilitation case study	Boris Bačić et.al.	2412.20733v1	link
2024-12-29	Exploiting Aggregation and Segregation of Representations for Domain Adaptive Human Pose Estimation	Qucheng Peng et.al.	2412.20538v1	link
2024-12-28	MambaVO: Deep Visual Odometry Based on Sequential Matching Refinement and Training Smoothing	Shuo Wang et.al.	2412.20082v1	null
2024-12-28	GSplatLoc: Ultra-Precise Camera Localization via 3D Gaussian Splatting	Atticus J. Zeller et.al.	2412.20056v1	link
2024-12-27	Optimizing Local-Global Dependencies for Accurate 3D Human Pose Estimation	Guangsheng Xu et.al.	2412.19676v1	link
2024-12-27	Dust to Tower: Coarse-to-Fine Photo-Realistic Scene Reconstruction from Sparse Uncalibrated Images	Xudong Cai et.al.	2412.19518v1	null
2024-12-26	Humans as a Calibration Pattern: Dynamic 3D Scene Reconstruction from Unsynchronized and Uncalibrated Videos	Changwoon Choi et.al.	2412.19089v1	null
2024-12-23	Reconstructing People, Places, and Cameras	Lea Müller et.al.	2412.17806v1	null
2024-12-22	Leveraging Consistent Spatio-Temporal Correspondence for Robust Visual Odometry	Zhaoxing Zhang et.al.	2412.16923v1	null
2024-12-21	EasyVis2: A Real Time Multi-view 3D Visualization for Laparoscopic Surgery Training Enhanced by a Deep Neural Network YOLOv8-Pose	Yung-Hong Sun et.al.	2412.16742v1	null
2024-12-21	FACTS: Fine-Grained Action Classification for Tactical Sports	Christopher Lai et.al.	2412.16454v1	null
2024-12-20	Can Generative Video Models Help Pose Estimation?	Ruojin Cai et.al.	2412.16155v1	null
2024-12-20	Monkey Transfer Learning Can Improve Human Pose Estimation	Bradley Scott et.al.	2412.15966v1	null
2024-12-19	Scaling 4D Representations	João Carreira et.al.	2412.15212v1	null
2024-12-13	IMPROVE: Impact of Mobile Phones on Remote Online Virtual Education	Roberto Daza et.al.	2412.14195v1	link
2024-12-18	Level-Set Parameters: Novel Representation for 3D Shape Analysis	Huan Lei et.al.	2412.13502v1	null
2024-12-18	Pre-training a Density-Aware Pose Transformer for Robust LiDAR-based 3D Human Pose Estimation	Xiaoqi An et.al.	2412.13454v1	link
2024-12-17	CondiMen: Conditional Multi-Person Mesh Recovery	Brégier Romain et.al.	2412.13058v1	null
2024-12-17	ShotVL: Human-Centric Highlight Frame Retrieval via Language Queries	Wangyu Xue et.al.	2412.12675v1	null
2024-12-16	Category Level 6D Object Pose Estimation from a Single RGB Image using Diffusion	Adam Bethell et.al.	2412.11420v1	null
2024-12-13	ExeChecker: Where Did I Go Wrong?	Yiwen Gu et.al.	2412.10573v1	null
2024-12-11	CUPS: Improving Human Pose-Shape Estimators with Conformalized Deep Uncertainty	Harry Zhang et.al.	2412.10431v1	null
2024-12-13	RP-SLAM: Real-time Photorealistic SLAM with Efficient 3D Gaussian Splatting	Lizhi Bai et.al.	2412.09868v1	null
2024-12-12	Stereo4D: Learning How Things Move in 3D from Internet Stereo Videos	Linyi Jin et.al.	2412.09621v1	null
2024-12-12	FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction	Jiale Xu et.al.	2412.09573v1	null
2024-12-11	BLADE: Single-view Body Mesh Learning through Accurate Depth Estimation	Shengze Wang et.al.	2412.08640v1	null
2024-12-12	Drift-free Visual SLAM using Digital Twins	Roxane Merat et.al.	2412.08496v2	null
2024-12-11	Reloc3r: Large-Scale Training of Relative Camera Pose Regression for Generalizable, Fast, and Accurate Visual Localization	Siyan Dong et.al.	2412.08376v1	link
2024-12-10	LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation Models	Ziqi Lu et.al.	2412.07746v1	null
2024-12-09	MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds	Zhenggang Tang et.al.	2412.06974v1	null
2024-12-09	An Efficient Scene Coordinate Encoding and Relocalization Method	Kuan Xu et.al.	2412.06488v1	link
2024-12-09	Attention-Enhanced Lightweight Hourglass Network for Human Pose Estimation	Marsha Mariya Kappan et.al.	2412.06227v1	null
2024-12-06	CCS: Continuous Learning for Customized Incremental Wireless Sensing Services	Qunhang Fu et.al.	2412.04821v1	null
2024-12-05	ProPLIKS: Probablistic 3D human body pose estimation	Karthik Shetty et.al.	2412.04665v1	null
2024-12-05	DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction	Ben Kaye et.al.	2412.04464v1	null
2024-12-05	Targeted Hard Sample Synthesis Based on Estimated Pose and Occlusion Error for Improved Object Pose Estimation	Alan Li et.al.	2412.04279v1	null
2024-12-04	Sparse-view Pose Estimation and Reconstruction via Analysis by Generative Synthesis	Qitao Zhao et.al.	2412.03570v1	null
2024-12-06	NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images	Lingen Li et.al.	2412.03517v2	null
2024-12-05	A Bidirectional Siamese Recurrent Neural Network for Accurate Gait Recognition Using Body Landmarks	Proma Hossain Progga et.al.	2412.03498v2	null
2024-12-04	MCVO: A Generic Visual Odometry for Arbitrarily Arranged Multi-Cameras	Huai Yu et.al.	2412.03146v1	link
2024-12-04	An indoor DSO-based ceiling-vision odometry system for indoor industrial environments	Abdelhak Bougouffa et.al.	2412.02950v1	null
2024-12-03	EgoCast: Forecasting Egocentric Human Pose in the Wild	Maria Escobar et.al.	2412.02903v1	null
2024-12-02	emg2pose: A Large and Diverse Benchmark for Surface Electromyographic Hand Pose Estimation	Sasha Salter et.al.	2412.02725v1	link
2024-12-03	ProbPose: A Probabilistic Approach to 2D Human Pose Estimation	Miroslav Purkrabek et.al.	2412.02254v1	null
2024-12-03	Cascaded Multi-Scale Attention for Enhanced Multi-Scale Feature Extraction and Interaction with Low-Resolution Images	Xiangyong Lu et.al.	2412.02197v1	link
2024-12-03	CLERF: Contrastive LEaRning for Full Range Head Pose Estimation	Ting-Ruen Wei et.al.	2412.02066v1	null
2024-12-02	Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle	Miroslav Purkrabek et.al.	2412.01562v1	link
2024-12-02	6DOPE-GS: Online 6D Object Pose Estimation using Gaussian Splatting	Yufeng Jin et.al.	2412.01543v1	null
2024-12-02	HandOS: 3D Hand Reconstruction in One Stage	Xingyu Chen et.al.	2412.01537v1	null
2024-12-02	SF-Loc: A Visual Mapping and Geo-Localization System based on Sparse Visual Structure Frames	Yuxuan Zhou et.al.	2412.01500v1	link
2024-12-02	MamKPD: A Simple Mamba Baseline for Real-Time 2D Keypoint Detection	Yonghao Dang et.al.	2412.01422v1	null
2024-12-02	Cross-Modal Visual Relocalization in Prior LiDAR Maps Utilizing Intensity Textures	Qiyuan Shen et.al.	2412.01299v1	null
2024-12-02	CRISP: Object Pose and Shape Estimation with Test-Time Adaptation	Jingnan Shi et.al.	2412.01052v1	null
2024-11-29	Diorama: Unleashing Zero-shot Single-view 3D Scene Modeling	Qirui Wu et.al.	2411.19492v1	null
2024-11-29	Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning	Yang You et.al.	2411.19458v1	link
2024-11-28	GMS-VINS:Multi-category Dynamic Objects Semantic Segmentation for Enhanced Visual-Inertial Odometry Using a Promptable Foundation Model	Rui Zhou et.al.	2411.19289v1	null
2024-11-28	HOT3D: Hand and Object Tracking in 3D from Egocentric Multi-View Videos	Prithviraj Banerjee et.al.	2411.19167v1	null
2024-11-28	Lost & Found: Updating Dynamic 3D Scene Graphs from Egocentric Observations	Tjark Behrens et.al.	2411.19162v1	link
2024-11-28	Distributed Dual Quaternion Extended Kalman Filtering for Spacecraft Pose Estimation	Mathias Hudoba de Badyn et.al.	2411.19033v1	null
2024-11-28	Waterfall Transformer for Multi-person Pose Estimation	Navin Ranjan et.al.	2411.18944v1	null
2024-12-02	AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers	Sherwin Bahmani et.al.	2411.18673v2	null
2024-11-27	XR-MBT: Multi-modal Full Body Tracking for XR through Self-Supervision with Learned Depth Point Cloud Registration	Denys Rozumnyi et.al.	2411.18377v1	null
2024-11-27	Manual-PA: Learning 3D Part Assembly from Instruction Diagrams	Jiahao Zhang et.al.	2411.18011v1	null
2024-11-26	Self-supervised Monocular Depth and Pose Estimation for Endoscopy with Generative Latent Priors	Ziang Xu et.al.	2411.17790v1	null
2024-11-26	Geometric Point Attention Transformer for 3D Shape Reassembly	Jiahan Li et.al.	2411.17788v1	null
2024-11-26	RoboPEPP: Vision-Based Robot Pose and Joint Angle Estimation through Embedding Predictive Pre-Training	Raktim Gautam Goswami et.al.	2411.17662v1	null
2024-11-26	Communication-Efficient Cooperative SLAMMOT via Determining the Number of Collaboration Vehicles	Susu Fang et.al.	2411.17432v1	null
2024-11-26	Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration	Junyuan Deng et.al.	2411.17240v1	link
2024-11-28	SelfSplat: Pose-Free and 3D Prior-Free Generalizable 3D Gaussian Splatting	Gyeongjin Kang et.al.	2411.17190v3	null
2024-11-26	GMFlow: Global Motion-Guided Recurrent Flow for 6D Object Pose Estimation	Xin Liu et.al.	2411.17174v1	null
2024-11-25	Diffusion Features for Zero-Shot 6DoF Object Pose Estimation	Bernd Von Gimborn et.al.	2411.16668v1	null
2024-11-25	Edge Weight Prediction For Category-Agnostic Pose Estimation	Or Hirschorn et.al.	2411.16665v1	link
2024-11-25	SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis	Hyojun Go et.al.	2411.16443v1	link
2024-11-25	One Diffusion to Generate Them All	Duong H. Le et.al.	2411.16318v1	link
2024-11-25	UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image	Xingyu Liu et.al.	2411.16106v1	null
2024-11-24	Generalizable Single-view Object Pose Estimation by Two-side Generating and Matching	Yujing Sun et.al.	2411.15860v1	link
2024-11-24	PEnG: Pose-Enhanced Geo-Localisation	Tavis Shore et.al.	2411.15742v1	null
2024-11-22	Personalization of Wearable Sensor-Based Joint Kinematic Estimation Using Computer Vision for Hip Exoskeleton Applications	Changseob Song et.al.	2411.15366v1	null
2024-11-22	Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation	Huy Le et.al.	2411.14913v1	null
2024-11-22	mmWave Radar for Sit-to-Stand Analysis: A Comparative Study with Wearables and Kinect	Shuting Hu et.al.	2411.14656v1	null
2024-11-21	DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding	Tianhe Ren et.al.	2411.14347v1	link
2024-11-21	SEMPose: A Single End-to-end Network for Multi-object Pose Estimation	Xin Liu et.al.	2411.14002v1	null
2024-11-21	Dehazing-aided Multi-Rate Multi-Modal Pose Estimation Framework for Mitigating Visual Disturbances in Extreme Underwater Domain	Vidya Sudevan et.al.	2411.13988v1	null
2024-11-21	Hybrid-Neuromorphic Approach for Underwater Robotics Applications: A Conceptual Framework	Vidya Sudevan et.al.	2411.13962v1	null
2024-11-20	Developing Normative Gait Cycle Parameters for Clinical Analysis Using Human Pose Estimation	Rahm Ranjan et.al.	2411.13716v1	null
2024-11-20	Robust SG-NeRF: Robust Scene Graph Aided Neural Surface Reconstruction	Yi Gu et.al.	2411.13620v1	null
2024-11-19	VioPose: Violin Performance 4D Pose Estimation by Hierarchical Audiovisual Inference	Seong Jong Yoo et.al.	2411.13607v1	link
2024-11-20	DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild	Weicai Ye et.al.	2411.13291v1	null
2024-11-20	X as Supervision: Contending with Depth Ambiguity in Unsupervised Monocular 3D Pose Estimation	Yuchen Yang et.al.	2411.13026v1	link
2024-11-19	IoT-Based 3D Pose Estimation and Motion Optimization for Athletes: Application of C3D and OpenPose	Fei Ren et.al.	2411.12676v1	null
2024-11-15	SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D Reconstruction	Yutao Tang et.al.	2411.12592v1	link
2024-11-19	GLOVER: Generalizable Open-Vocabulary Affordance Reasoning for Task-Oriented Grasping	Teli Ma et.al.	2411.12286v1	null
2024-11-18	IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos	Yunong Liu et.al.	2411.11409v1	link
2024-11-15	USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting	Kang Chen et.al.	2411.10504v1	link
2024-11-13	ReMP: Reusable Motion Prior for Multi-domain 3D Human Pose Estimation and Motion Inbetweening	Hojun Jang et.al.	2411.09435v1	null
2024-11-13	Generalized Pose Space Embeddings for Training In-the-Wild using Anaylis-by-Synthesis	Dominik Borer et.al.	2411.08603v1	null
2024-11-13	DG-SLAM: Robust Dynamic Gaussian Splatting SLAM with Hybrid Pose Optimization	Yueming Xu et.al.	2411.08373v1	null
2024-11-16	RINO: Accurate, Robust Radar-Inertial Odometry with Non-Iterative Estimation	Shuocheng Yang et.al.	2411.07699v2	link
2024-11-12	Human Arm Pose Estimation with a Shoulder-worn Force-Myography Device for Human-Robot Interaction	Rotem Atari et.al.	2411.07644v1	null
2024-11-12	Towards Seamless Integration of Magnetic Tracking into Fluoroscopy-guided Interventions	Shuwei Xing et.al.	2411.07495v1	null
2024-11-08	Acoustic-based 3D Human Pose Estimation Robust to Human Position	Yusuke Oumi et.al.	2411.07165v1	null
2024-11-11	CapeLLM: Support-Free Category-Agnostic Pose Estimation with Multimodal Large Language Models	Junho Kim et.al.	2411.06869v1	null
2024-11-11	GenZ-ICP: Generalizable and Degeneracy-Robust LiDAR Odometry Using an Adaptive Weighting	Daehan Lee et.al.	2411.06766v1	link
2024-11-11	GTA-Net: An IoT-Integrated 3D Human Pose Estimation System for Real-Time Adolescent Sports Posture Correction	Shizhe Yuan et.al.	2411.06725v1	null
2024-11-10	Magnetic Field Aided Vehicle Localization with Acceleration Correction	Mrunmayee Deshpande et.al.	2411.06543v1	null
2024-11-10	Visuotactile-Based Learning for Insertion with Compliant Hands	Osher Azulay et.al.	2411.06408v1	link
2024-11-08	Poze: Sports Technique Feedback under Data Constraints	Agamdeep Singh et.al.	2411.05734v1	null
2024-11-08	DeepArUco++: Improved detection of square fiducial markers in challenging lighting conditions	Rafael Berral-Soler et.al.	2411.05552v1	link
2024-11-08	Tightly-Coupled, Speed-aided Monocular Visual-Inertial Localization in Topological Map	Chanuk Yang et.al.	2411.05497v1	null
2024-11-08	Relative Pose Estimation for Nonholonomic Robot Formation with UWB-IO Measurements	Kunrui Ze et.al.	2411.05481v1	null
2024-11-07	Social EgoMesh Estimation	Luca Scofano et.al.	2411.04598v1	link
2024-11-07	Pose2Trajectory: Using Transformers on Body Pose to Predict Tennis Player's Trajectory	Ali K. AlShami et.al.	2411.04501v1	null
2024-11-08	SuperQ-GRASP: Superquadrics-based Grasp Pose Estimation on Larger Objects for Mobile-Manipulation	Xun Tu et.al.	2411.04386v2	null
2024-11-08	GS2Pose: Two-stage 6D Object Pose Estimation Guided by Gaussian Splatting	Jilan Mei et.al.	2411.03807v3	null
2024-11-06	Estimation of Psychosocial Work Environment Exposures Through Video Object Detection. Proof of Concept Using CCTV Footage	Claus D. Hansen et.al.	2411.03724v1	null
2024-11-05	Estimating Ego-Body Pose from Doubly Sparse Egocentric Video Data	Seunggeun Chi et.al.	2411.03561v1	null
2024-11-05	HFGaussian: Learning Generalizable Gaussian Human with Integrated Human Features	Arnab Dey et.al.	2411.03086v1	null
2024-11-04	Semantic Masking and Visual Feature Matching for Robust Localization	Luisa Mao et.al.	2411.01804v1	null
2024-11-03	Activating Self-Attention for Multi-Scene Absolute Pose Regression	Miso Lee et.al.	2411.01443v1	link
2024-11-04	3D Equivariant Pose Regression via Direct Wigner-D Harmonics Prediction	Jongmin Lee et.al.	2411.00543v2	null
2024-10-31	Whole-Herd Elephant Pose Estimation from Drone Data for Collective Behavior Analysis	Brody McNutt et.al.	2411.00196v1	null
2024-10-31	No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images	Botao Ye et.al.	2410.24207v1	link
2024-11-06	SceneComplete: Open-World 3D Scene Completion in Complex Real World Environments for Robot Manipulation	Aditya Agarwal et.al.	2410.23643v2	null
2024-10-30	SCRREAM : SCan, Register, REnder And Map:A Framework for Annotating Accurate and Dense 3D Indoor Scenes with a Benchmark	HyunJun Jung et.al.	2410.22715v1	link
2024-10-29	LiVisSfM: Accurate and Robust Structure-from-Motion with LiDAR and Visual Cues	Hanqing Jiang et.al.	2410.22213v1	null
2024-10-29	PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting	Sunghwan Hong et.al.	2410.22128v1	link
2024-10-29	HRPVT: High-Resolution Pyramid Vision Transformer for medium and small-scale human pose estimation	Zhoujie Xu et.al.	2410.22079v1	null
2024-10-29	EI-Nexus: Towards Unmediated and Flexible Inter-Modality Local Feature Extraction and Matching for Event-Image Data	Zhonghua Yi et.al.	2410.21743v1	link
2024-10-28	Synthetica: Large Scale Synthetic Data for Robot Perception	Ritvik Singh et.al.	2410.21153v1	null
2024-10-29	BLAPose: Enhancing 3D Human Pose Estimation with Bone Length Adjustment	Chih-Hsiang Hsu et.al.	2410.20731v2	link
2024-11-01	RopeTP: Global Human Motion Recovery via Integrating Robust Pose Estimation with Diffusion Trajectory Prior	Mingjiang Liang et.al.	2410.20358v2	null
2024-10-27	Harmony4D: A Video Dataset for In-The-Wild Close Human Interactions	Rawal Khirodkar et.al.	2410.20294v1	null
2024-10-26	Neural Fields in Robotics: A Survey	Muhammad Zubair Irshad et.al.	2410.20220v1	link
2024-10-25	DECADE: Towards Designing Efficient-yet-Accurate Distance Estimation Modules for Collision Avoidance in Mobile Advanced Driver Assistance Systems	Muhammad Zaeem Shahzad et.al.	2410.19336v1	null
2024-10-24	Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction	Junyi Chen et.al.	2410.18962v1	null
2024-10-24	VoxelKeypointFusion: Generalizable Multi-View Multi-Person Pose Estimation	Daniel Bermuth et.al.	2410.18723v1	link
2024-10-23	Robust Two-View Geometry Estimation with Implicit Differentiation	Vladislav Pyatov et.al.	2410.17983v1	link
2024-10-23	YOLOv11: An Overview of the Key Architectural Enhancements	Rahima Khanam et.al.	2410.17725v1	link
2024-10-21	Assisted Physical Interaction: Autonomous Aerial Robots with Neural Network Detection, Navigation, and Safety Layers	Andrea Berra et.al.	2410.15802v1	null
2024-10-21	ARTS: Semi-Analytical Regressor using Disentangled Skeletal Representations for Human Mesh Recovery from Videos	Tao Tang et.al.	2410.15582v1	link
2024-10-20	Neural Active Structure-from-Motion in Dark and Textureless Environment	Kazuto Ichimaru et.al.	2410.15378v1	null
2024-10-20	POSE: Pose estimation Of virtual Sync Exhibit system	Hao-Tang Tsui et.al.	2410.15343v1	link
2024-10-18	Graph Optimality-Aware Stochastic LiDAR Bundle Adjustment with Progressive Spatial Smoothing	Jianping Li et.al.	2410.14565v1	null
2024-10-18	Multi-modal Pose Diffuser: A Multimodal Generative Conditional Pose Prior	Calvin-Khang Ta et.al.	2410.14540v1	null
2024-10-18	Sim2real Cattle Joint Estimation in 3D point clouds	Okour Mohammad et.al.	2410.14419v1	null
2024-10-18	Unlabeled Action Quality Assessment Based on Multi-dimensional Adaptive Constrained Dynamic Time Warping	Renguang Chen et.al.	2410.14161v1	null
2024-10-15	From Real Artifacts to Virtual Reference: A Robust Framework for Translating Endoscopic Images	unyang Wu et.al.	2410.13896v1	null
2024-10-17	DualQuat-LOAM: LiDAR Odometry and Mapping parametrized on Dual Quaternions	Edison P. Velasco-Sánchez et.al.	2410.13541v1	null
2024-10-17	Object Pose Estimation Using Implicit Representation For Transparent Objects	Varun Burde et.al.	2410.13465v1	null
2024-10-16	Optimizing Multi-Task Learning for Accurate Spacecraft Pose Estimation	Francesco Evangelisti et.al.	2410.12679v1	null
2024-10-15	Contrastive Touch-to-Touch Pretraining	Samanta Rodriguez et.al.	2410.11834v1	null
2024-10-18	X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing	Xinyan Chen et.al.	2410.10167v2	null
2024-10-13	Occluded Human Pose Estimation based on Limb Joint Augmentation	Gangtao Han et.al.	2410.09885v1	null
2024-10-12	Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors	Hritam Basak et.al.	2410.09467v1	null
2024-10-12	Towards Multi-Modal Animal Pose Estimation: An In-Depth Analysis	Qianyi Deng et.al.	2410.09312v1	link
2024-10-11	CVAM-Pose: Conditional Variational Autoencoder for Multi-Object Monocular Pose Estimation	Jianyu Zhao et.al.	2410.09010v1	link
2024-10-11	Look Gauss, No Pose: Novel View Synthesis using Gaussian Splatting without Accurate Pose Initialization	Christian Schmidt et.al.	2410.08743v1	link
2024-10-10	Generalizing Stochastic Smoothing for Differentiation and Gradient Estimation	Felix Petersen et.al.	2410.08125v1	null
2024-10-10	Robotic framework for autonomous manipulation of laboratory equipment with different degrees of transparency via 6D pose estimation	Maria Makarova et.al.	2410.07801v1	null
2024-10-10	Optimal-State Dynamics Estimation for Physics-based Human Motion Capture from Videos	Cuong Le et.al.	2410.07795v1	link
2024-10-12	Autonomous Driving in Unstructured Environments: How Far Have We Come?	Chen Min et.al.	2410.07701v2	link
2024-10-10	Invisibility Cloak: Disappearance under Human Pose Estimation via Backdoor Attacks	Minxing Zhang et.al.	2410.07670v1	null
2024-10-09	OmniPose6D: Towards Short-Term Object Pose Tracking in Dynamic Scenes from Monocular RGB	Yunzhi Lin et.al.	2410.06694v1	null
2024-10-08	SpecTrack: Learned Multi-Rotation Tracking via Speckle Imaging	Ziyang Chen et.al.	2410.06028v1	link
2024-10-08	AIVIO: Closed-loop, Object-relative Navigation of UAVs with AI-aided Visual Inertial Odometry	Thomas Jantos et.al.	2410.05996v1	null
2024-10-08	Are Minimal Radial Distortion Solvers Necessary for Relative Pose Estimation?	Charalambos Tzamos et.al.	2410.05984v1	link
2024-10-08	FürElise: Capturing and Physically Synthesizing Hand Motions of Piano Performance	Ruocheng Wang et.al.	2410.05791v1	null
2024-10-07	Comparison of marker-less 2D image-based methods for infant pose estimation	Lennart Jahn et.al.	2410.04980v1	null
2024-10-06	Enhancing 3D Human Pose Estimation Amidst Severe Occlusion with Dual Transformer Fusion	Mehwish Ghafoor et.al.	2410.04574v1	link
2024-10-06	LiteVLoc: Map-Lite Visual Localization for Image Goal Navigation	Jianhao Jiao et.al.	2410.04419v1	null
2024-10-05	Test-Time Adaptation for Keypoint-Based Spacecraft Pose Estimation Based on Predicted-View Synthesis	Juan Ignacio Bravo Pérez-Villar et.al.	2410.04298v1	link
2024-10-05	A Framework for Reproducible Benchmarking and Performance Diagnosis of SLAM Systems	Nikola Radulov et.al.	2410.04242v1	link
2024-10-04	Unsupervised Prior Learning: Discovering Categorical Pose Priors from Videos	Ziyu Wang et.al.	2410.03858v1	null
2024-10-04	Universal Global State Estimation for Inertial Navigation Systems	Sifeddine Benahmed et.al.	2410.03846v1	null
2024-10-04	MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion	Junyi Zhang et.al.	2410.03825v1	null
2024-10-04	Dessie: Disentanglement for Articulated 3D Horse Shape and Pose Estimation from Images	Ci Li et.al.	2410.03438v1	null
2024-10-04	HRVMamba: High-Resolution Visual State Space Model for Dense Prediction	Hao Zhang et.al.	2410.03174v1	null
2024-10-04	CLIP-Clique: Graph-based Correspondence Matching Augmented by Vision Language Models for Object-based Global Localization	Shigemichi Matsuzaki et.al.	2410.03054v1	null
2024-10-03	Why Sample Space Matters: Keyframe Sampling Optimization for LiDAR-based Place Recognition	Nikolaos Stathoulopoulos et.al.	2410.02643v1	link
2024-10-03	Key-Grid: Unsupervised 3D Keypoints Detection using Grid Heatmap Features	Chengkai Hou et.al.	2410.02237v1	null
2024-10-02	SGBA: Semantic Gaussian Mixture Model-Based LiDAR Bundle Adjustment	Xingyu Ji et.al.	2410.01618v1	null
2024-10-02	SurgeoNet: Realtime 3D Pose Estimation of Articulated Surgical Instruments from Stereo Images using a Synthetically-trained Network	Ahmed Tawfik Aboukhadra et.al.	2410.01293v1	null
2024-10-01	Pose Estimation of Buried Deep-Sea Objects using 3D Vision Deep Learning Models	Jerry Yan et.al.	2410.01061v1	null
2024-10-01	RAD: A Dataset and Benchmark for Real-Life Anomaly Detection with Robotic Observations	Kaichen Zhou et.al.	2410.00713v1	link
2024-10-01	GERA: Geometric Embedding for Efficient Point Registration Analysis	Geng Li et.al.	2410.00589v1	null
2024-09-30	Continual Human Pose Estimation for Incremental Integration of Keypoints and Pose Variations	Muhammad Saif Ullah Khan et.al.	2409.20469v1	null
2024-09-30	Classroom-Inspired Multi-Mentor Distillation with Adaptive Learning Strategies	Shalini Sarode et.al.	2409.20237v1	null
2024-09-30	PuzzleBoard: A New Camera Calibration Pattern with Position Encoding	Peer Stelldinger et.al.	2409.20127v1	link
2024-09-30	Robust Gaussian Splatting SLAM by Leveraging Loop Closure	Zunjie Zhu et.al.	2409.20111v1	null
2024-09-30	GearTrack: Automating 6D Pose Estimation	Yu Deng et.al.	2409.19986v1	null
2024-09-29	PPLNs: Parametric Piecewise Linear Networks for Event-Based Temporal Modeling and Beyond	Chen Song et.al.	2409.19772v1	link
2024-09-29	GelSlim 4.0: Focusing on Touch and Reproducibility	Andrea Sipos et.al.	2409.19770v1	null
2024-09-27	Robust Proximity Operations using Probabilistic Markov Models	Deep Parikh et.al.	2409.19062v1	null
2024-09-27	Exploiting Motion Prior for Accurate Pose Estimation of Dashboard Cameras	Yipeng Lu et.al.	2409.18673v1	null
2024-09-27	DynaWeightPnP: Toward global real-time 3D-2D solver in PnP without correspondences	Jingwei Song et.al.	2409.18457v1	null
2024-09-30	Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation	Mengchen Zhang et.al.	2409.18261v2	link
2024-09-26	AI-Powered Augmented Reality for Satellite Assembly, Integration and Test	Alvaro Patricio et.al.	2409.18101v1	null
2024-09-27	Leveraging Anthropometric Measurements to Improve Human Mesh Estimation and Ensure Consistent Body Shapes	Katja Ludwig et.al.	2409.17671v2	null
2024-09-25	Safe Leaf Manipulation for Accurate Shape and Pose Estimation of Occluded Fruits	Shaoxiong Yao et.al.	2409.17389v1	null
2024-09-25	Hierarchical Tri-manual Planning for Vision-assisted Fruit Harvesting with Quadrupedal Robots	Zhichao Liu et.al.	2409.17116v1	null
2024-09-25	Self-Sensing for Proprioception and Contact Detection in Soft Robots Using Shape Memory Alloy Artificial Muscles	Ran Jing et.al.	2409.17111v1	null
2024-09-25	Online 6DoF Pose Estimation in Forests using Cross-View Factor Graph Optimisation and Deep Learned Re-localisation	Lucas Carvalho de Lima et.al.	2409.16680v1	null
2024-09-25	FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation	Jingyi Tang et.al.	2409.16600v1	null
2024-09-25	Robo-Platform: A Robotic System for Recording Sensors and Controlling Robots	Masoud Dayani Najafabadi et.al.	2409.16595v1	link
2024-09-24	PseudoNeg-MAE: Self-Supervised Point Cloud Learning using Conditional Pseudo-Negative Embeddings	Sutharsan Mahendren et.al.	2409.15832v1	null
2024-09-24	LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation	Ruida Zhang et.al.	2409.15727v1	link
2024-09-23	Framework for Robust Localization of UUVs and Mapping of Net Pens	David Botta et.al.	2409.15475v1	null
2024-09-23	FisheyeDepth: A Real Scale Self-Supervised Depth Estimation Model for Fisheye Camera	Guoyang Zhao et.al.	2409.15054v1	link
2024-09-23	BranchPoseNet: Characterizing tree branching with a deep learning-based pose estimation approach	Stefano Puliti et.al.	2409.14755v1	link
2024-09-23	ERPoT: Effective and Reliable Pose Tracking for Mobile Robots Based on Lightweight and Compact Polygon Maps	Haiming Gao et.al.	2409.14723v1	null
2024-09-22	Tactile Functasets: Neural Implicit Representations of Tactile Datasets	Sikai Li et.al.	2409.14592v1	null
2024-09-22	AR Overlay: Training Image Pose Estimation on Curved Surface in a Synthetic Way	Sining Huang et.al.	2409.14577v1	null
2024-09-22	DROP: Dexterous Reorientation via Online Planning	Albert H. Li et.al.	2409.14562v1	null
2024-09-21	Combining Absolute and Semi-Generalized Relative Poses for Visual Localization	Vojtech Panek et.al.	2409.14269v1	null
2024-09-18	SpotLight: Robotic Scene Understanding through Interaction and Affordance Detection	Tim Engelbracht et.al.	2409.11870v1	link
2024-09-18	End-to-End Probabilistic Geometry-Guided Regression for 6DoF Object Pose Estimation	Thomas Pöllabauer et.al.	2409.11819v1	null
2024-09-18	Bridging Domain Gap for Flight-Ready Spaceborne Vision	Tae Ha Park et.al.	2409.11661v1	null
2024-09-17	Good Grasps Only: A data engine for self-supervised fine-tuning of pose estimation using grasp poses for verification	Frederik Hagelskjær et.al.	2409.11512v1	null
2024-09-17	Training Datasets Generation for Machine Learning: Application to Vision Based Navigation	Jérémy Lebreton et.al.	2409.11383v1	null
2024-09-17	OmniGen: Unified Image Generation	Shitao Xiao et.al.	2409.11340v1	link
2024-09-17	ULOC: Learning to Localize in Complex Large-Scale Environments with Ultra-Wideband Ranges	Thien-Minh Nguyen et.al.	2409.11122v1	link
2024-09-17	Depth-based Privileged Information for Boosting 3D Human Pose Estimation on RGB	Alessandro Simoni et.al.	2409.11104v1	null
2024-09-21	HGSLoc: 3DGS-based Heuristic Camera Pose Refinement	Zhongyan Niu et.al.	2409.10925v2	null
2024-09-17	Pose estimation of CubeSats via sensor fusion and Error-State Extended Kalman Filter	Deep Parikh et.al.	2409.10815v1	null
2024-09-16	CtRNet-X: Camera-to-Robot Pose Estimation in Real-world Conditions Using a Single Camera	Jingpei Lu et.al.	2409.10441v1	null
2024-09-16	HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models	Vineet Bhat et.al.	2409.10419v1	null
2024-09-16	2D or not 2D: How Does the Dimensionality of Gesture Representation Affect 3D Co-Speech Gesture Generation?	Téo Guichoux et.al.	2409.10357v1	null
2024-09-16	Human Insights Driven Latent Space for Different Driving Perspectives: A Unified Encoder for Efficient Multi-Task Inference	Huy-Dung Nguyen et.al.	2409.10095v1	null
2024-09-15	Precise Pick-and-Place using Score-Based Diffusion Networks	Shih-Wei Guo et.al.	2409.09725v1	null
2024-09-15	Pre-Training for 3D Hand Pose Estimation with Contrastive Learning on Large-Scale Hand Images in the Wild	Nie Lin et.al.	2409.09714v1	null
2024-09-15	Proximity operations of CubeSats via sensor fusion of ultra-wideband range measurements with rate gyroscopes, accelerometers and monocular vision	Deep Parikh et.al.	2409.09665v1	null
2024-09-15	A Scalable Tabletop Satellite Automation Testbed:Design And Experiments	Deep Parikh et.al.	2409.09633v1	null
2024-09-14	MAC-VO: Metrics-aware Covariance for Learning-based Stereo Visual Odometry	Yuheng Qiu et.al.	2409.09479v1	null
2024-09-14	Distributed Invariant Kalman Filter for Object-level Multi-robot Pose SLAM	Haoying Li et.al.	2409.09410v1	null
2024-09-13	Causal Transformer for Fusion and Pose Estimation in Deep Visual Inertial Odometry	Yunus Bilge Kurt et.al.	2409.08769v1	link
2024-09-13	WheelPoser: Sparse-IMU Based Body Pose Estimation for Wheelchair Users	Yunzhi Li et.al.	2409.08494v1	link
2024-09-12	Bayesian Inverse Graphics for Few-Shot Concept Learning	Octavio Arriaga et.al.	2409.08351v1	link
2024-09-12	Touch2Touch: Cross-Modal Tactile Generation for Object Manipulation	Samanta Rodriguez et.al.	2409.08269v1	null
2024-09-12	Covariance Intersection-based Invariant Kalman Filtering(DInCIKF) for Distributed Pose Estimation	Haoying Li et.al.	2409.07933v1	null
2024-09-12	GateAttentionPose: Enhancing Pose Estimation with Agent Attention and Improved Gated Convolutions	Liang Feng et.al.	2409.07798v1	null
2024-09-12	GatedUniPose: A Novel Approach for Pose Estimation Combining UniRepLKNet and Gated Convolution	Liang Feng et.al.	2409.07752v1	null
2024-09-11	FaVoR: Features via Voxel Rendering for Camera Relocalization	Vincenzo Polizzi et.al.	2409.07571v1	null
2024-09-11	Benchmarking 2D Egocentric Hand Pose Datasets	Olga Taran et.al.	2409.07337v1	null
2024-09-11	iKalibr-RGBD: Partially-Specialized Target-Free Visual-Inertial Spatiotemporal Calibration For RGBDs via Continuous-Time Velocity Estimation	Shuolong Chen et.al.	2409.07116v1	link
2024-09-11	Equivariant Filter for Tightly Coupled LiDAR-Inertial Odometry	Anbo Tao et.al.	2409.06948v1	null
2024-09-13	A Bayesian framework for active object recognition, pose estimation and shape transfer learning through touch	Haodong Zheng et.al.	2409.06912v2	null
2024-09-11	Alignist: CAD-Informed Orientation Distribution Estimation by Fusing Shape and Correspondences	Shishir Reddy Vutukur et.al.	2409.06683v2	link
2024-09-10	PoseEmbroider: Towards a 3D, Visual, Semantic-aware Human Pose Representation	Ginger Delmas et.al.	2409.06535v1	null
2024-09-10	Test-Time Certifiable Self-Supervision to Bridge the Sim2Real Gap in Event-Based Satellite Pose Estimation	Mohsi Jawaid et.al.	2409.06240v1	null
2024-09-09	From Words to Poses: Enhancing Novel Object Pose Estimation with Vision Language Models	Tessa Pulli et.al.	2409.05413v1	null
2024-09-08	HelmetPoser: A Helmet-Mounted IMU Dataset for Data-Driven Estimation of Human Head Motion in Diverse Conditions	Jianping Li et.al.	2409.05006v1	null
2024-09-06	Casper DPM: Cascaded Perceptual Dynamic Projection Mapping onto Hands	Yotam Erel et.al.	2409.04397v1	null
2024-09-06	GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers	Lorenza Prospero et.al.	2409.04196v1	null
2024-09-06	Dense Hand-Object(HO) GraspNet with Full Grasping Taxonomy and Dynamics	Woojin Cho et.al.	2409.04033v1	null
2024-09-06	Matched Filtering based LiDAR Place Recognition for Urban and Natural Environments	Therese Joseph et.al.	2409.03998v1	null
2024-09-09	The Influence of Faulty Labels in Data Sets on Human Pose Estimation	Arnold Schwarz et.al.	2409.03887v2	null
2024-09-05	MaskVal: Simple but Effective Uncertainty Quantification for 6D Pose Estimation	Philipp Quentin et.al.	2409.03556v1	null
2024-09-05	UAV (Unmanned Aerial Vehicles): Diverse Applications of UAV Datasets in Segmentation, Classification, Detection, and Tracking	Md. Mahfuzur Rahman et.al.	2409.03245v1	null
2024-09-01	Recoverable Anonymization for Pose Estimation: A Privacy-Enhancing Approach	Wenjun Huang et.al.	2409.02715v1	null
2024-09-04	Object Gaussian for Monocular 6D Pose Estimation from Sparse Views	Luqing Luo et.al.	2409.02581v1	null
2024-09-03	EgoPressure: A Dataset for Hand Pressure and Pose Estimation in Egocentric Vision	Yiming Zhao et.al.	2409.02224v1	null
2024-09-03	Deep learning for objective estimation of Parkinsonian tremor severity	Felipe Duque-Quiceno et.al.	2409.02011v1	null
2024-09-03	SPiKE: 3D Human Pose from Point Cloud Sequences	Irene Ballester et.al.	2409.01879v1	link
2024-09-02	Kalman Filtering for Precise Indoor Position and Orientation Estimation Using IMU and Acoustics on Riemannian Manifolds	Mohammed H. AlSharif et.al.	2409.01002v1	null
2024-09-01	Detection, Recognition and Pose Estimation of Tabletop Objects	Sanjuksha Nirgude et.al.	2409.00869v1	null
2024-09-01	DSLO: Deep Sequence LiDAR Odometry Based on Inconsistent Spatio-temporal Propagation	Huixin Zhang et.al.	2409.00744v1	link
2024-09-01	MoManifold: Learning to Measure 3D Human Motion via Decoupled Joint Acceleration Manifolds	Ziqiang Dang et.al.	2409.00736v1	null
2024-08-31	ActionPose: Pretraining 3D Human Pose Estimation with the Dark Knowledge of Action	Longyun Liao et.al.	2409.00449v1	null
2024-09-04	Augmented Reality without Borders: Achieving Precise Localization Without Maps	Albert Gassol Puigjaner et.al.	2408.17373v3	null
2024-08-30	BOP-D: Revisiting 6D Pose Estimation Benchmark for Better Evaluation under Visual Ambiguities	Boris Meden et.al.	2408.17297v1	null
2024-08-30	EMHI: A Multimodal Egocentric Human Motion Dataset with HMD and Body-Worn IMUs	Zhen Fan et.al.	2408.17168v1	null
2024-09-01	Generic Objects as Pose Probes for Few-Shot View Synthesis	Zhirui Gao et.al.	2408.16690v2	null
2024-08-29	OP-Align: Object-level and Part-level Alignment for Self-supervised Category-level Articulated Object Pose Estimation	Yuchen Che et.al.	2408.16547v1	link
2024-08-29	GRPose: Learning Graph Relations for Human Image Generation with Pose Priors	Xiangchen Yin et.al.	2408.16540v1	link
2024-08-28	Are Pose Estimators Ready for the Open World? STAGE: Synthetic Data Generation Toolkit for Auditing 3D Human Pose Estimators	Nikita Kister et.al.	2408.16536v1	null
2024-08-28	Multi-view Pose Fusion for Occlusion-Aware 3D Human Pose Estimation	Laura Bragagnolo et.al.	2408.15810v1	link
2024-08-30	Addressing the challenges of loop detection in agricultural environments	Nicolás Soncini et.al.	2408.15761v2	link
2024-08-28	Str-L Pose: Integrating Point and Structured Line for Relative Pose Estimation in Dual-Graph	Zherong Zhang et.al.	2408.15750v1	null
2024-08-28	Benchmarking ML Approaches to UWB-Based Range-Only Posture Recognition for Human Robot-Interaction	Salma Salimi et.al.	2408.15717v1	null
2024-08-26	Bengali Sign Language Recognition through Hand Pose Estimation using Multi-Branch Spatial-Temporal Attention Model	Abu Saleh Musa Miah et.al.	2408.14111v1	null
2024-08-25	InterTrack: Tracking Human Object Interaction without Object Templates	Xianghui Xie et.al.	2408.13953v1	null
2024-08-24	Temporally-consistent 3D Reconstruction of Birds	Johannes Hägerlind et.al.	2408.13629v1	null
2024-08-24	Explainable Convolutional Networks for Crater Detection and Lunar Landing Navigation	Jianing Song et.al.	2408.13587v1	null
2024-08-27	Sapiens: Foundation for Human Vision Models	Rawal Khirodkar et.al.	2408.12569v3	null
2024-08-21	GaussianOcc: Fully Self-supervised and Efficient 3D Occupancy Estimation with Gaussian Splatting	Wanshui Gan et.al.	2408.11447v1	link
2024-08-20	GSLoc: Efficient Camera Pose Refinement via 3D Gaussian Splatting	Changkun Liu et.al.	2408.11085v1	link
2024-08-20	ZebraPose: Zebra Detection and Pose Estimation using only Synthetic Data	Elia Bonetto et.al.	2408.10831v1	null
2024-08-20	MPL: Lifting 3D Human Pose from Multi-view 2D Poses	Seyed Abolfazl Ghasemzadeh et.al.	2408.10805v1	link
2024-08-19	RUMI: Rummaging Using Mutual Information	Sheng Zhong et.al.	2408.10450v1	null
2024-08-19	SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views	Chao Xu et.al.	2408.10195v1	null
2024-08-19	SHARP: Segmentation of Hands and Arms by Range using Pseudo-Depth for Enhanced Egocentric 3D Hand Pose Estimation and Action Recognition	Wiktor Mucha et.al.	2408.10037v1	link
2024-08-19	Pose-GuideNet: Automatic Scanning Guidance for Fetal Head Ultrasound from Pose Estimation	Qianhui Men et.al.	2408.09931v1	null
2024-08-18	OPPH: A Vision-Based Operator for Measuring Body Movements for Personal Healthcare	Chen Long-fei et.al.	2408.09409v1	null
2024-08-17	An Open-Source American Sign Language Fingerspell Recognition and Semantic Pose Retrieval Interface	Kevin Jose Thomas et.al.	2408.09311v1	link
2024-08-16	ADen: Adaptive Density Representations for Sparse-view Camera Pose Estimation	Hao Tang et.al.	2408.09042v1	null
2024-08-16	Correspondence-Guided SfM-Free 3D Gaussian Splatting for NVS	Wei Sun et.al.	2408.08723v1	null
2024-08-16	SketchRef: A Benchmark Dataset and Evaluation Metrics for Automated Sketch Synthesis	Xingyue Lin et.al.	2408.08623v1	null
2024-08-15	HyperTaxel: Hyper-Resolution for Taxel-Based Tactile Signals Through Contrastive Learning	Hongyu Li et.al.	2408.08312v1	null
2024-08-15	Comparative Evaluation of 3D Reconstruction Methods for Object Pose Estimation	Varun Burde et.al.	2408.08234v1	link
2024-08-15	Towards Practical Human Motion Prediction with LiDAR Point Clouds	Xiao Han et.al.	2408.08202v1	null
2024-08-15	Your Turn: Real-World Turning Angle Estimation for Parkinson's Disease Severity Assessment	Qiushuo Cheng et.al.	2408.08182v1	null
2024-08-15	Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models	Tianyu Wang et.al.	2408.07975v1	null
2024-08-15	GOReloc: Graph-based Object-Level Relocalization for Visual SLAM	Yutong Wang et.al.	2408.07917v1	link
2024-08-13	Grasping by Hanging: a Learning-Free Grasping Detection Method for Previously Unseen Objects	Wanze Li et.al.	2408.06734v1	null
2024-08-13	A Miniature Vision-Based Localization System for Indoor Blimps	Shicong Ma et.al.	2408.06648v1	null
2024-08-12	UniT: Unified Tactile Representation for Robot Learning	Zhengtong Xu et.al.	2408.06481v1	link
2024-08-12	Moo-ving Beyond Tradition: Revolutionizing Cattle Behavioural Phenotyping with Pose Estimation Techniques	Navid Ghassemi et.al.	2408.06336v1	null
2024-08-12	CAD-Mesher: A Convenient, Accurate, Dense Mesh-based Mapping Module in SLAM for Dynamic Environments	Yanpeng Jia et.al.	2408.05981v1	null
2024-08-12	PAFormer: Part Aware Transformer for Person Re-identification	Hyeono Jung et.al.	2408.05918v1	null
2024-08-11	SABER-6D: Shape Representation Based Implicit Object Pose Estimation	Shishir Reddy Vutukur et.al.	2408.05867v1	null
2024-08-10	Visual SLAM with 3D Gaussian Primitives and Depth Priors Enabling Novel View Synthesis	Zhongche Qu et.al.	2408.05635v1	null
2024-08-10	Anticipation through Head Pose Estimation: a preliminary study	Federico Figari Tomenotti et.al.	2408.05516v1	null
2024-08-09	Mesh-based Object Tracking for Dynamic Semantic 3D Scene Graphs via Ray Tracing	Lennart Niecksch et.al.	2408.04979v1	null
2024-08-07	PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space Model	Yunlong Huang et.al.	2408.03540v1	link
2024-08-06	Line-based 6-DoF Object Pose Estimation and Tracking With an Event Camera	Zibin Liu et.al.	2408.03225v1	link
2024-08-06	Training on the Fly: On-device Self-supervised Learning aboard Nano-drones within 20 mW	Elia Cereda et.al.	2408.03168v1	null
2024-08-06	BodySLAM: A Generalized Monocular Visual SLAM Framework for Surgical Applications	G. Manni et.al.	2408.03078v1	link
2024-08-07	Pose Magic: Efficient and Temporally Consistent Human Pose Estimation with a Hybrid Mamba-GCN Network	Xinyi Zhang et.al.	2408.02922v2	null
2024-08-05	Analyzing Data Efficiency and Performance of Machine Learning Algorithms for Assessing Low Back Pain Physical Rehabilitation Exercises	Aleksa Marusic et.al.	2408.02855v1	null
2024-08-05	Joint-Motion Mutual Learning for Pose Estimation in Videos	Sifan Wu et.al.	2408.02285v1	null
2024-08-04	AvatarPose: Avatar-guided 3D Pose Estimation of Close Human Interaction from Sparse Multi-view Videos	Feichi Lu et.al.	2408.02110v1	null
2024-08-04	Generalized Maximum Likelihood Estimation for Perspective-n-Point Problem	Tian Zhan et.al.	2408.01945v1	null
2024-08-03	MotionTrace: IMU-based Field of View Prediction for Smartphone AR Interactions	Rahul Islam et.al.	2408.01850v1	null
2024-08-03	BEVPlace++: Fast, Robust, and Lightweight LiDAR Global Localization for Unmanned Ground Vehicles	Lun Luo et.al.	2408.01841v1	link
2024-08-03	E $^3$ NeRF: Efficient Event-Enhanced Neural Radiance Fields from Blurry Images	Yunshan Qi et.al.	2408.01840v1	null
2024-08-03	Survey on Emotion Recognition through Posture Detection and the possibility of its application in Virtual Reality	Leina Elansary et.al.	2408.01728v1	null
2024-08-03	Stimulating Imagination: Towards General-purpose Object Rearrangement	Jianyang Wu et.al.	2408.01655v1	null
2024-08-02	Full-range Head Pose Geometric Data Augmentations	Huei-Chung Hu et.al.	2408.01566v1	null
2024-07-31	Adapting Skills to Novel Grasps: A Self-Supervised Approach	Georgios Papagiannis et.al.	2408.00178v1	null
2024-07-31	Certifying Robustness of Learning-Based Keypoint Detection and Pose Estimation Methods	Xusheng Luo et.al.	2408.00117v1	null
2024-07-30	StackFLOW: Monocular Human-Object Reconstruction by Stacked Normalizing Flow with Offset	Chaofan Huo et.al.	2407.20545v1	link
2024-07-30	HandDAGT: A Denoising Adaptive Graph Transformer for 3D Hand Pose Estimation	Wencan Cheng et.al.	2407.20542v1	link
2024-07-30	Markers Identification for Relative Pose Estimation of an Uncooperative Target	Batu Candan et.al.	2407.20515v1	null
2024-07-29	BaseBoostDepth: Exploiting Larger Baselines For Self-supervised Monocular Depth Estimation	Kieran Saunders et.al.	2407.20437v1	null
2024-07-28	Skeleton-based Group Activity Recognition via Spatial-Temporal Panoramic Graph	Zhengcen Li et.al.	2407.19497v1	link
2024-07-26	Flexible graph convolutional network for 3D human pose estimation	Abu Taib Mohammed Shahjahan et.al.	2407.19077v1	link
2024-07-26	From 2D to 3D: AISG-SLA Visual Localization Challenge	Jialin Gao et.al.	2407.18590v1	null
2024-07-28	HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation	Zhenzhi Wang et.al.	2407.17438v2	link
2024-07-24	Active Loop Closure for OSM-guided Robotic Mapping in Large-Scale Urban Environments	Wei Gao et.al.	2407.17078v1	null
2024-07-30	DreamCar: Leveraging Car-specific Prior for in-the-wild 3D Car Reconstruction	Xiaobiao Du et.al.	2407.16988v2	link
2024-07-24	Pose Estimation from Camera Images for Underwater Inspection	Luyuan Peng et.al.	2407.16961v1	null
2024-07-23	COALA: A Practical and Vision-Centric Federated Learning Platform	Weiming Zhuang et.al.	2407.16560v1	link
2024-07-23	Probabilistic Parameter Estimators and Calibration Metrics for Pose Estimation from Image Features	Romeo Valentin et.al.	2407.16223v1	null
2024-07-23	Optimal camera-robot pose estimation in linear time from points and lines	Guangyang Zeng et.al.	2407.16151v1	null
2024-07-23	3D-UGCN: A Unified Graph Convolutional Network for Robust 3D Human Pose Estimation from Monocular RGB Images	Jie Zhao et.al.	2407.16137v1	null
2024-07-21	CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models	Zheng Chong et.al.	2407.15886v1	link
2024-07-22	RADA: Robust and Accurate Feature Learning with Domain Adaptation	Jingtai He et.al.	2407.15791v1	null
2024-07-22	Local Occupancy-Enhanced Object Grasping with Multiple Triplanar Projection	Kangqi Ma et.al.	2407.15771v1	null
2024-07-22	6DGS: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model	Matteo Bortolon et.al.	2407.15484v1	null
2024-07-23	Domain-Adaptive 2D Human Pose Estimation via Dual Teachers in Extremely Low-Light Conditions	Yihao Ai et.al.	2407.15451v2	link
2024-07-22	avaTTAR: Table Tennis Stroke Training with On-body and Detached Visualization in Augmented Reality	Dizhi Ma et.al.	2407.15373v1	null
2024-07-20	From Underground Mines to Offices: A Versatile and Robust Framework for Range-Inertial SLAM	Lorenzo Montano-Oliván et.al.	2407.14797v1	null
2024-07-19	ESCAPE: Energy-based Selective Adaptive Correction for Out-of-distribution 3D Human Pose Estimation	Luke Bidulka et.al.	2407.14605v1	null
2024-07-19	6DoF Head Pose Estimation through Explicit Bidirectional Interaction with Face Geometry	Sungho Chun et.al.	2407.14136v1	link
2024-07-18	RT-Pose: A 4D Radar Tensor-based 3D Human Pose Estimation and Localization Benchmark	Yuan-Hao Ho et.al.	2407.13930v1	null
2024-07-19	GlobalPointer: Large-Scale Plane Adjustment with Bi-Convex Relaxation	Bangyan Liao et.al.	2407.13537v2	link
2024-07-18	SCAPE: A Simple and Strong Category-Agnostic Pose Estimator	Yujia Liang et.al.	2407.13483v1	link
2024-07-17	SG-NeRF: Neural Surface Reconstruction with Scene Graph Optimization	Yiyang Chen et.al.	2407.12667v1	link
2024-07-17	Invertible Neural Warp for NeRF	Shin-Fang Chng et.al.	2407.12354v1	null
2024-07-16	NeuSurfEmb: A Complete Pipeline for Dense Correspondence-based 6D Object Pose Estimation without CAD Models	Francesco Milano et.al.	2407.12207v1	link
2024-07-16	Monocular pose estimation of articulated surgical instruments in open surgery	Robert Spektor et.al.	2407.12138v1	null
2024-07-17	GV-Bench: Benchmarking Local Feature Matching for Geometric Verification of Long-term Loop Closure Detection	Jingwen Yu et.al.	2407.11736v2	link
2024-07-16	TCFormer: Visual Recognition via Token Clustering Transformer	Wang Zeng et.al.	2407.11321v1	link
2024-07-15	A BlueROV2-based platform for underwater mapping experiments	Tudor Alinei-Poiana et.al.	2407.10901v1	link
2024-07-15	LVCP: LiDAR-Vision Tightly Coupled Collaborative Real-time Relative Positioning	Zhuozhu Jian et.al.	2407.10782v1	null
2024-07-15	Domain Generalization for 6D Pose Estimation Through NeRF-based Image Synthesis	Antoine Legrand et.al.	2407.10762v1	null
2024-07-16	GTPT: Group-based Token Pruning Transformer for Efficient Human Pose Estimation	Haonan Wang et.al.	2407.10756v2	null
2024-07-15	Learning to Estimate the Pose of a Peer Robot in a Camera Image by Predicting the States of its LEDs	Nicholas Carlotti et.al.	2407.10661v1	null
2024-07-15	Deep-Learning-Based Markerless Pose Estimation Systems in Gait Analysis: DeepLabCut Custom Training and the Refinement Function	Giulia Panconi et.al.	2407.10590v1	null
2024-07-14	3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of Grasped Objects	Weiming Zhi et.al.	2407.10331v1	null
2024-07-16	psifx -- Psychological and Social Interactions Feature Extraction Package	Guillaume Rochette et.al.	2407.10266v2	null
2024-07-14	PAFUSE: Part-based Diffusion for 3D Whole-Body Pose Estimation	Nermin Samet et.al.	2407.10220v1	link
2024-07-14	3DEgo: 3D Editing on the Go!	Umar Khalid et.al.	2407.10102v1	null
2024-07-12	iNeMo: Incremental Neural Mesh Models for Robust Class-Incremental Learning	Tom Fischer et.al.	2407.09271v1	link
2024-07-12	HUP-3D: A 3D multi-view synthetic dataset for assisted-egocentric hand-ultrasound pose estimation	Manuel Birlo et.al.	2407.09215v1	null
2024-07-12	KGpose: Keypoint-Graph Driven End-to-End Multi-Object 6D Pose Estimation via Point-Wise Pose Voting	Andrew Jeong et.al.	2407.08909v1	null
2024-07-11	RTMW: Real-Time Multi-Person 2D and 3D Whole-body Pose Estimation	Tao Jiang et.al.	2407.08634v1	link
2024-07-11	SRPose: Two-view Relative Pose Estimation with Sparse Keypoints	Rui Yin et.al.	2407.08199v1	link
2024-07-11	SGLC: Semantic Graph-Guided Coarse-Fine-Refine Full Loop Closing for LiDAR SLAM	Neng Wang et.al.	2407.08106v1	link
2024-07-10	RoCap: A Robotic Data Collection Pipeline for the Pose Estimation of Appearance-Changing Objects	Jiahao Nick Li et.al.	2407.08081v1	null
2024-07-10	Hybrid Structure-from-Motion and Camera Relocalization for Enhanced Egocentric Localization	Jinjie Mai et.al.	2407.08023v1	link
2024-07-10	Greit-HRNet: Grouped Lightweight High-Resolution Network for Human Pose Estimation	Junjia Han et.al.	2407.07389v1	null
2024-07-09	Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images	Chuanrui Zhang et.al.	2407.06984v1	null
2024-07-09	Computer vision tasks for intelligent aerospace missions: An overview	Huilin Chen et.al.	2407.06513v1	null
2024-07-08	GeoNLF: Geometry guided Pose-Free Neural LiDAR Fields	Weiyi Xue et.al.	2407.05597v1	null
2024-07-10	On the power of data augmentation for head pose estimation	Michael Welter et.al.	2407.05357v2	link
2024-07-07	SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning	Yi Feng et.al.	2407.05283v1	link
2024-07-05	Unsupervised Learning of Category-Level 3D Pose from Object-Centric Videos	Leonhard Sommer et.al.	2407.04384v1	link
2024-07-04	Towards Cross-View-Consistent Self-Supervised Surround Depth Estimation	Laiyan Ding et.al.	2407.04041v1	link
2024-07-04	Markerless Multi-view 3D Human Pose Estimation: a survey	Ana Filipa Rodrigues Nogueira et.al.	2407.03817v1	null
2024-07-04	A Fast Dynamic Point Detection Method for LiDAR-Inertial Odometry in Driving Scenarios	Zikang Yuan et.al.	2407.03590v1	link
2024-07-03	Graph and Skipped Transformer: Exploiting Spatial and Temporal Modeling Capacities for Efficient 3D Human Pose Estimation	Mengmeng Cui et.al.	2407.02990v1	null
2024-07-03	Free-SurGS: SfM-Free 3D Gaussian Splatting for Surgical Scene Reconstruction	Jiaxin Guo et.al.	2407.02918v1	link
2024-07-02	SUPER: Seated Upper Body Pose Estimation using mmWave Radars	Bo Zhang et.al.	2407.02455v1	null
2024-07-02	ReliaAvatar: A Robust Real-Time Avatar Animator with Integrated Motion Prediction	Bo Qian et.al.	2407.02129v1	null
2024-07-02	Joint-Dataset Learning and Cross-Consistent Regularization for Text-to-Motion Retrieval	Nicola Messina et.al.	2407.02104v1	null
2024-07-01	Active Human Pose Estimation via an Autonomous UAV Agent	Jingxi Chen et.al.	2407.01811v1	null
2024-07-01	RoDyn-SLAM: Robust Dynamic Dense RGB-D SLAM with Neural Radiance Fields	Haochen Jiang et.al.	2407.01303v1	link
2024-07-01	Collaborative Graph Exploration with Reduced Pose-SLAM Uncertainty via Submodular Optimization	Ruofei Bai et.al.	2407.01013v1	link
2024-06-30	Ego-to-Exo: Interfacing Third Person Visuals from Egocentric Views in Real-time for Improved ROV Teleoperation	Adnan Abdullah et.al.	2407.00848v1	null
2024-06-29	When Robots Get Chatty: Grounding Multimodal Human-Robot Conversation and Collaboration	Philipp Allgeuer et.al.	2407.00518v1	link
2024-06-28	Assistive Image Annotation Systems with Deep Learning and Natural Language Capabilities: A Review	Moseli Mots'oehli et.al.	2407.00252v1	null
2024-06-28	EPOCH: Jointly Estimating the 3D Pose of Cameras and Humans	Nicola Garau et.al.	2406.19726v1	null
2024-06-28	CLOi-Mapper: Consistent, Lightweight, Robust, and Incremental Mapper With Embedded Systems for Commercial Robot Services	DongKi Noh et.al.	2406.19634v1	null
2024-06-27	Multimodal Visual-haptic pose estimation in the presence of transient occlusion	Michael Zechmair et.al.	2406.19323v1	null
2024-06-27	Human Modelling and Pose Estimation Overview	Pawel Knap et.al.	2406.19290v1	null
2024-06-26	Towards Human-Level 3D Relative Pose Estimation: Generalizable, Training-Free, with Single Reference	Yuan Gao et.al.	2406.18453v1	link
2024-06-27	Automatic infant 2D pose estimation from videos: comparing seven deep neural network methods	Filipe Gama et.al.	2406.17382v2	null
2024-06-24	High-resolution open-vocabulary object 6D pose estimation	Jaime Corsetti et.al.	2406.16384v1	null
2024-06-23	Breaking the Frame: Image Retrieval by Visual Overlap Prediction	Tong Wei et.al.	2406.16204v1	link
2024-06-21	Efficient Human Pose Estimation: Leveraging Advanced Techniques with MediaPipe	Sandeep Singh Sengar et.al.	2406.15649v1	link
2024-06-24	Investigating the impact of 2D gesture representation on co-speech gesture generation	Teo Guichoux et.al.	2406.15111v2	null
2024-06-20	Benchmarking Monocular 3D Dog Pose Estimation Using In-The-Wild Motion Capture Data	Moira Shooter et.al.	2406.14412v1	null
2024-06-20	PoseBench: Benchmarking the Robustness of Pose Estimation Models under Corruptions	Sihan Ma et.al.	2406.14367v1	null
2024-06-19	NeRF-Feat: 6D Object Pose Estimation using Feature Rendering	Shishir Reddy Vutukur et.al.	2406.13796v1	null
2024-06-19	CNN Based Flank Predictor for Quadruped Animal Species	Vanessa Suessle et.al.	2406.13588v1	null
2024-06-19	MVSBoost: An Efficient Point Cloud-based 3D Reconstruction	Umair Haroon et.al.	2406.13515v1	null
2024-06-19	An Efficient yet High-Performance Method for Precise Radar-Based Imaging of Human Hand Poses	Johanna Bräunig et.al.	2406.13464v1	null
2024-06-18	Head Pose Estimation and 3D Neural Surface Reconstruction via Monocular Camera in situ for Navigation and Safe Insertion into Natural Openings	Ruijie Tang et.al.	2406.13048v1	null
2024-06-17	Matching Query Image Against Selected NeRF Feature for Efficient and Scalable Localization	Huaiji Zhou et.al.	2406.11766v1	null
2024-06-17	Domain Generalization for In-Orbit 6D Pose Estimation	Antoine Legrand et.al.	2406.11743v1	null
2024-06-17	SeamPose: Repurposing Seams as Capacitive Sensors in a Shirt for Upper-Body Pose Tracking	Tianhong Catherine Yu et.al.	2406.11645v1	null
2024-06-14	Galibr: Targetless LiDAR-Camera Extrinsic Calibration Method via Ground Plane Initialization	Wonho Song et.al.	2406.11599v1	null
2024-06-15	MMVR: Millimeter-wave Multi-View Radar Dataset and Benchmark for Indoor Perception	M. Mahbubur Rahman et.al.	2406.10708v1	link
2024-06-15	Improving Ab-Initio Cryo-EM Reconstruction with Semi-Amortized Pose Inference	Shayan Shekarforoush et.al.	2406.10455v1	null
2024-06-14	The BabyView dataset: High-resolution egocentric videos of infants' and young children's everyday experiences	Bria Long et.al.	2406.10447v1	null
2024-06-14	OpenCapBench: A Benchmark to Bridge Pose Estimation and Biomechanics	Yoni Gozlan et.al.	2406.09788v1	null
2024-06-13	ImageNet3D: Towards General-Purpose Object-Level 3D Understanding	Wufei Ma et.al.	2406.09613v1	link
2024-06-13	Deep Transformer Network for Monocular Pose Estimation of Ship-Based UAV	Maneesha Wickramasuriya et.al.	2406.09260v1	link
2024-06-14	Language-Driven Closed-Loop Grasping with Model-Predictive Trajectory Replanning	Huy Hoang Nguyen et.al.	2406.09039v2	null
2024-06-14	VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks	Jiannan Wu et.al.	2406.08394v2	link
2024-06-12	Asymptotic Unbiased Sample Sampling to Speed Up Sharpness-Aware Minimization	Jiaxin Deng et.al.	2406.08001v1	null
2024-06-12	IFTD: Image Feature Triangle Descriptor for Loop Detection in Driving Scenes	Fengtian Lang et.al.	2406.07937v1	link
2024-06-12	From Variance to Veracity: Unbundling and Mitigating Gradient Variance in Differentiable Bundle Adjustment Layers	Swaminathan Gurumurthy et.al.	2406.07785v1	link
2024-06-12	SPIN: Spacecraft Imagery for Navigation	Javier Montalvo et.al.	2406.07500v2	link
2024-06-11	Realistic Data Generation for 6D Pose Estimation of Surgical Instruments	Juan Antonio Barragan et.al.	2406.07328v1	link
2024-06-11	SignMusketeers: An Efficient Multi-Stream Approach for Sign Language Translation at Scale	Shester Gueuwou et.al.	2406.06907v1	null
2024-06-10	Multicam-SLAM: Non-overlapping Multi-camera SLAM for Indirect Visual Localization and Navigation	Shenghao Li et.al.	2406.06374v1	link
2024-06-08	A preprocessing-based planning framework for utilizing contacts in high-precision insertion tasks	Muhammad Suhail Saleem et.al.	2406.05522v1	null
2024-06-06	GLACE: Global Local Accelerated Coordinate Encoding	Fangjinhua Wang et.al.	2406.04340v1	link
2024-06-06	Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking	Jiyao Zhang et.al.	2406.04316v1	null
2024-06-05	Hi5: 2D Hand Pose Estimation with Zero Human Annotation	Masum Hasan et.al.	2406.03599v1	null
2024-06-05	Sparse Color-Code Net: Real-Time RGB-Based 6D Object Pose Estimation on Edge Devices	Xingjian Yang et.al.	2406.02977v1	null
2024-06-04	CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation	Dejia Xu et.al.	2406.02509v1	null
2024-06-04	HPE-CogVLM: New Head Pose Grounding Task Exploration on Vision Language Model	Yu Tian et.al.	2406.01914v1	null
2024-06-03	A Robust Filter for Marker-less Multi-person Tracking in Human-Robot Interaction Scenarios	Enrico Martini et.al.	2406.01832v1	link
2024-06-01	Equivariant amortized inference of poses for cryo-EM	Larissa de Ruijter et.al.	2406.01630v1	null
2024-06-03	3D WholeBody Pose Estimation based on Semantic Graph Attention Network and Distance Information	Sihan Wen et.al.	2406.01196v1	null
2024-06-01	CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation	Matan Rusanovsky et.al.	2406.00384v1	link
2024-05-30	Estimating Human Poses Across Datasets: A Unified Skeleton and Multi-Teacher Distillation Approach	Muhammad Saif Ullah Khan et.al.	2405.20084v1	null
2024-05-30	TAMBRIDGE: Bridging Frame-Centered Tracking and 3D Gaussian Splatting for Enhanced SLAM	Peifeng Jiang et.al.	2405.19614v1	null
2024-05-29	Real-Time Dynamic Robot-Assisted Hand-Object Interaction via Motion Primitives	Mingqi Yuan et.al.	2405.19531v1	null
2024-05-29	Exploring AI-based Anonymization of Industrial Image and Video Data in the Context of Feature Preservation	Sabrina Cynthia Triess et.al.	2405.19173v1	null
2024-05-28	World Models for General Surgical Grasping	Hongbin Lin et.al.	2405.17940v1	null
2024-05-27	MoSca: Dynamic Gaussian Fusion from Casual Videos via 4D Motion Scaffolds	Jiahui Lei et.al.	2405.17421v1	link
2024-05-27	Occlusion Handling in 3D Human Pose Estimation with Perturbed Positional Encoding	Niloofar Azizi et.al.	2405.17397v1	null
2024-05-27	$\text{Di}^2\text{Pose}$ : Discrete Diffusion Model for Occluded 3D Human Pose Estimation	Weiquan Wang et.al.	2405.17016v1	null
2024-05-27	Clustering-based Learning for UAV Tracking and Pose Estimation	Jiaping Xiao et.al.	2405.16867v1	null
2024-05-26	Multi-Modal UAV Detection, Classification and Tracking Algorithm -- Technical Report for CVPR 2024 UG2 Challenge	Tianchen Deng et.al.	2405.16464v1	link
2024-05-25	Intensity and Texture Correction of Omnidirectional Image Using Camera Images for Indirect Augmented Reality	Hakim Ikebayashi et.al.	2405.16008v1	null
2024-05-23	CoPeD-Advancing Multi-Robot Collaborative Perception: A Comprehensive Dataset in Real-World Environments	Yang Zhou et.al.	2405.14731v1	link
2024-05-23	Segformer++: Efficient Token-Merging Strategies for High-Resolution Semantic Segmentation	Daniel Kienzle et.al.	2405.14467v1	link
2024-05-21	Geometric Transformation Uncertainty for Improving 3D Fetal Brain Pose Prediction from Freehand 2D Ultrasound Videos	Jayroop Ramesh et.al.	2405.13235v1	link
2024-05-21	Leveraging Neural Radiance Fields for Pose Estimation of an Unknown Space Object during Proximity Operations	Antoine Legrand et.al.	2405.12728v1	null
2024-05-21	PoseGravity: Pose Estimation from Points and Lines with Axis Prior	Akshay Chandrasekhar et.al.	2405.12646v1	link
2024-05-19	Focus on Low-Resolution Information: Multi-Granular Information-Lossless Model for Low-Resolution Human Pose Estimation	Zejun Gu et.al.	2405.12247v1	null
2024-05-20	AutoSoccerPose: Automated 3D posture Analysis of Soccer Shot Movements	Calvin Yeung et.al.	2405.12070v1	link
2024-05-19	Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging Geometries	Christiaan G. A. Viviers et.al.	2405.11677v1	link
2024-05-19	Cross-Domain Knowledge Distillation for Low-Resolution Human Pose Estimation	Zejun Gu et.al.	2405.11448v1	null
2024-05-18	PS6D: Point Cloud Based Symmetry-Aware 6D Object Pose Estimation in Robot Bin-Picking	Yifan Yang et.al.	2405.11257v1	null
2024-05-18	MotionGS : Compact Gaussian Splatting SLAM by Motion Filter	Xinli Guo et.al.	2405.11129v1	link
2024-05-17	Resolving Symmetry Ambiguity in Correspondence-based Methods for Instance-level Object Pose Estimation	Yongliang Lin et.al.	2405.10557v1	null
2024-05-16	Diversity-Aware Sign Language Production through a Pose Encoding Variational Autoencoder	Mohamed Ilyes Lakhal et.al.	2405.10423v1	null
2024-05-17	Toon3D: Seeing Cartoons from a New Perspective	Ethan Weber et.al.	2405.10320v2	null
2024-05-15	Task-adaptive Q-Face	Haomiao Sun et.al.	2405.09059v1	null
2024-05-14	RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images	Zong-Wei Hong et.al.	2405.08483v1	link
2024-05-14	TP3M: Transformer-based Pseudo 3D Image Matching with Reference	Liming Han et.al.	2405.08434v1	null
2024-05-13	Deep Learning-Based Object Pose Estimation: A Comprehensive Survey	Jian Liu et.al.	2405.07801v1	link
2024-05-13	JointLoc: A Real-time Visual Localization Framework for Planetary UAVs Based on Joint Relative and Absolute Pose Estimation	Xubo Luo et.al.	2405.07429v1	link
2024-05-11	TD-NeRF: Novel Truncated Depth Prior for Joint Camera Pose and Neural Radiance Field Optimization	Zhen Tan et.al.	2405.07027v1	link
2024-05-11	AHPPEBot: Autonomous Robot for Tomato Harvesting based on Phenotyping and Pose Estimation	Xingxu Li et.al.	2405.06959v1	null
2024-05-10	CasCalib: Cascaded Calibration for Motion Capture from Sparse Unsynchronized Cameras	James Tang et.al.	2405.06845v1	link
2024-05-10	MGS-SLAM: Monocular Sparse Tracking and Gaussian Mapping with Depth Smooth Regularization	Pengcheng Zhu et.al.	2405.06241v1	null
2024-05-10	Free-Moving Object Reconstruction and Pose Estimation with Virtual Camera	Haixin Shi et.al.	2405.05858v2	null
2024-05-09	Semi-Autonomous Laparoscopic Robot Docking with Learned Hand-Eye Information Fusion	Huanyu Tian et.al.	2405.05817v1	null
2024-05-09	NeuRSS: Enhancing AUV Localization and Bathymetric Mapping with Neural Rendering for Sidescan SLAM	Yiping Xie et.al.	2405.05807v1	null
2024-05-09	Benchmarking Neural Radiance Fields for Autonomous Robots: An Overview	Yuhang Ming et.al.	2405.05526v1	null
2024-05-08	Adversary-Guided Motion Retargeting for Skeleton Anonymization	Thomas Carr et.al.	2405.05428v1	null
2024-05-08	FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models	Jinglin Xu et.al.	2405.05216v1	link
2024-05-08	ProbRadarM3F: mmWave Radar based Human Skeletal Pose Estimation with Probability Map Guided Multi-Format Feature Fusion	Bing Zhu et.al.	2405.05164v1	null
2024-05-08	GISR: Geometric Initialization and Silhouette-based Refinement for Single-View Robot Pose and Configuration Estimation	Ivan Bilić et.al.	2405.04890v1	null
2024-05-07	Learning Distributional Demonstration Spaces for Task-Specific Cross-Pose Estimation	Jenny Wang et.al.	2405.04609v1	null
2024-05-07	Bayesian Simultaneous Localization and Multi-Lane Tracking Using Onboard Sensors and a SD Map	Yuxuan Xia et.al.	2405.04290v1	null
2024-05-07	Speak the Same Language: Global LiDAR Registration on BIM Using Pose Hough Transform	Zhijian Qiao et.al.	2405.03969v1	null
2024-05-07	Joint Estimation of Identity Verification and Relative Pose for Partial Fingerprints	Xiongjun Guan et.al.	2405.03959v1	link
2024-05-06	Pose Priors from Language Models	Sanjay Subramanian et.al.	2405.03689v1	null
2024-05-06	Optimizing Hand Region Detection in MediaPipe Holistic Full-Body Pose Estimation to Improve Accuracy and Avoid Downstream Errors	Amit Moryossef et.al.	2405.03545v1	link
2024-05-05	Multi-hop graph transformer network for 3D human pose estimation	Zaedul Islam et.al.	2405.03055v1	null
2024-05-05	Blending Distributed NeRFs with Tri-stage Robust Pose Optimization	Baijun Ye et.al.	2405.02880v1	null
2024-05-03	WeightedPose: Generalizable Cross-Pose Estimation via Weighted SVD	Xuxin Cheng et.al.	2405.02241v1	link
2024-05-03	Probablistic Restoration with Adaptive Noise Sampling for 3D Human Pose Estimation	Xianzhou Zeng et.al.	2405.02114v1	link
2024-05-03	An Onboard Framework for Staircases Modeling Based on Point Clouds	Chun Qing et.al.	2405.01918v1	null
2024-05-06	ShadowNav: Autonomous Global Localization for Lunar Navigation in Darkness	Deegan Atha et.al.	2405.01673v2	null
2024-05-02	IntervenGen: Interventional Data Generation for Robust and Data-Efficient Robot Imitation Learning	Ryan Hoque et.al.	2405.01472v1	null
2024-05-02	Behavior Imitation for Manipulator Control and Grasping with Deep Reinforcement Learning	Liu Qiyuan et.al.	2405.01284v1	null
2024-05-02	Sports Analysis and VR Viewing System Based on Player Tracking and Pose Estimation with Multimodal and Multiview Sensors	Wenxuan Guo et.al.	2405.01112v1	null
2024-05-02	CoViS-Net: A Cooperative Visual Spatial Foundation Model for Multi-Robot Applications	Jan Blumenkamp et.al.	2405.01107v1	null
2024-05-04	HandSSCA: 3D Hand Mesh Reconstruction with State Space Channel Attention from RGB images	Zixun Jiao et.al.	2405.01066v2	null
2024-05-01	Radar-Based Localization For Autonomous Ground Vehicles In Suburban Neighborhoods	Andrew J. Kramer et.al.	2405.00600v1	null
2024-04-30	Ultra Inertial Poser: Scalable Motion Capture and Tracking from Sparse Inertial Sensors and Ultra-Wideband Ranging	Rayan Armani et.al.	2404.19541v1	link
2024-04-30	UniFS: Universal Few-shot Instance Perception with Point Representations	Sheng Jin et.al.	2404.19401v1	link
2024-04-30	Quater-GCN: Enhancing 3D Human Pose Estimation with Orientation and Semi-supervised Training	Xingyu Song et.al.	2404.19279v1	link
2024-04-30	XFeat: Accelerated Features for Lightweight Image Matching	Guilherme Potje et.al.	2404.19174v1	null
2024-04-29	Self-Avatar Animation in Virtual Reality: Impact of Motion Signals Artifacts on the Full-Body Pose Reconstruction	Antoine Maiorca et.al.	2404.18628v1	null
2024-04-29	Mesh-based Photorealistic and Real-time 3D Mapping for Robust Visual Perception of Autonomous Underwater Vehicle	Jungwoo Lee et.al.	2404.18395v1	null
2024-04-29	Reconstructing Satellites in 3D from Amateur Telescope Images	Zhiming Chang et.al.	2404.18394v1	null
2024-04-27	Hybrid 3D Human Pose Estimation with Monocular Video and Sparse IMUs	Yiming Bao et.al.	2404.17837v1	null
2024-04-26	Localization Through Particle Filter Powered Neural Network Estimated Monocular Camera Poses	Yi Shen et.al.	2404.17685v1	null
2024-04-26	SLAM for Indoor Mapping of Wide Area Construction Environments	Vincent Ress et.al.	2404.17215v1	null
2024-04-25	WheelPose: Data Synthesis Techniques to Improve Pose Estimation Performance on Wheelchair Users	William Huang et.al.	2404.17063v1	link
2024-04-25	Transformer-Based Local Feature Matching for Multimodal Image Registration	Remi Delaunay et.al.	2404.16802v1	null
2024-04-25	DeepKalPose: An Enhanced Deep-Learning Kalman Filter for Temporally Consistent Monocular Vehicle Pose Estimation	Leandro Di Bella et.al.	2404.16558v1	null
2024-04-25	Efficient Solution of Point-Line Absolute Pose	Petr Hruby et.al.	2404.16552v1	link
2024-04-25	COBRA -- COnfidence score Based on shape Regression Analysis for method-independent quality assessment of object pose estimation from single images	Panagiotis Sapoutzoglou et.al.	2404.16471v1	link
2024-04-25	MegaParticles: Range-based 6-DoF Monte Carlo Localization with GPU-Accelerated Stein Particle Filter	Kenji Koide et.al.	2404.16370v1	null
2024-04-24	3D Human Pose Estimation with Occlusions: Introducing BlendMimic3D Dataset and GCN Refinement	Filipa Lino et.al.	2404.16136v1	link
2024-04-23	SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation	Xiangyu Xu et.al.	2404.15276v1	link
2024-04-25	Domain adaptive pose estimation via multi-level alignment	Yugan Chen et.al.	2404.14885v2	link
2024-04-23	Semi-supervised 2D Human Pose Estimation via Adaptive Keypoint Masking	Kexin Meng et.al.	2404.14835v1	null
2024-04-23	UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues	Vandad Davoodnia et.al.	2404.14634v1	null
2024-04-22	DHRNet: A Dual-Path Hierarchical Relation Network for Multi-Person Pose Estimation	Yonghao Dang et.al.	2404.14025v1	link
2024-04-23	CT-NeRF: Incremental Optimizing Neural Radiance Field and Poses with Complex Trajectory	Yunlong Ran et.al.	2404.13896v2	null
2024-04-21	Resampling-free Particle Filters in High-dimensions	Akhilan Boopathy et.al.	2404.13698v1	link
2024-04-20	EC-SLAM: Real-time Dense Neural RGB-D SLAM System with Effectively Constrained Global Bundle Adjustment	Guanghao Li et.al.	2404.13346v1	link
2024-04-18	Spot-Compose: A Framework for Open-Vocabulary Object Retrieval and Drawer Manipulation in Point Clouds	Oliver Lemke et.al.	2404.12440v1	null
2024-04-18	Gait Recognition from Highly Compressed Videos	Andrei Niculae et.al.	2404.12183v1	null
2024-04-17	Mushroom Segmentation and 3D Pose Estimation from Point Clouds using Fully Convolutional Geometric Features and Implicit Pose Encoding	George Retsinas et.al.	2404.12144v1	link
2024-04-17	Kathakali Hand Gesture Recognition With Minimal Data	Kavitha Raju et.al.	2404.11205v1	null
2024-04-17	GeoReF: Geometric Alignment Across Shape Variation for Category-level Object Pose Refinement	Linfang Zheng et.al.	2404.11139v1	null
2024-04-17	CorrNet+: Sign Language Recognition and Translation via Spatial-Temporal Correlation	Lianyu Hu et.al.	2404.11111v1	link
2024-04-16	HumMUSS: Human Motion Understanding using State Space Models	Arnab Kumar Mondal et.al.	2404.10880v1	null
2024-04-16	Invariant Kalman Filtering with Noise-Free Pseudo-Measurements	Sven Goffin et.al.	2404.10687v1	null
2024-04-16	The Unreasonable Effectiveness of Pre-Trained Features for Camera Pose Refinement	Gabriele Trivigno et.al.	2404.10438v1	null
2024-04-16	GaitPoint+: A Gait Recognition Network Incorporating Point Cloud Analysis and Recycling	Huantao Ren et.al.	2404.10213v1	null
2024-04-16	LWIRPOSE: A novel LWIR Thermal Image Dataset and Benchmark	Avinash Upadhyay et.al.	2404.10212v1	link
2024-04-15	LetsGo: Large-Scale Garage Modeling and Rendering via LiDAR-Assisted Gaussian Primitives	Jiadi Cui et.al.	2404.09748v1	null
2024-04-14	In My Perspective, In My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition	Wiktor Mucha et.al.	2404.09308v1	link
2024-04-13	DeDoDe v2: Analyzing and Improving the DeDoDe Keypoint Detector	Johan Edstedt et.al.	2404.08928v1	link
2024-04-16	3D Human Scan With A Moving Event Camera	Kai Kohyama et.al.	2404.08504v2	null
2024-04-11	Separated Attention: An Improved Cycle GAN Based Under Water Image Enhancement Method	Tashmoy Ghosh et.al.	2404.07649v1	null
2024-04-11	GLID: Pre-training a Generalist Encoder-Decoder Vision Model	Jihao Liu et.al.	2404.07603v1	null
2024-04-10	Measuring proximity to standard planes during fetal brain ultrasound scanning	Chiara Di Vece et.al.	2404.07124v1	null
2024-04-10	MoCap-to-Visual Domain Adaptation for Efficient Human Mesh Estimation from 2D Keypoints	Bedirhan Uguz et.al.	2404.07094v1	null
2024-04-10	Gaussian-LIC: Photo-realistic LiDAR-Inertial-Camera SLAM with 3D Gaussian Splatting	Xiaolei Lang et.al.	2404.06926v1	null
2024-04-09	Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences	Axel Barroso-Laguna et.al.	2404.06337v1	link
2024-04-09	Incremental Joint Learning of Depth, Pose and Implicit Scene Representation on Monocular Camera in Large-scale Scenes	Tianchen Deng et.al.	2404.06050v1	null
2024-04-08	Learning 3D-Aware GANs from Unposed Images with Template Feature Field	Xinya Chen et.al.	2404.05705v1	null
2024-04-08	Learning a Category-level Object Pose Estimator without Pose Annotations	Fengrui Tian et.al.	2404.05626v1	null
2024-04-08	DepthMOT: Depth Cues Lead to a Strong Multi-Object Tracker	Jiapeng Wu et.al.	2404.05518v1	link
2024-04-08	Two Hands Are Better Than One: Resolving Hand to Hand Intersections via Occupancy Networks	Maksym Ivashechkin et.al.	2404.05414v1	null
2024-04-08	STITCH: Augmented Dexterity for Suture Throws Including Thread Coordination and Handoffs	Kush Hari et.al.	2404.05151v1	null
2024-04-05	ToolEENet: Tool Affordance 6D Pose Estimation	Yunlong Wang et.al.	2404.04193v1	null
2024-04-04	SDPose: Tokenized Pose Estimation via Circulation-Guide Self-Distillation	Sichen Chen et.al.	2404.03518v1	link
2024-04-04	Multi Positive Contrastive Learning with Pose-Consistent Generated Images	Sho Inayoshi et.al.	2404.03256v1	null
2024-04-04	HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud	Wencan Cheng et.al.	2404.03159v1	link
2024-04-03	Fusing Multi-sensor Input with State Information on TinyML Brains for Autonomous Nano-drones	Luca Crupi et.al.	2404.02567v1	null
2024-04-03	Semi-Supervised Unconstrained Head Pose Estimation in the Wild	Huayi Zhou et.al.	2404.02544v1	link
2024-04-02	3D Congealing: 3D-Aware Image Alignment in the Wild	Yunzhi Zhang et.al.	2404.02125v1	null
2024-04-02	SelfPose3d: Self-Supervised Multi-Person Multi-View 3d Pose Estimation	Vinkle Srivastav et.al.	2404.02041v1	link
2024-04-01	Marrying NeRF with Feature Matching for One-step Pose Estimation	Ronghan Chen et.al.	2404.00891v1	null
2024-03-31	Graph-Based vs. Error State Kalman Filter-Based Fusion Of 5G And Inertial Data For MAV Indoor Pose Estimation	Meisam Kabiri et.al.	2404.00691v1	null
2024-03-31	OmniLocalRF: Omnidirectional Local Radiance Fields from Dynamic Videos	Dongyoung Choi et.al.	2404.00676v1	null
2024-04-02	KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced Transformer for 3D Human Pose Estimation	Jihua Peng et.al.	2404.00658v2	link
2024-03-29	FetalDiffusion: Pose-Controllable 3D Fetal MRI Synthesis with Conditional Diffusion Model	Molin Zhang et.al.	2404.00132v1	null
2024-03-29	Latent Embedding Clustering for Occlusion Robust Head Pose Estimation	José Celestino et.al.	2403.20251v1	null
2024-03-29	A Unified Framework for Human-centric Point Cloud Video Understanding	Yiteng Xu et.al.	2403.20031v1	null
2024-04-01	Video-Based Human Pose Regression via Decoupled Space-Time Aggregation	Jijie He et.al.	2403.19926v2	link
2024-03-28	Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation	Xiao Lin et.al.	2403.19527v1	link
2024-03-27	Object Pose Estimation via the Aggregation of Diffusion Features	Tianfu Wang et.al.	2403.18791v1	link
2024-03-27	RoboKeyGen: Robot Pose and Joint Angles Estimation via Diffusion-based 3D Keypoint Generation	Yang Tian et.al.	2403.18259v1	null
2024-03-26	Mathematical Foundation and Corrections for Full Range Head Pose Estimation	Huei-Chung Hu et.al.	2403.18104v1	null
2024-03-26	EgoPoseFormer: A Simple Baseline for Egocentric 3D Human Pose Estimation	Chenhongyi Yang et.al.	2403.18080v1	link
2024-03-26	A Survey on 3D Egocentric Human Pose Estimation	Md Mushfiqur Azam et.al.	2403.17893v1	link
2024-03-26	GTA-HDR: A Large-Scale Synthetic Dataset for HDR Image Reconstruction	Hrishav Bakul Barua et.al.	2403.17837v1	link
2024-03-26	DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual Descriptions	Sammy Christen et.al.	2403.17827v1	null
2024-03-26	System Calibration of a Field Phenotyping Robot with Multiple High-Precision Profile Laser Scanners	Felix Esser et.al.	2403.17788v1	null
2024-03-25	Animal Avatars: Reconstructing Animatable 3D Animals from Casual Videos	Remy Sabathier et.al.	2403.17103v1	link
2024-03-25	Characterisation of the Intel RealSense D415 Stereo Depth Camera for Motion-Corrected CT Perfusion Imaging	Mahdieh Dashtbani Moghari et.al.	2403.16490v1	null
2024-03-25	Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects	Zicong Fan et.al.	2403.16428v1	link
2024-03-25	A Geometric Perspective on Fusing Gaussian Distributions on Lie Groups	Yixiao Ge et.al.	2403.16411v1	null
2024-03-25	ASDF: Assembly State Detection Utilizing Late Fusion by Integrating 6D Pose Estimation	Hannah Schieber et.al.	2403.16400v1	link
2024-03-24	KITchen: A Real-World Benchmark and Dataset for 6D Object Pose Estimation in Kitchen Environments	Abdelrahman Younes et.al.	2403.16238v1	null
2024-03-24	Diffusion Model is a Good Pose Estimator from 3D RF-Vision	Junqiao Fan et.al.	2403.16198v1	null
2024-03-23	UPNeRF: A Unified Framework for Monocular 3D Object Reconstruction and Pose Estimation	Yuliang Guo et.al.	2403.15705v1	link
2024-03-22	InterFusion: Text-Driven Generation of 3D Human-Object Interaction	Sisi Dai et.al.	2403.15612v1	link
2024-03-22	Augmented Reality Warnings in Roadway Work Zones: Evaluating the Effect of Modality on Worker Reaction Times	Sepehr Sabeti et.al.	2403.15571v1	null
2024-03-22	Gesture-Controlled Aerial Robot Formation for Human-Swarm Interaction in Safety Monitoring Applications	Vít Krátký et.al.	2403.15333v1	null
2024-03-22	WSCLoc: Weakly-Supervised Sparse-View Camera Relocalization	Jialu Wang et.al.	2403.15272v1	null
2024-03-22	DITTO: Demonstration Imitation by Trajectory Transformation	Nick Heppert et.al.	2403.15203v1	link
2024-03-22	Cartoon Hallucinations Detection: Pose-aware In Context Visual Learning	Bumsoo Kim et.al.	2403.15048v1	null
2024-03-22	Trajectory Regularization Enhances Self-Supervised Geometric Representation	Jiayun Wang et.al.	2403.14973v1	link
2024-03-21	VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding	Ahmad Mahmood et.al.	2403.14743v1	link
2024-03-21	Visibility-Aware Keypoint Localization for 6DoF Object Pose Estimation	Ruyi Lian et.al.	2403.14559v1	null
2024-03-23	Exploring 3D Human Pose Estimation and Forecasting from the Robot's Perspective: The HARPER Dataset	Andrea Avogaro et.al.	2403.14447v2	null
2024-03-21	Evaluation and Deployment of LiDAR-based Place Recognition in Dense Forests	Haedam Oh et.al.	2403.14326v1	null
2024-03-21	Zero123-6D: Zero-shot Novel View Synthesis for RGB Category-level 6D Pose Estimation	Francesco Di Felice et.al.	2403.14279v1	null
2024-03-20	DVMNet: Computing Relative Pose for Unseen Objects Beyond Hypotheses	Chen Zhao et.al.	2403.13683v1	link
2024-03-20	Meta-Point Learning and Refining for Category-Agnostic Pose Estimation	Junjie Chen et.al.	2403.13647v1	link
2024-03-20	Advancing 6D Pose Estimation in Augmented Reality -- Overcoming Projection Ambiguity with Uncontrolled Imagery	Mayura Manawadu et.al.	2403.13434v1	null
2024-03-20	DOR3D-Net: Dense Ordinal Regression Network for 3D Hand Pose Estimation	Yamin Mao et.al.	2403.13405v1	null
2024-03-20	ManiPose: A Comprehensive Benchmark for Pose-aware Object Manipulation in Robotics	Qiaojun Yu et.al.	2403.13365v1	null
2024-03-20	MULAN-WC: Multi-Robot Localization Uncertainty-aware Active NeRF with Wireless Coordination	Weiying Wang et.al.	2403.13348v1	null
2024-03-19	FaceXFormer: A Unified Transformer for Facial Analysis	Kartik Narayan et.al.	2403.12960v1	link
2024-03-19	WHAC: World-grounded Humans and Cameras	Wanqi Yin et.al.	2403.12959v1	link
2024-03-19	Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation	Jingtao Sun et.al.	2403.12728v1	link
2024-03-19	IFFNeRF: Initialisation Free and Fast 6DoF pose estimation from a single image and a NeRF model	Matteo Bortolon et.al.	2403.12682v1	null
2024-03-19	In-Hand Following of Deformable Linear Objects Using Dexterous Fingers with Tactile Sensing	Mingrui Yu et.al.	2403.12676v1	null
2024-03-19	Self-learning Canonical Space for Multi-view 3D Human Pose Estimation	Xiaoben Li et.al.	2403.12440v1	null
2024-03-20	Human Mesh Recovery from Arbitrary Multi-view Images	Xiaoben Li et.al.	2403.12434v2	link
2024-03-19	XPose: eXplainable Human Pose Estimation	Luyu Qiu et.al.	2403.12370v1	null
2024-03-18	HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data	Mengqi Zhang et.al.	2403.12011v1	null
2024-03-18	Normalized Validity Scores for DNNs in Regression based Eye Feature Extraction	Wolfgang Fuhl et.al.	2403.11665v1	null
2024-03-18	An Accurate and Real-time Relative Pose Estimation from Triple Point-line Images by Decoupling Rotation and Translation	Zewen Xu et.al.	2403.11639v1	null
2024-03-18	LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models	Yang Yang et.al.	2403.11627v1	link
2024-03-18	GenFlow: Generalizable Recurrent Flow for 6D Pose Refinement of Novel Objects	Sungphill Moon et.al.	2403.11510v1	null
2024-03-17	A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation	Qucheng Peng et.al.	2403.11310v1	link
2024-03-17	Compact 3D Gaussian Splatting For Dense Visual SLAM	Tianchen Deng et.al.	2403.11247v1	link
2024-03-16	Robotic Task Success Evaluation Under Multi-modal Non-Parametric Object Pose Uncertainty	Lakshadeep Naik et.al.	2403.10874v1	null
2024-03-16	DPPE: Dense Pose Estimation in a Plenoxels Environment using Gradient Approximation	Christopher Kolios et.al.	2403.10773v1	null
2024-03-15	GS-Pose: Cascaded Framework for Generalizable Segmentation-based 6D Object Pose Estimation	Dingding Cai et.al.	2403.10683v1	null
2024-03-15	CLOSURE: Fast Quantification of Pose Uncertainty Sets	Yihuai Gao et.al.	2403.09990v1	null
2024-03-14	ThermoHands: A Benchmark for 3D Hand Pose Estimation from Egocentric Thermal Image	Fangqiang Ding et.al.	2403.09871v1	null
2024-03-14	BOP Challenge 2023 on Detection, Segmentation and Pose Estimation of Seen and Unseen Rigid Objects	Tomas Hodan et.al.	2403.09799v1	null
2024-03-14	Scalable Autonomous Drone Flight in the Forest with Visual-Inertial SLAM and Dense Submaps Built without LiDAR	Sebastián Barbas Laina et.al.	2403.09596v1	null
2024-03-14	Improving Real-Time Omnidirectional 3D Multi-Person Human Pose Estimation with People Matching and Unsupervised 2D-3D Lifting	Pawel Knap et.al.	2403.09437v1	null
2024-03-14	LM2D: Lyrics- and Music-Driven Dance Synthesis	Wenjie Yin et.al.	2403.09407v1	null
2024-03-14	SD-Net: Symmetric-Aware Keypoint Prediction and Domain Adaptation for 6D Pose Estimation In Bin-picking Scenarios	Ding-Tao Huang et.al.	2403.09317v1	link
2024-03-14	MOTPose: Multi-object 6D Pose Estimation for Dynamic Video Sequences using Attention-based Temporal Fusion	Arul Selvam Periyasamy et.al.	2403.09309v1	null
2024-03-13	Data Augmentation in Human-Centric Vision	Wentao Jiang et.al.	2403.08650v1	null
2024-03-15	PRAGO: Differentiable Multi-View Pose Optimization From Objectness Detections	Matteo Taiana et.al.	2403.08586v2	null
2024-03-13	NeRF-Supervised Feature Point Detection and Description	Ali Youssef et.al.	2403.08156v1	link
2024-03-12	Q-SLAM: Quadric Representations for Monocular SLAM	Chensheng Peng et.al.	2403.08125v1	null
2024-03-12	MRC-Net: 6-DoF Pose Estimation with MultiScale Residual Correlation	Yuelong Li et.al.	2403.08019v1	link
2024-03-12	Uncertainty Quantification with Deep Ensembles for 6D Object Pose Estimation	Kira Wursthorn et.al.	2403.07741v1	null
2024-03-12	Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving	JunDa Cheng et.al.	2403.07535v1	link
2024-03-12	Category-Agnostic Pose Estimation for Point Clouds	Bowen Liu et.al.	2403.07437v1	null
2024-03-12	Monocular Microscope to CT Registration using Pose Estimation of the Incus for Augmented Reality Cochlear Implant Surgery	Yike Zhang et.al.	2403.07219v1	null
2024-03-11	Real-Time Simulated Avatar from Head-Mounted Sensors	Zhengyi Luo et.al.	2403.06862v1	null
2024-03-11	Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition	Erkut Akdag et.al.	2403.06577v1	null
2024-03-10	Platypose: Calibrated Zero-Shot Multi-Hypothesis 3D Human Motion Estimation	Paweł A. Pierzchlewicz et.al.	2403.06164v1	link
2024-03-10	Diffusion Models Trained with Large Data Are Transferable Visual Models	Guangkai Xu et.al.	2403.06090v1	link
2024-03-08	Prepared for the Worst: A Learning-Based Adversarial Attack for Resilience Analysis of the ICP Algorithm	Ziyu Zhang et.al.	2403.05666v1	null
2024-03-11	Exploiting polar symmetry in designing equivariant observers for vision-based motion estimation	Tarek Bouazza et.al.	2403.05450v2	null
2024-03-07	Real-Time Planning Under Uncertainty for AUVs Using Virtual Maps	Ivana Collado-Gonzalez et.al.	2403.04936v1	null
2024-03-07	That's My Point: Compact Object-centric LiDAR Pose Estimation for Large-scale Outdoor Localisation	Georgi Pramatarov et.al.	2403.04755v1	null
2024-03-07	Disentangled Diffusion-Based 3D Human Pose Estimation with Hierarchical Spatial and Temporal Denoiser	Qingyuan Cai et.al.	2403.04444v1	link
2024-03-09	Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation	Ruicong Liu et.al.	2403.04381v2	link
2024-03-05	FAR: Flexible, Accurate and Robust 6DoF Relative Camera Pose Estimation	Chris Rockwell et.al.	2403.03221v1	null
2024-03-05	NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors	Yannan He et.al.	2403.03122v1	null
2024-03-05	Improved LiDAR Odometry and Mapping using Deep Semantic Segmentation and Novel Outliers Detection	Mohamed Afifi et.al.	2403.03111v1	null
2024-03-05	Splat-Nav: Safe Real-Time Robot Navigation in Gaussian Splatting Maps	Timothy Chen et.al.	2403.02751v1	link
2024-03-04	PowerSkel: A Device-Free Framework Using CSI Signal for Human Skeleton Estimation in Power Station	Cunyi Yin et.al.	2403.01913v1	link
2024-03-04	A Simple Baseline for Efficient Hand Mesh Reconstruction	Zhishan Zhou et.al.	2403.01813v1	null
2024-03-03	MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images	Junwen Huang et.al.	2403.01517v1	null
2024-03-02	Single-image camera calibration with model-free distortion correction	Katia Genovese et.al.	2403.01263v1	null
2024-03-02	Grid-based Fast and Structural Visual Odometry	Zhang Zhihe et.al.	2403.01110v1	null
2024-03-01	Optimal Robot Formations: Balancing Range-Based Observability and User-Defined Configurations	Syed Shabbir Ahmed et.al.	2403.00988v1	null
2024-03-04	TEXterity -- Tactile Extrinsic deXterity: Simultaneous Tactile Estimation and Control for Extrinsic Dexterity	Sangwoon Kim et.al.	2403.00049v2	null
2024-03-01	Graph Convolutional Neural Networks for Automated Echocardiography View Recognition: A Holistic Approach	Sarina Thomas et.al.	2402.19062v2	null
2024-02-29	Deep Learning for 3D Human Pose Estimation and Mesh Recovery: A Survey	Yang Liu et.al.	2402.18844v1	link
2024-02-28	Attention-Propagation Network for Egocentric Heatmap to 3D Pose Lifting	Taeho Kang et.al.	2402.18330v1	link
2024-02-28	Location-guided Head Pose Estimation for Fisheye Image	Bing Li et.al.	2402.18320v1	null
2024-02-28	NToP: NeRF-Powered Large-scale Dataset Generation for 2D and 3D Human Pose Estimation in Top-View Fisheye Images	Jingrui Yu et.al.	2402.18196v1	link
2024-02-28	Six-Point Method for Multi-Camera Systems with Reduced Solution Space	Banglei Guan et.al.	2402.18066v1	link
2024-02-27	Real-Time Estimation of Relative Pose for UAVs Using a Dual-Channel Feature Association	Zhaoying Wang et.al.	2402.17504v1	null
2024-02-26	HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields	Haozhe Qi et.al.	2402.17062v1	link
2024-02-26	DRSI-Net: Dual-Residual Spatial Interaction Network for Multi-Person Pose Estimation	Shang Wu et.al.	2402.16640v1	null
2024-02-26	GEA: Reconstructing Expressive 3D Gaussian Avatar from Monocular Video	Xinqi Liu et.al.	2402.16607v1	null
2024-02-26	DreamUp3D: Object-Centric Generative Models for Single-View 3D Scene Understanding and Real-to-Sim Transfer	Yizhe Wu et.al.	2402.16308v1	null
2024-02-25	XAI-based gait analysis of patients walking with Knee-Ankle-Foot orthosis using video cameras	Arnav Mishra et.al.	2402.16175v1	null
2024-02-25	VOLoc: Visual Place Recognition by Querying Compressed Lidar Map	Xudong Cai et.al.	2402.15961v1	link
2024-02-24	CLIPose: Category-Level Object Pose Estimation with Pre-trained Vision-Language Knowledge	Xiao Lin et.al.	2402.15726v1	null
2024-02-23	Optimized Deployment of Deep Neural Networks for Visual Pose Estimation on Nano-drones	Matteo Risso et.al.	2402.15273v1	null
2024-02-22	Cameras as Rays: Pose Estimation via Ray Diffusion	Jason Y. Zhang et.al.	2402.14817v1	null
2024-02-22	S^2Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR	Jialun Pei et.al.	2402.14461v1	link
2024-02-22	VLPose: Bridging the Domain Gap in Pose Estimation with Language-Vision Tuning	Jingyao Li et.al.	2402.14456v1	null
2024-02-22	Modeling 3D Infant Kinetics Using Adaptive Graph Convolutional Networks	Daniel Holmberg et.al.	2402.14400v1	link
2024-02-22	Secure Navigation using Landmark-based Localization in a GPS-denied Environment	Ganesh Sapkota et.al.	2402.14280v1	null
2024-02-21	SecurePose: Automated Face Blurring and Human Movement Kinematics Extraction from Videos Recorded in Clinical Settings	Rishabh Bajpai et.al.	2402.14143v1	null
2024-02-21	High-throughput Visual Nano-drone to Nano-drone Relative Localization using Onboard Fully Convolutional Networks	Luca Crupi et.al.	2402.13756v1	null
2024-02-21	EffLoc: Lightweight Vision Transformer for Efficient 6-DOF Camera Relocalization	Zhendong Xiao et.al.	2402.13537v1	null
2024-02-20	DiffusionNOCS: Managing Symmetry and Uncertainty in Sim2Real Multi-Modal Category-level Pose Estimation	Takuya Ikeda et.al.	2402.12647v1	link
2024-02-19	Landmark-based Localization using Stereo Vision and Deep Learning in GPS-Denied Battlefield Environment	Ganesh Sapkota et.al.	2402.12551v1	null
2024-02-18	Boosting Semi-Supervised 2D Human Pose Estimation by Revisiting Data Augmentation and Consistency Training	Huayi Zhou et.al.	2402.11566v1	link
2024-02-17	Enhancing Surgical Performance in Cardiothoracic Surgery with Innovations from Computer Vision and Artificial Intelligence: A Narrative Review	Merryn D. Constable et.al.	2402.11288v1	null
2024-02-17	Dense Matchers for Dense Tracking	Tomáš Jelínek et.al.	2402.11287v1	null
2024-02-16	Occlusion Resilient 3D Human Pose Estimation	Soumava Kumar Roy et.al.	2402.11036v1	null
2024-02-16	3D Diffuser Actor: Policy Diffusion with 3D Scene Representations	Tsung-Wei Ke et.al.	2402.10885v1	null
2024-02-15	Lester: rotoscope animation through video object segmentation and tracking	Ruben Tous et.al.	2402.09883v1	link
2024-02-15	Foul prediction with estimated poses from soccer broadcast video	Jiale Fang et.al.	2402.09650v1	null
2024-02-16	IMUOptimize: A Data-Driven Approach to Optimal IMU Placement for Human Pose Estimation with Transformer Architecture	Varun Ramani et.al.	2402.08923v2	null
2024-02-13	Are Semi-Dense Detector-Free Methods Good at Matching Local Features?	Matthieu Vilain et.al.	2402.08671v1	null
2024-02-13	Gaussian-Sum Filter for Range-based 3D Relative Pose Estimation in the Presence of Ambiguities	Syed S. Ahmed et.al.	2402.08566v1	null
2024-02-13	Learning to Produce Semi-dense Correspondences for Visual Localization	Khang Truong Giang et.al.	2402.08359v1	link
2024-02-12	Extending 3D body pose estimation for robotic-assistive therapies of autistic children	Laura Santos et.al.	2402.08006v1	null
2024-02-12	GBOT: Graph-Based 3D Object Tracking for Augmented Reality-Assisted Assembly Guidance	Shiyu Li et.al.	2402.07677v1	link
2024-02-12	UAV-assisted Visual SLAM Generating Reconstructed 3D Scene Graphs in GPS-denied Environments	Ahmed Radwan et.al.	2402.07537v1	null
2024-02-09	Improving 2D-3D Dense Correspondences with Diffusion Models for 6D Object Pose Estimation	Peter Hönig et.al.	2402.06436v1	null
2024-02-08	Real-time Holistic Robot Pose Estimation with Unknown States	Shikun Ban et.al.	2402.05655v1	link
2024-02-08	Extending 6D Object Pose Estimators for Stereo Vision	Thomas Pöllabauer et.al.	2402.05610v1	null
2024-02-09	NCRF: Neural Contact Radiance Fields for Free-Viewpoint Rendering of Hand-Object Interaction	Zhongqun Zhang et.al.	2402.05532v2	null
2024-02-07	Detection and Pose Estimation of flat, Texture-less Industry Objects on HoloLens using synthetic Training	Thomas Pöllabauer et.al.	2402.04979v1	null
2024-02-07	4-Dimensional deformation part model for pose estimation using Kalman filter constraints	Enrique Martinez-Berti et.al.	2402.04953v1	null
2024-02-07	STAR: Shape-focused Texture Agnostic Representations for Improved Object Detection and 6D Pose Estimation	Peter Hönig et.al.	2402.04878v1	link
2024-02-05	A Computer Vision Based Approach for Stalking Detection Using a CNN-LSTM-MLP Hybrid Fusion Model	Murad Hasan et.al.	2402.03417v1	null
2024-02-05	SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM	Mingrui Li et.al.	2402.03246v1	link
2024-02-05	Extreme Two-View Geometry From Object Poses with Diffusion Models	Yujing Sun et.al.	2402.02800v1	link
2024-02-04	Uncertainty-Aware Testing-Time Optimization for 3D Human Pose Estimation	Ti Wang et.al.	2402.02339v1	null
2024-02-01	mmID: High-Resolution mmWave Imaging for Human Identification	Sakila S. Jayaweera et.al.	2402.00996v1	null
2024-02-01	In-Bed Pose Estimation: A Review	Ziya Ata Yazıcı et.al.	2402.00700v1	null
2024-02-01	WayFASTER: a Self-Supervised Traversability Prediction for Increased Navigation Awareness	Mateus Valverde Gasparino et.al.	2402.00683v1	link
2024-02-02	CMRNext: Camera to LiDAR Matching in the Wild for Localization and Extrinsic Calibration	Daniele Cattaneo et.al.	2402.00129v2	null
2024-01-31	Improved Scene Landmark Detection for Camera Localization	Tien Do et.al.	2401.18083v1	link
2024-01-30	Navigating the Unknown: Uncertainty-Aware Compute-in-Memory Autonomy of Edge Robotics	Nastaran Darabi et.al.	2401.17481v1	null
2024-01-30	MESA: Matching Everything by Segmenting Anything	Yesheng Zhang et.al.	2401.16741v1	null
2024-01-30	Towards Precise 3D Human Pose Estimation with Multi-Perspective Spatial-Temporal Relational Transformers	Jianbin Jiao et.al.	2401.16700v1	link
2024-01-29	Leveraging Positional Encoding for Robust Multi-Reference-Based Object 6D Pose Estimation	Jaewoo Park et.al.	2401.16284v1	null
2024-01-29	Reconstructing Close Human Interactions from Multiple Views	Qing Shuai et.al.	2401.16173v1	link
2024-01-28	Multi-Person 3D Pose Estimation from Multi-View Uncalibrated Depth Cameras	Yu-Jhe Li et.al.	2401.15616v1	null
2024-01-30	Multi-Robot Relative Pose Estimation in SE(2) with Observability Analysis: A Comparison of Extended Kalman Filtering and Robust Pose Graph Optimization	Kihoon Shin et.al.	2401.15313v2	null
2024-01-26	Adaptive Deep Learning for Efficient Visual Pose Estimation aboard Ultra-low-power Nano-drones	Beatrice Alessandra Motetti et.al.	2401.15236v1	null
2024-01-26	SimpleEgo: Predicting Probabilistic Body Pose from Egocentric Cameras	Hanz Cuevas-Velasquez et.al.	2401.14785v1	null
2024-01-24	Synthetic data enables faster annotation and robust segmentation for multi-object grasping in clutter	Dongmyoung Lee et.al.	2401.13405v1	null
2024-01-24	Linear Relative Pose Estimation Founded on Pose-only Imaging Geometry	Qi Cai et.al.	2401.13357v1	null
2024-01-23	SemanticSLAM: Learning based Semantic Map Construction and Robust Camera Localization	Mingyang Li et.al.	2401.13076v1	link
2024-01-24	RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos	Hongchi Xia et.al.	2401.12592v2	null
2024-01-26	MobileARLoc: On-device Robust Absolute Localisation for Pervasive Markerless Mobile AR	Changkun Liu et.al.	2401.11511v2	null
2024-01-19	SCENES: Subpixel Correspondence Estimation With Epipolar Supervision	Dominik A. Kloepfer et.al.	2401.10886v1	null
2024-01-19	Source-Free and Image-Only Unsupervised Domain Adaptation for Category Level Object Pose Estimation	Prakhar Kaushik et.al.	2401.10848v1	null
2024-01-22	TEXterity: Tactile Extrinsic deXterity	Antonia Bronars et.al.	2401.10230v2	null
2024-01-18	Exploring Latent Cross-Channel Embedding for Accurate 3D Human Pose Reconstruction in a Diffusion Framework	Junkun Jiang et.al.	2401.09836v1	link
2024-01-17	DK-SLAM: Monocular Visual SLAM with Deep Keypoints Adaptive Learning, Tracking and Loop-Closing	Hao Qu et.al.	2401.09160v1	null
2024-01-17	PIN-SLAM: LiDAR SLAM Using a Point-Based Implicit Neural Representation for Achieving Global Map Consistency	Yue Pan et.al.	2401.09101v1	link
2024-01-16	AdaSem: Adaptive Goal-Oriented Semantic Communications for End-to-End Camera Relocalization	Qi Liao et.al.	2401.08360v1	null
2024-01-16	S3M: Semantic Segmentation Sparse Mapping for UAVs with RGB-D Camera	Thanh Nguyen Canh et.al.	2401.08134v1	null
2024-01-15	Collaboratively Self-supervised Video Representation Learning for Action Recognition	Jie Zhang et.al.	2401.07584v1	null
2024-01-14	3D Landmark Detection on Human Point Clouds: A Benchmark and A Dual Cascade Point Transformer Framework	Fan Zhang et.al.	2401.07251v1	null
2024-01-11	On the representation and methodology for wide and short range head pose estimation	Alejandro Cobo et.al.	2401.05807v1	link
2024-01-10	Structure from Duplicates: Neural Inverse Graphics from a Pile of Objects	Tianhang Cheng et.al.	2401.05236v1	link
2024-01-10	Video-based Automatic Lameness Detection of Dairy Cows using Pose Estimation and Multiple Locomotion Traits	Helena Russello et.al.	2401.05202v1	null
2024-01-10	Diffusion-based Pose Refinement and Muti-hypothesis Generation for 3D Human Pose Estimaiton	Hongbo Kang et.al.	2401.04921v1	link
2024-01-15	Towards Real-World Aerial Vision Guidance with Categorical 6D Pose Tracker	Jingtao Sun et.al.	2401.04377v2	link
2024-01-07	RHOBIN Challenge: Reconstruction of Human Object Interaction	Xianghui Xie et.al.	2401.04143v1	null
2024-01-08	D3PRefiner: A Diffusion-based Denoise Method for 3D Human Pose Refinement	Danqi Yan et.al.	2401.03914v1	null
2024-01-07	Big Data and Deep Learning in Smart Cities: A Comprehensive Dataset for AI-Driven Traffic Accident Detection and Computer Vision Systems	Victor Adewopo et.al.	2401.03587v1	null
2024-01-04	Survey of 3D Human Body Pose and Shape Estimation Methods for Contemporary Dance Applications	Darshan Venkatrayappa et.al.	2401.02383v1	null
2024-01-04	Fit-NGP: Fitting Object Models to Neural Graphics Primitives	Marwan Taher et.al.	2401.02357v1	null
2024-01-04	PEGASUS: Physically Enhanced Gaussian Splatting Simulation System for 6DOF Object Pose Dataset Generation	Lukas Meyer et.al.	2401.02281v1	link
2024-01-03	Real-Time Human Fall Detection using a Lightweight Pose Estimation Technique	Ekram Alam et.al.	2401.01587v1	link
2024-01-05	PLE-SLAM: A Visual-Inertial SLAM Based on Point-Line Features and Efficient IMU Initialization	Jiaming He et.al.	2401.01081v2	link
2023-12-30	3D Human Pose Perception from Egocentric Stereo Videos	Hiroyasu Akada et.al.	2401.00889v1	null
2024-01-01	Geometry Depth Consistency in RGBD Relative Pose Estimation	Sourav Kumar et.al.	2401.00639v1	null
2023-12-30	A comprehensive framework for occluded human pose estimation	Linhao Xu et.al.	2401.00155v1	null
2024-01-02	6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation	Li Xu et.al.	2401.00029v2	null
2023-12-29	MURP: Multi-Agent Ultra-Wideband Relative Pose Estimation with Constrained Communications in 3D Environments	Andrew Fishberg et.al.	2312.17731v1	link
2023-12-28	iFusion: Inverting Diffusion for Pose-Free Reconstruction from Sparse Views	Chin-Hsuan Wu et.al.	2312.17250v1	link
2023-12-28	EvPlug: Learn a Plug-and-Play Module for Event and Image Fusion	Jianping Jiang et.al.	2312.16933v1	null
2023-12-28	SR-LIVO: LiDAR-Inertial-Visual Odometry and Mapping with Sweep Reconstruction	Zikang Yuan et.al.	2312.16800v1	link
2023-12-28	L-LO: Enhancing Pose Estimation Precision via a Landmark-Based LiDAR Odometry	Feiya Li et.al.	2312.16787v1	null
2023-12-27	HMP: Hand Motion Priors for Pose and Shape Estimation from Video	Enes Duran et.al.	2312.16737v1	null
2023-12-27	Camera calibration for the surround-view system: a benchmark and dataset	L Qin et.al.	2312.16499v1	null
2023-12-24	TEMP3D: Temporally Continuous 3D Human Pose Estimation Under Occlusions	Rohit Lal et.al.	2312.16221v1	link
2023-12-26	Graph Context Transformation Learning for Progressive Correspondence Pruning	Junwen Guo et.al.	2312.15971v1	link
2023-12-25	Lifting by Image -- Leveraging Image Cues for Accurate 3D Human Pose Estimation	Feng Zhou et.al.	2312.15636v1	null
2023-12-25	APTv2: Benchmarking Animal Pose Estimation and Tracking with a Large-scale Dataset and Beyond	Yuxiang Yang et.al.	2312.15612v1	link
2023-12-23	PACE: Pose Annotations in Cluttered Environments	Yang You et.al.	2312.15130v1	link
2023-12-22	PoseGen: Learning to Generate 3D Human Pose Dataset with NeRF	Mohsen Gholami et.al.	2312.14915v1	link
2023-12-22	Harnessing Diffusion Models for Visual Perception with Meta Prompts	Qiang Wan et.al.	2312.14733v1	link
2023-12-22	Pola4All: survey of polarimetric applications and an open-source toolkit to analyze polarization	Joaquin Rodriguez et.al.	2312.14697v1	link
2023-12-22	PoseViNet: Distracted Driver Action Recognition Framework Using Multi-View Pose Estimation and Vision Transformer	Neha Sengar et.al.	2312.14577v1	null
2023-12-22	Scalable 3D Reconstruction From Single Particle X-Ray Diffraction Images Based on Online Machine Learning	Jay Shenoy et.al.	2312.14432v1	null
2023-12-21	3D Pose Estimation of Two Interacting Hands from a Monocular Event Camera	Christen Millerdurai et.al.	2312.14157v1	null
2023-12-21	DUSt3R: Geometric 3D Vision Made Easy	Shuzhe Wang et.al.	2312.14132v1	link
2023-12-20	NeRF-VO: Real-Time Sparse Visual Odometry with Neural Radiance Fields	Jens Naumann et.al.	2312.13471v1	null
2023-12-20	Brain-Inspired Visual Odometry: Balancing Speed and Interpretability through a System of Systems Approach	Habib Boloorchi Tabrizi et.al.	2312.13162v1	link
2023-12-18	Unified framework for diffusion generative models in SO(3): applications in computer vision and astrophysics	Yesukhei Jagvaral et.al.	2312.11707v1	null
2023-12-18	Underwater Robot Pose Estimation Using Acoustic Methods and Intermittent Position Measurements at the Surface	Vicu-Mihalis Maer et.al.	2312.11401v1	null
2023-12-17	SHaRPose: Sparse High-Resolution Representation for Human Pose Estimation	Xiaoqi An et.al.	2312.10758v1	link
2023-12-17	PNeRFLoc: Visual Localization with Point-based Neural Radiance Fields	Boming Zhao et.al.	2312.10649v1	null
2023-12-15	SoloPose: One-Shot Kinematic 3D Human Pose Estimation with Video Data Augmentation	David C. Jeong et.al.	2312.10195v1	link
2023-12-14	iComMa: Inverting 3D Gaussians Splatting for Camera Pose Estimation via Comparing and Matching	Yuan Sun et.al.	2312.09031v1	null
2023-12-14	Scene 3-D Reconstruction System in Scattering Medium	Zhuoyifan Zhang et.al.	2312.09005v1	null
2023-12-14	CattleEyeView: A Multi-task Top-down View Cattle Dataset for Smarter Precision Livestock Farming	Kian Eng Ong et.al.	2312.08764v1	link
2023-12-20	PnP for Two-Dimensional Pose Estimation	Joshua Wang et.al.	2312.08488v2	link
2023-12-13	Pose and shear-based tactile servoing	John Lloyd et.al.	2312.08411v1	null
2023-12-13	FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects	Bowen Wen et.al.	2312.08344v1	link
2023-12-13	Efficient Multi-Object Pose Estimation using Multi-Resolution Deformable Attention and Query Aggregation	Arul Selvam Periyasamy et.al.	2312.08268v1	null
2023-12-13	CenterGrasp: Object-Aware Implicit Representation Learning for Simultaneous Shape Reconstruction and 6-DoF Grasp Estimation	Eugenio Chisari et.al.	2312.08240v1	null
2023-12-13	C-BEV: Contrastive Bird's Eye View Training for Cross-View Image Retrieval and 3-DoF Pose Estimation	Florian Fervers et.al.	2312.08060v1	null
2023-12-13	Three-Filters-to-Normal+: Revisiting Discontinuity Discrimination in Depth-to-Normal Translation	Jingwei Yang et.al.	2312.07964v1	null
2023-12-13	Diffusion Models Enable Zero-Shot Pose Estimation for Lower-Limb Prosthetic Users	Tianxun Zhou et.al.	2312.07854v1	null
2023-12-12	RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation	Peng Lu et.al.	2312.07526v1	link
2023-12-12	COLMAP-Free 3D Gaussian Splatting	Yang Fu et.al.	2312.07504v1	link
2023-12-12	RMS: Redundancy-Minimizing Point Cloud Sampling for Real-Time Pose Estimation in Degenerated Environments	Pavel Petracek et.al.	2312.07337v1	link
2023-12-12	Unifying Correspondence, Pose and NeRF for Pose-Free Novel View Synthesis from Stereo Pairs	Sunghwan Hong et.al.	2312.07246v1	link
2023-12-12	Mask as Supervision: Leveraging Unified Mask Information for Unsupervised 3D Pose Estimation	Yuchen Yang et.al.	2312.07051v1	link
2023-12-12	Towards Enhanced Human Activity Recognition through Natural Language Generation and Pose Estimation	Nikhil Kashyap et.al.	2312.06965v1	null
2023-12-12	Exploring Novel Object Recognition and Spontaneous Location Recognition Machine Learning Analysis Techniques in Alzheimer's Mice	Soham Bafana et.al.	2312.06914v1	link
2023-12-11	Keypoint-based Stereophotoclinometry for Characterizing and Navigating Small Bodies: A Factor Graph Approach	Travis Driver et.al.	2312.06865v1	link
2023-12-11	Improving the Robustness of 3D Human Pose Estimation: A Benchmark and Learning from Noisy Input	Trung-Hieu Hoang et.al.	2312.06797v1	null
2023-12-11	3D Hand Pose Estimation in Egocentric Images in the Wild	Aditya Prakash et.al.	2312.06583v1	null
2023-12-11	PointVoxel: A Simple and Effective Pipeline for Multi-View Multi-Modal 3D Human Pose Estimation	Zhiyu Pan et.al.	2312.06409v1	null
2023-12-11	ManiPose: Manifold-Constrained Multi-Hypothesis 3D Human Pose Estimation	Cédric Rommel et.al.	2312.06386v1	link
2023-12-10	From Correspondences to Pose: Non-minimal Certifiably Optimal Relative Pose without Disambiguation	Javier Tirado-Garín et.al.	2312.05995v1	link
2023-12-09	You Only Learn One Query: Learning Unified Human Query for Single-Stage Multi-Person Multi-Task Human-Centric Perception	Sheng Jin et.al.	2312.05525v1	link
2023-12-07	Image and AIS Data Fusion Technique for Maritime Computer Vision Applications	Emre Gülsoylu et.al.	2312.05270v1	link
2023-12-07	Correspondences of the Third Kind: Camera Pose Estimation from Object Reflection	Kohei Yamashita et.al.	2312.04527v1	null
2023-12-07	Detecting and Restoring Non-Standard Hands in Stable Diffusion Generated Images	Yiqun Zhang et.al.	2312.04236v1	null
2023-12-06	Skeleton-in-Context: Unified Skeleton Sequence Modeling with In-Context Learning	Xinshun Wang et.al.	2312.03703v1	link
2023-12-06	Cooperative Probabilistic Trajectory Forecasting under Occlusion	Anshul Nayak et.al.	2312.03296v1	null
2023-12-05	A Unified Simulation Framework for Visual and Behavioral Fidelity in Crowd Analysis	Niccolò Bisagno et.al.	2312.02613v1	null
2023-12-05	6D Assembly Pose Estimation by Point Cloud Registration for Robot Manipulation	K. Samarawickrama et.al.	2312.02593v1	link
2023-12-05	PolyFit: A Peg-in-hole Assembly Framework for Unseen Polygon Shapes via Sim-to-real Adaptation	Geonhyup Lee et.al.	2312.02531v1	null
2023-12-04	GenEM: Physics-Informed Generative Cryo-Electron Microscopy	Jiakai Zhang et.al.	2312.02235v1	null
2023-12-02	Dynamic Inertial Poser (DynaIP): Part-Based Motion Dynamics Learning for Enhanced Human Pose Estimation with Sparse Inertial Sensors	Yu Zhang et.al.	2312.02196v1	link
2023-12-04	iMatching: Imperative Correspondence Learning	Zitong Zhan et.al.	2312.02141v1	link
2023-12-04	SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM	Nikhil Keetha et.al.	2312.02126v1	link
2023-12-04	Disentangled Interaction Representation for One-Stage Human-Object Interaction Detection	Xubin Zhong et.al.	2312.01713v1	null
2023-12-05	Hulk: A Universal Knowledge Translator for Human-Centric Tasks	Yizhou Wang et.al.	2312.01697v2	link
2023-12-04	Multi-View Person Matching and 3D Pose Estimation with Arbitrary Uncalibrated Camera Networks	Yan Xu et.al.	2312.01561v1	null
2023-12-01	Object 6D pose estimation meets zero-shot learning	Andrea Caraffa et.al.	2312.00947v1	null
2023-12-01	Open-vocabulary object 6D pose estimation	Jaime Corsetti et.al.	2312.00690v1	null
2023-12-01	Global Localization: Utilizing Relative Spatio-Temporal Geometric Constraints from Adjacent and Distant Cameras	Mohammad Altillawi et.al.	2312.00500v1	null
2023-12-01	Learning Unorthogonalized Matrices for Rotation Estimation	Kerui Gu et.al.	2312.00462v1	null
2023-11-30	PoseGPT: Chatting about 3D Human Pose	Yao Feng et.al.	2311.18836v1	null
2023-11-30	FoundPose: Unseen Object Pose Estimation with Foundation Features	Evin Pınar Örnek et.al.	2311.18809v1	null
2023-11-30	Pose Estimation and Tracking for ASIST	Ari Goodman et.al.	2311.18665v1	null
2023-11-29	A Stochastic-Geometrical Framework for Object Pose Estimation based on Mixture Models Avoiding the Correspondence Problem	Wolfgang Hoegele et.al.	2311.18107v1	null
2023-11-29	Pose Anything: A Graph-Based Approach for Category-Agnostic Pose Estimation	Or Hirschorn et.al.	2311.17891v1	link
2023-11-29	Cinematic Behavior Transfer via NeRF-based Differentiable Filming	Xuekun Jiang et.al.	2311.17754v1	null
2023-11-29	PViT-6D: Overclocking Vision Transformers for 6D Pose Estimation with Confidence-Level Prediction and Pose Tokens	Sebastian Stapf et.al.	2311.17504v1	null
2023-11-28	On the Calibration of Human Pose Estimation	Kerui Gu et.al.	2311.17105v1	null
2023-11-28	Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence	Junyi Zhang et.al.	2311.17034v1	link
2023-11-28	HandyPriors: Physically Consistent Perception of Hand-Object Interactions with Differentiable Priors	Shutong Zhang et.al.	2311.16552v1	null
2023-11-28	Egocentric Whole-Body Motion Capture with FisheyeViT and Diffusion-Based Motion Refinement	Jian Wang et.al.	2311.16495v1	null
2023-11-24	UniHPE: Towards Unified Human Pose Estimation via Contrastive Learning	Zhongyu Jiang et.al.	2311.16477v1	null
2023-11-27	DiffSLVA: Harnessing Diffusion Models for Sign Language Video Anonymization	Zhaoyang Xia et.al.	2311.16060v1	link
2023-11-27	Uncertainty Quantification of Set-Membership Estimation in Control and Perception: Revisiting the Minimum Enclosing Ellipsoid	Yukai Tang et.al.	2311.15962v1	null
2023-11-27	Computer Vision for Carriers: PATRIOT	Ari Goodman et.al.	2311.15914v1	null
2023-11-27	SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation	Jiehong Lin et.al.	2311.15707v1	link
2023-11-24	RSB-Pose: Robust Short-Baseline Binocular 3D Human Pose Estimation with Occlusion Handling	Xiaoyue Wan et.al.	2311.14242v1	null
2023-11-23	Appearance-based gaze estimation enhanced with synthetic images using deep neural networks	Dmytro Herashchenko et.al.	2311.14175v1	link
2023-11-23	GigaPose: Fast and Robust Novel Object Pose Estimation via One Correspondence	Van Nguyen Nguyen et.al.	2311.14155v1	link
2023-11-23	GS-Pose: Category-Level Object Pose Estimation via Geometric and Semantic Correspondence	Pengyuan Wang et.al.	2311.13777v1	null
2023-11-22	HEViTPose: High-Efficiency Vision Transformer for Human Pose Estimation	Chengpeng Wu et.al.	2311.13615v1	link
2023-11-24	Calibration System and Algorithm Design for a Soft Hinged Micro Scanning Mirror with a Triaxial Hall Effect Sensor	Di Wang et.al.	2311.12778v2	null
2023-11-21	HiPose: Hierarchical Binary Surface Encoding and Correspondence Pruning for RGB-D 6DoF Object Pose Estimation	Yongliang Lin et.al.	2311.12588v1	link
2023-11-21	CoVOR-SLAM: Cooperative SLAM using Visual Odometry and Ranges for Multi-Robot Systems	Young-Hee Lee et.al.	2311.12580v1	null
2023-11-21	HCA-Net: Hierarchical Context Attention Network for Intervertebral Disc Semantic Labeling	Afshin Bozorgpour et.al.	2311.12486v1	link
2023-11-21	Two Views Are Better than One: Monocular 3D Pose Estimation with Multiview Consistency	Christian Keilstrup Ingwersen et.al.	2311.12421v1	null
2023-11-20	Fingerspelling PoseNet: Enhancing Fingerspelling Translation with Pose-Based Transformer Models	Pooya Fayyazsanavi et.al.	2311.12128v1	link
2023-11-20	Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation	Wenhao Li et.al.	2311.12028v1	link
2023-11-20	SniffyArt: The Dataset of Smelling Persons	Mathias Zinnen et.al.	2311.11888v1	null
2023-11-21	Robot Hand-Eye Calibration using Structure-from-Motion	Nicolas Andreff et.al.	2311.11808v2	null
2023-11-18	SecondPose: SE(3)-Consistent Dual-Stream Feature Fusion for Category-Level Pose Estimation	Yamei Chen et.al.	2311.11125v1	link
2023-11-18	Synthetic Data Generation for Bridging Sim2Real Gap in a Production Environment	Parth Rawal et.al.	2311.11039v1	null
2023-11-18	Multiple View Geometry Transformers for 3D Human Pose Estimation	Ziwei Liao et.al.	2311.10983v1	link
2023-11-18	Jenga Stacking Based on 6D Pose Estimation for Architectural Form Finding Process	Zixun Huang et.al.	2311.10918v1	null
2023-11-17	BiHRNet: A Binary high-resolution network for Human Pose Estimation	Zhicheng Zhang et.al.	2311.10296v1	null
2023-11-16	Match and Locate: low-frequency monocular odometry based on deep feature matching	Stepan Konev et.al.	2311.10034v1	null
2023-11-16	LIO-EKF: High Frequency LiDAR-Inertial Odometry using Extended Kalman Filters	Yibin Wu et.al.	2311.09887v1	link
2023-11-16	Improved TokenPose with Sparsity	Anning Li et.al.	2311.09653v1	null
2023-11-16	Pseudo-keypoints RKHS Learning for Self-supervised 6DoF Pose Estimation	Yangzheng Wu et.al.	2311.09500v1	null
2023-11-15	NormNet: Scale Normalization for 6D Pose Estimation in Stacked Scenarios	En-Te Lin et.al.	2311.09269v1	link
2023-11-15	Range-Visual-Inertial Sensor Fusion for Micro Aerial Vehicle Localization and Navigation	Abhishek Goudar et.al.	2311.09056v1	link
2023-11-14	LocaliseBot: Multi-view 3D object localisation with differentiable rendering for robot grasping	Sujal Vijayaraghavan et.al.	2311.08438v1	null
2023-11-13	SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models	Ziyi Lin et.al.	2311.07575v1	link
2023-11-13	Bio-Inspired Grasping Controller for Sensorized 2-DoF Grippers	Luca Lach et.al.	2311.07257v1	link
2023-11-10	CESPED: a new benchmark for supervised particle pose estimation in Cryo-EM	Ruben Sanchez-Garcia et.al.	2311.06194v1	link
2023-11-10	2D Image head pose estimation via latent space regression under occlusion settings	José Celestino et.al.	2311.06038v1	link
2023-11-10	Robust Adversarial Attacks Detection for Deep Learning based Relative Pose Estimation for Space Rendezvous	Ziwei Wang et.al.	2311.05992v1	null
2023-11-10	A Practical Guide to Implementing Off-Axis Stereo Projection Using Existing Ray Tracing Libraries	Stefan Zellmann et.al.	2311.05887v1	link
2023-11-09	Visually Guided Model Predictive Robot Control via 6D Object Pose Localization and Tracking	Mederic Fourmy et.al.	2311.05344v1	null
2023-11-09	Spatial Attention-based Distribution Integration Network for Human Pose Estimation	Sihan Gao et.al.	2311.05323v1	null
2023-11-09	SPADES: A Realistic Spacecraft Pose Estimation Dataset using Event Sensing	Arunkumar Rathinam et.al.	2311.05310v1	null
2023-11-09	Differentiable Cloth Parameter Identification and State Estimation in Manipulation	Dongzhe Zheng et.al.	2311.05141v1	null
2023-11-09	POISE: Pose Guided Human Silhouette Extraction under Occlusions	Arindam Dutta et.al.	2311.05077v1	link
2023-11-08	Active Transfer Learning for Efficient Video-Specific Human Pose Estimation	Hiromu Taketsugu et.al.	2311.05041v1	link
2023-11-08	3D Pose Estimation of Tomato Peduncle Nodes using Deep Keypoint Detection and Point Cloud	Jianchao Ci et.al.	2311.04699v1	null
2023-11-09	Rethinking Human Pose Estimation for Autonomous Driving with 3D Event Representations	Xiaoting Yin et.al.	2311.04591v2	link
2023-11-08	Learning Robust Multi-Scale Representation for Neural Radiance Fields from Unposed Images	Nishant Jain et.al.	2311.04521v1	null
2023-11-08	PLV-IEKF: Consistent Visual-Inertial Odometry using Points, Lines, and Vanishing Points	Tong Hua et.al.	2311.04477v1	null
2023-11-08	UP-NeRF: Unconstrained Pose-Prior-Free Neural Radiance Fields	Injae Kim et.al.	2311.03784v2	link
2023-11-06	A Single 2D Pose with Context is Worth Hundreds for 3D Human Pose Estimation	Qitao Zhao et.al.	2311.03312v1	null
2023-11-06	Enabling In-Situ Resources Utilisation by leveraging collaborative robotics and astronaut-robot interaction	Silvia Romero-Azpitarte et.al.	2311.03146v1	null
2023-11-06	Simultaneous Time Synchronization and Mutual Localization for Multi-robot System	Xiangyong Wen et.al.	2311.02948v1	null
2023-11-06	Initialisation of Autonomous Aircraft Visual Inspection Systems via CNN-Based Camera Pose Estimation	Xueyan Oh et.al.	2311.02900v1	null
2023-11-06	Efficient, Self-Supervised Human Pose Estimation with Inductive Prior Tuning	Nobline Yoo et.al.	2311.02815v1	link
2023-11-03	Generating Unbiased Pseudo-labels via a Theoretically Guaranteed Chebyshev Constraint to Unify Semi-supervised Classification and Regression	Jiaqi Wu et.al.	2311.01782v1	link
2023-11-03	Modeling the Uncertainty with Maximum Discrepant Students for Semi-supervised 2D Pose Estimation	Jiaqi Wu et.al.	2311.01770v1	null
2023-11-02	Sim2Real Bilevel Adaptation for Object Surface Classification using Vision-Based Tactile Sensors	Gabriele M. Caddeo et.al.	2311.01380v1	link
2023-11-01	A Spatial-Temporal Transformer based Framework For Human Pose Assessment And Correction in Education Scenarios	Wenyang Hu et.al.	2311.00401v1	null
2023-10-31	HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception	Junkun Yuan et.al.	2310.20695v1	link
2023-10-31	Pose-to-Motion: Cross-Domain Motion Retargeting with Pose Prior	Qingqing Zhao et.al.	2310.20249v1	null
2023-10-30	FetusMapV2: Enhanced Fetal Pose Estimation in 3D Ultrasound	Chaoyu Chen et.al.	2310.19293v1	null
2023-10-29	Distributed Nonlinear Filtering using Triangular Transport Maps	Daniel Grange et.al.	2310.19000v1	null
2023-10-29	TIC-TAC: A Framework To Learn And Evaluate Your Covariance	Megh Shukla et.al.	2310.18953v1	link
2023-10-29	Improving Multi-Person Pose Tracking with A Confidence Network	Zehua Fu et.al.	2310.18920v1	null
2023-10-29	HDMNet: A Hierarchical Matching Network with Double Attention for Large-scale Outdoor LiDAR Point Cloud Registration	Weiyi Xue et.al.	2310.18874v1	null
2023-10-28	Enhancing Grasping Performance of Novel Objects through an Improved Fine-Tuning Process	Xiao Hu et.al.	2310.18569v1	null
2023-10-27	ProcNet: Deep Predictive Coding Model for Robust-to-occlusion Visual Segmentation and Pose Estimation	Michael Zechmair et.al.	2310.18009v1	null
2023-10-26	Learning Extrinsic Dexterity with Parameterized Manipulation Primitives	Shih-Min Yang et.al.	2310.17785v1	null
2023-10-26	6-DoF Stability Field via Diffusion Models	Takuma Yoneda et.al.	2310.17649v1	null
2023-10-26	SE(3) Diffusion Model-based Point Cloud Registration for Robust 6D Object Pose Estimation	Haobo Jiang et.al.	2310.17359v1	null
2023-10-26	Automatic Edge Error Judgment in Figure Skating Using 3D Pose Estimation from a Monocular Camera and IMUs	Ryota Tanaka et.al.	2310.17193v1	link
2023-10-25	Real-time 6-DoF Pose Estimation by an Event-based Camera using Active LED Markers	Gerald Ebmer et.al.	2310.16618v1	null
2023-10-25	ChimpACT: A Longitudinal Dataset for Understanding Chimpanzee Behaviors	Xiaoxuan Ma et.al.	2310.16447v1	link
2023-10-25	MotionAGFormer: Enhancing 3D Human Pose Estimation with a Transformer-GCNFormer Network	Soroush Mehraban et.al.	2310.16288v1	link
2023-10-25	TransPose: 6D Object Pose Estimation with Geometry-Aware Transformer	Xiao Lin et.al.	2310.16279v1	null
2023-10-23	Converting Depth Images and Point Clouds for Feature-based Pose Estimation	Robert Lösch et.al.	2310.14924v1	link
2023-10-23	Object Pose Estimation Annotation Pipeline for Multi-view Monocular Camera Systems in Industrial Settings	Hazem Youssef et.al.	2310.14914v1	null
2023-10-23	Player Re-Identification Using Body Part Appearences	Mahesh Bhosale et.al.	2310.14469v1	null
2023-10-20	LanPose: Language-Instructed 6D Object Pose Estimation for Robotic Assembly	Bowen Fu et.al.	2310.13819v1	null
2023-10-20	FMRT: Learning Accurate Feature Matching with Reconciliatory Transformer	Xinyu Zhang et.al.	2310.13605v1	null
2023-10-20	ColAG: A Collaborative Air-Ground Framework for Perception-Limited UGVs' Navigation	Zhehan Li et.al.	2310.13324v1	link
2023-10-20	CylinderTag: An Accurate and Flexible Marker for Cylinder-Shape Objects Pose Estimation Based on Projective Invariants	Shaoan Wang et.al.	2310.13320v1	link
2023-10-19	Human Pose-based Estimation, Tracking and Action Recognition with Deep Learning: A Survey	Lijuan Zhou et.al.	2310.13039v1	null
2023-10-19	FSD: Fast Self-Supervised Single RGB-D to Categorical 3D Objects	Mayank Lunayach et.al.	2310.12974v1	link
2023-10-18	Mesh Represented Recycle Learning for 3D Hand Pose and Mesh Estimation	Bosang Kim et.al.	2310.12189v1	null
2023-10-18	One-Shot Imitation Learning: A Pose Estimation Perspective	Pietro Vitiello et.al.	2310.12077v1	null
2023-10-18	ShapeGraFormer: GraFormer-Based Network for Hand-Object Reconstruction from a Single Depth Map	Ahmed Tawfik Aboukhadra et.al.	2310.11811v1	null
2023-10-17	Holistic Parking Slot Detection with Polygon-Shaped Representations	Lihao Wang et.al.	2310.11629v1	null
2023-10-17	Diver Interest via Pointing in Three Dimensions: 3D Pointing Reconstruction for Diver-AUV Communication	Chelsey Edge et.al.	2310.11536v1	null
2023-10-18	AP $n$P: A Less-constrained P$n$ P Solver for Pose Estimation with Unknown Anisotropic Scaling or Focal Lengths	Jiaxin Wei et.al.	2310.09982v2	link
2023-10-15	Tabletop Transparent Scene Reconstruction via Epipolar-Guided Optical Flow with Monocular Depth Completion Prior	Xiaotong Chen et.al.	2310.09956v1	null
2023-10-15	Socially reactive navigation models for mobile robots in dynamic environments	Ricarte Ribeiro et.al.	2310.09916v1	link
2023-10-15	MoEmo Vision Transformer: Integrating Cross-Attention and Movement Vectors in 3D Pose Estimation for HRI Emotion Detection	David C. Jeong et.al.	2310.09757v1	link
2023-10-16	IMU Preintegration for Multi-Robot Systems in the Presence of Bias and Communication Constraints	Mohammed Ayman Shalaby et.al.	2310.08686v2	null
2023-10-12	Towards Design and Development of an ArUco Markers-Based Quantitative Surface Tactile Sensor	Ozdemir Can Kara et.al.	2310.08398v1	null
2023-10-12	Multimodal Active Measurement for Human Mesh Recovery in Close Proximity	Takahiro Maeda et.al.	2310.08116v1	link
2023-10-12	X-HRNet: Towards Lightweight Human Pose Estimation with Spatially Unidimensional Self-Attention	Yixuan Zhou et.al.	2310.08042v1	link
2023-10-12	PoRF: Pose Residual Field for Accurate Neural Surface Reconstruction	Jia-Wang Bian et.al.	2310.07449v2	link
2023-10-11	SAGE-ICP: Semantic Information-Assisted ICP	Jiaming Cui et.al.	2310.07237v1	link
2023-10-11	DeepSimHO: Stable Pose Estimation for Hand-Object Interaction via Physics Simulation	Rong Wang et.al.	2310.07206v1	link
2023-10-12	FABind: Fast and Accurate Protein-Ligand Binding	Qizhi Pei et.al.	2310.06763v2	link
2023-10-10	EARL: Eye-on-Hand Reinforcement Learner for Dynamic Grasping with Active Pose Estimation	Baichuan Huang et.al.	2310.06751v1	null
2023-10-09	Augmenting Vision-Based Human Pose Estimation with Rotation Matrix	Milad Vazan et.al.	2310.06068v1	null
2023-10-07	Federated Self-Supervised Learning of Monocular Depth Estimators for Autonomous Vehicles	Elton F. de S. Soares et.al.	2310.04837v1	null
2023-10-10	1st Place Solution of Egocentric 3D Hand Pose Estimation Challenge 2023 Technical Report:A Concise Pipeline for Egocentric Hand Pose Reconstruction	Zhishan Zhou et.al.	2310.04769v2	null
2023-10-06	SwimXYZ: A large-scale dataset of synthetic swimming motions and videos	Fiche Guénolé et.al.	2310.04360v1	null
2023-10-05	BID-NeRF: RGB-D image pose estimation with inverted Neural Radiance Fields	Ágoston István Csehi et.al.	2310.03563v1	null
2023-10-05	3D-Aware Hypothesis & Verification for Generalizable Relative Object Pose Estimation	Chen Zhao et.al.	2310.03534v1	null
2023-10-05	RGBManip: Monocular Image-based Robotic Manipulation through Active Object Pose Estimation	Boshi An et.al.	2310.03478v1	null
2023-10-05	Cyber Physical System Information Collection: Robot Location and Navigation Method Based on QR Code	Hongwei Li et.al.	2310.03470v1	null
2023-10-04	Condition numbers in multiview geometry, instability in relative pose estimation, and RANSAC	Hongyi Fan et.al.	2310.02719v1	null
2023-10-05	USB-NeRF: Unrolling Shutter Bundle Adjusted Neural Radiance Fields	Moyang Li et.al.	2310.02687v2	link
2023-10-03	Beyond the Benchmark: Detecting Diverse Anomalies in Videos	Yoav Arad et.al.	2310.01904v1	link
2023-10-03	MFOS: Model-Free & One-Shot Object Pose Estimation	JongMin Lee et.al.	2310.01897v1	null
2023-10-02	LEAP: Liberate Sparse-view 3D Modeling from Camera Poses	Hanwen Jiang et.al.	2310.01410v1	link
2023-10-02	H-InDex: Visual Reinforcement Learning with Hand-Informed Representations for Dexterous Manipulation	Yanjie Ze et.al.	2310.01404v1	link
2023-10-04	Self-supervised Learning of Contextualized Local Visual Embeddings	Thalles Santos Silva et.al.	2310.00527v3	link
2023-09-30	Diff-DOPE: Differentiable Deep Object Pose Estimation	Jonathan Tremblay et.al.	2310.00463v1	null
2023-09-29	Diver Identification Using Anthropometric Data Ratios for Underwater Multi-Human-Robot Collaboration	Jungseok Hong et.al.	2310.00146v1	null
2023-09-29	Denoising and Selecting Pseudo-Heatmaps for Semi-Supervised Human Pose Estimation	Zhuoran Yu et.al.	2310.00099v1	null
2023-09-29	Revisiting Cephalometric Landmark Detection from the view of Human Pose Estimation with Lightweight Super-Resolution Head	Qian Wu et.al.	2309.17143v1	link
2023-09-29	AdaPose: Towards Cross-Site Device-Free Human Pose Estimation with Commodity WiFi	Yunjiao Zhou et.al.	2309.16964v1	null
2023-09-28	End-to-End (Instance)-Image Goal Navigation through Correspondence as an Emergent Phenomenon	Guillaume Bono et.al.	2309.16634v1	null
2023-09-28	Off-the-shelf bin picking workcell with visual pose estimation: A case study on the world robot summit 2018 kitting task	Frederik Hagelskjær et.al.	2309.16221v1	null
2023-09-28	Cloth2Body: Generating 3D Human Body Mesh from 2D Clothing	Lu Dai et.al.	2309.16189v1	null
2023-09-28	Laboratory Automation: Precision Insertion with Adaptive Fingers utilizing Contact through Sliding with Tactile-based Pose Estimation	Sameer Pai et.al.	2309.16170v1	null
2023-09-28	CLIP-Hand3D: Exploiting 3D Hand Pose Estimation via Context-Aware Prompting	Shaoxiang Guo et.al.	2309.16140v1	null
2023-09-28	A Modular Bio-inspired Robotic Hand with High Sensitivity	Chao Liu et.al.	2309.16081v1	null
2023-09-27	Handbook on Leveraging Lines for Two-View Relative Pose Estimation	Petr Hruby et.al.	2309.16040v1	null
2023-09-27	Q-REG: End-to-End Trainable Point Cloud Registration with Surface Curvature	Shengze Jin et.al.	2309.16023v1	null
2023-09-27	Analysis on Multi-robot Relative 6-DOF Pose Estimation Error Based on UWB Range	Xinran Li et.al.	2309.15367v1	null
2023-09-26	Unsupervised Reconstruction of 3D Human Pose Interactions From 2D Poses Alone	Peter Hardy et.al.	2309.14865v1	null
2023-09-26	Learning Vision-Based Bipedal Locomotion for Challenging Terrain	Helei Duan et.al.	2309.14594v1	null
2023-09-25	Spring-IMU Fusion Based Proprioception for Feedback Control of Soft Manipulators	Yinan Meng et.al.	2309.14279v1	null
2023-09-25	Industrial Application of 6D Pose Estimation for Robotic Manipulation in Automotive Internal Logistics	Philipp Quentin et.al.	2309.14265v1	null
2023-09-25	BoIR: Box-Supervised Instance Representation for Multi-Person Pose Estimation	Uyoung Jeong et.al.	2309.14072v1	link
2023-09-24	Towards Subcentimeter Accuracy Digital-Twin Tracking via An RGBD-based Transformer Model and A Comprehensive Mobile Dataset	Zixun Huang et.al.	2309.13570v1	link
2023-09-21	ORTexME: Occlusion-Robust Human Shape and Pose via Temporal Average Texture and Mesh Encoding	Yu Cheng et.al.	2309.12183v1	null
2023-09-21	ZS6D: Zero-shot 6D Object Pose Estimation using Vision Transformers	Philipp Ausserlechner et.al.	2309.11986v1	null
2023-09-21	Ego3DPose: Capturing 3D Cues from Binocular Egocentric Views	Taeho Kang et.al.	2309.11962v1	link
2023-09-21	A Real-Time Multi-Task Learning System for Joint Detection of Face, Facial Landmark and Head Pose	Qingtian Wu et.al.	2309.11773v1	null
2023-09-20	Understanding Pose and Appearance Disentanglement in 3D Human Pose Estimation	Krishna Kanth Nakka et.al.	2309.11667v1	null
2023-09-20	Online Supervised Training of Spaceborne Vision during Proximity Operations using Adaptive Kalman Filtering	Tae Ha Park et.al.	2309.11645v1	null
2023-09-20	OCC-VO: Dense Mapping via 3D Occupancy-Based Visual Odometry for Autonomous Driving	Heng Li et.al.	2309.11011v1	link
2023-09-19	Language-Conditioned Affordance-Pose Detection in 3D Point Clouds	Toan Nguyen et.al.	2309.10911v1	null
2023-09-19	MAGIC-TBR: Multiview Attention Fusion for Transformer-based Bodily Behavior Recognition in Group Settings	Surbhi Madan et.al.	2309.10765v1	link
2023-09-19	SHOWMe: Benchmarking Object-agnostic Hand-Object 3D Reconstruction	Anilkumar Swamy et.al.	2309.10748v1	null
2023-09-20	GloPro: Globally-Consistent Uncertainty-Aware 3D Human Pose Estimation & Tracking in the Wild	Simon Schaefer et.al.	2309.10369v2	null
2023-09-19	RGB-based Category-level Object Pose Estimation via Decoupled Metric Scale Recovery	Jiaxin Wei et.al.	2309.10255v1	link
2023-09-18	Hierarchical Attention and Graph Neural Networks: Toward Drift-Free Pose Estimation	Kathia Melbouci et.al.	2309.09934v1	null
2023-09-18	Application-driven Validation of Posteriors in Inverse Problems	Tim J. Adler et.al.	2309.09764v1	null
2023-09-18	RIDE: Self-Supervised Learning of Rotation-Equivariant Keypoint Detection and Invariant Description for Endoscopy	Mert Asim Karaoglu et.al.	2309.09563v1	null
2023-09-18	Sparse and Privacy-enhanced Representation for Human Pose Estimation	Ting-Ying Lin et.al.	2309.09515v1	null
2023-09-19	RenderIH: A Large-scale Synthetic Dataset for 3D Interacting Hand Pose Estimation	Lijun Li et.al.	2309.09301v2	link
2023-09-16	Optimal Initialization Strategies for Range-Only Trajectory Estimation	Abhishek Goudar et.al.	2309.09011v1	null
2023-09-16	DynaMoN: Motion-Aware Fast And Robust Camera Localization for Dynamic NeRF	Mert Asim Karaoglu et.al.	2309.08927v1	link
2023-09-16	Outram: One-shot Global Localization via Triangulated Scene Graph and Global Outlier Pruning	Pengyu Yin et.al.	2309.08914v1	link
2023-09-15	Towards Robust and Smooth 3D Multi-Person Pose Estimation from Monocular Videos in the Wild	Sungchan Park et.al.	2309.08644v1	null
2023-09-15	YCB-Ev: Event-vision dataset for 6DoF object pose estimation	Pavel Rojtberg et.al.	2309.08482v1	link
2023-09-15	Fast and Accurate Deep Loop Closing and Relocalization for Reliable LiDAR SLAM	Chenghao Shi et.al.	2309.08086v1	null
2023-09-14	Gradient based Grasp Pose Optimization on a NeRF that Approximates Grasp Success	Gergely Sóti et.al.	2309.08040v1	null
2023-09-14	TEMPO: Efficient Multi-View Pose Estimation, Tracking, and Forecasting	Rohan Choudhury et.al.	2309.07910v1	null
2023-09-14	Towards Robust and Unconstrained Full Range of Rotation Head Pose Estimation	Thorsten Hempel et.al.	2309.07654v1	link
2023-09-14	EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale Visual Localization	Minjung Kim et.al.	2309.07471v1	link
2023-09-14	Unleashing the Power of Depth and Pose Estimation Neural Networks by Designing Compatible Endoscopic Images	Junyang Wu et.al.	2309.07390v1	null
2023-09-13	LInKs "Lifting Independent Keypoints" -- Partial Pose Lifting for Occlusion Handling with Improved Accuracy in 2D-3D Human Pose Estimation	Peter Hardy et.al.	2309.07243v1	null
2023-09-13	3D Active Metric-Semantic SLAM	Yuezhan Tao et.al.	2309.06950v1	null
2023-09-11	ViHOPE: Visuotactile In-Hand Object 6D Pose Estimation with Shape Completion	Hongyu Li et.al.	2309.05662v1	null
2023-09-11	Towards Intuitive HMI for UAV Control	Filip Zoric et.al.	2309.05460v1	null
2023-09-12	FreeMan: Towards Benchmarking 3D Human Pose Estimation in the Wild	Jiong Wang et.al.	2309.05073v2	link
2023-09-09	Probabilistic Triangulation for Uncalibrated Multi-View 3D Human Pose Estimation	Boyuan Jiang et.al.	2309.04756v1	link
2023-09-09	Mirror-Aware Neural Humans	Daniel Ajisafe et.al.	2309.04750v1	link
2023-09-08	Robot Localization and Mapping Final Report -- Sequential Adversarial Learning for Self-Supervised Deep Visual Odometry	Akankshya Kar et.al.	2309.04147v1	null
2023-09-07	ArtiGrasp: Physically Plausible Synthesis of Bi-Manual Dexterous Grasping and Articulation	Hui Zhang et.al.	2309.03891v1	null
2023-09-05	An automated, high-resolution phenotypic assay for adult Brugia malayi and microfilaria	Upender Kalwa et.al.	2309.03235v1	null
2023-09-05	A Robust Localization Solution for an Uncrewed Ground Vehicle in Unstructured Outdoor GNSS-Denied Environments	W. Jacob Wagner et.al.	2309.02569v1	null
2023-09-05	GO-SLAM: Global Optimization for Consistent 3D Instant Reconstruction	Youmin Zhang et.al.	2309.02436v1	link
2023-09-05	DR-Pose: A Two-stage Deformation-and-Registration Pipeline for Category-level 6D Object Pose Estimation	Lei Zhou et.al.	2309.01925v1	link
2023-09-04	On the Query Strategies for Efficient Online Active Distillation	Michele Boldo et.al.	2309.01612v1	null
2023-09-04	DiffHPE: Robust, Coherent 3D Human Pose Lifting with Diffusion	Cédric Rommel et.al.	2309.01575v1	null
2023-09-06	Refined Temporal Pyramidal Compression-and-Amplification Transformer for 3D Human Pose Estimation	Hanbing Liu et.al.	2309.01365v2	link
2023-09-04	SKoPe3D: A Synthetic Dataset for Vehicle Keypoint Perception in 3D from Traffic Monitoring Cameras	Himanshu Pahadia et.al.	2309.01324v1	null
2023-09-03	BodySLAM++: Fast and Tightly-Coupled Visual-Inertial Camera and Human Motion Tracking	Dorian F. Henning et.al.	2309.01236v1	null
2023-09-02	Mitigating Motion Blur for Robust 3D Baseball Player Pose Modeling for Pitch Analysis	Jerrin Bright et.al.	2309.01010v1	null
2023-09-01	Fusing Monocular Images and Sparse IMU Signals for Real-time Human Motion Capture	Shaohua Pan et.al.	2309.00310v1	link
2023-08-31	EMDB: The Electromagnetic Database of Global 3D Human Pose and Shape in the Wild	Manuel Kaufmann et.al.	2308.16894v1	link
2023-08-31	SA6D: Self-Adaptive Few-Shot 6D Pose Estimator for Novel and Occluded Objects	Ning Gao et.al.	2308.16528v1	null
2023-08-30	Two-Stage Violence Detection Using ViTPose and Classification Models at Smart Airports	İrem Üstek et.al.	2308.16325v1	link
2023-08-30	SignDiff: Learning Diffusion Models for American Sign Language Production	Sen Fang et.al.	2308.16082v1	null
2023-08-30	Learning Structure-from-Motion with Graph Attention Networks	Lucas Brynte et.al.	2308.15984v1	link
2023-08-30	Reconstructing Groups of People with Hypergraph Relational Reasoning	Buzhen Huang et.al.	2308.15844v1	link
2023-08-29	3D-MuPPET: 3D Multi-Pigeon Pose Estimation and Tracking	Urs Waldmann et.al.	2308.15316v1	link
2023-08-29	Spatio-temporal MLP-graph network for 3D human pose estimation	Tanvir Hassan et.al.	2308.15313v1	link
2023-08-29	Pose-Free Neural Radiance Fields via Implicit Pose Regularization	Jiahui Zhang et.al.	2308.15049v1	null
2023-08-28	R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras	Aron Schmied et.al.	2308.14713v1	null
2023-08-28	Video-Based Hand Pose Estimation for Remote Assessment of Bradykinesia in Parkinson's Disease	Gabriela T. Acevedo Trebbau et.al.	2308.14679v1	null
2023-08-28	Active Pose Refinement for Textureless Shiny Objects using the Structured Light Camera	Jun Yang et.al.	2308.14665v1	null
2023-08-28	CPFES: Physical Fitness Evaluation Based on Canadian Agility and Movement Skill Assessment	Pengcheng Dong et.al.	2308.14324v1	null
2023-08-27	LDL: Line Distance Functions for Panoramic Localization	Junho Kim et.al.	2308.13989v1	link
2023-08-26	Prior-guided Source-free Domain Adaptation for Human Pose Estimation	Dripta S. Raychaudhuri et.al.	2308.13954v1	null
2023-08-26	Vision-Based Human Pose Estimation via Deep Learning: A Survey	Gongjin Lan et.al.	2308.13872v1	null
2023-08-24	POCO: 3D Pose and Shape Estimation with Confidence	Sai Kumar Dwivedi et.al.	2308.12965v1	link
2023-08-24	Robot Pose Nowcasting: Forecast the Future to Improve the Present	Alessandro Simoni et.al.	2308.12914v1	null
2023-08-23	Certifiably Optimal Rotation and Pose Estimation Based on the Cayley Map	Timothy D Barfoot et.al.	2308.12418v1	null
2023-08-22	Animal3D: A Comprehensive Dataset of 3D Animal Pose and Shape	Jiacong Xu et.al.	2308.11737v1	null
2023-08-22	TrackFlow: Multi-Object Tracking with Normalizing Flows	Gianluca Mancusi et.al.	2308.11513v1	null
2023-08-22	A LiDAR-Inertial SLAM Tightly-Coupled with Dropout-Tolerant GNSS Fusion for Autonomous Mine Service Vehicles	Yusheng Wang et.al.	2308.11492v1	null
2023-08-22	PoseGraphNet++: Enriching 3D Human Pose with Orientation Estimation	Soubarna Banik et.al.	2308.11440v1	null
2023-08-22	Novel-view Synthesis and Pose Estimation for Hand-Object Interaction from Sparse Views	Wentian Qu et.al.	2308.11198v1	null
2023-08-21	Spectral Graphormer: Spectral Graph-based Transformer for Egocentric Two-Hand Reconstruction using Multi-View Color Images	Tze Ho Elden Tse et.al.	2308.11015v1	null
2023-08-21	Polarimetric Information for Multi-Modal 6D Pose Estimation of Photometrically Challenging Objects with Limited Data	Patrick Ruhkamp et.al.	2308.10627v1	null
2023-08-21	GaitPT: Skeletons Are All You Need For Gait Recognition	Andy Catruna et.al.	2308.10623v1	null
2023-08-21	Approximately Equivariant Graph Networks	Ningyuan Huang et.al.	2308.10436v1	link
2023-08-21	In-Rack Test Tube Pose Estimation Using RGB-D Data	Hao Chen et.al.	2308.10411v1	null
2023-08-20	Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video	Yingxuan You et.al.	2308.10305v1	link
2023-08-20	OCHID-Fi: Occlusion-Robust Hand Pose Estimation in 3D via RF-Vision	Shujie Zhang et.al.	2308.10146v1	link
2023-08-19	3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose Estimation	Yi Zhang et.al.	2308.10123v1	link
2023-08-19	Pseudo Flow Consistency for Self-Supervised 6D Object Pose Estimation	Yang Hai et.al.	2308.10016v1	link
2023-08-19	UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning	Meiqi Sun et.al.	2308.09953v1	null
2023-08-22	Scene-Aware Feature Matching	Xiaoyong Lu et.al.	2308.09949v2	null
2023-08-18	PoSynDA: Multi-Hypothesis Pose Synthesis Domain Adaptation for Robust 3D Human Pose Estimation	Hanbing Liu et.al.	2308.09678v1	link
2023-08-18	Improving 3D Pose Estimation for Sign Language	Maksym Ivashechkin et.al.	2308.09525v1	null
2023-08-18	Denoising Diffusion for 3D Hand Pose Estimation from Images	Maksym Ivashechkin et.al.	2308.09523v1	null
2023-08-18	ResQ: Residual Quantization for Video Perception	Davide Abati et.al.	2308.09511v1	null
2023-08-17	MovePose: A High-performance Human Pose Estimation Algorithm on Mobile and Edge Devices	Dongyang Yu et.al.	2308.09084v1	null
2023-08-17	Pedestrian Environment Model for Automated Driving	Adrian Holzbock et.al.	2308.09080v1	link
2023-08-17	Exploiting Point-Wise Attention in 6D Object Pose Estimation Based on Bidirectional Prediction	Yuhao Yang et.al.	2308.08518v2	null
2023-08-16	View Consistent Purification for Accurate Cross-View Localization	Shan Wang et.al.	2308.08110v1	null
2023-08-15	Learning Better Keypoints for Multi-Object 6DoF Pose Estimation	Yangzheng Wu et.al.	2308.07827v1	link
2023-08-14	Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation	Huan Liu et.al.	2308.07313v1	link
2023-08-12	4DRVO-Net: Deep 4D Radar-Visual Odometry Using Multi-Modal and Multi-Scale Adaptive Fusion	Guirong Zhuo et.al.	2308.06573v1	null
2023-08-17	EgoPoser: Robust Real-Time Ego-Body Pose Estimation in Large Scenes	Jiaxi Jiang et.al.	2308.06493v2	null
2023-08-11	Aggressive Aerial Grasping using a Soft Drone with Onboard Perception	Samuel Ubellacker et.al.	2308.06351v1	null
2023-08-11	VERF: Runtime Monitoring of Pose Estimation with Neural Radiance Fields	Dominic Maggio et.al.	2308.05939v1	null
2023-08-10	Toward Globally Optimal State Estimation Using Automatically Tightened Semidefinite Relaxations	Frederike Dümbgen et.al.	2308.05783v1	link
2023-08-10	KS-APR: Keyframe Selection for Robust Absolute Pose Regression	Changkun Liu et.al.	2308.05459v1	null
2023-08-10	How-to Augmented Lagrangian on Factor Graphs	Barbara Bazzana et.al.	2308.05444v1	null
2023-08-10	Deep Fusion Transformer Network with Weighted Vector-Wise Keypoints Voting for Robust 6D Object Pose Estimation	Jun Zhou et.al.	2308.05438v1	link
2023-08-10	Robust Localization with Visual-Inertial Odometry Constraints for Markerless Mobile AR	Changkun Liu et.al.	2308.05394v1	null
2023-08-10	Double-chain Constraints for 3D Human Pose Estimation in Images and Videos	Hongbo Kang et.al.	2308.05298v1	link
2023-08-09	ACE-HetEM for ab initio Heterogenous Cryo-EM 3D Reconstruction	Weijie Chen et.al.	2308.04956v1	null
2023-08-07	SEM-GAT: Explainable Semantic Pose Estimation using Learned Graph Attention	Efimia Panagiotaki et.al.	2308.03718v1	link
2023-08-07	A Horse with no Labels: Self-Supervised Horse Pose Estimation from Unlabelled Images and Synthetic Prior	Jose Sosa et.al.	2308.03411v1	null
2023-08-06	Source-free Domain Adaptive Human Pose Estimation	Qucheng Peng et.al.	2308.03202v1	link
2023-08-04	Diffusion-Augmented Depth Prediction with Sparse Annotations	Jiaqi Li et.al.	2308.02283v1	null
2023-08-04	DTF-Net: Category-Level Pose Estimation and Shape Reconstruction via Deformable Template Field	Haowen Wang et.al.	2308.02239v1	null
2023-08-07	Robust Self-Supervised Extrinsic Self-Calibration	Takayuki Kanai et.al.	2308.02153v2	null
2023-08-03	Sim-to-Real Vision-depth Fusion CNNs for Robust Pose Estimation Aboard Autonomous Nano-quadcopter	Luca Crupi et.al.	2308.01833v1	null
2023-08-03	Active Acoustic Sensing for Robot Manipulation	Shihan Lu et.al.	2308.01600v1	null
2023-08-02	HANDAL: A Dataset of Real-World Manipulable Object Categories with Pose Annotations, Affordances, and Reconstructions	Andrew Guo et.al.	2308.01477v1	null
2023-08-06	Human-M3: A Multi-view Multi-modal Dataset for 3D Human Pose Estimation in Outdoor Scenes	Bohao Fan et.al.	2308.00628v2	link
2023-08-01	Markerless human pose estimation for biomedical applications: a survey	Andrea Avogaro et.al.	2308.00519v1	null
2023-08-01	Kidnapping Deep Learning-based Multirotors using Optimized Flying Adversarial Patches	Pia Hanfeld et.al.	2308.00344v1	link
2023-08-01	Fine-Grained Sports, Yoga, and Dance Postures Recognition: A Benchmark Analysis	Asish Bera et.al.	2308.00323v1	null
2023-08-01	Robust Single-view Cone-beam X-ray Pose Estimation with Neural Tuned Tomography (NeTT) and Masked Neural Radiance Fields (mNeRF)	Chaochao Zhou et.al.	2308.00214v1	null
2023-07-31	Lightweight Super-Resolution Head for Human Pose Estimation	Haonan Wang et.al.	2307.16765v1	link
2023-07-31	DiffPose: SpatioTemporal Diffusion Model for Video-Based Human Pose Estimation	Runyang Feng et.al.	2307.16687v1	null
2023-07-30	Touch if it's transparent! ACTOR: Active Tactile-based Category-Level Transparent Object Reconstruction	Prajval Kumar Murali et.al.	2307.16254v1	null
2023-07-30	Successive Pose Estimation and Beam Tracking for mmWave Vehicular Communication Systems	Cen Liu et.al.	2307.16117v1	link
2023-07-29	Iterative Graph Filtering Network for 3D Human Pose Estimation	Zaedul Islam et.al.	2307.16074v1	link
2023-07-29	HandMIM: Pose-Aware Self-Supervised Learning for 3D Hand Mesh Estimation	Zuyan Liu et.al.	2307.16061v1	null
2023-07-29	Effective Whole-body Pose Estimation with Two-stages Distillation	Zhendong Yang et.al.	2307.15880v1	link
2023-07-28	TrackAgent: 6D Object Tracking via Reinforcement Learning	Konstantin Röhrl et.al.	2307.15671v1	null
2023-07-28	Revisiting Fully Convolutional Geometric Features for Object 6D Pose Estimation	Jaime Corsetti et.al.	2307.15514v1	link
2023-07-28	Robust Visual Sim-to-Real Transfer for Robotic Manipulation	Ricardo Garcia et.al.	2307.15320v1	null
2023-07-27	Weakly Supervised Multi-Modal 3D Human Body Pose Estimation for Autonomous Driving	Peter Bauer et.al.	2307.14889v1	null
2023-07-26	Attention of Robot Touch: Tactile Saliency Prediction for Robust Sim-to-Real Tactile Control	Yijiong Lin et.al.	2307.14510v1	null
2023-07-28	CBGL: Fast Monte Carlo Passive Global Localisation of 2D LIDAR Sensor	Alexandros Filotheou et.al.	2307.14247v2	link
2023-07-26	Deep Robust Multi-Robot Re-localisation in Natural Environments	Milad Ramezani et.al.	2307.13950v1	null
2023-07-25	Of Mice and Pose: 2D Mouse Pose Estimation from Unlabelled Data and Synthetic Prior	Jose Sosa et.al.	2307.13361v1	null
2023-07-23	TransNet: Transparent Object Manipulation Through Category-Level Pose Estimation	Huijie Zhang et.al.	2307.12400v1	null
2023-07-25	FDCT: Fast Depth Completion for Transparent Objects	Tianan Li et.al.	2307.12274v2	link
2023-07-22	Challenges for Monocular 6D Object Pose Estimation in Robotics	Stefan Thalhammer et.al.	2307.12172v1	null
2023-07-22	Pyramid Semantic Graph-based Global Point Cloud Registration with Low Overlap	Zhijian Qiao et.al.	2307.12116v1	link
2023-07-22	Robot Structure Prior Guided Temporal Attention for Camera-to-Robot Pose Estimation from Image Sequence	Yang Tian et.al.	2307.12106v1	link
2023-07-26	LAMP: Leveraging Language Prompts for Multi-person Pose Estimation	Shengnan Hu et.al.	2307.11934v2	link
2023-07-21	YOLOPose V2: Understanding and Improving Transformer-based 6D Pose Estimation	Arul Selvam Periyasamy et.al.	2307.11550v1	null
2023-07-21	KVN: Keypoints Voting Network with Differentiable RANSAC for Stereo Pose Estimation	Ivano Donadi et.al.	2307.11543v1	link
2023-07-21	Semantically-enhanced Deep Collision Prediction for Autonomous Navigation using Aerial Robots	Mihir Kulkarni et.al.	2307.11522v1	null
2023-07-20	SimCol3D -- 3D Reconstruction during Colonoscopy Challenge	Anita Rau et.al.	2307.11261v1	link
2023-07-20	MSQNet: Actor-agnostic Action Recognition with Multi-modal Query	Anindya Mondal et.al.	2307.10763v1	link
2023-07-19	POV-Surgery: A Dataset for Egocentric Hand and Tool Pose Estimation During Surgical Activities	Rui Wang et.al.	2307.10387v1	link
2023-07-18	ActionPrompt: Action-Guided 3D Human Pose Estimation With Text and Pose Prompting	Hongwei Zheng et.al.	2307.09026v1	null
2023-07-17	Human Emergency Detection during Autonomous Hospital Transports	Andreas Zachariae et.al.	2307.08359v1	link
2023-07-17	Self-supervised Monocular Depth Estimation: Let's Talk About The Weather	Kieran Saunders et.al.	2307.08357v1	null
2023-07-20	Boosting 3-DoF Ground-to-Satellite Camera Localization Accuracy via Geometry-Guided Cross-View Transformer	Yujiao Shi et.al.	2307.08015v3	link
2023-07-15	Tightly-Coupled LiDAR-Visual SLAM Based on Geometric Features for Mobile Agents	Ke Cao et.al.	2307.07763v1	null
2023-07-13	Haptic-guided assisted telemanipulation approach for grasping desired objects from heaps	Maxime Adjigble et.al.	2307.07053v1	null
2023-07-13	Improving 2D Human Pose Estimation across Unseen Camera Views with Synthetic Data	Miroslav Purkrábek et.al.	2307.06737v1	link
2023-07-12	Deep learning-based estimation of whole-body kinematics from multi-view images	Kien X. Nguyen et.al.	2307.05896v1	link
2023-07-12	GLA-GCN: Global-local Adaptive Graph Convolutional Network for 3D Human	Bruce X. B. Yu et.al.	2307.05853v1	link
2023-07-09	TransPose: A Transformer-based 6D Object Pose Estimation Network with Depth Refinement	Mahmoud Abdulsalam et.al.	2307.05561v1	null
2023-07-11	ResMatch: Residual Attention Learning for Local Feature Matching	Yuxin Deng et.al.	2307.05180v1	link
2023-07-07	Proximity and Visuotactile Point Cloud Fusion for Contact Patches in Extreme Deformation	Jessica Yin et.al.	2307.03839v1	null
2023-07-07	Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation	Zhongyu Jiang et.al.	2307.03833v1	link
2023-07-07	Equivariant Single View Pose Prediction Via Induced and Restricted Representations	Owen Howell et.al.	2307.03704v1	null
2023-07-07	RCDN -- Robust X-Corner Detection Algorithm based on Advanced CNN Model	Ben Chen et.al.	2307.03505v1	null
2023-07-06	Self-supervised Optimization of Hand Pose Estimation using Anatomical Features and Iterative Learning	Christian Jauch et.al.	2307.03007v1	null
2023-07-06	Recognition and Estimation of Human Finger Pointing with an RGB Camera for Robot Directive	Eran Bamani et.al.	2307.02949v1	null
2023-07-06	A Real-time Human Pose Estimation Approach for Optimal Sensor Placement in Sensor-based Human Activity Recognition	Orhan Konak et.al.	2307.02906v1	null
2023-07-04	Secure Deep Learning-based Distributed Intelligence on Pocket-sized Drones	Elia Cereda et.al.	2307.01559v1	null
2023-07-03	Joint Coordinate Regression and Association For Multi-Person Pose Estimation, A Pure Neural Network Approach	Dongyang Yu et.al.	2307.01004v1	null
2023-07-01	Automatic Solver Generator for Systems of Laurent Polynomial Equations	Evgeniy Martyushev et.al.	2307.00320v1	link
2023-07-01	SyMFM6D: Symmetry-aware Multi-directional Fusion for Multi-View 6D Object Pose Estimation	Fabian Duffhauss et.al.	2307.00306v1	link
2023-06-30	GIRA: Gaussian Mixture Models for Inference and Robot Autonomy	Kshitij Goel et.al.	2307.00071v1	link
2023-06-30	Towards the extraction of robust sign embeddings for low resource sign language recognition	Mathieu De Coster et.al.	2306.17558v1	null
2023-06-30	Fusion of Visual-Inertial Odometry with LiDAR Relative Localization for Cooperative Guidance of a Micro-Scale Aerial Vehicle	Václav Pritzl et.al.	2306.17544v1	link
2023-06-30	Locking On: Leveraging Dynamic Vehicle-Imposed Motion Constraints to Improve Visual Localization	Stephen Hausler et.al.	2306.17529v1	null
2023-06-29	ID-Pose: Sparse-view Camera Pose Estimation by Inverting Diffusion Models	Weihao Cheng et.al.	2306.17140v1	null
2023-06-29	Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation	Zhongwei Qiu et.al.	2306.17074v1	null
2023-06-28	Hierarchical Graph Neural Networks for Proprioceptive 6D Pose Estimation of In-hand Objects	Alireza Rezazadeh et.al.	2306.15858v1	null
2023-06-09	Data-Link: High Fidelity Manufacturing Datasets for Model2Real Transfer under Industrial Settings	Sunny Katyara et.al.	2306.05766v1	null
2023-05-28	Counter-Hypothetical Particle Filters for Single Object Pose Tracking	Elizabeth A. Olson et.al.	2305.17828v1	null
2023-05-25	Enhanced 6D Pose Estimation for Robotic Fruit Picking	Marco Costanzo et.al.	2305.15856v1	null
2023-05-22	You Only Look at One: Category-Level Object Representations for Pose Estimation From a Single Example	Walter Goodwin et.al.	2305.12626v1	null
2023-05-18	Manifold-Aware Self-Training for Unsupervised Domain Adaptation on Regressing 6D Object Pose	Yichen Zhang et.al.	2305.10808v1	link
2023-05-08	RelPose++: Recovering 6D Poses from Sparse-view Observations	Amy Lin et.al.	2305.04926v1	link
2023-04-17	Uncovering the Background-Induced bias in RGB based 6-DoF Object Pose Estimation	Elena Govi et.al.	2304.08230v1	link
2023-03-28	CARTO: Category and Joint Agnostic Reconstruction of ARTiculated Objects	Nick Heppert et.al.	2303.15782v1	link
2023-03-23	Prior-free Category-level Pose Estimation with Implicit Space Transformation	Jianhui Liu et.al.	2303.13479v1	link
2023-06-21	6D Object Pose Estimation from Approximate 3D Models for Orbital Robotics	Maximilian Ulmer et.al.	2303.13241v3	null
2023-03-22	Rigidity-Aware Detection for 6D Object Pose Estimation	Yang Hai et.al.	2303.12396v1	link
2023-03-22	Object Pose Estimation with Statistical Guarantees: Conformal Keypoint Detection and Geometric Uncertainty Propagation	Heng Yang et.al.	2303.12246v1	link
2023-03-21	Linear-Covariance Loss for End-to-End Learning of 6D Pose Estimation	Fulin Liu et.al.	2303.11516v1	link
2023-03-18	SOCS: Semantically-aware Object Coordinate Space for Category-Level 6D Object Pose Estimation under Large Shape Variations	Boyan Wan et.al.	2303.10346v1	null
2023-03-12	Module-Wise Network Quantization for 6D Object Pose Estimation	Saqib Javed et.al.	2303.06753v1	link
2023-03-09	SpyroPose: Importance Sampling Pyramids for Object Pose Distribution Estimation in SE(3)	Rasmus Laurvig Haugaard et.al.	2303.05308v1	null
2023-03-03	Depth-based 6DoF Object Pose Estimation using Swin Transformer	Zhujun Li et.al.	2303.02133v1	link
2023-03-02	Canonical mapping as a general-purpose object descriptor for robotic manipulation	Benjamin Joffe et.al.	2303.01331v1	null
2023-02-14	MSDA: Monocular Self-supervised Domain Adaptation for 6D Object Pose Estimation	Dingding Cai et.al.	2302.07300v1	null
2023-02-14	Model-Based Underwater 6D Pose Estimation from RGB	Davide Sapienza et.al.	2302.06821v1	null
2023-02-02	A Projective Geometric View for 6D Pose Estimation in mmWave MIMO Systems	Shengqiang Shen et.al.	2302.00227v2	null
2023-01-31	Collision-aware In-hand 6D Object Pose Estimation using Multiple Vision-based Tactile Sensors	Gabriele M. Caddeo et.al.	2301.13667v1	link
2023-01-19	Learning ultrasound plane pose regression: assessing generalized pose coordinates in the fetal brain	Chiara Di Vece et.al.	2301.08317v1	null
2023-01-19	RGB-D-Based Categorical Object Pose and Shape Estimation: Methods, Datasets, and Evaluation	Leonard Bruns et.al.	2301.08147v1	link
2022-12-21	HouseCat6D -- A Large-Scale Multi-Modal Category Level 6D Object Pose Dataset with Household Objects in Realistic Scenarios	HyunJun Jung et.al.	2212.10428v2	link
2022-12-13	MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare	Yann Labbé et.al.	2212.06870v1	null
2022-12-11	Context-aware 6D Pose Estimation of Known Objects using RGB-D data	Ankit Kumar et.al.	2212.05560v1	null
2023-01-30	Category-Level 6D Object Pose Estimation with Flexible Vector-Based Rotation Representation	Wei Chen et.al.	2212.04632v2	null

(back to top)

Point Cloud Registration

Publish Date	Title	Authors	PDF	Code
2025-03-11	BUFFER-X: Towards Zero-Shot Point Cloud Registration in Diverse Scenes	Minkyun Seo et.al.	2503.07940v1	null
2025-03-10	SANDRO: a Robust Solver with a Splitting Strategy for Point Cloud Registration	Michael Adlerstein et.al.	2503.07743v1	null
2025-03-10	HybridReg: Robust 3D Point Cloud Registration with Hybrid Motions	Keyu Du et.al.	2503.07019v1	null
2025-03-07	Diff-Reg v2: Diffusion-Based Matching Matrix Estimation for Image Matching and 3D Registration	Qianliang Wu et.al.	2503.04127v2	null
2025-03-04	HyperGCT: A Dynamic Hyper-GNN-Learned Geometric Constraint for 3D Registration	Xiyu Zhang et.al.	2503.02195v1	null
2025-03-02	Semantic-ICP: Iterative Closest Point for Non-rigid Multi-Organ Point Cloud Registration	Wanwen Chen et.al.	2503.00972v1	null
2025-02-26	BEV-LIO(LC): BEV Image Assisted LiDAR-Inertial Odometry with Loop Closure	Haoxin Cai et.al.	2502.19242v1	link
2025-02-15	Occlusion-aware Non-Rigid Point Cloud Registration via Unsupervised Neural Deformation Correntropy	Mingyang Zhao et.al.	2502.10704v1	link
2025-02-12	Fully-Geometric Cross-Attention for Point Cloud Registration	Weijie Wang et.al.	2502.08285v1	null
2025-02-11	Multiview Point Cloud Registration Based on Minimum Potential Energy for Free-Form Blade Measurement	Zijie Wu et.al.	2502.07680v1	null
2025-02-10	DefTransNet: A Transformer-based Method for Non-Rigid Point Cloud Registration in the Simulation of Soft Tissue Deformation	Sara Monji-Azad et.al.	2502.06336v1	null
2025-02-05	Mapping and Localization Using LiDAR Fiducial Markers	Yibo Liu et.al.	2502.03510v1	null
2025-01-31	A Direct Semi-Exhaustive Search Method for Robust, Partial-to-Full Point Cloud Registration	Richard Cheng et.al.	2502.00115v1	null
2025-01-18	PSReg: Prior-guided Sparse Mixture of Experts for Point Cloud Registration	Xiaoshui Huang et.al.	2501.07762v2	null
2025-01-10	LPRnet: A self-supervised registration network for LiDAR and photogrammetric point clouds	Chen Wang et.al.	2501.05669v1	null
2025-01-09	LP-ICP: General Localizability-Aware Point Cloud Registration for Robust Localization in Extreme Unstructured Environments	Haosong Yue et.al.	2501.02580v2	link
2025-01-03	MRG: A Multi-Robot Manufacturing Digital Scene Generation Method Using Multi-Instance Point Cloud Registration	Songjie Han et.al.	2501.02041v1	null
2024-12-29	Towards Explaining Uncertainty Estimates in Point Cloud Registration	Ziyuan Qin et.al.	2412.20612v1	null
2024-12-26	Resolving the Ambiguity of Complete-to-Partial Point Cloud Registration for Image-Guided Liver Surgery with Patches-to-Partial Matching	Zixin Yang et.al.	2412.19328v1	null
2024-12-25	Cross-PCR: A Robust Cross-Source Point Cloud Registration Framework	Guiyu Zhao et.al.	2412.18873v1	null
2024-12-23	PointVoxelFormer -- Reviving point cloud networks for 3D medical imaging	Mattias Paul Heinrich et.al.	2412.17390v1	null
2024-12-19	3D Registration in 30 Years: A Survey	Jiaqi Yang et.al.	2412.13735v2	link
2024-12-13	TrafficLoc: Localizing Traffic Surveillance Cameras in 3D Scenes	Yan Xia et.al.	2412.10308v1	null
2024-12-10	A Real-time Degeneracy Sensing and Compensation Method for Enhanced LiDAR SLAM	Zongbo Liao et.al.	2412.07513v1	null
2024-12-07	AutoURDF: Unsupervised Robot Modeling from Point Cloud Frames Using Cluster Registration	Jiong Lin et.al.	2412.05507v1	null
2024-12-06	GS-Matching: Reconsidering Feature Matching task in Point Cloud Registration	Yaojie Zhang et.al.	2412.04855v1	null
2024-12-04	AffordDP: Generalizable Diffusion Policy with Transferable Affordance	Shijie Wu et.al.	2412.03142v1	null
2024-12-04	QuadricsReg: Large-Scale Point Cloud Registration using Quadric Primitives	Ji Wu et.al.	2412.02998v1	null
2024-12-01	FlashSLAM: Accelerated RGB-D SLAM for Real-Time 3D Scene Reconstruction with Gaussian Splatting	Phu Pham et.al.	2412.00682v1	null
2024-11-27	XR-MBT: Multi-modal Full Body Tracking for XR through Self-Supervision with Learned Depth Point Cloud Registration	Denys Rozumnyi et.al.	2411.18377v1	null
2024-11-22	EADReg: Probabilistic Correspondence Generation with Efficient Autoregressive Diffusion Model for Outdoor Point Cloud Registration	Linrui Gong et.al.	2411.15271v1	null
2024-11-20	Automatic marker-free registration based on similar tetrahedras for single-tree point clouds	Jing Ren et.al.	2411.13069v1	null
2024-11-19	3D Reconstruction by Looking: Instantaneous Blind Spot Detector for Indoor SLAM through Mixed Reality	Hanbeom Chang et.al.	2411.12514v1	null
2024-11-16	Deep Loss Convexification for Learning Iterative Models	Ziming Zhang et.al.	2411.10649v1	null
2024-11-12	3D Focusing-and-Matching Network for Multi-Instance Point Cloud Registration	Liyuan Zhang et.al.	2411.07740v1	link
2024-11-04	Mining and Transferring Feature-Geometry Coherence for Unsupervised Point Cloud Registration	Kezheng Xiong et.al.	2411.01870v1	link
2024-10-30	UniRiT: Towards Few-Shot Non-Rigid Point Cloud Registration	Geng Li et.al.	2410.22909v1	null
2024-10-29	Micro-Structures Graph-Based Point Cloud Registration for Balancing Efficiency and Accuracy	Rongling Zhang et.al.	2410.21857v1	null
2024-10-29	Memory-Efficient Point Cloud Registration via Overlapping Region Sampling	Tomoyasu Shimada et.al.	2410.21753v1	null
2024-10-21	RANSAC Back to SOTA: A Two-stage Consensus Filtering for Real-time 3D Registration	Pengcheng Shi et.al.	2410.15682v1	link
2024-10-14	A Consistency-Aware Spot-Guided Transformer for Versatile and Hierarchical Point Cloud Registration	Renlang Huang et.al.	2410.10295v1	link
2024-10-14	Kinematic-ICP: Enhancing LiDAR Odometry with Kinematic Constraints for Wheeled Mobile Robots Moving on Planar Surfaces	Tiziano Guadagnino et.al.	2410.10277v1	null
2024-10-10	LiPO: LiDAR Inertial Odometry for ICP Comparison	Darwin Mick et.al.	2410.08097v1	null
2024-10-08	Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud Registration	Xueyang Kang et.al.	2410.05729v1	link
2024-10-07	Enhanced Multi-Robot SLAM System with Cross-Validation Matching and Exponential Threshold Keyframe Selection	Ang He et.al.	2410.05017v1	null
2024-10-03	LoGDesc: Local geometric features aggregation for robust point cloud registration	Karim Slimani et.al.	2410.02420v1	link
2024-10-01	GERA: Geometric Embedding for Efficient Point Registration Analysis	Geng Li et.al.	2410.00589v1	null
2024-10-01	TFCT-I2P: Three stream fusion network with color aware transformer for image-to-point cloud registration	Muyao Peng et.al.	2410.00360v1	link
2024-10-06	KISS-Matcher: Fast and Robust Point Cloud Registration Revisited	Hyungtae Lim et.al.	2409.15615v2	link
2024-09-23	MATCH POLICY: A Simple Pipeline from Point Cloud Registration to Manipulation Policies	Haojie Huang et.al.	2409.15517v1	null
2024-09-22	SynBench: A Synthetic Benchmark for Non-rigid 3D Point Cloud Registration	Sara Monji-Azad et.al.	2409.14474v1	null
2024-09-27	FracGM: A Fast Fractional Programming Technique for Geman-McClure Robust Estimator	Bang-Shien Chen et.al.	2409.13978v2	link
2024-09-17	Enhancing the Reliability of LiDAR Point Cloud Sampling: A Colorization and Super-Resolution Approach Based on LiDAR-Generated Images	Sier Ha et.al.	2409.11532v1	null
2024-09-14	Registration between Point Cloud Streams and Sequential Bounding Boxes via Gradient Descent	Xuesong Li et.al.	2409.09312v1	null
2024-09-11	Unsupervised Point Cloud Registration with Self-Distillation	Christian Löwens et.al.	2409.07558v1	link
2024-09-10	Mahalanobis k-NN: A Statistical Lens for Robust Point-Cloud Registrations	Tejas Anvekar et.al.	2409.06267v1	link
2024-09-09	From Words to Poses: Enhancing Novel Object Pose Estimation with Vision Language Models	Tessa Pulli et.al.	2409.05413v1	null
2024-09-08	Sight View Constraint for Robust Point Cloud Registration	Yaojie Zhang et.al.	2409.05065v1	null
2024-08-23	UMERegRobust - Universal Manifold Embedding Compatible Features for Robust Point Cloud Registration	Yuval Haitman et.al.	2408.12380v2	link
2024-08-21	Informed, Constrained, Aligned: A Field Analysis on Degeneracy-aware Point Cloud Registration in the Wild	Turcan Tuna et.al.	2408.11809v1	null
2024-08-20	LoopSplat: Loop Closure by Registering 3D Gaussian Splats	Liyuan Zhu et.al.	2408.10154v2	link
2024-08-05	CMR-Agent: Learning a Cross-Modal Agent for Iterative Image-to-Point Cloud Registration	Gongxin Yao et.al.	2408.02394v1	null
2024-08-05	MaFreeI2P: A Matching-Free Image-to-Point Cloud Registration Paradigm with Active Camera Pose Retrieval	Gongxin Yao et.al.	2408.02392v1	null
2024-07-29	Correspondence-Free SE(3) Point Cloud Registration in RKHS via Unsupervised Equivariant Learning	Ray Zhang et.al.	2407.20223v1	null
2024-07-24	Robust Point Cloud Registration in Robotic Inspection with Locally Consistent Gaussian Mixture Model	Lingjie Su et.al.	2407.17183v1	null
2024-07-23	SE3ET: SE(3)-Equivariant Transformer for Low-Overlap Point Cloud Registration	Chien Erh Lin et.al.	2407.16823v1	link
2024-07-19	PointRegGPT: Boosting 3D Point Cloud Registration using Generative Point-Cloud Pairs for Training	Suyi Chen et.al.	2407.14054v1	link
2024-07-19	GlobalPointer: Large-Scale Plane Adjustment with Bi-Convex Relaxation	Bangyan Liao et.al.	2407.13537v2	link
2024-07-22	Snail-Radar: A large-scale diverse dataset for the evaluation of 4D-radar-based SLAM systems	Jianzhu Huai et.al.	2407.11705v2	null
2024-07-14	PARE-Net: Position-Aware Rotation-Equivariant Networks for Robust Point Cloud Registration	Runzhao Yao et.al.	2407.10142v1	link
2024-07-13	ML-SemReg: Boosting Point Cloud Registration with Multi-level Semantic Consistency	Shaocheng Yan et.al.	2407.09862v1	link
2024-07-11	BiEquiFormer: Bi-Equivariant Representations for Global Point Cloud Registration	Stefanos Pertigkiozoglou et.al.	2407.08729v1	null
2024-07-10	Incremental Multiview Point Cloud Registration with Two-stage Candidate Retrieval	Shiqi Li et.al.	2407.07525v1	null
2024-07-08	SGOR: Outlier Removal by Leveraging Semantic and Geometric Information for Robust Point Cloud Registration	Guiyu Zhao et.al.	2407.06297v1	link
2024-07-08	GeoNLF: Geometry guided Pose-Free Neural LiDAR Fields	Weiyi Xue et.al.	2407.05597v1	null
2024-07-07	GaussReg: Fast 3D Registration with Gaussian Splatting	Jiahao Chang et.al.	2407.05254v1	null
2024-07-06	Incremental Multiview Point Cloud Registration	Xiaoya Cheng et.al.	2407.05021v1	link
2024-06-25	Point Tree Transformer for Point Cloud Registration	Meiling Wang et.al.	2406.17530v1	null
2024-06-17	Correspondence Free Multivector Cloud Registration using Conformal Geometric Algebra	Francisco Xavier Vasconcelos et.al.	2406.11732v1	link
2024-06-05	L-PR: Exploiting LiDAR Fiducial Marker for Unordered Low Overlap Multiview Point Cloud Registration	Yibo Liu et.al.	2406.03298v1	link
2024-05-25	Deep-PE: A Learning-Based Pose Evaluator for Point Cloud Registration	Junjie Gao et.al.	2405.16085v1	null
2024-05-26	NV-LIO: LiDAR-Inertial Odometry using Normal Vectors Towards Robust SLAM in Multifloor Environments	Dongha Chung et.al.	2405.12563v2	link
2024-05-13	RGBD-Glue: General Feature Combination for Robust RGB-D Point Cloud Registration	Congjia Chen et.al.	2405.07594v1	null
2024-05-10	Benchmarking Classical and Learning-Based Multibeam Point Cloud Registration	Li Ling et.al.	2405.06279v1	link
2024-05-09	Rotation Initialization and Stepwise Refinement for Universal LiDAR Calibration	Yifan Duan et.al.	2405.05589v1	null
2024-05-07	Speak the Same Language: Global LiDAR Registration on BIM Using Pose Hough Transform	Zhijian Qiao et.al.	2405.03969v1	null
2024-05-06	Deep Learning-based Point Cloud Registration for Augmented Reality-guided Surgery	Maximilian Weber et.al.	2405.03314v1	null
2024-04-27	FRAME: A Modular Framework for Autonomous Map-merging: Advancements in the Field	Nikolaos Stathoulopoulos et.al.	2404.18006v1	null
2024-04-22	PointDifformer: Robust Point Cloud Registration With Neural Diffusion and Transformer	Rui She et.al.	2404.14034v1	null
2024-04-22	A Comprehensive Survey and Taxonomy on Point Cloud Registration Based on Deep Learning	Yu-Xin Zhang et.al.	2404.13830v1	link
2024-04-09	Efficient and Robust Point Cloud Registration via Heuristics-guided Parameter Search	Tianyu Huang et.al.	2404.06155v1	link
2024-04-08	Rendering-Enhanced Automatic Image-to-Point Cloud Registration for Roadside Scenes	Yu Sheng et.al.	2404.05164v1	null
2024-04-06	Learning Instance-Aware Correspondences for Robust Multi-Instance Point Cloud Registration in Cluttered Scenes	Zhiyuan Yu et.al.	2404.04557v1	link
2024-04-05	A Ground Mobile Robot for Autonomous Terrestrial Laser Scanning-Based Field Phenotyping	Javier Rodriguez-Sanchez et.al.	2404.04404v1	null
2024-04-01	FPGA-Accelerated Correspondence-free Point Cloud Registration with PointNet Features	Keisuke Sugiura et.al.	2404.01237v1	null
2024-03-28	SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks	Yaxu Xie et.al.	2403.19474v1	link
2024-03-26	Global Point Cloud Registration Network for Large Transformations	Hanz Cuevas-Velasquez et.al.	2403.18040v1	link
2024-03-28	Exploring Accurate 3D Phenotyping in Greenhouse through Neural Radiance Fields	Junhong Zhao et.al.	2403.15981v2	null
2024-03-15	VRHCF: Cross-Source Point Cloud Registration via Voxel Representation and Hierarchical Correspondence Filtering	Guiyu Zhao et.al.	2403.10085v1	link
2024-03-15	MEDPNet: Achieving High-Precision Adaptive Registration for Complex Die Castings	Yu Du et.al.	2403.09996v1	null
2024-03-15	CLOSURE: Fast Quantification of Pose Uncertainty Sets	Yihuai Gao et.al.	2403.09990v1	null
2024-03-13	FastMAC: Stochastic Spectral Sampling of Correspondence Graph	Yifei Zhang et.al.	2403.08770v1	link
2024-03-13	NeRF-Supervised Feature Point Detection and Description	Ali Youssef et.al.	2403.08156v1	link
2024-03-10	PSS-BA: LiDAR Bundle Adjustment with Progressive Spatial Smoothing	Jianping Li et.al.	2403.06124v1	null
2024-03-27	Extend Your Own Correspondences: Unsupervised Distant Point Cloud Registration by Progressive Distance Extension	Quan Liu et.al.	2403.03532v2	link
2024-03-15	RELEAD: Resilient Localization with Enhanced LiDAR Odometry in Adverse Environments	Zhiqiang Chen et.al.	2402.18934v2	null
2024-02-28	PCR-99: A Practical Method for Point Cloud Registration with 99% Outliers	Seong Hun Lee et.al.	2402.16598v2	link
2024-02-23	CLIPPER+: A Fast Maximal Clique Algorithm for Robust Global Registration	Kaveh Fathian et.al.	2402.15464v1	link
2024-02-11	CLIPPER: Robust Data Association without an Initial Guess	Parker C. Lusk et.al.	2402.07284v1	null
2024-02-08	Tightly Coupled Range Inertial Localization on a 3D Prior Map Based on Sliding Window Factor Graph Optimization	Kenji Koide et.al.	2402.05540v1	null
2024-01-16	Registration of algebraic varieties using Riemannian optimization	Florentin Goyens et.al.	2401.08562v1	link
2024-01-09	Iterative Feedback Network for Unsupervised Point Cloud Registration	Yifan Xie et.al.	2401.04357v1	link
2024-01-06	PosDiffNet: Positional Neural Diffusion for Point Cloud Registration in a Large Field of View with Perturbations	Rui She et.al.	2401.03167v1	null
2024-01-04	OptFlow: Fast Optimization-based Scene Flow Estimation without Supervision	Rahul Ahuja et.al.	2401.02550v1	null
2024-01-17	Diff-PCR: Diffusion-Based Correspondence Searching in Doubly Stochastic Matrix Space for Point Cloud Registration	Qianliang Wu et.al.	2401.00436v4	null
2023-12-22	On Partial Optimal Transport: Revising the Infeasibility of Sinkhorn and Efficient Gradient Methods	Anh Duc Nguyen et.al.	2312.13970v2	link
2023-12-20	D3Former: Jointly Learning Repeatable Dense Detectors and Feature-enhanced Descriptors via Saliency-guided Transformer	Junjie Gao et.al.	2312.12970v1	null
2023-12-14	SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration	Kezheng Xiong et.al.	2312.08664v1	null
2023-12-11	PCRDiffusion: Diffusion Probabilistic Models for Point Cloud Registration	Yue Wu et.al.	2312.06063v1	null
2023-12-05	DiffusionPCR: Diffusion Models for Robust Multi-Step Point Cloud Registration	Zhi Chen et.al.	2312.03053v1	null
2023-12-08	Zero-Shot Point Cloud Registration	Weijie Wang et.al.	2312.03032v2	null
2023-12-05	A Dynamic Network for Efficient Point Cloud Registration	Yang Ai et.al.	2312.02877v1	null
2023-12-05	6D Assembly Pose Estimation by Point Cloud Registration for Robot Manipulation	K. Samarawickrama et.al.	2312.02593v1	link
2023-12-04	Rotation-Invariant Rapid TRISO-Fueled Pebble Identification Based on Feature Matching and Point Cloud Registration	Ming Fang et.al.	2312.02006v1	null
2023-12-27	E2PNet: Event to Point Cloud Registration with Spatio-Temporal Representation Learning	Xiuhong Lin et.al.	2311.18433v2	link
2023-11-15	Nothing Stands Still: A Spatiotemporal Benchmark on 3D Point Cloud Registration Under Large Geometric and Temporal Change	Tao Sun et.al.	2311.09346v1	null
2023-11-02	Transformation Decoupling Strategy based on Screw Theory for Deterministic Point Cloud Registration with Gravity Prior	Xinyi Li et.al.	2311.01432v1	null
2023-11-02	Cross-Modal Information-Guided Network using Contrastive Learning for Point Cloud Registration	Yifan Xie et.al.	2311.01202v1	link
2023-10-29	HDMNet: A Hierarchical Matching Network with Double Attention for Large-scale Outdoor LiDAR Point Cloud Registration	Weiyi Xue et.al.	2310.18874v1	null
2023-10-27	Do we need scan-matching in radar odometry?	Vladimír Kubelka et.al.	2310.18117v1	link
2023-10-26	SE(3) Diffusion Model-based Point Cloud Registration for Robust 6D Object Pose Estimation	Haobo Jiang et.al.	2310.17359v1	null
2023-10-18	DBDNet:Partial-to-Partial Point Cloud Registration with Dual Branches Decoupling	Shiqi Li et.al.	2310.11733v1	null
2023-10-15	OAAFormer: Robust and Efficient Point Cloud Registration Through Overlapping-Aware Attention in Transformer	Junjie Gao et.al.	2310.09817v1	null
2023-10-09	FeatSense -- A Feature-based Registration Algorithm with GPU-accelerated TSDF-Mapping Backend for NVIDIA Jetson Boards	Julian Gaal et.al.	2310.05766v1	link
2023-10-09	Colmap-PCD: An Open-source Tool for Fine Image-to-point cloud Registration	Chunge Bai et.al.	2310.05504v1	link
2023-10-06	Light-LOAM: A Lightweight LiDAR Odometry and Mapping based on Graph-Matching	Shiquan Yi et.al.	2310.04162v1	link
2023-10-05	FreeReg: Image-to-Point Cloud Registration Leveraging Pretrained Diffusion Models and Monocular Depth Estimators	Haiping Wang et.al.	2310.03420v1	link
2023-10-02	COIN-LIO: Complementary Intensity-Augmented LiDAR Inertial Odometry	Patrick Pfreundschuh et.al.	2310.01235v1	link
2023-09-27	Q-REG: End-to-End Trainable Point Cloud Registration with Surface Curvature	Shengze Jin et.al.	2309.16023v1	null
2023-09-27	Partial Transport for Point-Cloud Registration	Yikun Bai et.al.	2309.15787v1	null
2023-09-27	KDD-LOAM: Jointly Learned Keypoint Detector and Descriptors Assisted LiDAR Odometry and Mapping	Renlang Huang et.al.	2309.15394v1	null
2023-09-26	CoFiI2P: Coarse-to-Fine Correspondences for Image-to-Point Cloud Registration	Shuhao Kang et.al.	2309.14660v1	null
2023-09-20	AutoSynth: Learning to Generate 3D Training Data for Object Point Cloud Registration	Zheng Dang et.al.	2309.11170v1	null
2023-09-19	LiDAR-Generated Images Derived Keypoints Assisted Point Cloud Registration Scheme in Odometry Estimation	Haizhou Zhang et.al.	2309.10436v1	link
2023-09-17	Hamiltonian Dynamics Learning from Point Cloud Observations for Nonholonomic Mobile Robot Control	Abdullah Altawaitan et.al.	2309.09163v1	link
2023-09-16	FF-LOGO: Cross-Modality Point Cloud Registration with Feature Filtering and Local to Global Optimization	Nan Ma et.al.	2309.08966v1	null
2023-09-16	Outram: One-shot Global Localization via Triangulated Scene Graph and Global Outlier Pruning	Pengyu Yin et.al.	2309.08914v1	link
2023-09-15	A Ground Segmentation Method Based on Point Cloud Map for Unstructured Roads	Zixuan Li et.al.	2309.08164v1	null
2023-09-15	Fast and Accurate Deep Loop Closing and Relocalization for Reliable LiDAR SLAM	Chenghao Shi et.al.	2309.08086v1	null
2023-09-14	EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale Visual Localization	Minjung Kim et.al.	2309.07471v1	link
2023-09-12	SGFeat: Salient Geometric Feature for Point Cloud Registration	Qianliang Wu et.al.	2309.06207v1	null
2023-09-01	Point-TTA: Test-Time Adaptation for Point Cloud Registration Using Multitask Meta-Auxiliary Learning	Ahmed Hatem et.al.	2308.16481v2	null
2023-08-21	In-Rack Test Tube Pose Estimation Using RGB-D Data	Hao Chen et.al.	2308.10411v1	null
2023-08-18	DReg-NeRF: Deep Registration for Neural Radiance Fields	Yu Chen et.al.	2308.09386v1	link
2023-08-18	Overlap Bias Matching is Necessary for Point Cloud Registration	Pengcheng Shi et.al.	2308.09364v1	null
2023-08-10	Deep Semantic Graph Matching for Large-scale Outdoor Point Clouds Registration	Shaocong Liu et.al.	2308.05314v1	null
2023-08-09	PointMBF: A Multi-scale Bidirectional Fusion Network for Unsupervised RGB-D Point Cloud Registration	Mingzhi Yuan et.al.	2308.04782v1	link
2023-07-25	GeoTransformer: Fast and Robust Point Cloud Registration with Geometric Transformer	Zheng Qin et.al.	2308.03768v1	link
2023-07-26	One-Nearest Neighborhood Guides Inlier Estimation for Unsupervised Point Cloud Registration	Yongzhe Yuan et.al.	2307.14019v1	null
2023-07-22	Pyramid Semantic Graph-based Global Point Cloud Registration with Low Overlap	Zhijian Qiao et.al.	2307.12116v1	link
2023-09-12	ELiOT : End-to-end Lidar Odometry using Transformer Framework	Daegyu Lee et.al.	2307.11998v4	null
2023-08-08	Density-invariant Features for Distant Point Cloud Registration	Quan Liu et.al.	2307.09788v2	link
2023-07-18	SphereNet: Learning a Noise-Robust and General Descriptor for Point Cloud Registration	Guiyu Zhao et.al.	2307.09351v1	null
2023-07-14	CFI2P: Coarse-to-Fine Cross-Modal Correspondence Learning for Image-to-Point Cloud Registration	Gongxin Yao et.al.	2307.07142v1	null
2023-07-11	Exact Point Cloud Downsampling for Fast and Accurate Global Trajectory Optimization	Kenji Koide et.al.	2307.02948v2	link
2023-07-03	Direct Superpoints Matching for Fast and Robust Point Cloud Registration	Aniket Gupta et.al.	2307.01362v1	link
2023-07-04	A denoised Mean Teacher for domain adaptive point cloud registration	Alexander Bigalke et.al.	2306.14749v2	link
2023-06-20	End-to-end 2D-3D Registration between Image and LiDAR Point Cloud for Vehicle Localization	Guangming Wang et.al.	2306.11346v1	null
2023-06-14	ICET Online Accuracy Characterization for Geometry-Based Laser Scan Matching	Matthew McDermott et.al.	2306.08690v1	link
2023-06-12	Volume-DROID: A Real-Time Implementation of Volumetric Mapping with DROID-SLAM	Peter Stratton et.al.	2306.06850v1	link
2023-06-11	PWR-Align: Leveraging Part-Whole Relationships for Part-wise Rigid Point Cloud Registration in Mixed Reality Applications	Manorama Jha et.al.	2306.06717v1	null
2023-06-07	Robust-DefReg: A Robust Deformable Point Cloud Registration Method based on Graph Convolutional Neural Networks	Sara Monji-Azad et.al.	2306.04701v1	null
2023-05-23	Cross-source Point Cloud Registration: Challenges, Progress and Prospects	Xiaoshui Huang et.al.	2305.13570v1	null
2023-05-19	Efficient and Deterministic Search Strategy Based on Residual Projections for Point Cloud Registration	Xinyi Li et.al.	2305.11716v1	null
2023-05-18	3D Registration with Maximal Cliques	Xiyu Zhang et.al.	2305.10854v1	link
2023-05-05	HD2Reg: Hierarchical Descriptors and Detectors for Point Cloud Registration	Canhui Tang et.al.	2305.03487v1	link
2023-05-08	APR: Online Distant Point Cloud Registration Through Aggregated Point Cloud Reconstruction	Quan Liu et.al.	2305.02893v2	link
2023-04-27	RegHEC: Hand-Eye Calibration via Simultaneous Multi-view Point Clouds Registration of Arbitrary Object	Shiyu Xing et.al.	2304.14092v1	link
2023-04-26	Non-rigid Point Cloud Registration for Middle Ear Diagnostics with Endoscopic Optical Coherence Tomography	Peng Liu et.al.	2304.13618v1	link
2023-04-25	BO-ICP: Initialization of Iterative Closest Point Based on Bayesian Optimization	Harel Biggie et.al.	2304.13114v1	link
2023-04-18	SDFReg: Learning Signed Distance Functions for Point Cloud Registration	Leida Zhang et.al.	2304.08929v1	null
2023-04-12	SiLK -- Simple Learned Keypoints	Pierre Gleize et.al.	2304.06194v1	link
2023-04-11	TT-SDF2PC: Registration of Point Cloud and Compressed SDF Directly in the Memory-Efficient Tensor Train Domain	Alexey I. Boyko et.al.	2304.05342v1	null
2023-04-10	HybridFusion: LiDAR and Vision Cross-Source Point Cloud Fusion	Yu Wang et.al.	2304.04508v1	null
2023-04-09	Self-Supervised Learning of Object Segmentation from Unlabeled RGB-D Videos	Shiyang Lu et.al.	2304.04325v1	null
2023-04-09	DSMNet: Deep High-precision 3D Surface Modeling from Sparse Point Cloud Frames	Changjie Qiu et.al.	2304.04200v1	null
2023-04-02	Robust Multiview Point Cloud Registration with Reliable Pose Graph Initialization and History Reweighting	Haiping Wang et.al.	2304.00467v1	link
2023-03-31	kNN-Res: Residual Neural Network with kNN-Graph coherence for point cloud registration	Muhammad S. Battikh et.al.	2304.00050v1	link
2023-03-31	RDMNet: Reliable Dense Matching Based Point Cloud Registration for Autonomous Driving	Chenghao Shi et.al.	2303.18084v1	null
2023-04-23	HybridPoint: Point Cloud Registration Based on Hybrid Point Sampling and Matching	Yiheng Li et.al.	2303.16526v2	link
2023-03-27	Learnable Graph Matching: A Practical Paradigm for Data Association	Jiawei He et.al.	2303.15414v1	link
2023-03-23	Unsupervised Deep Probabilistic Approach for Partial Point Cloud Registration	Guofeng Mei et.al.	2303.13290v1	link
2023-03-22	RegFormer: An Efficient Projection-Aware Transformer Network for Large-Scale Point Cloud Registration	Jiuming Liu et.al.	2303.12384v1	link
2023-03-17	Deep Graph-based Spatial Consistency for Robust Non-rigid Point Cloud Registration	Zheng Qin et.al.	2303.09950v1	link
2023-03-14	RoCNet: 3D Robust Registration of Point-Clouds using Deep Learning	Karim Slimani et.al.	2303.07963v1	null
2023-03-07	GMCR: Graph-based Maximum Consensus Estimation for Point Cloud Registration	Michael Gentner et.al.	2303.04032v1	null
2023-03-02	Neural Intrinsic Embedding for Non-rigid Point Cloud Matching	Puhua Jiang et.al.	2303.01038v1	null
2023-03-14	A Unified BEV Model for Joint Learning of 3D Local Features and Overlap Estimation	Lin Li et.al.	2302.14511v2	link
2023-02-28	PCR-CG: Point Cloud Registration via Deep Color and Geometry	Yu Zhang et.al.	2302.14418v1	link
2023-02-28	Efficient Implicit Neural Reconstruction Using LiDAR	Dongyu Yan et.al.	2302.14363v1	link
2023-02-25	Accurate Gaussian Process Distance Fields with applications to Echolocation and Mapping	Cedric Le Gentil et.al.	2302.13005v1	null
2023-02-14	Point Cloud Registration for LiDAR and Photogrammetric Data: a Critical Synthesis and Performance Analysis on Classic and Deep Learning Algorithms	Ningli Xu et.al.	2302.07184v1	link

(back to top)

Point Cloud Segmentation

Publish Date	Title	Authors	PDF	Code
2025-03-07	Joint 3D Point Cloud Segmentation using Real-Sim Loop: From Panels to Trees and Branches	Tian Qiu et.al.	2503.05630v1	null
2025-03-05	Label-Efficient LiDAR Semantic Segmentation with 2D-3D Vision Transformer Adapters	Julia Hindel et.al.	2503.03299v1	null
2025-03-01	Explainable LiDAR 3D Point Cloud Segmentation and Clustering for Detecting Airplane-Generated Wind Turbulence	Zhan Qu et.al.	2503.00518v1	null
2025-02-26	PFSD: A Multi-Modal Pedestrian-Focus Scene Dataset for Rich Tasks in Semi-Structured Environments	Yueting Liu et.al.	2502.15342v3	link
2025-02-18	An Experimental Study of SOTA LiDAR Segmentation Models	Bike Chen et.al.	2502.12860v1	null
2025-01-30	Ground Awareness in Deep Learning for Large Outdoor Point Cloud Segmentation	Kevin Qiu et.al.	2501.18246v1	null
2025-01-29	3DSES: an indoor Lidar point cloud segmentation dataset with real and pseudo-labels from a 3D model	Maxime Mérizette et.al.	2501.17534v1	null
2025-01-24	LiDAR-Based Vehicle Detection and Tracking for Autonomous Racing	Marcello Cellina et.al.	2501.14502v1	null
2025-01-06	The 2nd Place Solution from the 3D Semantic Segmentation Track in the 2024 Waymo Open Dataset Challenge	Qing Wu et.al.	2501.05472v1	null
2025-01-03	MRG: A Multi-Robot Manufacturing Digital Scene Generation Method Using Multi-Instance Point Cloud Registration	Songjie Han et.al.	2501.02041v1	null
2025-01-18	Impact of color and mixing proportion of synthetic point clouds on semantic segmentation	Shaojie Zhou et.al.	2412.19145v2	link
2024-12-02	The Bare Necessities: Designing Simple, Effective Open-Vocabulary Scene Graphs	Christina Kassab et.al.	2412.01539v1	null
2024-11-30	Density-aware Global-Local Attention Network for Point Cloud Segmentation	Chade Li et.al.	2412.00489v1	null
2024-11-28	Textured As-Is BIM via GIS-informed Point Cloud Segmentation	Mohamed S. H. Alabassy et.al.	2411.18898v1	null
2024-11-27	Towards Cross-device and Training-free Robotic Grasping in 3D Open World	Weiguang Zhao et.al.	2411.18133v1	null
2024-11-20	BelHouse3D: A Benchmark Dataset for Assessing Occlusion Robustness in 3D Point Cloud Semantic Segmentation	Umamaheswaran Raman Kumar et.al.	2411.13251v1	null
2024-11-13	Biomass phenotyping of oilseed rape through UAV multi-view oblique imaging with 3DGS and SAM model	Yutao Shen et.al.	2411.08453v1	null
2024-11-13	Multiscale Graph Construction Using Non-local Cluster Features	Reina Kaneko et.al.	2411.08371v1	null
2024-10-30	Automated Image-Based Identification and Consistent Classification of Fire Patterns with Quantitative Shape Analysis and Spatial Location Identification	Pengkun Liu et.al.	2410.23105v1	null
2024-11-03	Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation	Zhaochong An et.al.	2410.22489v2	null
2024-10-28	Exploring contextual modeling with linear complexity for point cloud segmentation	Yong Xien Chng et.al.	2410.21211v1	null
2024-10-14	Generalizable Humanoid Manipulation with Improved 3D Diffusion Policies	Yanjie Ze et.al.	2410.10803v1	link
2024-10-09	Evaluating the Impact of Point Cloud Colorization on Semantic Segmentation Accuracy	Qinfeng Zhu et.al.	2410.06725v1	null
2024-09-24	Underground Mapping and Localization Based on Ground-Penetrating Radar	Jinchang Zhang et.al.	2409.16446v1	null
2024-09-22	Lidar Panoptic Segmentation in an Open World	Anirudh S Chakravarthy et.al.	2409.14273v1	link
2024-09-03	When 3D Partial Points Meets SAM: Tooth Point Cloud Segmentation with Sparse Labels	Yifan Liu et.al.	2409.01691v1	null
2024-09-03	Efficiently Expanding Receptive Fields: Local Split Attention and Parallel Aggregation for Enhanced Large-scale Point Cloud Semantic Segmentation	Haodong Wang et.al.	2409.01662v1	null
2024-08-29	Towards Modality-agnostic Label-efficient Segmentation with Entropy-Regularized Distribution Alignment	Liyao Tang et.al.	2408.16520v1	link
2024-08-21	GSTran: Joint Geometric and Semantic Coherence for Point Cloud Segmentation	Abiao Li et.al.	2408.11558v1	link
2024-08-02	Trainable Pointwise Decoder Module for Point Cloud Segmentation	Bike Chen et.al.	2408.01548v1	null
2024-07-31	Fine-grained Metrics for Point Cloud Semantic Segmentation	Zhuheng Lu et.al.	2407.21289v1	null
2024-07-19	Scale Disparity of Instances in Interactive Point Cloud Segmentation	Chenrui Han et.al.	2407.14009v1	null
2024-07-18	SegPoint: Segment Any Point Cloud via Large Language Model	Shuting He et.al.	2407.13761v1	null
2024-07-17	Dual-level Adaptive Self-Labeling for Novel Class Discovery in Point Cloud Segmentation	Ruijie Xu et.al.	2407.12489v1	link
2024-07-17	HGL: Hierarchical Geometry Learning for Test-time Adaptation in 3D Point Cloud Segmentation	Tianpei Zou et.al.	2407.12387v1	link
2024-07-17	Serialized Point Mamba: A Serialized Point Cloud Mamba Segmentation Model	Tao Wang et.al.	2407.12319v1	null
2024-07-12	Uplifting Range-View-based 3D Semantic Segmentation in Real-Time with Multi-Sensor Fusion	Shiqi Tan et.al.	2407.09697v1	null
2024-07-01	fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial Intelligence	Francis Williams et.al.	2407.01781v1	null
2024-06-25	Mamba24/8D: Enhancing Global Interaction in Point Clouds via State Space Model	Zhuoyuan Li et.al.	2406.17442v1	null
2024-08-04	Twin Deformable Point Convolutions for Point Cloud Semantic Segmentation in Remote Sensing Scenes	Yong-Qiang Mao et.al.	2405.19735v2	null
2024-05-24	3D Unsupervised Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving	Boyi Sun et.al.	2405.15286v1	link
2024-05-25	Filling Missing Values Matters for Range Image-Based Point Cloud Segmentation	Bike Chen et.al.	2405.10175v2	null
2024-04-16	ECLAIR: A High-Fidelity Aerial LiDAR Dataset for Semantic Segmentation	Iaroslav Melekhov et.al.	2404.10699v1	link
2024-04-04	OpenNeRF: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features and Rendered Novel Views	Francis Engelmann et.al.	2404.03650v1	null
2024-03-28	RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation	Chongkai Gao et.al.	2403.19460v1	null
2024-05-30	CurbNet: Curb Detection Framework Based on LiDAR Point Cloud Segmentation	Guoyang Zhao et.al.	2403.16794v2	link
2024-03-18	EffiPerception: an Efficient Framework for Various Perception Tasks	Xinhao Xiang et.al.	2403.12317v1	null
2024-03-11	3DRef: 3D Dataset and Benchmark for Reflection Detection in RGB and Lidar Data	Xiting Zhao et.al.	2403.06538v1	null
2024-03-11	Refining Segmentation On-the-Fly: An Interactive Framework for Point Cloud Semantic Segmentation	Peng Zhang et.al.	2403.06401v1	null
2024-03-03	Region-Transformer: Self-Attention Region Based Class-Agnostic Point Cloud Segmentation	Dipesh Gyawali et.al.	2403.01407v1	null
2024-01-29	Dynamic Prototype Adaptation with Distillation for Few-shot Point Cloud Segmentation	Jie Liu et.al.	2401.16051v1	link
2024-01-19	Symbol as Points: Panoptic Symbol Spotting via Point-based Representation	Wenlong Liu et.al.	2401.10556v1	link
2023-12-29	Multi-modality Affinity Inference for Weakly Supervised 3D Semantic Segmentation	Xiawei Li et.al.	2312.16578v2	link
2023-12-19	Point Cloud Segmentation Using Transfer Learning with RandLA-Net: A Case Study on Urban Areas	Alperen Enes Bayar et.al.	2312.11880v1	null
2023-12-15	T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning	Weijie Wei et.al.	2312.10217v1	link
2023-12-14	FAPP: Fast and Adaptive Perception and Planning for UAVs in Dynamic Cluttered Environments	Minghao Lu et.al.	2312.08743v1	null
2023-12-12	Transferring CLIP's Knowledge into Zero-Shot Point Cloud Semantic Segmentation	Yuanbin Wang et.al.	2312.07221v1	null
2023-12-11	Densify Your Labels: Unsupervised Clustering with Bipartite Matching for Weakly Supervised Point Cloud Segmentation	Shaobo Xia et.al.	2312.06799v1	null
2024-01-15	Provable Adversarial Robustness for Group Equivariant Tasks: Graphs, Point Clouds, Molecules, and More	Jan Schuchardt et.al.	2312.02708v2	null
2023-11-24	OneFormer3D: One Transformer for Unified Point Cloud Segmentation	Maxim Kolodiazhnyi et.al.	2311.14405v1	null
2023-11-18	DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields	Yu Chi et.al.	2311.12063v1	link
2023-11-10	U3DS $^3$ : Unsupervised 3D Semantic Scene Segmentation	Jiaxu Liu et.al.	2311.06018v1	null
2023-11-06	Leveraging Large-Scale Pretrained Vision Foundation Models for Label-Efficient 3D Point Cloud Segmentation	Shichao Dong et.al.	2311.01989v2	null
2023-10-19	2D-3D Interlaced Transformer for Point Cloud Segmentation with Scene-Level Supervision	Cheng-Kun Yang et.al.	2310.12817v1	null
2023-10-11	PointHR: Exploring High-Resolution Architectures for 3D Point Cloud Segmentation	Haibo Qiu et.al.	2310.07743v1	link
2023-09-26	Addressing Data Misalignment in Image-LiDAR Fusion on Point Cloud Segmentation	Wei Jong Yang et.al.	2309.14932v1	null
2023-09-20	Towards Robust Few-shot Point Cloud Semantic Segmentation	Yating Xu et.al.	2309.11228v1	link
2023-09-20	Generalized Few-Shot Point Cloud Segmentation Via Geometric Words	Yating Xu et.al.	2309.11222v1	link
2023-08-29	Compositional Semantic Mix for Domain Adaptation in Point Cloud Segmentation	Cristiano Saltori et.al.	2308.14619v2	link
2023-08-22	Hierarchical Point-based Active Learning for Semi-supervised Point Cloud Semantic Segmentation	Zongyi Xu et.al.	2308.11166v1	link
2023-08-14	Autonomous Point Cloud Segmentation for Power Lines Inspection in Smart Grid	Alexander Kyuroson et.al.	2308.07283v1	null
2023-08-08	Boosting Few-shot 3D Point Cloud Segmentation via Query-Guided Enhancement	Zhenhua Ning et.al.	2308.03177v2	link
2023-07-31	pCTFusion: Point Convolution-Transformer Fusion with Semantic Aware Loss for Outdoor LiDAR Point Cloud Segmentation	Abhishek Kuriyal et.al.	2307.14777v2	link
2023-07-27	Clustering based Point Cloud Representation Learning for 3D Analysis	Tuo Feng et.al.	2307.14605v1	link
2023-07-20	See More and Know More: Zero-shot Point Cloud Segmentation via Multi-modal Visual Data	Yuhang Lu et.al.	2307.10782v1	null
2023-07-14	Achelous: A Fast Unified Water-surface Panoptic Perception Framework based on Fusion of Monocular Camera and 4D mmWave Radar	Runwei Guan et.al.	2307.07102v1	link
2023-07-08	BPNet: Bézier Primitive Segmentation on 3D Point Clouds	Rao Fu et.al.	2307.04013v1	link
2023-06-28	Point2Point : A Framework for Efficient Deep Learning on Hilbert sorted Point Clouds with applications in Spatio-Temporal Occupancy Prediction	Athrva Atul Pandhare et.al.	2306.16306v1	null
2023-05-30	Dynamic Clustering Transformer Network for Point Cloud Segmentation	Dening Lu et.al.	2306.08073v1	null
2023-05-23	Prototype Adaption and Projection for Few- and Zero-shot 3D Point Cloud Semantic Segmentation	Shuting He et.al.	2305.14335v1	link
2023-05-22	Contrastive Predictive Autoencoders for Dynamic Point Cloud Self-Supervised Learning	Xiaoxiao Sheng et.al.	2305.12959v1	null
2023-05-17	Tinto: Multisensor Benchmark for 3D Hyperspectral Point Cloud Segmentation in the Geosciences	Ahmed J. Afifi et.al.	2305.09928v1	null
2023-05-08	OctFormer: Octree-based Transformers for 3D Point Clouds	Peng-Shuai Wang et.al.	2305.03045v2	link
2023-05-22	Urban GeoBIM construction by integrating semantic LiDAR point clouds with as-designed BIM models	Jie Shao et.al.	2304.11719v2	null
2023-04-22	Knowledge Distillation from 3D to Bird's-Eye-View for LiDAR Semantic Segmentation	Feng Jiang et.al.	2304.11393v1	link
2023-06-02	Transformer-Based Visual Segmentation: A Survey	Xiangtai Li et.al.	2304.09854v2	link
2023-04-11	Feature-assisted interactive geometry reconstruction in 3D point clouds using incremental region growing	Attila Szabo et.al.	2304.05109v1	null

(back to top)

Zero-shot

Publish Date	Title	Authors	PDF	Code
2025-03-11	Exploring the Word Sense Disambiguation Capabilities of Large Language Models	Pierpaolo Basile et.al.	2503.08662v1	null
2025-03-11	CellStyle: Improved Zero-Shot Cell Segmentation via Style Transfer	Rüveyda Yilmaz et.al.	2503.08603v1	null
2025-03-11	NSF-SciFy: Mining the NSF Awards Database for Scientific Claims	Delip Rao et.al.	2503.08600v1	null
2025-03-11	MMRL: Multi-Modal Representation Learning for Vision-Language Models	Yuncheng Guo et.al.	2503.08497v1	link
2025-03-11	Controlling Latent Diffusion Using Latent CLIP	Jason Becker et.al.	2503.08455v1	null
2025-03-11	Embodied Crowd Counting	Runling Long et.al.	2503.08367v1	null
2025-03-11	Reasoning in visual navigation of end-to-end trained agents: a dynamical systems approach	Steeven Janny et.al.	2503.08306v1	null
2025-03-12	Large Language Model as Meta-Surrogate for Data-Driven Many-Task Optimization: A Proof-of-Principle Study	Xian-Rong Zhang et.al.	2503.08301v2	null
2025-03-11	Investigating the Effectiveness of a Socratic Chain-of-Thoughts Reasoning Method for Task Planning in Robotics, A Case Study	Veronica Bot et.al.	2503.08174v1	null
2025-03-12	Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning	Lizhen Xu et.al.	2503.08101v2	link
2025-03-10	PE3R: Perception-Efficient 3D Reconstruction	Jie Hu et.al.	2503.07507v1	null
2025-03-10	Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts	Shiu-hong Kao et.al.	2503.07503v1	null
2025-03-10	LLaVA-RadZ: Can Multimodal Large Language Models Effectively Tackle Zero-shot Radiology Recognition?	Bangyan Li et.al.	2503.07487v1	null
2025-03-10	YOLOE: Real-Time Seeing Anything	Ao Wang et.al.	2503.07465v1	link
2025-03-10	REF-VLM: Triplet-Based Referring Paradigm for Unified Visual Decoding	Yan Tai et.al.	2503.07413v1	link
2025-03-10	Dynamic Path Navigation for Motion Agents with LLM Reasoning	Yubo Zhao et.al.	2503.07323v1	null
2025-03-10	Automatic Curriculum Design for Zero-Shot Human-AI Coordination	Won-Sang You et.al.	2503.07275v1	null
2025-03-11	AnomalyPainter: Vision-Language-Diffusion Synergy for Zero-Shot Realistic and Diverse Industrial Anomaly Synthesis	Zhangyu Lai et.al.	2503.07253v2	null
2025-03-10	Cross-Lingual IPA Contrastive Learning for Zero-Shot NER	Jimin Sohn et.al.	2503.07214v1	null
2025-03-10	A Zero-shot Learning Method Based on Large Language Models for Multi-modal Knowledge Graph Embedding	Bingchen Liu et.al.	2503.07202v1	null
2025-03-10	Multi-Modal 3D Mesh Reconstruction from Images and Text	Melvin Reka et.al.	2503.07190v1	null
2025-03-10	Generative AI in Transportation Planning: A Survey	Longchao Da et.al.	2503.07158v1	null
2025-03-07	Joint 3D Point Cloud Segmentation using Real-Sim Loop: From Panels to Trees and Branches	Tian Qiu et.al.	2503.05630v1	null
2025-03-07	InDRiVE: Intrinsic Disagreement based Reinforcement for Vehicle Exploration through Curiosity Driven Generalized World Model	Feeza Khan Khanzada et.al.	2503.05573v1	null
2025-03-07	Stereo Any Video: Temporally Consistent Stereo Matching	Junpeng Jing et.al.	2503.05549v1	null
2025-03-07	Data-Efficient Generalization for Zero-shot Composed Image Retrieval	Zining Chen et.al.	2503.05204v1	null
2025-03-06	Leveraging Domain Knowledge at Inference Time for LLM Translation: Retrieval versus Generation	Bryan Li et.al.	2503.05010v1	null
2025-03-06	Adapt3R: Adaptive 3D Scene Representation for Domain Transfer in Imitation Learning	Albert Wilcox et.al.	2503.04877v1	null
2025-03-06	Memory Is All You Need: Testing How Model Memory Affects LLM Performance in Annotation Tasks	Joan C. Timoneda et.al.	2503.04874v1	null
2025-03-06	Enough Coin Flips Can Make LLMs Act Bayesian	Ritwik Gupta et.al.	2503.04722v1	null
2025-03-06	A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image Captioning	Qing Zhou et.al.	2503.04592v1	null
2025-03-06	SRSA: Skill Retrieval and Adaptation for Robotic Assembly Tasks	Yijie Guo et.al.	2503.04538v1	null
2025-03-06	AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM	Sunghyun Ahn et.al.	2503.04504v1	null
2025-03-06	Semantic Alignment of Unimodal Medical Text and Vision Representations	Maxime Di Folco et.al.	2503.04478v1	null
2025-03-06	EvidMTL: Evidential Multi-Task Learning for Uncertainty-Aware Semantic Surface Mapping from Monocular RGB Images	Rohit Menon et.al.	2503.04441v1	null
2025-03-06	A Dataset for Analysing News Framing in Chinese Media	Owen Cook et.al.	2503.04439v1	null
2025-03-06	Comparative Study of Zero-Shot Cross-Lingual Transfer for Bodo POS and NER Tagging Using Gemini 2.0 Flash Thinking Experimental Model	Sanjib Narzary et.al.	2503.04405v1	null
2025-03-06	Large Language Models for Zero-shot Inference of Causal Structures in Biology	Izzy Newsham et.al.	2503.04347v1	null
2025-03-06	Bridging the Vision-Brain Gap with an Uncertainty-Aware Blur Prior	Haitao Wu et.al.	2503.04207v1	null
2025-03-05	OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction	Huang Huang et.al.	2503.03734v1	null
2025-03-05	CLIP is Strong Enough to Fight Back: Test-time Counterattacks towards Zero-shot Adversarial Robustness of CLIP	Songlong Xing et.al.	2503.03613v1	link
2025-03-05	Scaling Crowdsourced Election Monitoring: Construction and Evaluation of Classification Models for Multilingual and Cross-Domain Classification Settings	Jabez Magomere et.al.	2503.03582v1	null
2025-03-05	iNews: A Multimodal Dataset for Modeling Personalized Affective Responses to News	Tiancheng Hu et.al.	2503.03335v1	null
2025-03-04	ArticuBot: Learning Universal Articulated Object Manipulation Policy via Large Scale Simulation	Yufei Wang et.al.	2503.03045v1	null
2025-03-04	Zero-Shot Multi-Label Classification of Bangla Documents: Large Decoders Vs. Classic Encoders	Souvika Sarkar et.al.	2503.02993v1	null
2025-03-04	RAILGUN: A Unified Convolutional Policy for Multi-Agent Path Finding Across Different Environments and Tasks	Yimin Tang et.al.	2503.02992v1	null
2025-03-06	Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training	Vaibhav Singh et.al.	2503.02844v2	null
2025-03-04	SeqFusion: Sequential Fusion of Pre-Trained Models for Zero-Shot Time-Series Forecasting	Ting-Ji Huang et.al.	2503.02836v1	link
2025-03-04	Bridging VLM and KMP: Enabling Fine-grained robotic manipulation via Semantic Keypoints Representation	Junjie Zhu et.al.	2503.02748v1	null
2025-03-04	Evaluating Knowledge Generation and Self-Refinement Strategies for LLM-based Column Type Annotation	Keti Korini et.al.	2503.02718v1	null
2025-03-04	FlowPlan: Zero-Shot Task Planning with LLM Flow Engineering for Robotic Instruction Following	Zijun Lin et.al.	2503.02698v1	null
2025-03-04	Zero-Shot Complex Question-Answering on Long Scientific Documents	Wanting Wang et.al.	2503.02695v1	null
2025-03-04	Towards Event Extraction with Massive Types: LLM-based Collaborative Annotation and Partitioning Extraction	Wenxuan Liu et.al.	2503.02628v1	null
2025-03-04	Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly Detection	Wei Luo et.al.	2503.02424v1	null
2025-03-04	EchoQA: A Large Collection of Instruction Tuning Data for Echocardiogram Reports	Lama Moukheiber et.al.	2503.02365v1	null
2025-03-04	Towards Explainable Doctor Recommendation with Large Language Models	Ziyang Zeng et.al.	2503.02298v1	null
2025-02-28	Assessing zero-shot generalisation behaviour in graph-neural-network interatomic potentials	Chiheb Ben Mahmoud et.al.	2502.21317v1	null
2025-02-28	LesionLocator: Zero-Shot Universal Tumor Segmentation and Tracking in 3D Whole-Body Imaging	Maximilian Rokuss et.al.	2502.20985v1	null
2025-02-28	WebFAQ: A Multilingual Collection of Natural Q&A Datasets for Dense Retrieval	Michael Dinzinger et.al.	2502.20936v1	null
2025-02-28	Less is More? Revisiting the Importance of Frame Rate in Real-Time Zero-Shot Surgical Video Segmentation	Utku Ozbulak et.al.	2502.20934v1	null
2025-02-28	DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping	Yifan Zhong et.al.	2502.20900v1	null
2025-02-28	Better Benchmarking LLMs for Zero-Shot Dependency Parsing	Ana Ezquerro et.al.	2502.20866v1	null
2025-02-28	MESC-3D:Mining Effective Semantic Cues for 3D Reconstruction from a Single Image	Shaoming Li et.al.	2502.20861v1	null
2025-02-28	Hierarchical and Modular Network on Non-prehensile Manipulation in General Environments	Yoonyoung Cho et.al.	2502.20843v1	null
2025-02-28	CoTMR: Chain-of-Thought Multi-Scale Reasoning for Training-Free Zero-Shot Composed Image Retrieval	Zelong Sun et.al.	2502.20826v1	null
2025-02-28	MedHallTune: An Instruction-Tuning Benchmark for Mitigating Medical Hallucination in Vision-Language Models	Qiao Yan et.al.	2502.20780v1	null
2025-02-27	InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions	Sirui Xu et.al.	2502.20390v1	null
2025-02-27	Physics-Driven Data Generation for Contact-Rich Manipulation via Trajectory Optimization	Lujie Yang et.al.	2502.20382v1	null
2025-02-27	Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners	Daniele Paliotta et.al.	2502.20339v1	null
2025-02-27	UniTok: A Unified Tokenizer for Visual Generation and Understanding	Chuofan Ma et.al.	2502.20321v1	link
2025-02-27	FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction	Siyu Jiao et.al.	2502.20313v1	link
2025-02-27	Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription	Benjamin Gutteridge et.al.	2502.20295v1	link
2025-02-27	Visual Adaptive Prompting for Compositional Zero-Shot Learning	Kyle Stein et.al.	2502.20292v1	null
2025-02-27	An Extensive Evaluation of PDDL Capabilities in off-the-shelf LLMs	Kaustubh Vyas et.al.	2502.20175v1	null
2025-02-27	Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models	Itay Benou et.al.	2502.20134v1	null
2025-02-27	UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler	Luigi Piccinelli et.al.	2502.20110v1	link
2025-02-26	ImageChain: Advancing Sequential Image-to-Text Reasoning in Multimodal Large Language Models	Danae Sánchez Villegas et.al.	2502.19409v1	null
2025-02-26	Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator	Xiankang He et.al.	2502.19204v1	link
2025-02-26	A Sliding Layer Merging Method for Efficient Depth-Wise Pruning in LLMs	Xuan Ding et.al.	2502.19159v1	null
2025-02-26	A Survey on Foundation-Model-Based Industrial Defect Detection	Tianle Yang et.al.	2502.19106v1	null
2025-02-26	Foundation Inference Models for Stochastic Differential Equations: A Transformer-based Approach for Zero-shot Function Estimation	Patrick Seifner et.al.	2502.19049v1	null
2025-02-26	FungalZSL: Zero-Shot Fungal Classification with Image Captioning Using a Synthetic Data Approach	Anju Rani et.al.	2502.19038v1	null
2025-02-26	Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis	Ziyue Jiang et.al.	2502.18924v1	null
2025-02-26	Think on your feet: Seamless Transition between Human-like Locomotion in Response to Changing Commands	Huaxing Huang et.al.	2502.18901v1	null
2025-02-26	Hierarchical corpus encoder: Fusing generative retrieval and dense indices	Tongfei Chen et.al.	2502.18877v1	null
2025-02-26	Data-Efficient Multi-Agent Spatial Planning with LLMs	Huangyuan Su et.al.	2502.18822v1	null
2025-02-25	Evaluating the Effectiveness of Small Language Models in Detecting Refactoring Bugs	Rohit Gheyi et.al.	2502.18454v1	null
2025-02-25	LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation	Pengzhi Li et.al.	2502.18302v1	null
2025-02-25	Synthesizing Consistent Novel Views via 3D Epipolar Attention without Re-Training	Botao Ye et.al.	2502.18219v1	null
2025-02-25	Task-Agnostic Semantic Communication with Multimodal Foundation Models	Jiangjing Hu et.al.	2502.18200v1	null
2025-02-25	CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification	Mingkun Zhang et.al.	2502.18176v1	link
2025-02-25	Progressive Local Alignment for Medical Multimodal Pre-training	Huimin Yan et.al.	2502.18047v1	null
2025-02-26	From planning to policy: distilling $\texttt{Skill-RRT}$ for long-horizon prehensile and non-prehensile manipulation	Haewon Jung et.al.	2502.18015v2	null
2025-02-25	Knowledge-enhanced Multimodal ECG Representation Learning with Arbitrary-Lead Inputs	Che Liu et.al.	2502.17900v1	null
2025-02-25	FetchBot: Object Fetching in Cluttered Shelves via Zero-Shot Sim2Real	Weiheng Liu et.al.	2502.17894v1	null
2025-02-25	UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting	Haoyuan Li et.al.	2502.17860v1	null
2025-02-24	X-Dancer: Expressive Music to Human Dance Video Generation	Zeyuan Chen et.al.	2502.17414v1	null
2025-02-24	FIG: Forward-Inverse Generation for Low-Resource Domain-specific Event Detection	Tanmay Parekh et.al.	2502.17394v1	null
2025-02-24	Improving the Inclusivity of Dutch Speech Recognition by Fine-tuning Whisper on the JASMIN-CGN Corpus	Golshid Shekoufandeh et.al.	2502.17284v1	null
2025-02-24	VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing	Xiangpeng Yang et.al.	2502.17258v1	null
2025-02-24	Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search	Boyan Li et.al.	2502.17248v1	null
2025-02-24	A Reinforcement Learning Approach to Non-prehensile Manipulation through Sliding	Hamidreza Raei et.al.	2502.17221v1	null
2025-02-24	DUNIA: Pixel-Sized Embeddings via Cross-Modal Alignment for Earth Observation Applications	Ibrahim Fayad et.al.	2502.17066v1	null
2025-02-24	LLM-QE: Improving Query Expansion by Aligning Large Language Models with Ranking Preferences	Sijia Yao et.al.	2502.17057v1	link
2025-02-24	MA2RL: Masked Autoencoders for Generalizable Multi-Agent Reinforcement Learning	Jinyuan Feng et.al.	2502.17046v1	null
2025-02-24	Reasoning Does Not Necessarily Improve Role-Playing Ability	Xiachong Feng et.al.	2502.16940v1	null
2025-02-21	ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval	Guanqi Zhan et.al.	2502.15682v1	null
2025-02-21	One-step Diffusion Models with $f$ -Divergence Distribution Matching	Yilun Xu et.al.	2502.15681v1	null
2025-02-21	Pick-and-place Manipulation Across Grippers Without Retraining: A Learning-optimization Diffusion Policy Approach	Xiangtong Yao et.al.	2502.15613v1	link
2025-02-21	FaultGPT: Industrial Fault Diagnosis Question Answering System by Vision Language Models	Jiao Chen et.al.	2502.15481v1	null
2025-02-21	Multimodal Graph-Based Variational Mixture of Experts Network for Zero-Shot Multimodal Information Extraction	Baohang Zhou et.al.	2502.15290v1	link
2025-02-21	From Documents to Dialogue: Building KG-RAG Enhanced AI Assistants	Manisha Mukherjee et.al.	2502.15237v1	null
2025-02-21	GNN-Coder: Boosting Semantic Code Retrieval with Combined GNNs and Transformer	Yufan Ye et.al.	2502.15202v1	null
2025-02-21	Extreme Speech Classification in the Era of LLMs: Exploring Open-Source and Proprietary Models	Sarthak Mahajan et.al.	2502.15155v1	null
2025-02-20	A Meta-Evaluation of Style and Attribute Transfer Metrics	Amalie Brogaard Pauli et.al.	2502.15022v1	null
2025-02-20	Using tournaments to calculate AUROC for zero-shot classification with LLMs	Wonjin Yoon et.al.	2502.15018v1	null
2025-02-20	Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models	Vlad Sobal et.al.	2502.14819v1	null
2025-02-20	Dynamic Low-Rank Sparse Adaptation for Large Language Models	Weizhong Huang et.al.	2502.14816v1	link
2025-02-20	RendBEV: Semantic Novel View Synthesis for Self-Supervised Bird's Eye View Segmentation	Henrique Piñeiro Monteagudo et.al.	2502.14792v1	null
2025-02-20	SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features	Michael Tschannen et.al.	2502.14786v1	link
2025-02-20	Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective	Weizhong Huang et.al.	2502.14770v1	null
2025-02-20	Entity Framing and Role Portrayal in the News	Tarek Mahmoud et.al.	2502.14718v1	null
2025-02-20	Exploring RWKV for Sentence Embeddings: Layer-wise Analysis and Baseline Comparison for Semantic Similarity	Xinghan Pan et.al.	2502.14620v1	link
2025-02-20	Noisy Test-Time Adaptation in Vision-Language Models	Chentao Cao et.al.	2502.14604v1	link
2025-02-20	LLM-based User Profile Management for Recommender System	Seunghwan Bang et.al.	2502.14541v1	null
2025-02-20	Generative adversarial networks vs large language models: a comparative study on synthetic tabular data generation	Austin A. Barr et.al.	2502.14523v1	link
2025-02-20	Where's the Bug? Attention Probing for Scalable Fault Localization	Adam Stein et.al.	2502.13966v2	null
2025-02-19	A Training-Free Framework for Precise Mobile Manipulation of Small Everyday Objects	Arjun Gupta et.al.	2502.13964v1	null
2025-02-19	NavigateDiff: Visual Predictors are Zero-Shot Navigation Assistants	Yiran Qin et.al.	2502.13894v1	null
2025-02-19	Quantifying Memorization and Retriever Performance in Retrieval-Augmented Vision-Language Models	Peter Carragher et.al.	2502.13836v1	null
2025-02-19	MMTEB: Massive Multilingual Text Embedding Benchmark	Kenneth Enevoldsen et.al.	2502.13595v1	null
2025-02-19	Extracting Social Connections from Finnish Karelian Refugee Interviews Using LLMs	Joonatan Laato et.al.	2502.13566v1	null
2025-02-19	PLDR-LLMs Learn A Generalizable Tensor Operator That Can Replace Its Own Deep Neural Net At Inference	Burc Gokden et.al.	2502.13502v1	link
2025-02-19	Enhancing Chest X-ray Classification through Knowledge Injection in Cross-Modality Learning	Yang Yan et.al.	2502.13447v1	null
2025-02-19	MaizeEar-SAM: Zero-Shot Maize Ear Phenotyping	Hossein Zaremehrjerdi et.al.	2502.13399v1	link
2025-02-19	$\mathtt{GeLLM^3O}$ : Generalizing Large Language Models for Multi-property Molecule Optimization	Vishal Dey et.al.	2502.13398v1	link
2025-02-18	LAMD: Context-driven Android Malware Detection and Classification with LLMs	Xingzhi Qian et.al.	2502.13055v1	null
2025-02-18	Detection and Geographic Localization of Natural Objects in the Wild: A Case Study on Palms	Kangning Cui et.al.	2502.13023v1	null
2025-02-18	A Survey of Text Classification Under Class Distribution Shift	Adriana Valentina Costache et.al.	2502.12965v1	null
2025-02-18	Performance of Zero-Shot Time Series Foundation Models on Cloud Data	William Toner et.al.	2502.12944v1	null
2025-02-18	Commonsense Reasoning in Arab Culture	Abdelrahman Sadallah et.al.	2502.12788v1	null
2025-02-18	High-Fidelity Novel View Synthesis via Splatting-Guided Diffusion	Xiang Zhang et.al.	2502.12752v1	null
2025-02-18	Self-Enhanced Reasoning Training: Activating Latent Reasoning in Small Models for Enhanced Reasoning Distillation	Yong Zhang et.al.	2502.12744v1	null
2025-02-18	SATA: Safe and Adaptive Torque-Based Locomotion Policies Inspired by Animal Learning	Peizhuo Li et.al.	2502.12674v1	null
2025-02-18	Label Drop for Multi-Aspect Relation Modeling in Universal Information Extraction	Lu Yang et.al.	2502.12614v1	link
2025-02-18	Enhancing Semi-supervised Learning with Noisy Zero-shot Pseudolabels	Jichan Chung et.al.	2502.12584v1	null
2025-02-17	SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs	Yige Xu et.al.	2502.12134v1	null
2025-02-17	Can LLMs Simulate Social Media Engagement? A Study on Action-Guided Response Generation	Zhongyi Qiu et.al.	2502.12073v1	null
2025-02-17	Model Generalization on Text Attribute Graphs: Principles with Large Language Models	Haoyu Wang et.al.	2502.11836v1	link
2025-02-17	Text Classification in the LLM Era - Where do we stand?	Sowmya Vajjala et.al.	2502.11830v1	null
2025-02-17	video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model	Guangzhi Sun et.al.	2502.11775v1	null
2025-02-17	Multi-Modal Retrieval Augmentation for Open-Ended and Knowledge-Intensive Video Question Answering	Md Zarif Ul Alam et.al.	2502.11747v1	null
2025-02-17	Adversarially Robust CLIP Models Can Induce Better (Robust) Perceptual Metrics	Francesco Croce et.al.	2502.11725v1	link
2025-02-17	MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction	Jingcheng Ni et.al.	2502.11663v1	link
2025-02-17	Language Complexity Measurement as a Noisy Zero-Shot Proxy for Evaluating LLM Performance	Birger Moell et.al.	2502.11578v1	null
2025-02-17	Improving Rare-Word Recognition in Zero-Shot Settings	Yash Jogi et.al.	2502.11572v1	null
2025-02-14	Aspect-Oriented Summarization for Psychiatric Short-Term Readmission Prediction	WonJin Yoon et.al.	2502.10388v1	null
2025-02-14	SPIRIT: Short-term Prediction of solar IRradIance for zero-shot Transfer learning using Foundation Models	Aditya Mishra et.al.	2502.10307v1	null
2025-02-14	Are Large Language Models the future crowd workers of Linguistics?	Iris Ferrazzo et.al.	2502.10266v1	null
2025-02-14	Large Language Models and Synthetic Data for Monitoring Dataset Mentions in Research Papers	Aivin V. Solatorio et.al.	2502.10263v1	null
2025-02-14	PromptArtisan: Multi-instruction Image Editing in Single Pass with Complete Attention Control	Kunal Swami et.al.	2502.10258v1	null
2025-02-14	Cooperative Multi-Agent Planning with Adaptive Skill Synthesis	Zhiyuan Li et.al.	2502.10148v1	null
2025-02-14	AutoS $^2$ earch: Unlocking the Reasoning Potential of Large Models for Web-based Source Search	Zhengqiu Zhu et.al.	2502.09913v1	null
2025-02-14	Artificial Intelligence in Spectroscopy: Advancing Chemistry from Prediction to Generation and Beyond	Kehan Guo et.al.	2502.09897v1	null
2025-02-13	Evaluating GPT's Capability in Identifying Stages of Cognitive Impairment from Electronic Health Data	Yu Leng et.al.	2502.09715v1	null
2025-02-13	Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights	Jonathan Kahana et.al.	2502.09619v1	null
2025-02-13	Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs	Siyan Zhao et.al.	2502.09597v1	link
2025-02-14	Optimizing GPT for Video Understanding: Zero-Shot Performance and Prompt Engineering	Mark Beliaev et.al.	2502.09573v2	null
2025-02-13	Zero-shot generation of synthetic neurosurgical data with large language models	Austin A. Barr et.al.	2502.09566v1	link
2025-02-13	AnomalyGFM: Graph Foundation Model for Zero/Few-shot Anomaly Detection	Hezhe Qiao et.al.	2502.09254v1	null
2025-02-13	E-MD3C: Taming Masked Diffusion Transformers for Efficient Zero-Shot Object Customization	Trung X. Pham et.al.	2502.09164v1	null
2025-02-13	Zero-shot Concept Bottleneck Models	Shin'ya Yamaguchi et.al.	2502.09018v1	link
2025-02-13	Hope vs. Hate: Understanding User Interactions with LGBTQ+ News Content in Mainstream US News Media through the Lens of Hope Speech	Jonathan Pofcher et.al.	2502.09004v1	null
2025-02-13	Tuning-Free Personalized Alignment via Trial-Error-Explain In-Context Learning	Hyundong Cho et.al.	2502.08972v1	null
2025-02-12	MuJoCo Playground	Kevin Zakka et.al.	2502.08844v1	null
2025-02-12	Re $^3$ Sim: Generating High-Fidelity Simulation Data via 3D-Photorealistic Real-to-Sim for Robotic Manipulation	Xiaoshen Han et.al.	2502.08645v1	null
2025-02-12	Rhythmic sharing: A bio-inspired paradigm for zero-shot adaptation and learning in neural networks	Hoony Kang et.al.	2502.08644v1	link
2025-02-12	From Haystack to Needle: Label Space Reduction for Zero-shot Classification	Nathan Vandemoortele et.al.	2502.08436v1	null
2025-02-12	Salience-Invariant Consistent Policy Learning for Generalization in Visual Reinforcement Learning	Sun Jingbo et.al.	2502.08336v1	null
2025-02-12	FixDrive: Automatically Repairing Autonomous Vehicle Driving Behaviour for $0.08 per Violation	Yang Sun et.al.	2502.08260v1	link
2025-02-12	HuDEx: Integrating Hallucination Detection and Explainability for Enhancing the Reliability of LLM responses	Sujeong Lee et.al.	2502.08109v1	null
2025-02-12	Franken-Adapter: Cross-Lingual Adaptation of LLMs by Embedding Surgery	Fan Jiang et.al.	2502.08037v1	null
2025-02-11	Towards Zero-Shot Anomaly Detection and Reasoning with Multimodal Large Language Models	Jiacong Xu et.al.	2502.07601v1	null
2025-02-11	LoRP-TTS: Low-Rank Personalized Text-To-Speech	Łukasz Bondaruk et.al.	2502.07562v1	null
2025-02-12	O1 Embedder: Let Retrievers Think Before Action	Ruiran Yan et.al.	2502.07555v2	null
2025-02-11	Advanced Zero-Shot Text-to-Speech for Background Removal and Preservation with Controllable Masked Speech Prediction	Leying Zhang et.al.	2502.07345v1	null
2025-02-11	TRAVEL: Training-Free Retrieval and Alignment for Vision-and-Language Navigation	Navid Rajabi et.al.	2502.07306v1	null
2025-02-11	Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement	Xueyao Zhang et.al.	2502.07243v1	null
2025-02-11	PDV: Prompt Directional Vectors for Zero-shot Composed Image Retrieval	Osman Tursun et.al.	2502.07215v1	null
2025-02-11	Perceived Confidence Scoring for Data Annotation with Zero-Shot LLMs	Sina Salimian et.al.	2502.07186v1	null
2025-02-11	Don't Just Demo, Teach Me the Principles: A Principle-Based Multi-Agent Prompting Strategy for Text Classification	Peipei Wei et.al.	2502.07165v1	null
2025-02-10	Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations	Yong Cao et.al.	2502.07068v1	link
2025-02-10	Visual Agentic AI for Spatial Reasoning with a Dynamic API	Damiano Marsili et.al.	2502.06787v1	null
2025-02-10	Boosting Self-Efficacy and Performance of Large Language Models via Verbal Efficacy Stimulations	Rui Chen et.al.	2502.06669v1	null
2025-02-10	MaterialFusion: High-Quality, Zero-Shot, and Controllable Material Transfer with Diffusion Models	Kamil Garifullin et.al.	2502.06606v1	null
2025-02-10	CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for Zero-Shot Customized Video Diffusion Transformers	D. She et.al.	2502.06527v1	null
2025-02-10	Learning Clustering-based Prototypes for Compositional Zero-shot Learning	Hongyu Qu et.al.	2502.06501v1	link
2025-02-10	Causal Lifting of Neural Representations: Zero-Shot Generalization for Causal Inferences	Riccardo Cadei et.al.	2502.06343v1	null
2025-02-10	Zero-shot Depth Completion via Test-time Alignment with Affine-invariant Depth Prior	Lee Hyoseok et.al.	2502.06338v1	null
2025-02-10	Find Central Dogma Again	Wang Liang et.al.	2502.06253v1	null
2025-02-10	Scaling Public Health Text Annotation: Zero-Shot Learning vs. Crowdsourcing for Improved Efficiency and Labeling Accuracy	Kamyar Kazari et.al.	2502.06150v1	null
2025-02-09	Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning	Bidipta Sarkar et.al.	2502.06060v1	link
2025-02-07	QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation	Yue Zhao et.al.	2502.05178v1	null
2025-02-07	AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360° Unbounded Scene Inpainting	Chung-Ho Wu et.al.	2502.05176v1	null
2025-02-07	DCFormer: Efficient 3D Vision-Language Modeling with Decomposed Convolutions	Gorkem Can Ates et.al.	2502.05091v1	null
2025-02-07	Aligning Black-box Language Models with Human Judgments	Gerrit J. J. van den Burg et.al.	2502.04997v1	null
2025-02-07	OccGS: Zero-shot 3D Occupancy Reconstruction with Semantic and Geometric-Aware Gaussian Splatting	Xiaoyu Zhou et.al.	2502.04981v1	null
2025-02-07	STRIDE: Automating Reward Design, Deep Reinforcement Learning Training and Feedback Optimization in Humanoid Robotics Locomotion	Zhenwei Wu et.al.	2502.04692v1	null
2025-02-07	ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and Reasoning	Yuwei Yin et.al.	2502.04689v1	link
2025-02-07	Mechanistic Understandings of Representation Vulnerabilities and Engineering Robust Vision Transformers	Chashi Mahiul Islam et.al.	2502.04679v1	null
2025-02-06	Zero-shot Meta-learning for Tabular Prediction Tasks with Adversarially Pre-trained Transformer	Yulun Wu et.al.	2502.04573v1	null
2025-02-06	GenVC: Self-Supervised Zero-Shot Voice Conversion	Zexin Cai et.al.	2502.04519v1	null
2025-02-06	ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features	Alec Helbling et.al.	2502.04320v1	link
2025-02-06	Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion	Marco Mistretta et.al.	2502.04263v1	link
2025-02-06	LR0.FM: Low-Resolution Zero-shot Classification Benchmark For Foundation Models	Priyank Pathak et.al.	2502.03950v1	link
2025-02-06	DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation	Dongya Jia et.al.	2502.03930v1	null
2025-02-06	It's All in The [MASK]: Simple Instruction-Tuning Enables BERT-like Masked Language Models As Generative Classifiers	Benjamin Clavié et.al.	2502.03793v1	null
2025-02-05	DynVFX: Augmenting Real Videos with Dynamic Content	Danah Yatim et.al.	2502.03621v1	null
2025-02-05	SKI Models: Skeleton Induced Vision-Language Embeddings for Understanding Activities of Daily Living	Arkaprava Sinha et.al.	2502.03459v1	null
2025-02-05	Think or Step-by-Step? UnZIPping the Black Box in Zero-Shot Prompts	Nikta Gohari Sadr et.al.	2502.03418v1	null
2025-02-05	Benchmarking Time Series Forecasting Models: From Statistical Techniques to Foundation Models in Real-World Applications	Issar Arab et.al.	2502.03395v1	null
2025-02-05	CAPE: Covariate-Adjusted Pre-Training for Epidemic Time Series Forecasting	Zewen Liu et.al.	2502.03393v1	null
2025-02-05	ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments with Vision Foundation Models	Ying Zhang et.al.	2502.03266v1	link
2025-02-05	SimSort: A Powerful Framework for Spike Sorting by Large-Scale Electrophysiology Simulation	Yimu Zhang et.al.	2502.03198v1	null
2025-02-05	Metis: A Foundation Speech Generation Model with Masked Generative Pre-training	Yuancheng Wang et.al.	2502.03128v1	link
2025-02-05	IAO Prompting: Making Knowledge Flow Explicit in LLMs through Structured Reasoning Templates	Aissatou Diallo et.al.	2502.03080v1	null
2025-02-05	Fine-grained Preference Optimization Improves Zero-shot Text-to-Speech	Jixun Yao et.al.	2502.02950v1	null
2025-02-04	RFMedSAM 2: Automatic Prompt Refinement for Enhanced Volumetric Medical Image Segmentation with SAM 2	Bin Xie et.al.	2502.02741v1	null
2025-02-04	IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning	Quan Zhang et.al.	2502.02454v1	null
2025-02-04	Evaluating the Effectiveness of LLMs in Fixing Maintainability Issues in Real-World Projects	Henrique Nunes et.al.	2502.02368v1	null
2025-02-04	LoRA-TTT: Low-Rank Test-Time Training for Vision-Language Models	Yuto Kojima et.al.	2502.02069v1	null
2025-02-04	VolleyBots: A Testbed for Multi-Drone Volleyball Game Combining Motion Control and Strategic Play	Zelai Xu et.al.	2502.01932v1	null
2025-02-03	AquaticCLIP: A Vision-Language Foundation Model for Underwater Scene Analysis	Basit Alawode et.al.	2502.01785v1	null
2025-02-03	Expected Return Symmetries	Darius Muglich et.al.	2502.01711v1	null
2025-02-03	Scalable Language Models with Posterior Inference of Latent Thought Vectors	Deqian Kong et.al.	2502.01567v1	null
2025-02-03	Toward Task Generalization via Memory Augmentation in Meta-Reinforcement Learning	Kaixi Bao et.al.	2502.01521v1	null
2025-02-03	Embrace Collisions: Humanoid Shadowing for Deployable Contact-Agnostics Motions	Ziwen Zhuang et.al.	2502.01465v1	null
2025-02-03	A Framework for Double-Blind Federated Adaptation of Foundation Models	Nurbek Tastan et.al.	2502.01289v1	null
2025-01-31	MINDSTORES: Memory-Informed Neural Decision Synthesis for Task-Oriented Reinforcement in Embodied Systems	Anirudh Chari et.al.	2501.19318v1	null
2025-01-31	Differentially Private In-context Learning via Sampling Few-shot Mixed with Zero-shot Outputs	James Flemings et.al.	2501.19287v1	null
2025-01-31	A Zero-Shot Generalization Framework for LLM-Driven Cross-Domain Sequential Recommendation	Yunzhe Li et.al.	2501.19232v1	null
2025-01-31	Autonomous Legacy Web Application Upgrades Using a Multi-Agent System	Valtteri Ala-Salmi et.al.	2501.19204v1	link
2025-01-31	Efficient Reasoning with Hidden Thinking	Xuan Shen et.al.	2501.19201v1	link
2025-01-31	Brain-inspired sparse training enables Transformers and LLMs to perform as fully connected	Yingtao Zhang et.al.	2501.19107v1	null
2025-01-31	Fairness Analysis of CLIP-Based Foundation Models for X-Ray Image Classification	Xiangyu Sun et.al.	2501.19086v1	null
2025-02-03	Contrast-Aware Calibration for Fine-Tuned CLIP: Leveraging Image-Text Alignment	Song-Lin Lv et.al.	2501.19060v2	null
2025-01-31	TV-Dialogue: Crafting Theme-Aware Video Dialogues with Immersive Interaction	Sai Wang et.al.	2501.18940v1	null
2025-01-31	Test-time Loss Landscape Adaptation for Zero-Shot Generalization in Vision-Language Models	Aodi Li et.al.	2501.18864v1	null
2025-01-30	DeltaLLM: Compress LLMs with Low-Rank Deltas between Shared Weights	Liana Mikaelyan et.al.	2501.18596v1	null
2025-01-30	Learn from the Past: Language-conditioned Object Rearrangement with Large Language Models	Guanqun Cao et.al.	2501.18516v1	null
2025-01-30	CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering	Yumeng Wang et.al.	2501.18457v1	null
2025-01-30	ReactEmbed: A Cross-Domain Framework for Protein-Molecule Representation Learning via Biochemical Reaction Networks	Amitay Sicherman et.al.	2501.18278v1	link
2025-01-30	Unraveling the Capabilities of Language Models in News Summarization	Abdurrahman Odabaşı et.al.	2501.18128v1	link
2025-01-30	LLMs can see and hear without any training	Kumar Ashutosh et.al.	2501.18096v1	link
2025-01-29	Hybrid Graphs for Table-and-Text based Question Answering using LLMs	Ankush Agarwal et.al.	2501.17767v1	null
2025-01-29	VoicePrompter: Robust Zero-Shot Voice Conversion with Voice Prompt and Conditional Flow Matching	Ha-Yeong Choi et.al.	2501.17612v1	null
2025-01-29	LLM Assistance for Pediatric Depression	Mariia Ignashina et.al.	2501.17510v1	null
2025-01-29	General Scene Adaptation for Vision-and-Language Navigation	Haodong Hong et.al.	2501.17403v1	link
2025-01-28	RLPP: A Residual Method for Zero-Shot Real-World Autonomous Racing on Scaled Platforms	Edoardo Ghignone et.al.	2501.17311v1	link
2025-01-28	Mitigating Hallucinated Translations in Large Language Models with Hallucination-focused Preference Optimization	Zilu Tang et.al.	2501.17295v1	null
2025-01-28	Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding	Akash Kumar et.al.	2501.17053v1	null
2025-01-28	Image-based Geo-localization for Robotics: Are Black-box Vision-Language Models there yet?	Sania Waheed et.al.	2501.16947v1	null
2025-01-28	Irony Detection, Reasoning and Understanding in Zero-shot Learning	Peiling Yi et.al.	2501.16884v1	null
2025-01-28	LLM Assisted Anomaly Detection Service for Site Reliability Engineers: Enhancing Cloud Infrastructure Resilience	Nimesh Jha et.al.	2501.16744v1	null
2025-01-28	B-RIGHT: Benchmark Re-evaluation for Integrity in Generalized Human-Object Interaction Testing	Yoojin Jang et.al.	2501.16724v1	link
2025-01-28	Polyp-Gen: Realistic and Diverse Polyp Image Generation for Endoscopic Dataset Expansion	Shengyuan Liu et.al.	2501.16679v1	link
2025-01-27	How well can LLMs Grade Essays in Arabic?	Rayed Ghazawi et.al.	2501.16516v1	null
2025-01-27	Generating customized prompts for Zero-Shot Rare Event Medical Image Classification using LLM	Payal Kamboj et.al.	2501.16481v1	link
2025-01-28	Upside Down Reinforcement Learning with Policy Generators	Jacopo Di Ventura et.al.	2501.16288v2	link
2025-01-27	Zero-Shot Decision Tree Construction via Large Language Models	Lucas Carrasco et.al.	2501.16247v1	null
2025-01-27	CLISC: Bridging clip and sam by enhanced cam for unsupervised brain tumor segmentation	Xiaochuan Ma et.al.	2501.16246v1	null
2025-01-27	SPECIAL: Zero-shot Hyperspectral Image Classification With CLIP	Li Pang et.al.	2501.16222v1	link
2025-01-27	Solving Turbulent Rayleigh-Bénard Convection using Fourier Neural Operators	Michiel Straat et.al.	2501.16209v1	null
2025-01-27	TimeHF: Billion-Scale Time Series Models Guided by Human Feedback	Yongzhi Qi et.al.	2501.15942v1	null
2025-01-27	SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model	Delin Qu et.al.	2501.15830v1	null
2025-01-27	MM-Retinal V2: Transfer an Elite Knowledge Spark into Fundus Vision-Language Pretraining	Ruiqi Wu et.al.	2501.15798v1	link
2025-01-27	GraphICL: Unlocking Graph Learning Potential in LLMs through Structured Prompt Design	Yuanfu Sun et.al.	2501.15755v1	null
2025-01-26	StochSync: Stochastic Diffusion Synchronization for Image Generation in Arbitrary Spaces	Kyeongmin Yeo et.al.	2501.15445v1	null
2025-01-24	Calibrating Wireless AI via Meta-Learned Context-Dependent Conformal Prediction	Seonghoon Yoo et.al.	2501.14566v1	null
2025-01-24	Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding	Zhongyi Shui et.al.	2501.14548v1	link
2025-01-24	On Correlating Factors for Domain Adaptation Performance	Goksenin Yuksel et.al.	2501.14466v1	null
2025-01-24	Interpretability Analysis of Domain Adapted Dense Retrievers	Goksenin Yuksel et.al.	2501.14459v1	null
2025-01-24	Remining Hard Negatives for Generative Pseudo Labeled Domain Adaptation	Goksenin Yuksel et.al.	2501.14434v1	null
2025-01-24	GraphBC: Improving LLMs for Better Graph Data Processing	Xu Chu et.al.	2501.14427v1	null
2025-01-24	Kolmogorov Arnold Neural Interpolator for Downscaling and Correcting Meteorological Fields from In-Situ Observations	Zili Liu et.al.	2501.14404v1	null
2025-01-24	Learning Primitive Relations for Compositional Zero-Shot Learning	Insu Lee et.al.	2501.14308v1	null
2025-01-24	A Zero-Shot LLM Framework for Automatic Assignment Grading in Higher Education	Calvin Yeung et.al.	2501.14305v1	link
2025-01-24	PuzzleGPT: Emulating Human Puzzle-Solving Ability for Time and Location Prediction	Hammad Ayyubi et.al.	2501.14210v1	null
2025-01-23	Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference	Shuqi Dai et.al.	2501.13870v1	null
2025-01-23	Dual-Modal Prototype Joint Learning for Compositional Zero-Shot Learning	Shiyu Zhang et.al.	2501.13859v1	null
2025-01-23	Training-Free Zero-Shot Temporal Action Detection with Vision-Language Models	Chaolei Han et.al.	2501.13795v1	null
2025-01-23	Tune In, Act Up: Exploring the Impact of Audio Modality-Specific Edits on Large Audio Language Models in Jailbreak	Erjia Xiao et.al.	2501.13772v1	null
2025-01-23	Training-Free Consistency Pipeline for Fashion Repose	Potito Aghilar et.al.	2501.13692v1	null
2025-01-23	Text-driven Online Action Detection	Manuel Benavent-Lledo et.al.	2501.13518v1	link
2025-01-23	Zero-Shot Trajectory Planning for Signal Temporal Logic Tasks	Ruijia Liu et.al.	2501.13457v1	null
2025-01-23	Scalable Evaluation Framework for Foundation Models in Musculoskeletal MRI Bridging Computational Innovation with Clinical Utility	Gabrielle Hoyer et.al.	2501.13376v1	link
2025-01-23	Generative Data Augmentation Challenge: Zero-Shot Speech Synthesis for Personalized Speech Enhancement	Jae-Sung Bae et.al.	2501.13372v1	null
2025-01-22	State Combinatorial Generalization In Decision Making With Conditional Diffusion Models	Xintong Duan et.al.	2501.13241v1	null
2025-01-22	Orchid: Image Latent Diffusion for Joint Appearance and Geometry Generation	Akshay Krishnan et.al.	2501.13087v1	null
2025-01-22	Evolution and The Knightian Blindspot of Machine Learning	Joel Lehman et.al.	2501.13075v1	null
2025-01-22	Beyond the Lungs: Extending the Field of View in Chest CT with Latent Diffusion Models	Lianrui Zuo et.al.	2501.13068v1	null
2025-01-22	Correctness Assessment of Code Generated by Large Language Models Using Internal Representations	Tuan-Dung Bui et.al.	2501.12934v1	link
2025-01-22	Patent Figure Classification using Large Vision-language Models	Sushil Awale et.al.	2501.12751v1	link
2025-01-22	Training Dialogue Systems by AI Feedback for Improving Overall Dialogue Impression	Kai Yoshida et.al.	2501.12698v1	null
2025-01-22	Combining Knowledge Graph and LLMs for Enhanced Zero-shot Visual Question Answering	Qian Tao et.al.	2501.12697v1	null
2025-01-22	Can masking background and object reduce static bias for zero-shot action recognition?	Takumi Fukuzawa et.al.	2501.12681v1	null
2025-01-21	fabSAM: A Farmland Boundary Delineation Method Based on the Segment Anything Model	Yufeng Xie et.al.	2501.12487v1	null
2025-01-21	Slot-BERT: Self-supervised Object Discovery in Surgical Video	Guiqiu Liao et.al.	2501.12477v1	null
2025-01-21	Video Depth Anything: Consistent Depth Estimation for Super-Long Videos	Sili Chen et.al.	2501.12375v1	null
2025-01-21	Zero-shot Bias Correction: Efficient MR Image Inhomogeneity Reduction Without Any Data	Hongxu Yang et.al.	2501.12244v1	null
2025-01-21	Survey on Monocular Metric Depth Estimation	Jiuling Zhang et.al.	2501.11841v1	null
2025-01-20	SimLabel: Consistency-Guided OOD Detection with Pretrained Vision-Language Models	Shu Zou et.al.	2501.11485v1	link
2025-01-20	MIFNet: Learning Modality-Invariant Features for Generalizable Multimodal Image Matching	Yepeng Liu et.al.	2501.11299v1	null
2025-01-20	KPL: Training-Free Medical Knowledge Mining of Vision-Language Models	Jiaxiang Liu et.al.	2501.11231v1	link
2025-01-20	Embedding-Driven Diversity Sampling to Improve Few-Shot Synthetic Data Generation	Ivan Lopez et.al.	2501.11199v1	null
2025-01-19	CART-MPC: Coordinating Assistive Devices for Robot-Assisted Transferring with Multi-Agent Model Predictive Control	Ruolin Ye et.al.	2501.11149v1	null
2025-01-19	Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective	Yiyao Yu et.al.	2501.11110v1	null
2025-01-19	Can LLM Generate Regression Tests for Software Commits?	Jing Liu et.al.	2501.11086v1	null
2025-01-17	FaceXBench: Evaluating Multimodal LLMs on Face Understanding	Kartik Narayan et.al.	2501.10360v1	link
2025-01-17	Zero-Shot Monocular Scene Flow Estimation in the Wild	Yiqing Liang et.al.	2501.10357v1	null
2025-01-17	Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics	Chenhao Li et.al.	2501.10100v1	null
2025-01-17	FiLo++: Zero-/Few-Shot Anomaly Detection by Fused Fine-Grained Descriptions and Deformable Localization	Zhaopeng Gu et.al.	2501.10067v1	link
2025-01-17	X-Dyna: Expressive Dynamic Human Image Animation	Di Chang et.al.	2501.10021v1	link
2025-01-17	Attention-guided Self-reflection for Zero-shot Hallucination Detection in Large Language Models	Qiang Liu et.al.	2501.09997v1	null
2025-01-17	GVMGen: A General Video-to-Music Generation Model with Hierarchical Attentions	Heda Zuo et.al.	2501.09972v1	null
2025-01-17	MultiPruner: Balanced Structure Removal in Foundation Models	J. Pablo Muñoz et.al.	2501.09949v1	link
2025-01-17	FoundationStereo: Zero-Shot Stereo Matching	Bowen Wen et.al.	2501.09898v1	link
2025-01-17	FLORA: Formal Language Model Enables Robust Training-free Zero-shot Object Referring Analysis	Zhe Chen et.al.	2501.09887v1	null
2025-01-16	Comparative Insights from 12 Machine Learning Models in Extracting Economic Ideology from Political Text	Jihed Ncib et.al.	2501.09719v1	null
2025-01-16	DEFOM-Stereo: Depth Foundation Model Based Stereo Matching	Hualie Jiang et.al.	2501.09466v1	link
2025-01-16	Double Visual Defense: Adversarial Pre-training and Instruction Tuning for Improving Vision-Language Model Robustness	Zeyu Wang et.al.	2501.09446v1	null
2025-01-16	Efficient Few-Shot Medical Image Analysis via Hierarchical Contrastive Vision-Language Learning	Harrison Fuller et.al.	2501.09294v1	null
2025-01-16	Text-guided Synthetic Geometric Augmentation for Zero-shot 3D Understanding	Kohei Torimi et.al.	2501.09278v1	null
2025-01-15	Few-Shot Adaptation of Training-Free Foundation Model for 3D Medical Image Segmentation	Xingxin He et.al.	2501.09138v1	null
2025-01-15	Tracking the Takes and Trajectories of English-Language News Narratives across Trustworthy and Worrisome Websites	Hans W. A. Hanley et.al.	2501.09102v1	link
2025-01-15	Multimodal LLMs Can Reason about Aesthetics in Zero-Shot	Ruixiang Jiang et.al.	2501.09012v1	link
2025-01-15	Exploring ChatGPT for Face Presentation Attack Detection in Zero and Few-Shot in-Context Learning	Alain Komaty et.al.	2501.08799v1	null
2025-01-15	StereoGen: High-quality Stereo Image Generation from a Single Image	Xianqi Wang et.al.	2501.08654v1	null
2025-01-15	MonSter: Marry Monodepth to Stereo Unleashes Power	Junda Cheng et.al.	2501.08643v1	link
2025-01-15	Towards Lightweight and Stable Zero-shot TTS with Self-distilled Representation Disentanglement	Qianniu Chen et.al.	2501.08566v1	null
2025-01-14	FLAVARS: A Multimodal Foundational Language and Vision Alignment Model for Remote Sensing	Isaac Corley et.al.	2501.08490v1	null
2025-01-14	Towards Zero-Shot & Explainable Video Description by Reasoning over Graphs of Events in Space and Time	Mihai Masala et.al.	2501.08460v1	null
2025-01-14	Toward Zero-Shot User Intent Recognition in Shared Autonomy	Atharv Belsare et.al.	2501.08389v1	null
2025-01-14	I Can Find You in Seconds! Leveraging Large Language Models for Code Authorship Attribution	Soohyeon Choi et.al.	2501.08165v1	null
2025-01-14	HydroelasticTouch: Simulation of Tactile Sensors with Hydroelastic Contact Surfaces	David P. Leins et.al.	2501.08077v1	null
2025-01-14	Skeleton and Font Generation Network for Zero-shot Chinese Character Generation	Mobai Xue et.al.	2501.08062v1	null
2025-01-14	Zero-shot Video Moment Retrieval via Off-the-shelf Multimodal Large Language Models	Yifang Xu et.al.	2501.07972v1	null
2025-01-13	Constructing Set-Compositional and Negated Representations for First-Stage Ranking	Antonios Minas Krasakis et.al.	2501.07679v1	null
2025-01-13	BlobGEN-Vid: Compositional Text-to-Video Generation with Blob Video Representations	Weixi Feng et.al.	2501.07647v1	null
2025-01-13	Investigating Large Language Models in Inferring Personality Traits from User Conversations	Jianfeng Zhu et.al.	2501.07532v1	null
2025-01-13	Zero-Shot Scene Understanding for Automatic Target Recognition Using Large Vision-Language Models	Yasiru Ranasinghe et.al.	2501.07396v1	null
2025-01-13	Exploring the Use of Contrastive Language-Image Pre-Training for Human Posture Classification: Insights from Yoga Pose Analysis	Andrzej D. Dobrzycki et.al.	2501.07221v1	null
2025-01-14	BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature	Alejandro Lozano et.al.	2501.07171v2	link
2025-01-13	Duplex: Dual Prototype Learning for Compositional Zero-Shot Learning	Zhong Peng et.al.	2501.07114v1	null
2025-01-10	OpenFOAMGPT: a RAG-Augmented LLM Agent for OpenFOAM-Based Computational Fluid Dynamics	Sandeep Pandey et.al.	2501.06327v1	null
2025-01-10	Learning Flexible Heterogeneous Coordination with Capability-Aware Shared Hypernetworks	Kevin Fu et.al.	2501.06058v1	link
2025-01-10	Generate, Transduct, Adapt: Iterative Transduction with VLMs	Oindrila Saha et.al.	2501.06031v1	null
2025-01-10	Low-Resource Text-to-Speech Synthesis Using Noise-Augmented Training of ForwardTacotron	Kishor Kayyar Lakshminarayana et.al.	2501.05976v1	null
2025-01-10	MARS6: A Small and Robust Hierarchical-Codec Text-to-Speech Model	Matthew Baas et.al.	2501.05787v1	null
2025-01-10	Super-class guided Transformer for Zero-Shot Attribute Classification	Sehyung Kim et.al.	2501.05728v1	link
2025-01-10	Zero-shot Shark Tracking and Biometrics from Aerial Imagery	Chinmay K Lalgudi et.al.	2501.05717v1	null
2025-01-10	The Impact of Model Scaling on Seen and Unseen Language Performance	Rhitabrat Pokharel et.al.	2501.05629v1	null
2025-01-09	FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion	Alef Iury Siqueira Ferreira et.al.	2501.05586v1	link
2025-01-09	Vision-Language Models for Autonomous Driving: CLIP-Based Dynamic Scene Understanding	Mohammed Elhenawy et.al.	2501.05566v1	null
2025-01-09	Improving Zero-Shot Object-Level Change Detection by Incorporating Visual Correspondence	Hung Huy Nguyen et.al.	2501.05555v1	link
2025-01-09	CellViT++: Energy-Efficient and Adaptive Cell Segmentation and Classification Using Foundation Models	Fabian Hörst et.al.	2501.05269v1	link
2025-01-09	Harnessing Large Language and Vision-Language Models for Robust Out-of-Distribution Detection	Pei-Kang Lee et.al.	2501.05228v1	null
2025-01-09	Leveraging Large Language Models for Zero-shot Lay Summarisation in Biomedicine and Beyond	Tomas Goldsack et.al.	2501.05224v1	null
2025-01-09	SpaLLM-Guard: Pairing SMS Spam Detection Using Open-source and Commercial LLMs	Muhammad Salman et.al.	2501.04985v1	null
2025-01-08	Test-Time Optimization for Domain Adaptive Open Vocabulary Segmentation	Ulindu De Silva et.al.	2501.04696v1	link
2025-01-08	Beyond Sight: Finetuning Generalist Robot Policies with Heterogeneous Sensors via Language Grounding	Joshua Jones et.al.	2501.04693v1	null
2025-01-08	DRIVINGVQA: Analyzing Visual Chain-of-Thought Reasoning of Vision Language Models in Real-World Scenarios with Driving Theory Tests	Charles Corbière et.al.	2501.04671v1	null
2025-01-08	A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI	Kazusato Oko et.al.	2501.04641v1	link
2025-01-09	OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis	Run Luo et.al.	2501.04561v2	link
2025-01-08	Hidden Entity Detection from GitHub Leveraging Large Language Models	Lu Gan et.al.	2501.04455v1	link
2025-01-08	Dual-Force: Enhanced Offline Diversity Maximization under Imitation Constraints	Pavel Kolev et.al.	2501.04426v1	null
2025-01-08	ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training	Xinfa Zhu et.al.	2501.04416v1	null
2025-01-08	DispFormer: Pretrained Transformer for Flexible Dispersion Curve Inversion from Global Synthesis to Regional Applications	Feng Liu et.al.	2501.04366v1	link
2025-01-08	Online Gaussian Test-Time Adaptation of Vision-Language Models	Clément Fuchs et.al.	2501.04352v1	link
2025-01-07	Not all tokens are created equal: Perplexity Attention Weighted Networks for AI generated text detection	Pablo Miralles-González et.al.	2501.03940v1	null
2025-01-07	ZDySS -- Zero-Shot Dynamic Scene Stylization using Gaussian Splatting	Abhishek Saroha et.al.	2501.03875v1	null
2025-01-07	Improving Dialectal Slot and Intent Detection with Auxiliary Tasks: A Multi-Dialectal Bavarian Case Study	Xaver Maria Krückl et.al.	2501.03863v1	link
2025-01-07	OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints	Mingjie Pan et.al.	2501.03841v1	null
2025-01-07	MADation: Face Morphing Attack Detection with Foundation Models	Eduarda Caldeira et.al.	2501.03800v1	link
2025-01-07	KAnoCLIP: Zero-Shot Anomaly Detection through Knowledge-Driven Prompt Learning and Enhanced Cross-Modal Integration	Chengyuan Li et.al.	2501.03786v1	null
2025-01-07	Context-Alignment: Activating and Enhancing LLM Capabilities in Time Series	Yuxiao Hu et.al.	2501.03747v1	null
2025-01-07	Realistic Test-Time Adaptation of Vision-Language Models	Maxime Zanella et.al.	2501.03729v1	link
2025-01-07	Exploring Optimal Latent Trajetory for Zero-shot Image Editing	Maomao Li et.al.	2501.03631v1	null
2025-01-07	LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment	Gaoussou Youssouf Kebe et.al.	2501.03624v1	null
2025-01-06	Gaussian Masked Autoencoders	Jathushan Rajasegaran et.al.	2501.03229v1	null
2025-01-06	GLiREL -- Generalist Model for Zero-Shot Relation Extraction	Jack Boylan et.al.	2501.03172v1	link
2025-01-06	Segment Anything Model for Zero-shot Single Particle Tracking in Liquid Phase Transmission Electron Microscopy	Risha Goel et.al.	2501.03153v1	link
2025-01-07	Socratic Questioning: Learn to Self-guide Multimodal Reasoning in the Wild	Wanpeng Hu et.al.	2501.02964v2	link
2025-01-06	Sim-to-Real Transfer for Mobile Robots with Reinforcement Learning: from NVIDIA Isaac Sim to Gazebo and Real ROS 2 Robots	Sahar Salimpour et.al.	2501.02902v1	link
2025-01-06	Universal Features Guided Zero-Shot Category-Level Object Pose Estimation	Wentian Qu et.al.	2501.02831v1	null
2025-01-06	Holistic Semantic Representation for Navigational Trajectory Generation	Ji Cao et.al.	2501.02737v1	link
2025-01-06	EAGLE: Enhanced Visual Grounding Minimizes Hallucinations in Instructional Multimodal Models	Andrés Villa et.al.	2501.02699v1	null
2025-01-05	LLMs Help Alleviate the Cross-Subject Variability in Brain Signal and Language Alignment	Yifei Liu et.al.	2501.02621v1	null
2025-01-05	CHAIR-Classifier of Hallucination as Improver	Ao Sun et.al.	2501.02518v1	link
2025-01-03	IGAF: Incremental Guided Attention Fusion for Depth Super-Resolution	Athanasios Tragakis et.al.	2501.01723v1	null
2025-01-03	LLMs & Legal Aid: Understanding Legal Needs Exhibited Through User Queries	Michal Kuk et.al.	2501.01711v1	null
2025-01-03	GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models	Zhangyang Qi et.al.	2501.01428v2	null
2025-01-02	VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control	Yuanpeng Tu et.al.	2501.01427v1	null
2025-01-02	Unifying Specialized Visual Encoders for Video Language Models	Jihoon Chung et.al.	2501.01426v1	link
2025-01-03	AdaptVC: High Quality Voice Conversion with Adaptive Learning	Jaehun Kim et.al.	2501.01347v2	null
2025-01-02	Digital Guardians: Can GPT-4, Perspective API, and Moderation API reliably detect hate speech in reader comments of German online newspapers?	Manuel Weber et.al.	2501.01256v1	null
2025-01-02	Automated Self-Refinement and Self-Correction for LLM-based Product Attribute Value Extraction	Alexander Brinkmann et.al.	2501.01237v1	link
2025-01-02	Symmetries-enhanced Multi-Agent Reinforcement Learning	Nikolaos Bousias et.al.	2501.01136v1	null
2025-01-03	MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization	Haina Zhu et.al.	2501.01108v2	link
2025-01-02	Are LLMs effective psychological assessors? Leveraging adaptive RAG for interpretable mental health screening through psychometric practice	Federico Ravenda et.al.	2501.00982v1	link
2025-01-01	Text2Earth: Unlocking Text-driven Remote Sensing Image Generation with a Global-Scale Dataset and a Foundation Model	Chenyang Liu et.al.	2501.00895v1	null
2024-12-30	QuantumLLMInstruct: A 500k LLM Instruction-Tuning Dataset with Problem-Solution Pairs for Quantum Computing	Shlomo Kashani et.al.	2412.20956v1	null
2024-12-30	Navigating Chemical-Linguistic Sharing Space with Heterogeneous Molecular Encoding	Liuzhenghao Lv et.al.	2412.20888v1	link
2024-12-30	TimeRAF: Retrieval-Augmented Foundation model for Zero-shot Time Series Forecasting	Huanyu Zhang et.al.	2412.20810v1	null
2024-12-30	Learning to Rank Pre-trained Vision-Language Models for Downstream Tasks	Yuhe Ding et.al.	2412.20682v1	null
2024-12-29	Zero-Shot Image Restoration Using Few-Step Guidance of Consistency Models (and Beyond)	Tomer Garber et.al.	2412.20596v1	link
2024-12-27	Zero-shot Hazard Identification in Autonomous Driving: A Case Study on the COOOL Benchmark	Lukas Picek et.al.	2412.19944v1	null
2024-12-27	EEG-Reptile: An Automatized Reptile-Based Meta-Learning Library for BCIs	Daniil A. Berdyshev et.al.	2412.19725v1	link
2024-12-30	VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models	Tao Wu et.al.	2412.19645v2	null
2024-12-27	MINIMA: Modality Invariant Image Matching	Xingyu Jiang et.al.	2412.19412v1	link
2024-12-26	Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment	Ziang Yan et.al.	2412.19326v1	link
2024-12-26	RecLM: Recommendation Instruction Tuning	Yangqin Jiang et.al.	2412.19302v1	link
2024-12-26	Time Series Foundational Models: Their Role in Anomaly Detection and Prediction	Chathurangi Shyalika et.al.	2412.19286v1	link
2024-12-26	Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval	Yang Du et.al.	2412.19178v1	link
2024-12-26	CLIP-GS: Unifying Vision-Language Representation with 3D Gaussian Splatting	Siyu Jiao et.al.	2412.19142v1	null
2024-12-26	Semantic Residual for Multimodal Unified Discrete Representation	Hai Huang et.al.	2412.19128v1	null
2024-12-26	Advanced Knowledge Transfer: Refined Feature Distillation for Zero-Shot Quantization in Edge Computing	Inpyo Hong et.al.	2412.19125v1	link
2024-12-24	Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models	Zehan Wang et.al.	2412.18605v1	link
2024-12-24	ZeroHSI: Zero-Shot 4D Human-Scene Interaction by Video Generation	Hongjie Li et.al.	2412.18600v1	null
2024-12-24	Distilling Fine-grained Sentiment Understanding from Large Language Models	Yice Zhang et.al.	2412.18552v1	link
2024-12-24	The Key of Understanding Vision Tasks: Explanatory Instructions	Yang Shen et.al.	2412.18525v1	link
2024-12-24	Multilingual Mathematical Reasoning: Advancing Open-Source LLMs in Hindi and English	Avinash Anand et.al.	2412.18415v1	link
2024-12-24	Extract Free Dense Misalignment from CLIP	JeongYeon Nam et.al.	2412.18404v1	link
2024-12-24	A Zero-Shot Physics-Informed Dictionary Learning Approach for Sound Field Reconstruction	Stefano Damiano et.al.	2412.18348v1	link
2024-12-24	Efficient and Context-Aware Label Propagation for Zero-/Few-Shot Training-Free Adaptation of Vision-Language Model	Yushu Li et.al.	2412.18303v1	null
2024-12-24	Quo Vadis, Anomaly Detection? LLMs and VLMs in the Spotlight	Xi Ding et.al.	2412.18298v1	link
2024-12-24	Improved Feature Generating Framework for Transductive Zero-shot Learning	Zihan Ye et.al.	2412.18282v1	null
2024-12-23	CiteBART: Learning to Generate Citations for Local Citation Recommendation	Ege Yiğit Çelik et.al.	2412.17534v1	link
2024-12-23	Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio	Gongyu Chen et.al.	2412.17306v1	null
2024-12-23	Discriminative Image Generation with Diffusion Models for Zero-Shot Learning	Dingjie Fu et.al.	2412.17219v1	null
2024-12-22	Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis	Ye-Xin Lu et.al.	2412.16977v1	null
2024-12-22	Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation	Quan Dao et.al.	2412.16906v1	null
2024-12-22	Autoregressive Speech Synthesis with Next-Distribution Prediction	Xinfa Zhu et.al.	2412.16846v1	null
2024-12-21	RoomPainter: View-Integrated Diffusion for Consistent Indoor Scene Texturing	Zhipeng Huang et.al.	2412.16778v1	null
2024-12-21	HyperCLIP: Adapting Vision-Language models with Hypernetworks	Victor Akinwande et.al.	2412.16777v1	null
2024-12-21	Large Language Model Can Be a Foundation for Hidden Rationale-Based Retrieval	Luo Ji et.al.	2412.16615v1	link
2024-12-21	Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive Learning with Dense Labeling	Daichi Yashima et.al.	2412.16576v1	link
2024-12-20	Deciphering the Underserved: Benchmarking LLM OCR for Low-Resource Scripts	Muhammad Abdullah Sohail et.al.	2412.16119v1	link
2024-12-20	CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up	Songhua Liu et.al.	2412.16112v1	link
2024-12-20	Interleaved Speech-Text Language Models are Simple Streaming Text to Speech Synthesizers	Yifan Yang et.al.	2412.16102v1	null
2024-12-20	Fearful Falcons and Angry Llamas: Emotion Category Annotations of Arguments by Humans and LLMs	Lynn Greschner et.al.	2412.15993v1	null
2024-12-20	Watertox: The Art of Simplicity in Universal Attacks A Cross-Model Framework for Robust Adversarial Generation	Zhenghao Gao et.al.	2412.15924v1	null
2024-12-20	On the Suitability of pre-trained foundational LLMs for Analysis in German Legal Education	Lorenz Wendlinger et.al.	2412.15902v1	null
2024-12-20	AutoLife: Automatic Life Journaling with Smartphones and LLMs	Huatao Xu et.al.	2412.15714v1	null
2024-12-20	Cracking the Code: Evaluating Zero-Shot Prompting Methods for Providing Programming Feedback	Niklas Ippisch et.al.	2412.15702v1	null
2024-12-20	SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training	Wenxi Chen et.al.	2412.15649v1	link
2024-12-20	A New Method to Capturing Compositional Knowledge in Linguistic Space	Jiahe Wan et.al.	2412.15632v1	null
2024-12-19	Face the Facts! Evaluating RAG-based Fact-checking Pipelines in Realistic Settings	Daniel Russo et.al.	2412.15189v1	link
2024-12-19	STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning	Marius Memmel et.al.	2412.15182v1	null
2024-12-19	Adaptive Pruning for Large Language Models with Structural Importance Awareness	Haotian Zheng et.al.	2412.15127v1	null
2024-12-19	Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling	Leying Zhang et.al.	2412.14890v1	null
2024-12-19	Zero-Shot Artifact2Artifact: Self-incentive artifact removal for photoacoustic imaging without any data	Shuang Li et.al.	2412.14873v1	link
2024-12-19	Extending TWIG: Zero-Shot Predictive Hyperparameter Selection for KGEs based on Graph Structure	Jeffrey Sardina et.al.	2412.14801v1	null
2024-12-19	Beyond Guilt: Legal Judgment Prediction with Trichotomous Reasoning	Kepu Zhang et.al.	2412.14588v1	null
2024-12-19	MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval	Junjie Zhou et.al.	2412.14475v1	null
2024-12-19	WildSAT: Learning Satellite Image Representations from Wildlife Observations	Rangel Daroya et.al.	2412.14428v1	null
2024-12-18	I0T: Embedding Standardization Method Towards Zero Modality Gap	Na Min An et.al.	2412.14384v1	link
2024-12-18	Autoregressive Video Generation without Vector Quantization	Haoge Deng et.al.	2412.14169v1	link
2024-12-18	Incorporating Feature Pyramid Tokenization and Open Vocabulary Semantic Segmentation	Jianyu Zhang et.al.	2412.14145v1	null
2024-12-18	Foundation Models Meet Low-Cost Sensors: Test-Time Adaptation for Rescaling Disparity for Zero-Shot Metric Depth Estimation	Rémi Marsal et.al.	2412.14103v1	null
2024-12-18	FarExStance: Explainable Stance Detection for Farsi	Majid Zarharan et.al.	2412.14008v1	link
2024-12-18	Real Classification by Description: Extending CLIP's Limits of Part Attributes Recognition	Ethan Baron et.al.	2412.13947v1	null
2024-12-18	Memorizing SAM: 3D Medical Segment Anything Model with Memorizing Transformer	Xinyuan Shao et.al.	2412.13908v1	link
2024-12-18	Zero-Shot Prompting and Few-Shot Fine-Tuning: Revisiting Document Image Classification Using Large Language Models	Anna Scius-Bertrand et.al.	2412.13859v1	null
2024-12-18	SongEditor: Adapting Zero-Shot Song Generation Language Model as a Multi-Task Editor	Chenyu Yang et.al.	2412.13786v1	null
2024-12-18	G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o	Tony Cheng Tong et.al.	2412.13647v1	link
2024-12-18	Reverse Region-to-Entity Annotation for Pixel-Level Visual Entity Linking	Zhengfei Xu et.al.	2412.13614v1	null
2024-12-17	GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding	Haoyi Jiang et.al.	2412.13193v1	link
2024-12-17	A Knowledge-enhanced Pathology Vision-language Foundation Model for Cancer Diagnosis	Xiao Zhou et.al.	2412.13126v1	null
2024-12-17	Enabling Low-Resource Language Retrieval: Establishing Baselines for Urdu MS MARCO	Umer Butt et.al.	2412.12997v1	link
2024-12-17	An Agentic Approach to Automatic Creation of P&ID Diagrams from Natural Language Descriptions	Shreeyash Gowaikar et.al.	2412.12898v1	null
2024-12-17	Question: How do Large Language Models perform on the Question Answering tasks? Answer:	Kevin Fischer et.al.	2412.12893v1	null
2024-12-17	MIVE: New Design and Benchmark for Multi-Instance Video Editing	Samuel Teodoro et.al.	2412.12877v1	null
2024-12-17	Comparative Analysis of Zero-Shot Capability of Time-Series Foundation Models in Short-Term Load Prediction	Nan Lin et.al.	2412.12834v1	null
2024-12-17	FocusChat: Text-guided Long Video Understanding via Spatiotemporal Information Filtering	Zheng Cheng et.al.	2412.12833v1	null
2024-12-17	Cross-Dialect Information Retrieval: Information Access in Low-Resource and High-Variance Languages	Robert Litschko et.al.	2412.12806v1	link
2024-12-17	ZoRI: Towards Discriminative Zero-Shot Remote Sensing Instance Segmentation	Shiqi Huang et.al.	2412.12798v1	link
2024-12-16	Causal Diffusion Transformers for Generative Modeling	Chaorui Deng et.al.	2412.12095v1	link
2024-12-16	CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology	Yuxuan Sun et.al.	2412.12077v1	null
2024-12-16	A LoRA is Worth a Thousand Pictures	Chenxi Liu et.al.	2412.12048v1	null
2024-12-16	Learning to Navigate in Mazes with Novel Layouts using Abstract Top-down Maps	Linfeng Zhao et.al.	2412.12024v1	null
2024-12-16	Cost-Effective Label-free Node Classification with LLMs	Taiyan Zhang et.al.	2412.11983v1	null
2024-12-16	Advancing Comprehensive Aesthetic Insight with Multi-Scale Text-Guided Self-Supervised Learning	Yuti Liu et.al.	2412.11952v1	null
2024-12-16	Stepwise Reasoning Error Disruption Attack of LLMs	Jingyu Peng et.al.	2412.11934v1	null
2024-12-16	PICLe: Pseudo-Annotations for In-Context Learning in Low-Resource Named Entity Detection	Sepideh Mamooler et.al.	2412.11923v1	null
2024-12-16	Improved Models for Media Bias Detection and Subcategorization	Tim Menzner et.al.	2412.11835v1	null
2024-12-16	A Distributed Collaborative Retrieval Framework Excelling in All Queries and Corpora based on Zero-shot Rank-Oriented Automatic Evaluation	Tian-Yi Che et.al.	2412.11832v1	null
2024-12-13	UniMed-CLIP: Towards a Unified Image-Text Pretraining Paradigm for Diverse Medical Imaging Modalities	Muhammad Uzair Khattak et.al.	2412.10372v1	link
2024-12-13	Reasoner Outperforms: Generative Stance Detection with Rationalization for Social Media	Jiaqing Yuan et.al.	2412.10266v1	null
2024-12-13	Efficient Generative Modeling with Residual Vector Quantization-Based Tokens	Jaehyeon Kim et.al.	2412.10208v1	null
2024-12-13	Constraint-Aware Zero-Shot Vision-Language Navigation in Continuous Environments	Kehan Chen et.al.	2412.10137v1	null
2024-12-13	Familiarity: Better Evaluation of Zero-Shot Named Entity Recognition by Quantifying Label Shifts in Synthetic Training Data	Jonas Golde et.al.	2412.10121v1	link
2024-12-13	Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP	Yating Yu et.al.	2412.09895v1	link
2024-12-13	CP-DETR: Concept Prompt Guide DETR Toward Stronger Universal Object Detection	Qibo Chen et.al.	2412.09799v1	null
2024-12-12	Toward Foundation Model for Multivariate Wearable Sensing of Physiological Signals	Yunfei Luo et.al.	2412.09758v1	link
2024-12-12	Should We Learn Contact-Rich Manipulation Policies from Sampling-Based Planners?	Huaijiang Zhu et.al.	2412.09743v1	null
2024-12-12	TransferLight: Zero-Shot Traffic Signal Control on any Road-Network	Johann Schmidt et.al.	2412.09719v1	null
2024-12-12	EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM	Zhuofan Zong et.al.	2412.09618v1	null
2024-12-12	Learning to Adapt: Bio-Inspired Gait Strategies for Versatile Quadruped Locomotion	Joseph Humphreys et.al.	2412.09440v1	null
2024-12-12	Distribution free uncertainty quantification in neuroscience-inspired deep operators	Shailesh Garg et.al.	2412.09369v1	null
2024-12-12	Towards Open-Vocabulary Video Semantic Segmentation	Xinhao Li et.al.	2412.09329v1	link
2024-12-12	T-SVG: Text-Driven Stereoscopic Video Generation	Qiao Jin et.al.	2412.09323v1	null
2024-12-12	Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine	Xiaoshuang Huang et.al.	2412.09278v1	link
2024-12-12	Pinpoint Counterfactuals: Reducing social bias in foundation models via localized counterfactual generation	Kirill Sirotkin et.al.	2412.09160v1	null
2024-12-12	Evaluating Pixel Language Models on Non-Standardized Languages	Alberto Muñoz-Ortiz et.al.	2412.09084v1	null
2024-12-12	Cross-View Completion Models are Zero-shot Correspondence Estimators	Honggyu An et.al.	2412.09072v1	null
2024-12-13	An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques	Chunxiao Li et.al.	2412.09063v2	null
2024-12-11	RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation	Mingfei Han et.al.	2412.08591v1	null
2024-12-11	SenCLIP: Enhancing zero-shot land-use mapping for Sentinel-2 with ground-level prompting	Pallavi Jain et.al.	2412.08536v1	link
2024-12-11	SAM-Mamba: Mamba Guided SAM Architecture for Generalized Zero-Shot Polyp Segmentation	Tapas Kumar Dutta et.al.	2412.08482v1	link
2024-12-11	Assessing Personalized AI Mentoring with Large Language Models in the Computing Field	Xiao Luo et.al.	2412.08430v1	null
2024-12-11	Zero-Shot Mono-to-Binaural Speech Synthesis	Alon Levkovitch et.al.	2412.08356v1	null
2024-12-11	BEIR-NL: Zero-shot Information Retrieval Benchmark for the Dutch Language	Nikolay Banar et.al.	2412.08329v1	null
2024-12-11	Lightweight Method for Interactive 3D Medical Image Segmentation with Multi-Round Result Fusion	Bingzhi Shen et.al.	2412.08315v1	null
2024-12-11	2M-BELEBELE: Highly Multilingual Speech and American Sign Language Comprehension Dataset	Marta R. Costa-jussà et.al.	2412.08274v1	null
2024-12-11	Large Language Models for Scholarly Ontology Generation: An Extensive Analysis in the Engineering Field	Tanay Aggarwal et.al.	2412.08258v1	link
2024-12-11	Can Graph Neural Networks Learn Language with Extremely Weak Text Supervision?	Zihao Li et.al.	2412.08174v1	null
2024-12-10	Video Motion Transfer with Diffusion Transformers	Alexander Pondaven et.al.	2412.07776v1	link
2024-12-10	From Slow Bidirectional to Fast Causal Video Generators	Tianwei Yin et.al.	2412.07772v1	null
2024-12-11	Test-time Correction with Human Feedback: An Online 3D Detection System via Visual Prompting	Zetong Yang et.al.	2412.07768v2	null
2024-12-10	SAT: Spatial Aptitude Training for Multimodal Language Models	Arijit Ray et.al.	2412.07755v1	null
2024-12-10	Zero-Shot ATC Coding with Large Language Models for Clinical Assessments	Zijian Chen et.al.	2412.07743v1	null
2024-12-10	DriveMM: All-in-One Large Multimodal Model for Autonomous Driving	Zhijian Huang et.al.	2412.07689v1	link
2024-12-10	Privacy-Preserving Customer Support: A Framework for Secure and Scalable Interactions	Anant Prakash Awasthi et.al.	2412.07687v1	null
2024-12-10	FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing	Yingying Deng et.al.	2412.07517v1	link
2024-12-10	ConfigX: Modular Configuration for Evolutionary Algorithms via Multitask Reinforcement Learning	Hongshu Guo et.al.	2412.07507v1	null
2024-12-10	Bilingual BSARD: Extending Statutory Article Retrieval to Dutch	Ehsan Lotfi et.al.	2412.07462v1	null
2024-12-09	Visual Lexicon: Rich Image Features in Language Space	XuDong Wang et.al.	2412.06774v1	null
2024-12-09	JAPAGEN: Efficient Few/Zero-shot Learning via Japanese Training Dataset Generation with LLM	Takuro Fujii et.al.	2412.06738v1	link
2024-12-09	You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale	Baorui Ma et.al.	2412.06699v1	link
2024-12-09	Knowledge Transfer and Domain Adaptation for Fine-Grained Remote Sensing Image Segmentation	Shun Zhang et.al.	2412.06664v1	null
2024-12-09	LLM-BIP: Structured Pruning for Large Language Models with Block-Wise Forward Importance Propagation	Haihang Wu et.al.	2412.06419v1	null
2024-12-09	Continual Learning for Segment Anything Model Adaptation	Jinglong Yang et.al.	2412.06418v1	link
2024-12-09	ZeroKey: Point-Level Reasoning and Zero-Shot 3D Keypoint Detection from Large Language Models	Bingchen Gong et.al.	2412.06292v1	null
2024-12-09	No Annotations for Object Detection in Art through Stable Diffusion	Patrick Ramos et.al.	2412.06286v1	link
2024-12-09	DenseVLM: A Retrieval and Decoupled Alignment Framework for Open-Vocabulary Dense Prediction	Yunheng Li et.al.	2412.06244v1	null
2024-12-09	Evaluating and Mitigating Social Bias for Large Language Models in Open-ended Settings	Zhao Liu et.al.	2412.06134v1	link
2024-12-06	DenseMatcher: Learning 3D Semantic Correspondence for Category-Level Manipulation from a Single Demo	Junzhe Zhu et.al.	2412.05268v1	null
2024-12-06	Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization	Luca Masserano et.al.	2412.05244v1	null
2024-12-06	Towards Understanding the Role of Sharpness-Aware Minimization Algorithms for Out-of-Distribution Generalization	Samuel Schapiro et.al.	2412.05169v1	null
2024-12-06	A Practical Examination of AI-Generated Text Detectors for Large Language Models	Brian Tufts et.al.	2412.05139v1	null
2024-12-06	Can Large Language Models Serve as Effective Classifiers for Hierarchical Multi-Label Classification of Scientific Documents at Industrial Scale?	Seyed Amin Tabatabaei et.al.	2412.05137v1	null
2024-12-06	The Silent Prompt: Initial Noise as Implicit Guidance for Goal-Driven Image Generation	Ruoyu Wang et.al.	2412.05101v1	null
2024-12-06	HOLa: HoloLens Object Labeling	Michael Schwimmbeck et.al.	2412.04945v1	link
2024-12-06	$S^3$ : Synonymous Semantic Space for Improving Zero-Shot Generalization of Vision-Language Models	Xiaojie Yin et.al.	2412.04925v1	null
2024-12-06	StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching	Jixun Yao et.al.	2412.04724v1	null
2024-12-06	LLM-Align: Utilizing Large Language Models for Entity Alignment in Knowledge Graphs	Xuan Chen et.al.	2412.04690v1	null
2024-12-05	Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail	Luca Bartolomei et.al.	2412.04472v1	link
2024-12-05	Grounding Descriptions in Images informs Zero-Shot Visual Recognition	Shaunak Halbe et.al.	2412.04429v1	link
2024-12-05	SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding	Rong Li et.al.	2412.04383v1	null
2024-12-05	Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting	Edoardo Cetin et.al.	2412.04368v1	null
2024-12-05	Towards Zero-shot 3D Anomaly Localization	Yizhou Wang et.al.	2412.04304v1	null
2024-12-05	3D Part Segmentation via Geometric Aggregation of 2D Visual Features	Marco Garosi et.al.	2412.04247v1	null
2024-12-05	Quantifying the Limits of Segment Anything Model: Analyzing Challenges in Segmenting Tree-Like and Low-Contrast Structures	Yixin Zhang et.al.	2412.04243v1	link
2024-12-05	Hipandas: Hyperspectral Image Joint Denoising and Super-Resolution by Image Fusion with the Panchromatic Image	Shuang Xu et.al.	2412.04201v1	null
2024-12-05	Unified Framework for Open-World Compositional Zero-shot Learning	Hirunima Jayasekara et.al.	2412.04083v1	link
2024-12-05	Towards Generalizable Autonomous Penetration Testing via Domain Randomization and Meta-Reinforcement Learning	Shicheng Zhou et.al.	2412.04078v1	link
2024-12-04	The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control	Ruili Feng et.al.	2412.03568v1	null
2024-12-04	FLAIR: VLM with Fine-grained Language-informed Image Representations	Rui Xiao et.al.	2412.03561v1	link
2024-12-04	Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression	Junjie Wen et.al.	2412.03293v1	null
2024-12-04	Expanding Event Modality Applications through a Robust CLIP-Based Encoder	Sungheon Jeong et.al.	2412.03093v1	null
2024-12-04	ASR-EC Benchmark: Evaluating Large Language Models on Chinese ASR Error Correction	Victor Junqiu Wei et.al.	2412.03075v1	null
2024-12-04	UTSD: Unified Time Series Diffusion Model	Xiangkai Ma et.al.	2412.03068v1	null
2024-12-03	A Novel Compact LLM Framework for Local, High-Privacy EHR Data Applications	Yixiang Qu et.al.	2412.02868v1	null
2024-12-03	Is Large-Scale Pretraining the Secret to Good Domain Generalization?	Piotr Teterwak et.al.	2412.02856v1	null
2024-12-03	Enhancing Robustness of CLIP to Common Corruptions through Bimodal Test-Time Adaptation	Sarthak Kumar Maharana et.al.	2412.02837v1	null
2024-12-03	Gaussian Splatting Under Attack: Investigating Adversarial Noise in 3D Objects	Abdurrahman Zeybey et.al.	2412.02803v1	null
2024-12-03	FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation	Kefan Chen et.al.	2412.02690v1	null
2024-12-03	Multimodal Remote Sensing Scene Classification Using VLMs and Dual-Cross Attention Networks	Jinjin Cai et.al.	2412.02531v1	null
2024-12-03	LoRA Diffusion: Zero-Shot LoRA Synthesis for Diffusion Model Personalization	Ethan Smith et.al.	2412.02352v1	null
2024-12-03	Improving Language Transfer Capability of Decoder-only Architecture in Multilingual Neural Machine Translation	Zhi Qu et.al.	2412.02101v1	link
2024-12-03	Gaussian Object Carver: Object-Compositional Gaussian Splatting with surfaces completion	Liu Liu et.al.	2412.02075v1	link
2024-12-02	PKRD-CoT: A Unified Chain-of-thought Prompting for Multi-Modal Large Language Models in Autonomous Driving	Xuewen Luo et.al.	2412.02025v1	null
2024-12-04	The use of large language models to enhance cancer clinical trial educational materials	Mingye Gao et.al.	2412.01955v2	null
2024-12-02	RandAR: Decoder-only Autoregressive Visual Generation in Random Orders	Ziqi Pang et.al.	2412.01827v1	null
2024-12-02	COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training	Sanghwan Kim et.al.	2412.01814v1	link
2024-12-02	Hard Constraint Guided Flow Matching for Gradient-Free Generation of PDE Solutions	Chaoran Cheng et.al.	2412.01786v1	null
2024-12-02	T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs	Shukang Yin et.al.	2411.19951v2	link
2024-11-29	Reverse Thinking Makes LLMs Stronger Reasoners	Justin Chih-Yao Chen et.al.	2411.19865v1	null
2024-11-29	Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures	Alain Riou et.al.	2411.19806v1	null
2024-11-29	Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models	Kaican Li et.al.	2411.19757v1	link
2024-11-29	Multimodal Whole Slide Foundation Model for Pathology	Tong Ding et.al.	2411.19666v1	link
2024-11-29	LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification	Taja Kuzman et.al.	2411.19638v1	link
2024-11-29	Diorama: Unleashing Zero-shot Single-view 3D Scene Modeling	Qirui Wu et.al.	2411.19492v1	null
2024-11-29	Proto Successor Measure: Representing the Space of All Possible Solutions of Reinforcement Learning	Siddhant Agarwal et.al.	2411.19418v1	null
2024-11-28	CLIP meets DINO for Tuning Zero-Shot Classifier using Unlabeled Image Collections	Mohamed Fazli Imam et.al.	2411.19346v1	link
2024-11-28	OMNI-DC: Highly Robust Depth Completion with Multiresolution Depth Integration	Yiming Zuo et.al.	2411.19278v1	link
2024-11-27	Diffusion Self-Distillation for Zero-Shot Customized Image Generation	Shengqu Cai et.al.	2411.18616v1	null
2024-11-27	Isolating authorship from content with semantic embeddings and contrastive learning	Javier Huertas-Tato et.al.	2411.18472v1	null
2024-11-27	SharpDepth: Sharpening Metric Depth Predictions Using Diffusion Distillation	Duc-Hai Pham et.al.	2411.18229v1	null
2024-11-27	DRS: Deep Question Reformulation With Structured Output	Zhecheng Li et.al.	2411.17993v1	link
2024-11-26	Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient	Zigeng Chen et.al.	2411.17787v1	link
2024-11-26	MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation	Harsh Singh et.al.	2411.17636v1	null
2024-11-26	ShowUI: One Vision-Language-Action Model for GUI Visual Agent	Kevin Qinghong Lin et.al.	2411.17465v1	link
2024-11-26	FLEX-CLIP: Feature-Level GEneration Network Enhanced CLIP for X-shot Cross-modal Retrieval	Jingyou Xie et.al.	2411.17454v1	null
2024-11-26	PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning	Zhen Sun et.al.	2411.17453v1	null
2024-11-26	CoA: Chain-of-Action for Generative Semantic Labels	Meng Wei et.al.	2411.17406v1	link
2024-11-26	vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation	Bastian Wittmann et.al.	2411.17386v1	link
2024-11-26	2D Matryoshka Training for Information Retrieval	Shuai Wang et.al.	2411.17299v1	link
2024-11-26	APT: Architectural Planning and Text-to-Blueprint Construction Using Large Language Models for Open-World Agents	Jun Yu Chen et.al.	2411.17255v1	link
2024-11-26	Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors	Zhengfei Kuang et.al.	2411.17249v1	null
2024-11-26	Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration	Junyuan Deng et.al.	2411.17240v1	link
2024-11-25	Diffusion Features for Zero-Shot 6DoF Object Pose Estimation	Bernd Von Gimborn et.al.	2411.16668v1	null
2024-11-25	Generating Out-Of-Distribution Scenarios Using Language Models	Erfan Aasi et.al.	2411.16554v1	null
2024-11-25	TopV-Nav: Unlocking the Top-View Spatial Reasoning Potential of MLLM for Zero-shot Object Navigation	Linqing Zhong et.al.	2411.16425v1	null
2024-11-25	Poster: Could Large Language Models Perform Network Management?	Zine el abidine Kherroubi et.al.	2411.16232v1	null
2024-11-25	SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context	Jungang Li et.al.	2411.16213v1	null
2024-11-25	Learn from Foundation Model: Fruit Detection Model without Manual Annotation	Yanan Wang et.al.	2411.16196v1	link
2024-11-25	Language Driven Occupancy Prediction	Zhu Yu et.al.	2411.16072v1	link
2024-11-25	Style-Pro: Style-Guided Prompt Learning for Generalizable Vision-Language Models	Niloufar Alipour Talemi et.al.	2411.16018v1	null
2024-11-24	PIANIST: Learning Partially Observable World Models with LLMs for Multi-Agent Decision Making	Jonathan Light et.al.	2411.15998v1	null
2024-11-24	Segment to Recognize Robustly -- Enhancing Recognition by Image Decomposition	Klara Janouskova et.al.	2411.15933v1	null
2024-11-22	Context-Aware Multimodal Pretraining	Karsten Roth et.al.	2411.15099v1	null
2024-11-22	Task-Aware Robotic Grasping by evaluating Quality Diversity Solutions through Foundation Models	Aurel X. Appius et.al.	2411.14917v1	null
2024-11-22	Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation	Huy Le et.al.	2411.14913v1	null
2024-11-22	Leveraging Hierarchical Prototypes as the Verbalizer for Implicit Discourse Relation Recognition	Wanqiu Long et.al.	2411.14880v1	null
2024-11-22	VisGraphVar: A Benchmark Generator for Assessing Variability in Graph Analysis Using Large Vision-Language Models	Camilo Chacón Sartori et.al.	2411.14832v1	null
2024-11-22	De-biased Multimodal Electrocardiogram Analysis	Haitao Li et.al.	2411.14795v1	null
2024-11-22	Simplifying CLIP: Unleashing the Power of Large-Scale Models on Consumer-level Computers	Hongbo Liu et.al.	2411.14789v1	null
2024-11-21	Solving Zero-Shot 3D Visual Grounding as Constraint Satisfaction Problems	Qihao Yuan et.al.	2411.14594v1	link
2024-11-21	Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding	Yiming Zhang et.al.	2411.14401v1	null
2024-11-21	DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding	Tianhe Ren et.al.	2411.14347v1	link
2024-11-21	StereoCrafter-Zero: Zero-Shot Stereo Video Generation with Noisy Restart	Jian Shi et.al.	2411.14295v1	null
2024-11-21	Efficient Aspect-Based Summarization of Climate Change Reports with Small Language Models	Iacopo Ghinassi et.al.	2411.14272v1	link
2024-11-21	Generalizing End-To-End Autonomous Driving In Real-World Environments Using Zero-Shot LLMs	Zeyu Dong et.al.	2411.14256v1	null
2024-11-21	Evaluating the Robustness of Analogical Reasoning in Large Language Models	Martha Lewis et.al.	2411.14215v1	link
2024-11-21	Stereo Anything: Unifying Stereo Matching with Large-Scale Mixed Data	Xianda Guo et.al.	2411.14053v1	link
2024-11-21	Zero-Shot Low-Light Image Enhancement via Joint Frequency Domain Priors Guided Diffusion	Jinhong He et.al.	2411.13961v1	link
2024-11-21	Learning to Cooperate with Humans using Generative Agents	Yancheng Liang et.al.	2411.13934v1	link
2024-11-21	CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation	Lin Sun et.al.	2411.13836v1	link
2024-11-20	Find Any Part in 3D	Ziqi Ma et.al.	2411.13550v1	null
2024-11-20	BIPro: Zero-shot Chinese Poem Generation via Block Inverse Prompting Constrained Generation Framework	Xu Zou et.al.	2411.13237v1	null
2024-11-20	Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding	Nabeel Seedat et.al.	2411.13163v1	null
2024-11-20	Hard-Synth: Synthesizing Diverse Hard Samples for ASR using Zero-Shot TTS and LLM	Jiawei Yu et.al.	2411.13159v1	null
2024-11-20	Learning Time-Optimal and Speed-Adjustable Tactile In-Hand Manipulation	Johannes Pitz et.al.	2411.13148v1	null
2024-11-20	TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models	Xin Wang et.al.	2411.13136v1	null
2024-11-20	Training Physics-Driven Deep Learning Reconstruction without Raw Data Access for Equitable Fast MRI	Yaşar Utku Alçalar et.al.	2411.13022v1	null
2024-11-20	Evaluating LLMs Capabilities Towards Understanding Social Dynamics	Anique Tahir et.al.	2411.13008v1	null
2024-11-19	Improving Controllability and Editability for Pretrained Text-to-Music Generation Models	Yixiao Zhang et.al.	2411.12641v1	null
2024-11-19	Instant Policy: In-Context Imitation Learning via Graph Diffusion	Vitalis Vosylius et.al.	2411.12633v1	null
2024-11-19	SAM Carries the Burden: A Semi-Supervised Approach Refining Pseudo Labels for Medical Segmentation	Ron Keuth et.al.	2411.12602v1	link
2024-11-19	Probe-Me-Not: Protecting Pre-trained Encoders from Malicious Probing	Ruyi Ding et.al.	2411.12508v1	null
2024-11-19	Predicting User Intents and Musical Attributes from Music Discovery Conversations	Daeyong Kwon et.al.	2411.12254v1	link
2024-11-19	Zero-Shot Crate Digging: DJ Tool Retrieval Using Speech Activity, Music Structure And CLAP Embeddings	Iroro Orife et.al.	2411.12209v1	link
2024-11-19	A More Advanced Group Polarization Measurement Approach Based on LLM-Based Agents and Graphs	Zixin Liu et.al.	2411.12196v1	null
2024-11-19	UrbanDiT: A Foundation Model for Open-World Urban Spatio-Temporal Learning	Yuan Yuan et.al.	2411.12164v1	link
2024-11-19	HEIGHT: Heterogeneous Interaction Graph Transformer for Robot Navigation in Crowded and Constrained Environments	Shuijing Liu et.al.	2411.12150v1	null
2024-11-18	VLN-Game: Vision-Language Equilibrium Search for Zero-Shot Semantic Navigation	Bangguo Yu et.al.	2411.11609v1	null
2024-11-18	Unveiling the Inflexibility of Adaptive Embedding in Traffic Forecasting	Hongjun Wang et.al.	2411.11448v1	link
2024-11-18	Scalable Autoregressive Monocular Depth Estimation	Jinhong Wang et.al.	2411.11361v1	null
2024-11-18	Text-guided Zero-Shot Object Localization	Jingjing Wang et.al.	2411.11357v1	null
2024-11-18	Visual-Semantic Graph Matching Net for Zero-Shot Learning	Bowen Duan et.al.	2411.11351v1	link
2024-11-18	Zero-Shot Load Forecasting with Large Language Models	Wenlong Liao et.al.	2411.11350v1	null
2024-11-18	Transcending Language Boundaries: Harnessing LLMs for Low-Resource Language Translation	Peng Shu et.al.	2411.11295v1	null
2024-11-18	Neuron: Learning Context-Aware Evolving Representations for Zero-Shot Skeleton Action Recognition	Yang Chen et.al.	2411.11288v1	null
2024-11-18	Zero-Shot Automatic Annotation and Instance Segmentation using LLM-Generated Datasets: Eliminating Field Imaging and Manual Annotation for Deep Learning Model Development	Ranjan Sapkota et.al.	2411.11285v1	null
2024-11-18	ZeFaV: Boosting Large Language Models for Zero-shot Fact Verification	Son T. Luu et.al.	2411.11247v1	link
2024-11-15	Modification Takes Courage: Seamless Image Stitching via Reference-Driven Inpainting	Ziqi Xie et.al.	2411.10309v1	link
2024-11-15	CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation	Dengke Zhang et.al.	2411.10086v1	null
2024-11-15	'What did the Robot do in my Absence?' Video Foundation Models to Enhance Intermittent Supervision	Kavindie Katuwandeniya et.al.	2411.10016v1	null
2024-11-15	Zero-shot Voice Conversion with Diffusion Transformers	Songting Liu et.al.	2411.09943v1	link
2024-11-14	LLM Hallucination Reasoning with Zero-shot Knowledge Test	Seongmin Lee et.al.	2411.09689v1	null
2024-11-14	Script-centric behavior understanding for assisted autism spectrum disorder diagnosis	Wenxing Liu et.al.	2411.09413v1	null
2024-11-14	Less is More: Unseen Domain Fake News Detection via Causal Propagation Substructures	Shuzhi Gong et.al.	2411.09389v1	null
2024-11-14	Exploring Zero-Shot Anomaly Detection with CLIP in Medical Imaging: Are We There Yet?	Aldo Marzullo et.al.	2411.09310v1	null
2024-11-14	Mono2Stereo: Monocular Knowledge Transfer for Enhanced Stereo Matching	Yuran Wang et.al.	2411.09151v1	null
2024-11-15	UniHOI: Learning Fast, Dense and Generalizable 4D Reconstruction for Egocentric Hand Object Interaction Videos	Chengbo Yuan et.al.	2411.09145v2	null
2024-11-13	Zero-shot Cross-lingual Transfer Learning with Multiple Source and Target Languages for Information Extraction: Language Selection and Adversarial Training	Nghia Trung Ngo et.al.	2411.08785v1	null
2024-11-13	Measuring similarity between embedding spaces using induced neighborhood graphs	Tiago F. Tavares et.al.	2411.08687v1	null
2024-11-13	Zero-shot capability of SAM-family models for bone segmentation in CT scans	Caroline Magg et.al.	2411.08629v1	null
2024-11-13	Grammarization-Based Grasping with Deep Multi-Autoencoder Latent Space Exploration by Reinforcement Learning Agent	Leonidas Askianakis et.al.	2411.08566v1	null
2024-11-13	CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs	Suhas S Kowshik et.al.	2411.08553v1	null
2024-11-13	An Information Theoretic Approach to Operationalize Right to Data Protection	Abhinav Java et.al.	2411.08506v1	null
2024-11-13	Enhancing Multimodal Query Representation via Visual Dialogues for End-to-End Knowledge Retrieval	Yeong-Joon Ju et.al.	2411.08334v1	link
2024-11-12	Retrieval Augmented Time Series Forecasting	Kutay Tire et.al.	2411.08249v1	link
2024-11-12	Latent Space Disentanglement in Diffusion Transformers Enables Precise Zero-shot Semantic Editing	Zitao Shuai et.al.	2411.08196v1	null
2024-11-12	LLMPhy: Complex Physical Reasoning Using Large Language Models and World Models	Anoop Cherian et.al.	2411.08027v1	null
2024-11-12	Semantic Sleuth: Identifying Ponzi Contracts via Large Language Models	Cong Wu et.al.	2411.07498v1	null
2024-11-11	Untangling Hate Speech Definitions: A Semantic Componential Analysis Across Cultures and Domains	Katerina Korre et.al.	2411.07417v1	null
2024-11-11	Warmstarting for Scaling Language Models	Neeratyoy Mallik et.al.	2411.07340v1	null
2024-11-11	DeepONet as a Multi-Operator Extrapolation Model: Distributed Pretraining with Physics-Informed Fine-Tuning	Zecheng Zhang et.al.	2411.07239v1	null
2024-11-11	The Super Weight in Large Language Models	Mengxia Yu et.al.	2411.07191v1	link
2024-11-11	NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics	David Robinson et.al.	2411.07186v1	null
2024-11-11	SAMPart3D: Segment Any Part in 3D Objects	Yunhan Yang et.al.	2411.07184v1	link
2024-11-11	Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models	Yanchen Wang et.al.	2411.07121v1	link
2024-11-11	Transformer verbatim in-context retrieval across time and scale	Kristijan Armeni et.al.	2411.07075v1	link
2024-11-11	MapSAM: Adapting Segment Anything Model for Automated Feature Detection in Historical Maps	Xue Xia et.al.	2411.06971v1	link
2024-11-11	Robust Fine-tuning of Zero-shot Models via Variance Reduction	Beier Zhu et.al.	2411.06966v1	link
2024-11-11	UMFC: Unsupervised Multi-Domain Feature Calibration for Vision-Language Models	Jiachen Liang et.al.	2411.06921v1	link
2024-11-11	Multi-Stage Knowledge Integration of Vision-Language Models for Continual Learning	Hongsheng Zhang et.al.	2411.06764v1	null
2024-11-08	End-to-End Navigation with Vision Language Models: Transforming Spatial Reasoning into Question-Answering	Dylan Goetting et.al.	2411.05755v1	link
2024-11-08	Asterisk: Keep it Simple*	Andrew Semenov et.al.	2411.05691v1	null
2024-11-08	Assessing Open-Source Large Language Models on Argumentation Mining Subtasks	Mohammad Yeghaneh Abkenar et.al.	2411.05639v1	null
2024-11-08	An Early FIRST Reproduction and Improvements to Single-Token Decoding for Fast Listwise Reranking	Zijian Chen et.al.	2411.05508v1	null
2024-11-08	WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models	Shengda Fan et.al.	2411.05451v1	link
2024-11-08	Enhancing Visual Classification using Comparative Descriptors	Hankyeol Lee et.al.	2411.05357v1	link
2024-11-08	ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving	Tao Ma et.al.	2411.05311v1	null
2024-11-07	Abstract2Appendix: Academic Reviews Enhance LLM Long-Context Capabilities	Shengzhi Li et.al.	2411.05232v1	link
2024-11-07	Audiobox TTA-RAG: Improving Zero-Shot and Few-Shot Text-To-Audio with Retrieval-Augmented Generation	Mu Yang et.al.	2411.05141v1	null
2024-11-07	SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation	Koichi Namekata et.al.	2411.04989v1	null
2024-11-07	DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning	Gaoyue Zhou et.al.	2411.04983v1	null
2024-11-07	Noisy Zero-Shot Coordination: Breaking The Common Knowledge Assumption In Zero-Shot Coordination Games	Usman Anwar et.al.	2411.04976v1	link
2024-11-07	In the Era of Prompt Learning with Vision-Language Models	Ankit Jha et.al.	2411.04892v1	null
2024-11-07	Zero-Shot Temporal Resolution Domain Adaptation for Spiking Neural Networks	Sanja Karilanova et.al.	2411.04760v1	null
2024-11-07	Vision Language Models are In-Context Value Learners	Yecheng Jason Ma et.al.	2411.04549v1	null
2024-11-07	Best Practices for Distilling Large Language Models into BERT for Web Search Ranking	Dezhi Ye et.al.	2411.04539v1	null
2024-11-07	Tomato, Tomahto, Tomate: Measuring the Role of Shared Semantics among Subwords in Multilingual Language Models	Xinyu Zhang et.al.	2411.04530v1	null
2024-11-07	Enabling Adaptive Agent Training in Open-Ended Simulators by Targeting Diversity	Robby Costales et.al.	2411.04466v1	link
2024-11-07	AutoProteinEngine: A Large Language Model Driven Agent Framework for Multimodal AutoML in Protein Engineering	Yungeng Liu et.al.	2411.04440v1	link
2024-11-06	RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models	Maya Varma et.al.	2411.04097v1	link
2024-11-06	Multi3Hate: Multimodal, Multilingual, and Multicultural Hate Speech Detection with Vision-Language Models	Minh Duc Bui et.al.	2411.03888v1	link
2024-11-06	SA3DIP: Segment Any 3D Instance with Potential 3D Priors	Xi Yang et.al.	2411.03819v1	link
2024-11-06	No Culture Left Behind: ArtELingo-28, a Benchmark of WikiArt with Captions in 28 Languages	Youssef Mohamed et.al.	2411.03769v1	link
2024-11-06	Zero-shot Dynamic MRI Reconstruction with Global-to-local Diffusion Model	Yu Guan et.al.	2411.03723v1	link
2024-11-06	Fine-Tuning Vision-Language Model for Automated Engineering Drawing Information Extraction	Muhammad Tayyab Khan et.al.	2411.03707v1	null
2024-11-06	3DGS-CD: 3D Gaussian Splatting-based Change Detection for Physical Object Rearrangement	Ziqi Lu et.al.	2411.03706v1	link
2024-11-06	Towards Scalable Automated Grading: Leveraging Large Language Models for Conceptual Question Evaluation in Engineering	Rujun Gao et.al.	2411.03659v1	null
2024-11-05	Exploring the Benefits of Domain-Pretraining of Generative Large Language Models for Chemistry	Anurag Acharya et.al.	2411.03542v1	null
2024-11-05	A Mamba Foundation Model for Time Series Forecasting	Haoyu Ma et.al.	2411.02941v1	null
2024-11-05	DDFAV: Remote Sensing Large Vision Language Models Dataset and Evaluation Benchmark	Haodong Li et.al.	2411.02733v1	link
2024-11-04	EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector	Deok-Hyeon Cho et.al.	2411.02625v1	link
2024-11-04	MM-Embed: Universal Multimodal Retrieval with Multimodal LLMs	Sheng-Chieh Lin et.al.	2411.02571v1	null
2024-11-04	TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives	Maitreya Patel et.al.	2411.02545v1	null
2024-11-04	A Comparative Analysis of Instruction Fine-Tuning LLMs for Financial Text Classification	Sorouralsadat Fatemi et.al.	2411.02476v1	null
2024-11-04	Do Advanced Language Models Eliminate the Need for Prompt Engineering in Software Engineering?	Guoqing Wang et.al.	2411.02093v1	null
2024-11-04	CTEFM-VC: Zero-Shot Voice Conversion Based on Content-Aware Timbre Ensemble Modeling and Flow Matching	Yu Pan et.al.	2411.02026v1	null
2024-11-04	Exploiting Contextual Uncertainty of Visual Data for Efficient Training of Deep Models	Sharat Agarwal et.al.	2411.01925v1	null
2024-11-04	ManiBox: Enhancing Spatial Grasping Generalization via Scalable Simulation Data Generation	Hengkai Tan et.al.	2411.01850v1	null
2024-11-04	DiffuMask-Editor: A Novel Paradigm of Integration Between the Segmentation Diffusion Model and Image Editing to Improve Segmentation Ability	Bo Gao et.al.	2411.01819v1	null
2024-11-03	Investigating Large Language Models for Complex Word Identification in Multilingual and Multidomain Setups	Răzvan-Alexandru Smădu et.al.	2411.01706v1	link
2024-11-03	Object segmentation from common fate: Motion energy processing enables human-like zero-shot generalization to random dot stimuli	Matthias Tangemann et.al.	2411.01505v1	link
2024-11-02	Task-Oriented Hierarchical Object Decomposition for Visuomotor Control	Jianing Qian et.al.	2411.01284v1	null
2024-11-02	MonoPlane: Exploiting Monocular Geometric Cues for Generalizable 3D Plane Reconstruction	Wang Zhao et.al.	2411.01226v1	link
2024-11-02	Transfer Learning for Finetuning Large Language Models	Tobias Strangmann et.al.	2411.01195v1	null
2024-10-31	DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models	Heng-Jui Chang et.al.	2410.24177v1	null
2024-11-02	$π_0$ : A Vision-Language-Action Flow Model for General Robot Control	Kevin Black et.al.	2410.24164v2	null
2024-10-31	Scaling Concept With Text-Guided Diffusion Models	Chao Huang et.al.	2410.24151v1	null
2024-10-31	Matchmaker: Self-Improving Large Language Model Programs for Schema Matching	Nabeel Seedat et.al.	2410.24105v1	null
2024-10-31	In-Context Fine-Tuning for Time-Series Foundation Models	Abhimanyu Das et.al.	2410.24087v1	null
2024-10-31	GAMap: Zero-Shot Object Goal Navigation with Multi-Scale Geometric-Affordance Guidance	Shuaihang Yuan et.al.	2410.23978v1	null
2024-10-31	Text-DiFuse: An Interactive Multi-Modal Image Fusion Framework based on Text-modulated Diffusion Model	Hao Zhang et.al.	2410.23905v1	link
2024-10-31	EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection	Qinqian Lei et.al.	2410.23904v1	link
2024-10-31	The NPU-HWC System for the ISCSLP 2024 Inspirational and Convincing Audio Generation Challenge	Dake Guo et.al.	2410.23815v1	null
2024-10-31	RealMind: Zero-Shot EEG-Based Visual Decoding and Captioning Using Multi-Modal Models	Dongyang Li et.al.	2410.23754v1	null
2024-10-30	Multi-student Diffusion Distillation for Better One-step Generators	Yanke Song et.al.	2410.23274v1	null
2024-10-30	Partial Channel Dependence with Channel Masks for Time Series Foundation Models	Seunghan Lee et.al.	2410.23222v1	null
2024-10-30	Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks	Michael Matthews et.al.	2410.23208v1	link
2024-10-30	FlexTSF: A Universal Forecasting Model for Time Series with Variable Regularities	Jingge Xiao et.al.	2410.23160v1	link
2024-10-30	DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes	Jialiang Zhang et.al.	2410.23004v1	null
2024-10-30	SimpsonsVQA: Enhancing Inquiry-Based Learning with a Tailored Dataset	Ngoc Dung Huynh et.al.	2410.22648v1	null
2024-10-30	SleepNetZero: Zero-Burden Zero-Shot Reliable Sleep Staging With Neural Networks Based on Ballistocardiograms	Shuzhen Li et.al.	2410.22646v1	null
2024-10-29	RealCQA-V2 : Visual Premise Proving	Saleem Ahmed et.al.	2410.22492v1	null
2024-10-29	Local Policies Enable Zero-shot Long-horizon Manipulation	Murtaza Dalal et.al.	2410.22332v1	null
2024-10-29	Are Decoder-Only Large Language Models the Silver Bullet for Code Search?	Yuxuan Chen et.al.	2410.22240v1	link
2024-10-29	Active Learning for Vision-Language Models	Bardia Safaei et.al.	2410.22187v1	null
2024-10-29	Data Generation for Hardware-Friendly Post-Training Quantization	Lior Dikstein et.al.	2410.22110v1	link
2024-10-29	PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement	Shutong Jin et.al.	2410.22059v1	null
2024-10-29	Reliable Semantic Understanding for Real World Zero-shot Object Goal Navigation	Halil Utku Unlu et.al.	2410.21926v1	null
2024-10-30	Text-Guided Attention is All You Need for Zero-Shot Robustness in Vision-Language Models	Lu Yu et.al.	2410.21802v2	link
2024-10-29	Pushing the Limits of All-Atom Geometric Graph Neural Networks: Pre-Training, Scaling and Zero-Shot Transfer	Zihan Pengmei et.al.	2410.21683v1	null
2024-10-28	SandboxAQ's submission to MRL 2024 Shared Task on Multi-lingual Multi-task Information Retrieval	Isidora Chara Tourni et.al.	2410.21501v1	null
2024-10-28	SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization	Wanhua Li et.al.	2410.21411v1	link
2024-10-28	Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback	Nour Jedidi et.al.	2410.21242v1	null
2024-10-28	Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments	Marharyta Domnich et.al.	2410.21131v1	link
2024-10-28	Retrieval-Enhanced Mutation Mastery: Augmenting Zero-Shot Prediction of Protein Language Model	Yang Tan et.al.	2410.21127v1	link
2024-10-28	Zero-Shot Action Recognition in Surveillance Videos	Joao Pereira et.al.	2410.21113v1	null
2024-10-28	Exploring the Reliability of Foundation Model-Based Frontier Selection in Zero-Shot Object Goal Navigation	Shuaihang Yuan et.al.	2410.21037v1	null
2024-10-28	Reference-Free Formula Drift with Reinforcement Learning: From Driving Data to Tire Energy-Inspired, Real-World Policies	Franck Djeumou et.al.	2410.20990v1	null
2024-10-28	DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive Learning	Xun Guo et.al.	2410.20964v1	link
2024-10-28	MrT5: Dynamic Token Merging for Efficient Byte-level Language Models	Julie Kallini et.al.	2410.20771v1	link
2024-10-28	Face-MLLM: A Large Face Perception Model	Haomiao Sun et.al.	2410.20717v1	null
2024-10-28	Reprogramming Pretrained Target-Specific Diffusion Models for Dual-Target Drug Design	Xiangxin Zhou et.al.	2410.20688v1	link
2024-10-25	Adversarial Environment Design via Regret-Guided Diffusion Models	Hojun Chung et.al.	2410.19715v1	null
2024-10-25	TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning	Xiangyu Zeng et.al.	2410.19702v1	null
2024-10-25	IPPON: Common Sense Guided Informative Path Planning for Object Goal Navigation	Kaixian Qu et.al.	2410.19697v1	null
2024-10-25	Context-Based Visual-Language Place Recognition	Soojin Woo et.al.	2410.19341v1	link
2024-10-25	Enhancing Zero-Shot Vision Models by Label-Free Prompt Distribution Learning and Bias Correcting	Xingyu Zhu et.al.	2410.19294v1	null
2024-10-24	Label Set Optimization via Activation Distribution Kurtosis for Zero-shot Classification with Generative Models	Yue Li et.al.	2410.19195v1	null
2024-10-24	AlignCap: Aligning Speech Emotion Captioning to Human Preferences	Ziqi Liang et.al.	2410.19134v1	null
2024-10-24	ConceptDrift: Uncovering Biases through the Lens of Foundational Models	Cristian Daniel Păduraru et.al.	2410.18970v1	null
2024-10-24	BioMistral-NLU: Towards More Generalizable Medical Language Understanding through Instruction Tuning	Yujuan Velvin Fu et.al.	2410.18955v1	null
2024-10-24	SkillMimicGen: Automated Demonstration Generation for Efficient Skill Learning and Deployment	Caelan Garrett et.al.	2410.18907v1	null
2024-10-24	Probabilistic Language-Image Pre-Training	Sanghyuk Chun et.al.	2410.18857v1	link
2024-10-24	Task Calibration: Calibrating Large Language Models on Inference Tasks	Yingjie Li et.al.	2410.18764v1	null
2024-10-24	Data Scaling Laws in Imitation Learning for Robotic Manipulation	Fanqi Lin et.al.	2410.18647v1	link
2024-10-24	Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data	Anup Shirgaonkar et.al.	2410.18588v1	null
2024-10-24	Zero-shot Object Navigation with Vision-Language Models Reasoning	Congcong Wen et.al.	2410.18570v1	null
2024-10-24	Beyond Color and Lines: Zero-Shot Style-Specific Image Variations with Coordinated Semantics	Jinghao Hu et.al.	2410.18537v1	null
2024-10-24	Scaling up Masked Diffusion Models on Text	Shen Nie et.al.	2410.18514v1	link
2024-10-23	Key Algorithms for Keyphrase Generation: Instruction-Based LLMs for Russian Scientific Keyphrases	Anna Glazkova et.al.	2410.18040v1	null
2024-10-23	Scaling Robot Policy Learning via Zero-Shot Labeling with Foundation Models	Nils Blank et.al.	2410.17772v1	null
2024-10-23	Learning Versatile Skills with Curriculum Masking	Yao Tang et.al.	2410.17744v1	link
2024-10-23	Entity-based Reinforcement Learning for Autonomous Cyber Defence	Isaac Symes Thompson et.al.	2410.17647v1	link
2024-10-23	Incremental Learning of Affordances using Markov Logic Networks	George Potter et.al.	2410.17624v1	null
2024-10-23	Graphusion: A RAG Framework for Knowledge Graph Construction with a Global Perspective	Rui Yang et.al.	2410.17600v1	link
2024-10-23	Multimodal Information Bottleneck for Deep Reinforcement Learning with Multiple Sensors	Bang You et.al.	2410.17551v1	null
2024-10-23	Generalizable Motion Planning via Operator Learning	Sharath Matada et.al.	2410.17547v1	null
2024-10-23	X-MOBILITY: End-To-End Generalizable Navigation via World Modeling	Wei Liu et.al.	2410.17491v1	link
2024-10-22	Denoise-I2W: Mapping Images to Denoising Words for Accurate Zero-Shot Composed Image Retrieval	Yuanmin Tang et.al.	2410.17393v1	null
2024-10-22	Altogether: Image Captioning via Re-aligning Alt-text	Hu Xu et.al.	2410.17251v1	link
2024-10-22	LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias	Haian Jin et.al.	2410.17242v1	null
2024-10-22	Are Visual-Language Models Effective in Action Recognition? A Comparative Study	Mahmoud Ali et.al.	2410.17149v1	null
2024-10-22	LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging	Ke Wang et.al.	2410.17146v1	link
2024-10-22	SG-FSM: A Self-Guiding Zero-Shot Prompting Paradigm for Multi-Hop Question Answering Based on Finite State Machine	Xiaochen Wang et.al.	2410.17021v1	null
2024-10-22	Towards Real Zero-Shot Camouflaged Object Segmentation without Camouflaged Annotations	Cheng Lei et.al.	2410.16953v1	null
2024-10-22	DNAHLM -- DNA sequence and Human Language mixed large language Model	Wang Liang et.al.	2410.16917v1	link
2024-10-22	AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models	Yongjian Wu et.al.	2410.16820v1	link
2024-10-22	PLDR-LLM: Large Language Model from Power Law Decoder Representations	Burc Gokden et.al.	2410.16703v1	link
2024-10-22	GE2E-KWS: Generalized End-to-End Training and Evaluation for Zero-shot Keyword Spotting	Pai Zhu et.al.	2410.16647v1	null
2024-10-21	MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report	Samrajya Thapa et.al.	2410.16239v1	link
2024-10-21	IBGP: Imperfect Byzantine Generals Problem for Zero-Shot Robustness in Communicative Multi-Agent Systems	Yihuan Mao et.al.	2410.16237v1	null
2024-10-21	Continuous Speech Synthesis using per-token Latent Diffusion	Arnon Turetzky et.al.	2410.16048v1	null
2024-10-21	Few-shot target-driven instance detection based on open-vocabulary object detection models	Ben Crulis et.al.	2410.16028v1	null
2024-10-21	Zero-Shot Scene Reconstruction from Single Images with Deep Prior Assembly	Junsheng Zhou et.al.	2410.15971v1	null
2024-10-21	Mitigating Object Hallucination via Concentric Causal Attention	Yun Xing et.al.	2410.15926v1	link
2024-10-21	MI-VisionShot: Few-shot adaptation of vision-language models for slide-level classification of histopathological images	Pablo Meseguer et.al.	2410.15881v1	null
2024-10-21	Triplane Grasping: Efficient 6-DoF Grasping with Single RGB Images	Yiming Li et.al.	2410.15879v1	null
2024-10-21	FlickerFusion: Intra-trajectory Domain Generalizing Multi-Agent RL	Woosung Koh et.al.	2410.15876v1	null
2024-10-21	Unleashing the Potential of Vision-Language Pre-Training for 3D Zero-Shot Lesion Segmentation via Mask-Attribute Alignment	Yankai Jiang et.al.	2410.15744v1	null
2024-10-18	BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities	Shaozhe Hao et.al.	2410.14672v1	link
2024-10-18	Dialetto, ma Quanto Dialetto? Transcribing and Evaluating Dialects on a Continuum	Ryan Soh-Eun Shim et.al.	2410.14589v1	null
2024-10-18	SylloBio-NLI: Evaluating Large Language Models on Biomedical Syllogistic Reasoning	Magdalena Wysocka et.al.	2410.14399v1	null
2024-10-18	AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios	Ziming Huang et.al.	2410.14379v1	link
2024-10-18	Zero-shot Action Localization via the Confidence of Large Vision-Language Models	Josiah Aklilu et.al.	2410.14340v1	null
2024-10-18	Storyboard guided Alignment for Fine-grained Video Action Recognition	Enqi Liu et.al.	2410.14238v1	null
2024-10-18	Assessing Open-world Forgetting in Generative Image Model Customization	Héctor Laria et.al.	2410.14159v1	null
2024-10-17	Measuring and Modifying the Readability of English Texts with GPT-4	Sean Trott et.al.	2410.14028v1	link
2024-10-17	Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens	Lijie Fan et.al.	2410.13863v1	null
2024-10-17	VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding	Runsen Xu et.al.	2410.13860v1	link
2024-10-17	DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control	Yujie Wei et.al.	2410.13830v1	null
2024-10-17	AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents	Ke Yang et.al.	2410.13825v1	null
2024-10-17	Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers	Yuchen Liang et.al.	2410.13746v1	null
2024-10-17	ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions	Shailaja Keyur Sampat et.al.	2410.13662v1	link
2024-10-17	Help Me Identify: Is an LLM+VQA System All We Need to Identify Visual Concepts?	Shailaja Keyur Sampat et.al.	2410.13651v1	link
2024-10-18	Enhanced Prompt-leveraged Weakly Supervised Cancer Segmentation based on Segment Anything	Joonhyeon Song et.al.	2410.13621v2	link
2024-10-17	Large Language Models as Narrative-Driven Recommenders	Lukas Eberhard et.al.	2410.13604v1	null
2024-10-17	Representing Model Weights with Language using Tree Experts	Eliahu Horwitz et.al.	2410.13569v1	null
2024-10-16	In-Context Learning Enables Robot Action Prediction in LLMs	Yida Yin et.al.	2410.12782v1	null
2024-10-16	Towards Zero-Shot Camera Trap Image Categorization	Jiří Vyskočil et.al.	2410.12769v1	null
2024-10-16	Towards Graph Foundation Models: The Perspective of Zero-shot Reasoning on Knowledge Graphs	Kai Wang et.al.	2410.12609v1	null
2024-10-16	A Claim Decomposition Benchmark for Long-form Answer Verification	Zhihao Zhang et.al.	2410.12558v1	link
2024-10-16	SAC-GLAM: Improving Online RL for LLM agents with Soft Actor-Critic and Hindsight Relabeling	Loris Gaven et.al.	2410.12481v1	null
2024-10-16	SF-Speech: Straightened Flow for Zero-Shot Voice Clone on Small-Scale Dataset	Xuyuan Li et.al.	2410.12399v1	null
2024-10-16	ERVQ: Enhanced Residual Vector Quantization with Intra-and-Inter-Codebook Optimization for Neural Audio Codecs	Rui-Chen Zheng et.al.	2410.12359v1	null
2024-10-16	MAX: Masked Autoencoder for X-ray Fluorescence in Geological Investigation	An-Sheng Lee et.al.	2410.12330v1	link
2024-10-16	Evaluating Cascaded Methods of Vision-Language Models for Zero-Shot Detection and Association of Hardhats for Increased Construction Safety	Lucas Choi et.al.	2410.12225v1	null
2024-10-15	Planning Anything with Rigor: General-Purpose Zero-Shot Planning with LLM-based Formalized Programming	Yilun Hao et.al.	2410.12112v1	null
2024-10-15	FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting	Zhe Li et.al.	2410.11802v1	null
2024-10-15	Time-Series Foundation Model for Value-at-Risk	Anubha Goel et.al.	2410.11773v1	link
2024-10-15	Zero-shot Model-based Reinforcement Learning using Large Language Models	Abdelhakim Benechehab et.al.	2410.11711v1	link
2024-10-15	PSVMA+: Exploring Multi-granularity Semantic-visual Adaption for Generalized Zero-shot Learning	Man Liu et.al.	2410.11560v1	null
2024-10-15	AGENTiGraph: An Interactive Knowledge Graph Platform for LLM-based Chatbots Utilizing Private Data	Xinjie Zhao et.al.	2410.11531v1	null
2024-10-15	Leveraging LLM Embeddings for Cross Dataset Label Alignment and Zero Shot Music Emotion Prediction	Renhang Liu et.al.	2410.11522v1	link
2024-10-15	Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement	Zhi Wang et.al.	2410.11448v1	link
2024-10-15	DRACO: A Denoising-Reconstruction Autoencoder for Cryo-EM	Yingjun Shen et.al.	2410.11373v1	null
2024-10-15	Enhance Graph Alignment for Large Language Models	Haitong Luo et.al.	2410.11370v1	null
2024-10-15	In-Context Learning for Long-Context Sentiment Analysis on Infrastructure Project Opinions	Alireza Shamshiri et.al.	2410.11265v1	null
2024-10-14	Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models	Jingzhi Bao et.al.	2410.10821v1	link
2024-10-14	Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations	Litu Rout et.al.	2410.10792v1	null
2024-10-14	SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators	Rasoul Shafipour et.al.	2410.10714v1	null
2024-10-14	MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer	Minghao Zhu et.al.	2410.10589v1	link
2024-10-14	Recipe for Zero-shot POS Tagging: Is It Useful in Realistic Scenarios?	Zeno Vandenbulcke et.al.	2410.10576v1	null
2024-10-14	Continual Learning Improves Zero-Shot Action Recognition	Shreyank N Gowda et.al.	2410.10497v1	null
2024-10-14	Learning to Ground VLMs without Forgetting	Aritra Bhowmik et.al.	2410.10491v1	null
2024-10-14	Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts	Xu Liu et.al.	2410.10469v1	null
2024-10-14	4DStyleGaussian: Zero-shot 4D Style Transfer with Gaussian Splatting	Wanlin Liang et.al.	2410.10412v1	null
2024-10-14	GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation	Taha Aksu et.al.	2410.10393v1	link
2024-10-11	Extra Global Attention Designation Using Keyword Detection in Sparse Transformer Architectures	Evan Lucas et.al.	2410.08971v1	null
2024-10-11	NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models	Zheng Yi Ho et.al.	2410.08970v1	null
2024-10-11	Zero-Shot Pupil Segmentation with SAM 2: A Case Study of Over 14 Million Images	Virmarie Maquiling et.al.	2410.08926v1	null
2024-10-11	SegGrasp: Zero-Shot Task-Oriented Grasping via Semantic and Geometric Guided Segmentation	Haosheng Li et.al.	2410.08901v1	null
2024-10-11	A Benchmark for Cross-Domain Argumentative Stance Classification on Social Media	Jiaqing Yuan et.al.	2410.08900v1	null
2024-10-11	RoRA-VLM: Robust Retrieval-Augmented Vision Language Models	Jingyuan Qi et.al.	2410.08876v1	null
2024-10-11	One-shot Generative Domain Adaptation in 3D GANs	Ziqiang Li et.al.	2410.08824v1	link
2024-10-11	Zero-Shot Offline Imitation Learning via Optimal Transport	Thomas Rupf et.al.	2410.08751v1	link
2024-10-11	Chain-of-Restoration: Multi-Task Image Restoration Models are Zero-Shot Step-by-Step Universal Image Restorers	Jin Cao et.al.	2410.08688v1	link
2024-10-11	Boosting Open-Vocabulary Object Detection by Handling Background Samples	Ruizhe Zeng et.al.	2410.08645v1	null
2024-10-10	LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts	Anh-Quan Cao et.al.	2410.08211v1	null
2024-10-10	SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation	Hang Yin et.al.	2410.08189v1	null
2024-10-10	On the Evaluation of Generative Robotic Simulations	Feng Chen et.al.	2410.08172v1	null
2024-10-10	ZeroComp: Zero-shot Object Compositing from Image Intrinsics via Diffusion	Zitian Zhang et.al.	2410.08168v1	link
2024-10-10	Constrained Skill Discovery: Quadruped Locomotion with Unsupervised Reinforcement Learning	Vassil Atanassov et.al.	2410.07877v1	null
2024-10-10	RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation	Songming Liu et.al.	2410.07864v1	link
2024-10-10	Rewriting Conversational Utterances with Instructed Large Language Models	Elnara Galimzhanova et.al.	2410.07797v1	null
2024-10-10	The Power of Input: Benchmarking Zero-Shot Sim-To-Real Transfer of Reinforcement Learning Control Policies for Quadrotor Control	Alberto Dionigi et.al.	2410.07686v1	null
2024-10-10	Parallel Digital Twin-driven Deep Reinforcement Learning for User Association and Load Balancing in Dynamic Wireless Networks	Zhenyu Tao et.al.	2410.07611v1	null
2024-10-10	CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features	Po-han Li et.al.	2410.07610v1	null
2024-10-09	AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation	Yukang Cao et.al.	2410.07164v1	null
2024-10-09	Exploring the Readiness of Prominent Small Language Models for the Democratization of Financial Literacy	Tagore Rao Kosireddy et.al.	2410.07118v1	link
2024-10-09	Collusion Detection with Graph Neural Networks	Lucas Gomes et.al.	2410.07091v1	null
2024-10-09	Stanceformer: Target-Aware Transformer for Stance Detection	Krishna Garg et.al.	2410.07083v1	link
2024-10-09	Compositional Entailment Learning for Hyperbolic Vision-Language Models	Avik Pal et.al.	2410.06912v1	link
2024-10-09	F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching	Yushen Chen et.al.	2410.06885v1	link
2024-10-09	K-SAM: A Prompting Method Using Pretrained U-Net to Improve Zero Shot Performance of SAM on Lung Segmentation in CXR Images	Mohamed Deriche et.al.	2410.06825v1	null
2024-10-09	Toward Physics-guided Time Series Embedding	Jiaxi Hu et.al.	2410.06651v1	null
2024-10-09	Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments	Meng Yu et.al.	2410.06626v1	null
2024-10-09	DCP: Learning Accelerator Dataflow for Neural Network via Propagation	Peng Xu et.al.	2410.06553v1	null
2024-10-07	Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality	Youngtaek Oh et.al.	2410.05210v1	link
2024-10-07	ZEBRA: Zero-Shot Example-Based Retrieval Augmentation for Commonsense Question Answering	Francesco Maria Molfese et.al.	2410.05077v1	link
2024-10-07	PostEdit: Posterior Sampling for Efficient Zero-Shot Image Editing	Feng Tian et.al.	2410.04844v1	link
2024-10-07	LPZero: Language Model Zero-cost Proxy Search from Zero	Peijie Dong et.al.	2410.04808v1	null
2024-10-07	Building Damage Assessment in Conflict Zones: A Deep Learning Approach Using Geospatial Sub-Meter Resolution Data	Matteo Risso et.al.	2410.04802v1	null
2024-10-07	Improving Image Clustering with Artifacts Attenuation via Inference-Time Attention Engineering	Kazumoto Nakamura et.al.	2410.04801v1	null
2024-10-07	Document-level Causal Relation Extraction with Knowledge-guided Binary Question Answering	Zimu Wang et.al.	2410.04752v1	null
2024-10-07	ACDC: Autoregressive Coherent Multimodal Generation using Diffusion Correction	Hyungjin Chung et.al.	2410.04721v1	null
2024-10-07	Demo of Zero-Shot Guitar Amplifier Modelling: Enhancing Modeling with Hyper Neural Networks	Yu-Hua Chen et.al.	2410.04702v1	null
2024-10-07	SegINR: Segment-wise Implicit Neural Representation for Sequence Alignment in Neural Text-to-Speech	Minchan Kim et.al.	2410.04690v1	null
2024-10-04	GenSim2: Scaling Robot Data Generation with Multi-modal and Reasoning LLMs	Pu Hua et.al.	2410.03645v1	null
2024-10-04	What Matters for Model Merging at Scale?	Prateek Yadav et.al.	2410.03617v1	null
2024-10-04	Table Question Answering for Low-resourced Indic Languages	Vaishali Pal et.al.	2410.03576v1	link
2024-10-04	STREAMS: An Assistive Multimodal AI Framework for Empowering Biosignal Based Robotic Controls	Ali Rabiee et.al.	2410.03486v1	null
2024-10-04	Zero-Shot Fact Verification via Natural Logic and Large Language Models	Marek Strong et.al.	2410.03341v1	link
2024-10-04	Selective Test-Time Adaptation for Unsupervised Anomaly Detection using Neural Implicit Representations	Sameer Ambekar et.al.	2410.03306v1	link
2024-10-04	Comparing zero-shot self-explanations with human rationales in multilingual text classification	Stephanie Brandl et.al.	2410.03296v1	null
2024-10-04	Enhanced Transformer architecture for in-context learning of dynamical systems	Matteo Rufolo et.al.	2410.03291v1	null
2024-10-04	What do Large Language Models Need for Machine Translation Evaluation?	Shenbin Qian et.al.	2410.03278v1	link
2024-10-04	PersoBench: Benchmarking Personalized Response Generation in Large Language Models	Saleh Afzoon et.al.	2410.03198v1	null
2024-10-03	Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations	Nick Jiang et.al.	2410.02762v1	link
2024-10-03	Training Language Models on Synthetic Edit Sequences Improves Code Synthesis	Ulyana Piterbarg et.al.	2410.02749v1	link
2024-10-03	Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers	Shijie Chen et.al.	2410.02642v1	null
2024-10-03	Plots Unlock Time-Series Understanding in Multimodal Models	Mayank Daswani et.al.	2410.02637v1	null
2024-10-03	LoGra-Med: Long Context Multi-Graph Alignment for Medical Vision-Language Model	Duy M. H. Nguyen et.al.	2410.02615v1	null
2024-10-03	Dog-IQA: Standard-guided Zero-shot MLLM for Mix-grained Image Quality Assessment	Kai Liu et.al.	2410.02505v1	link
2024-10-03	Cross-Embodiment Dexterous Grasping with Reinforcement Learning	Haoqi Yuan et.al.	2410.02479v1	null
2024-10-03	Learning Diverse Bimanual Dexterous Manipulation Skills from Human Demonstrations	Bohan Zhou et.al.	2410.02477v1	null
2024-10-03	Unsupervised Meta-Learning via Dynamic Head and Heterogeneous Task Construction for Few-Shot Classification	Yunchuan Guan et.al.	2410.02267v1	link
2024-10-03	Visual Prompting in LLMs for Enhancing Emotion Recognition	Qixuan Zhang et.al.	2410.02244v1	null
2024-10-02	An Exploration of Self-Supervised Mutual Information Alignment for Multi-Task Settings	Soham Govande et.al.	2410.01704v1	link
2024-10-02	Saliency-Guided DETR for Moment Retrieval and Highlight Detection	Aleksandr Gordeev et.al.	2410.01615v1	link
2024-10-02	Coordinate-Based Neural Representation Enabling Zero-Shot Learning for 3D Multiparametric Quantitative MRI	Guoyan Lao et.al.	2410.01577v1	null
2024-10-03	EUFCC-CIR: a Composed Image Retrieval Dataset for GLAM Collections	Francesc Net et.al.	2410.01536v2	link
2024-10-02	Toward a Holistic Evaluation of Robustness in CLIP Models	Weijie Tu et.al.	2410.01534v1	null
2024-10-02	SinkSAM: A Monocular Depth-Guided SAM Framework for Automatic Sinkhole Segmentation	Osher Rafaeli et.al.	2410.01473v1	link
2024-10-02	The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs	Hong Li et.al.	2410.01417v1	null
2024-10-02	AgriCLIP: Adapting CLIP for Agriculture and Livestock via Domain-Specialized Cross-Model Alignment	Umair Nawaz et.al.	2410.01407v1	link
2024-10-02	Toward Zero-Shot Learning for Visual Dehazing of Urological Surgical Robots	Renkai Wu et.al.	2410.01395v1	link
2024-10-02	Takin-VC: Zero-shot Voice Conversion via Jointly Hybrid Content and Memory-Augmented Context-Aware Timbre Modeling	Yuguang Yang et.al.	2410.01350v1	null
2024-09-30	Uni $^2$ Det: Unified and Universal Framework for Prompt-Guided Multi-dataset 3D Detection	Yubin Wang et.al.	2409.20558v1	null
2024-09-30	Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos	Md Mohaiminul Islam et.al.	2409.20557v1	null
2024-09-30	Robi Butler: Remote Multimodal Interactions with Household Robot Assistant	Anxing Xiao et.al.	2409.20548v1	null
2024-09-30	FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing	Lingling Cai et.al.	2409.20500v1	null
2024-10-01	Instance-adaptive Zero-shot Chain-of-Thought Prompting	Xiaosong Yuan et.al.	2409.20441v2	null
2024-09-30	VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs	Ruotong Liao et.al.	2409.20365v1	link
2024-09-30	CableInspect-AD: An Expert-Annotated Anomaly Detection Dataset	Akshatha Arodi et.al.	2409.20353v1	link
2024-09-30	RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning	Yuxuan Wu et.al.	2409.20291v1	null
2024-09-30	Analysing Zero-Shot Readability-Controlled Sentence Simplification	Abdullah Barayan et.al.	2409.20246v1	null
2024-09-30	VMAD: Visual-enhanced Multimodal Large Language Model for Zero-Shot Anomaly Detection	Huilin Deng et.al.	2409.20146v1	null
2024-09-27	Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs	Yanyuan Qiao et.al.	2409.18794v1	null
2024-09-27	When SAM2 Meets Video Camouflaged Object Segmentation: A Comprehensive Evaluation and Adaptation	Yuli Zhou et.al.	2409.18653v1	link
2024-09-27	Do LLMs suffer from Multi-Party Hangover? A Diagnostic Approach to Addressee Recognition and Response Selection in Conversations	Nicolò Penzo et.al.	2409.18602v1	link
2024-09-27	"Oh LLM, I'm Asking Thee, Please Give Me a Decision Tree": Zero-Shot Decision Tree Induction and Embedding with Large Language Models	Ricardo Knauer et.al.	2409.18594v1	null
2024-09-27	EmoPro: A Prompt Selection Strategy for Emotional Expression in LM-based Speech Synthesis	Haoyu Wang et.al.	2409.18512v1	null
2024-09-27	Exploring Language Model Generalization in Low-Resource Extractive QA	Saptarshi Sengupta et.al.	2409.18446v1	link
2024-09-26	AER-LLM: Ambiguity-aware Emotion Recognition Leveraging Large Language Models	Xin Hong et.al.	2409.18339v1	null
2024-09-26	Learning to Drive via Asymmetric Self-Play	Chris Zhang et.al.	2409.18218v1	null
2024-09-26	Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction	Jing He et.al.	2409.18124v1	null
2024-09-26	GSON: A Group-based Social Navigation Framework with Large Multimodal Model	Shangyi Luo et.al.	2409.18084v1	null
2024-09-26	FreeEdit: Mask-free Reference-based Image Editing with Multi-modal Instruction	Runze He et.al.	2409.18071v1	null
2024-09-26	DualAD: Dual-Layer Planning for Reasoning in Autonomous Driving	Dingrui Wang et.al.	2409.18053v1	link
2024-09-26	IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning	Soeun Lee et.al.	2409.18046v1	link
2024-09-26	Learning to Love Edge Cases in Formative Math Assessment: Using the AMMORE Dataset and Chain-of-Thought Prompting to Improve Grading Accuracy	Owen Henkel et.al.	2409.17904v1	null
2024-09-26	Language Models as Zero-shot Lossless Gradient Compressors: Towards General Neural Parameter Prior Models	Hui-Po Wang et.al.	2409.17836v1	link
2024-09-27	Few-shot Pairwise Rank Prompting: An Effective Non-Parametric Retrieval Model	Nilanjan Sinhababu et.al.	2409.17745v2	null
2024-09-26	AnyLogo: Symbiotic Subject-Driven Diffusion System with Gemini Status	Jinghao Zhang et.al.	2409.17740v1	null
2024-09-26	Robust Ladder Climbing with a Quadrupedal Robot	Dylan Vogel et.al.	2409.17731v1	null
2024-09-25	Can Vision Language Models Learn from Visual Demonstrations of Ambiguous Spatial Reasoning?	Bowen Zhao et.al.	2409.17080v1	link
2024-09-25	ControlCity: A Multimodal Diffusion Model Based Approach for Accurate Geospatial Data Generation and Urban Morphology Analysis	Fangshuo Zhou et.al.	2409.17049v1	link
2024-09-25	Detecting Temporal Ambiguity in Questions	Bhawna Piryani et.al.	2409.17046v1	link
2024-09-25	Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness	Shixuan Ma et.al.	2409.16914v1	link
2024-09-25	Pruning Multilingual Large Language Models for Multilingual Inference	Hwichan Kim et.al.	2409.16911v1	link
2024-09-25	Multi-objective Evolution of Heuristic Using Large Language Model	Shunyu Yao et.al.	2409.16867v1	null
2024-09-25	Towards General Text-guided Image Synthesis for Customized Multimodal Brain MRI Generation	Yulin Wang et.al.	2409.16818v1	link
2024-09-25	Vision-Language Model Fine-Tuning via Simple Parameter-Efficient Modification	Ming Li et.al.	2409.16718v1	link
2024-09-24	Unsupervised Text Representation Learning via Instruction-Tuning for Zero-Shot Dense Retrieval	Qiuhai Zeng et.al.	2409.16497v1	null
2024-09-24	BehAV: Behavioral Rule Guided Autonomy Using VLMs for Robot Navigation in Outdoor Scenes	Kasun Weerakoon et.al.	2409.16484v1	null
2024-09-24	Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation	Homanga Bharadhwaj et.al.	2409.16283v1	null
2024-09-24	Fields of The World: A Machine Learning Benchmark Dataset For Global Agricultural Field Boundary Segmentation	Hannah Kerner et.al.	2409.16252v1	link
2024-09-24	Facial Expression-Enhanced TTS: Combining Face Representation and Emotion Intensity for Adaptive Speech	Yunji Chu et.al.	2409.16203v1	null
2024-09-24	HA-FGOVD: Highlighting Fine-grained Attributes via Explicit Linear Composition for Open-Vocabulary Object Detection	Yuqi Ma et.al.	2409.16136v1	null
2024-09-24	Evaluation of state-of-the-art ASR Models in Child-Adult Interactions	Aditya Ashvin et.al.	2409.16135v1	null
2024-09-24	Bridging Environments and Language with Rendering Functions and Vision-Language Models	Theo Cachet et.al.	2409.16024v1	null
2024-09-24	Finetuning LLMs for Comparative Assessment Tasks	Vatsal Raina et.al.	2409.15979v1	null
2024-09-24	StyleSinger 2: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control	Yu Zhang et.al.	2409.15977v1	link
2024-09-24	SLIMER-IT: Zero-Shot NER on Italian Language	Andrew Zamai et.al.	2409.15933v1	link
2024-09-24	Zero-Shot Detection of AI-Generated Images	Davide Cozzolino et.al.	2409.15875v1	null
2024-09-24	Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models	Sijing Chen et.al.	2409.12139v3	null
2024-09-18	IMRL: Integrating Visual, Physical, Temporal, and Geometric Representations for Enhanced Food Acquisition	Rui Liu et.al.	2409.12092v1	null
2024-09-18	Efficacy of Synthetic Data as a Benchmark	Gaurav Maheshwari et.al.	2409.11968v1	null
2024-09-18	GauTOAO: Gaussian-based Task-Oriented Affordance of Objects	Jiawen Wang et.al.	2409.11941v1	null
2024-09-18	LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Foundation Models	Amaia Cardiel et.al.	2409.11919v1	null
2024-09-18	ABHINAW: A method for Automatic Evaluation of Typography within AI-Generated Images	Abhinaw Jagtap et.al.	2409.11874v1	null
2024-09-18	One Map to Find Them All: Real-time Open-Vocabulary Mapping for Zero-shot Multi-Object Navigation	Finn Lukas Busch et.al.	2409.11764v1	null
2024-09-18	Speaking from Coarse to Fine: Improving Neural Codec Language Model via Multi-Scale Speech Coding and Generation	Haohan Guo et.al.	2409.11630v1	null
2024-09-17	Good Grasps Only: A data engine for self-supervised fine-tuning of pose estimation using grasp poses for verification	Frederik Hagelskjær et.al.	2409.11512v1	null
2024-09-17	Enriching Datasets with Demographics through Large Language Models: What's in a Name?	Khaled AlNuaimi et.al.	2409.11491v1	null
2024-09-17	Says Who? Effective Zero-Shot Annotation of Focalization	Rebecca M. M. Hicke et.al.	2409.11390v1	null
2024-09-17	Towards Time Series Reasoning with LLMs	Winnie Chow et.al.	2409.11376v1	null
2024-09-17	Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think	Gonzalo Martin Garcia et.al.	2409.11355v1	**[link](https://github.com/VisualComputingInstitute/diffusion-

Name		Name	Last commit message	Last commit date
Latest commit History 1,332 Commits
.github/workflows		.github/workflows
assets		assets
docs		docs
README.md		README.md
config.yaml		config.yaml
daily_arxiv.py		daily_arxiv.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Updated on 2025.03.13

6D Pose

Point Cloud Registration

Point Cloud Segmentation

Zero-shot

About

Releases

Packages

Languages

Jianqiuer/Awesome6DPoseEstimation

Folders and files

Latest commit

History

Repository files navigation

Updated on 2025.03.13

6D Pose

Point Cloud Registration

Point Cloud Segmentation

Zero-shot

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages