GitHub

Learning segmentation from point trajectories

Laurynas Karazija*, Iro Laina, Christian Rupprecht, Andrea Vedaldi

Abstract:

^{We consider the problem of segmenting objects in videos based on their motion and no other forms of supervision. Prior work has often approached this problem by using the principle of common fate, namely the fact that the motion of points that belong to the same object is strongly correlated. However, most authors have only considered instantaneous motion from optical flow. In this work, we present a way to train a segmentation network using long-term point trajectories as a supervisory signal to complement optical flow. The key difficulty is that long-term motion, unlike instantaneous motion, is difficult to model -- any parametric approximation is unlikely to capture complex motion patterns over long periods of time. We instead draw inspiration from subspace clustering approaches, proposing a loss function that seeks to group the trajectories into low-rank matrices where the motion of object points can be approximately explained as a linear combination of other point tracks. Our method outperforms the prior art on motion-based segmentation, which shows the utility of long-term motion and the effectiveness of our formulation.}

Getting Started

Requirements

The following packages are required to run the code:

cv2
numpy
torch==2.0.1
torchvision==0.15.2
einops
timm
wandb
tqdm
scikit-learn
scipy
PIL
detectron2

see environment.yaml for precise versions and full list of dependencies and environment state.

Data Preparation

Datasets should be placed under data/<dataset_name>, e.g. data/DAVIS2016. For video segmentation we follow the dataset preparation steps of MotionGrouping including obtaining optical flow. For trajectoies, we use CotrackerV2. To generate trajectories run the following commands:

# DAVIS
python extract_trajectories.py data/DAVIS2016/ data/DAVIS2016/Tracks/cotrackerv2_rel_stride4_aux2 --grid_step 1 --height 480 --width 854 --max_frames 100 --grid_stride 4 --precheck
# SegTrackV2
python extract_trajectories.py data/SegTrackv2/ data/SegTrackv2/Tracks/cotrackerv2_rel_stride4_aux2 --grid_step 1 --height 480 --width 854 --grid_stride 4 --max_frames 100 --seq-search-path JPEGImages --precheck
# FBMS
python extract_trajectories.py data/FBMS_clean/ data/FBMS_clean/Tracks/ --grid_step 1 --height 480 --width 854 --grid_stride 4 --max_frames 100 --seq-search-path JPEGImages --precheck

Note that calculating trajectories will take a long time and requires a lot of memory due to tracking very many points (we observed that this lead to more accurate trajectories with cotracker). We made use of SLURM arrays to distribute the workload across many GPUs. We used machines with at least 64 GB of RAM and 48 GB of GPU memory. The scrip has additional options and functionality to resume, checkpoint, and skip already processed sequences. Also there are options for debugging.

Running

Training

Experiments are controlled through a mix of config files and command line arguments. See config files and src/config.py for a list of all available options.

python main.py GWM.DATASET DAVIS LOG_ID davis_training

for training a model on davis dataset.

Checkpoints

We provide trained checkpoints for main experiments in the paper. These can be downloaded from the following links:

Acknowledgements

This repository builds on MaskFormer, MotionGrouping, guess-what-moves, and dino-vit-features.

Citation

@inproceedings{karazija24learning,
  title={Learning segmentation from point trajectories},
  author={Karazija, Laurynas and Laina, Iro and Rupprecht, Christian and Vedaldi, Andrea},
  booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
  year={2024}
}

Name	Name	Last commit message	Last commit date
Latest commit karazijal Update README.md Feb 5, 2025 ef9f686 · Feb 5, 2025 History 5 Commits
configs	configs	Check in	Jan 14, 2025
data	data	Check in	Jan 14, 2025
datasets	datasets	Check in	Jan 14, 2025
davis2017	davis2017	Check in	Jan 14, 2025
losses	losses	Check in	Jan 14, 2025
mask_former	mask_former	Check in	Jan 14, 2025
outputs	outputs	Check in	Jan 14, 2025
utils	utils	Check in	Jan 14, 2025
.gitignore	.gitignore	Check in	Jan 14, 2025
LICENSE	LICENSE	Initial commit	Jan 14, 2025
README.md	README.md	Update README.md	Feb 5, 2025
config.py	config.py	environment	Jan 17, 2025
ema.py	ema.py	Check in	Jan 14, 2025
environment.yml	environment.yml	environment	Jan 17, 2025
eval_utils.py	eval_utils.py	Check in	Jan 14, 2025
extract_trajectories.py	extract_trajectories.py	Check in	Jan 14, 2025
flow_reconstruction.py	flow_reconstruction.py	Check in	Jan 14, 2025
main.py	main.py	Check in	Jan 14, 2025
mask_former_trainer.py	mask_former_trainer.py	Check in	Jan 14, 2025
unet.py	unet.py	Check in	Jan 14, 2025
wandb.yaml	wandb.yaml	Check in	Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning segmentation from point trajectories

Laurynas Karazija*, Iro Laina, Christian Rupprecht, Andrea Vedaldi

Abstract:

Getting Started

Requirements

Data Preparation

Running

Training

Checkpoints

Acknowledgements

Citation

About

Releases

Packages

Languages

License

karazijal/lrtl

Folders and files

Latest commit

History

Repository files navigation

Learning segmentation from point trajectories

Laurynas Karazija*, Iro Laina, Christian Rupprecht, Andrea Vedaldi

Abstract:

Getting Started

Requirements

Data Preparation

Running

Training

Checkpoints

Acknowledgements

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages