[CVPR'25] Graph-Generating State Space Models (GG-SSMs)

Official PyTorch Implementation of our CVPR 2025 paper.
Authors: Nikola Zubić, Davide Scaramuzza

Figure: Overview of the GG-SSM pipeline applied to various tasks, such as event-based vision tasks, time series forecasting, image classification, and optical flow estimation.

Abstract

State Space Models (SSMs) are powerful tools for modeling sequential data in computer vision and time series analysis domains. However, traditional SSMs are limited by fixed, one-dimensional sequential processing, which restricts their ability to model non-local interactions in high-dimensional data. While methods like Mamba and VMamba introduce selective and flexible scanning strategies, they rely on predetermined paths, which fails to efficiently capture complex dependencies.

We introduce Graph-Generating State Space Models (GG-SSMs), a novel framework that overcomes these limitations by dynamically constructing graphs based on feature relationships. Using Chazelle's Minimum Spanning Tree algorithm, GG-SSMs adapt to the inherent data structure, enabling robust feature propagation across dynamically generated graphs and efficiently modeling complex dependencies.

We validate GG-SSMs on 11 diverse datasets, including event-based eye-tracking, ImageNet classification, optical flow estimation, and six time series datasets. GG-SSMs achieve state-of-the-art performance across all tasks, surpassing existing methods by significant margins. Specifically, GG-SSM attains a top-1 accuracy of 84.9% on ImageNet, outperforming prior SSMs by 1%, reducing the KITTI-15 error rate to 2.77%, and improving eye-tracking detection rates by up to 0.33% with fewer parameters. These results demonstrate that dynamic scanning based on feature relationships significantly improves SSMs' representational power and efficiency, offering a versatile tool for various applications in computer vision and beyond.

Citation

If you find this work helpful, please cite our paper:

@inproceedings{Zubic_2025_CVPR,
  title     = {Graph-Generating State Space Models (GG-SSMs)},
  author    = {Zubic, Nikola and Scaramuzza, Davide},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2025}
}

Standalone Installation

Below are the commands to set up a conda environment and install all necessary dependencies, including custom libraries for graph-based state scanning:

# 1. Create and activate conda environment
conda create -y -n gg_ssms python=3.11
conda activate gg_ssms

# 2. Install PyTorch and CUDA
conda install -y pytorch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 pytorch-cuda=12.4 -c pytorch -c nvidia
conda install -y nvidia::cuda-toolkit

# 3. Install custom dependencies (TreeScan and TreeScanLan)
cd core/convolutional_graph_ssm/third-party/TreeScan/
pip install -v -e .
cd $(git rev-parse --show-toplevel)

cd core/graph_ssm/third-party/TreeScanLan/
pip install -v -e .

Additional Dependencies

Depending on which tasks or modules you want to run, you may need extra Python packages beyond the core requirements listed above. Below is a breakdown of recommended installations for each sub-project:

INI-30 dataset event-based eye tracking (eye_tracking_ini_30)

cd eye_tracking_ini_30
pip install dv-processing sinabs tonic thop samna fire

LPW dataset event-based eye tracking (eye_tracking_lpw)

cd eye_tracking_lpw
pip install matplotlib opencv-python tqdm tables easydict wandb timm einops

MambaTS (Time Series)
- Check the requirements.txt inside the MambaTS folder:
```
cd MambaTS
pip install -r requirements.txt
```

General Usage with Pretrained Models

Convolutional Graph SSM for Image-Based Tasks

We provide a Convolutional Graph-Generating SSM for image-based feature extraction and classification in:

core/convolutional_graph_ssm/classification/models/graph_ssm.py

Choosing Model Size: On line 545, you can set config_path to one of base, small, or tiny to pick the desired model variant.
Pretrained Weights: Place the corresponding pretrained weight files (e.g., gg_ssm_base.pth, gg_ssm_small.pth, gg_ssm_tiny.pth) inside:
```
core/convolutional_graph_ssm/classification/weights/
```
These weights can be downloaded from the Releases page.

Example Usage

To run a forward pass on an image:

python core/convolutional_graph_ssm/classification/models/graph_ssm.py

By default, this script will load the base model from config_path='base'.

Temporal Graph SSM

A purely temporal Graph-Generating SSM (for sequential or time-series data) is available in:

core/graph_ssm/main.py

This module focuses on modeling temporal dependencies using dynamically constructed graphs.

Spatio-Temporal Usage

You can combine the Convolutional Graph SSM (for spatial modeling) and the Temporal Graph SSM (for sequential/temporal modeling) to create a unified spatio-temporal pipeline. Our event-based eye tracking tasks (see Ini-30 Eye Tracking or LPW Dataset Eye Tracking) demonstrate exactly how these two components are integrated for end-to-end training.

Time Series Tasks

We incorporate Graph-Generating SSMs into the MambaTS codebase by replacing the default encoder in MambaTS/models/MambaTS.py with our TemporalGraphSSM. This allows graph-based temporal modeling for long-horizon forecasting.

How to Run

Scripts Location
All relevant scripts can be found here.
Adjusting Paths & Parameters
In each script (e.g., run.py), you can modify:
- CUDA_VISIBLE_DEVICES: Set to your GPU index (e.g., export CUDA_VISIBLE_DEVICES=3).
- root_path / data_path: Point these to the folder containing your time-series dataset.
- model_id / model_name: Namespacing for checkpoints and logging.
- seq_len, pred_len: Sequence length and prediction horizon you want to experiment with.
- Hyperparameters: Adjust e_layers, d_layers, batch_size, learning_rate, etc.
Datasets Download All datasets can be downloaded from here.

To run, do cd MambaTS/ from the root and then bash ./scripts/MambaTS_ETTh2.sh to run ETTh2 dataset training. All the other scripts for any of the 6 time series datasets are available. All the logs and outputs will be generated inside the MambaTS folder.

Ini-30 Eye Tracking

Our implementation for Ini-30 event-based eye tracking can be found in the retina folder:

/training/models/baseline_3et.py:
Contains the code where our GG-SSM architecture is integrated for eye tracking with a spatial_backbone=ConvGraphSSM and temporal_ssm=TemporalGraphSSM.
From the root you can run CUDA_VISIBLE_DEVICES=i python retina/scripts/train.py --run_name=graph_ssm --device=i, where i is the GPU ID. The script will automatically log and create a project in Weights & Biases (wandb), named eye_tracking_ini_30.

Tonic & NumPy Version Conflicts

When installing Tonic (needed for event-based data processing), you may encounter a pip dependency error like:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed.
...
python-tsp 0.5.0 requires numpy<3.0.0,>=2.0.0, but you have numpy 1.26.4 which is incompatible.

This means Tonic and python-tsp (used for certain Time Series tasks) have conflicting NumPy requirements. If you plan to run Time Series tasks in the same environment, you can:

Uninstall Tonic once finished with eye tracking,
Downgrade or reinstall NumPy, and
Reinstall python-tsp for time series.

Alternatively, keep separate environments for each task to avoid conflicts.

LPW Dataset Eye Tracking

Our integration for the LPW dataset eye tracking is located in the eye_tracking_lpw folder.

Data Preparation
Follow the instructions provided by cb-convlstm-eyetracking to download and prepare the LPW dataset.
Path Configuration
In the eye_tracking_lpw/graph_ssm_train.py file, set:
```
DATA_DIR_ROOT = "/path/to/your/LPW/dataset"
```
so that it points to the root directory containing the LPW dataset.
Run Training
From the project root directory, simply execute:
```
python eye_tracking_lpw/graph_ssm_train.py
```
This will start the training process for LPW eye tracking with the Graph-Generating SSM architecture.

Code Acknowledgments

This project has used code from the following projects:

MambaTS - Improved Selective State Space Models for Long-term Time Series Forecasting
Retina - Low-Power Eye Tracking with Event Camera and Spiking Hardware
3ET - Efficient Event-based Eye Tracking using a Change-Based ConvLSTM Network
MemFlow - Optical Flow Estimation and Prediction with Memory
GrootVL - Tree Topology is All You Need in State Space Model

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
MambaTS		MambaTS
core		core
eye_tracking_lpw		eye_tracking_lpw
retina		retina
.gitignore		.gitignore
README.md		README.md
gg_ssm_core.png		gg_ssm_core.png
standalone_installation_details.txt		standalone_installation_details.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[CVPR'25] Graph-Generating State Space Models (GG-SSMs)

Abstract

Citation

Standalone Installation

Additional Dependencies

General Usage with Pretrained Models

Convolutional Graph SSM for Image-Based Tasks

Example Usage

Temporal Graph SSM

Spatio-Temporal Usage

Time Series Tasks

How to Run

Ini-30 Eye Tracking

Tonic & NumPy Version Conflicts

LPW Dataset Eye Tracking

Code Acknowledgments

About

Releases 1

Packages

Languages

uzh-rpg/gg_ssms

Folders and files

Latest commit

History

Repository files navigation

[CVPR'25] Graph-Generating State Space Models (GG-SSMs)

Abstract

Citation

Standalone Installation

Additional Dependencies

General Usage with Pretrained Models

Convolutional Graph SSM for Image-Based Tasks

Example Usage

Temporal Graph SSM

Spatio-Temporal Usage

Time Series Tasks

How to Run

Ini-30 Eye Tracking

Tonic & NumPy Version Conflicts

LPW Dataset Eye Tracking

Code Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages