Skip to content

Commit

Permalink
add setuptools_scm and build and publish workflow (#3)
Browse files Browse the repository at this point in the history
* add setuptools_scm and build and publish workflow

* add version to __init__.py

* update README and .gitignore
  • Loading branch information
ekorman authored Dec 29, 2023
1 parent 9bf3adb commit 968884b
Show file tree
Hide file tree
Showing 5 changed files with 67 additions and 19 deletions.
21 changes: 21 additions & 0 deletions .github/workflows/build-and-publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
name: Build and publish python package

on:
push:
tags:
- "v*"

jobs:
build-and-publish:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: "3.9"
- name: Build wheel
run: pip install build && python -m build
- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
password: ${{ secrets.PYPI_API_TOKEN }}
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,5 @@ __pycache__/
wandb/
models/
.vscode
*.egg-info
*.egg-info
events.out.tfevents*
49 changes: 33 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,37 @@
# neurve

This is the repository to accompany the paper [*Self-supervised representation learning on manifolds*](https://openreview.net/forum?id=EofGDIGAhvR), to be presented at the *ICLR 2021 Workshop on Geometrical and Topological Representation Learning*.
This is the repository to accompany the paper [_Self-supervised representation learning on manifolds_](https://openreview.net/forum?id=EofGDIGAhvR), to be presented at the _ICLR 2021 Workshop on Geometrical and Topological Representation Learning_.

Additionally, we implement a manifold version of triplet training, which will be expounded on in an upcoming preprint.

## Notebooks

[MSimCLR Inference](https://github.com/ekorman/neurve/blob/master/notebooks/msimclr-inference.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ekorman/neurve/blob/master/notebooks/msimclr-inference.ipynb)

This notebook will run inference using a pre-trained Manifold SimCLR model (trained on either CIFAR10, FashionMNIST, or MNIST).

## Installation

Install by cloning this repository and then running, from the repo root, the command
```
pip install .
Install via

```shell
pip install neurve
```

or, to install with [Weights & Biases](https://wandb.ai/) support, run:

```shell
pip install "neurve[wandb]"
```
pip install .[wandb]

You can also install from source by cloning this repository and then running, from the repo root, the command

```shell
pip install . # or pip install .[wandb]
```

The dependencies are

```
numpy>=1.17.4
torch>=1.3.1
Expand All @@ -32,18 +42,22 @@ tensorboardX
```

### Datasets

To get the datasets for metric learning (the datasets we use for representation learning are included in `torchvision.datasets`):

* CUB dataset: Download the file `CUB_200_2011.tgz` from http://www.vision.caltech.edu/visipedia/CUB-200-2011.html and decompress in the `data` folder. The folder structure should be `data/CUB_200_2011/images/`.
* cars196 dataset: run `make data/cars`.
- CUB dataset: Download the file `CUB_200_2011.tgz` from http://www.vision.caltech.edu/visipedia/CUB-200-2011.html and decompress in the `data` folder. The folder structure should be `data/CUB_200_2011/images/`.
- cars196 dataset: run `make data/cars`.

## Training commands

### Tracking with Weights & Biases

To use [Weights & Biases](https://wandb.ai/) to log training/validation metrics and for storing model checkpoints, set the environment variable `NEURVE_TRACKER` to `wandb`. Otherwise [tensorboardX](https://github.com/lanpa/tensorboardX) will be used for metric logging and model checkpoints will be saved locally.

### Manifold SimCLR

For self-supervised training, run the command

```bash
python experiments/simclr.py \
--dataset $DATASET \
Expand All @@ -54,16 +68,18 @@ python experiments/simclr.py \
--tau $TAU \
--out_path $OUT_PATH # if not using Weights & Biases for tracking
```

where

* `$DATASET` is one of `"cifar"`, `"mnist"`, `"fashion_mnist"`.
* `$BACKBONE` is the name of the backbone network (in the paper we used `"resnet50"` for CIFAR10 and `"resnet18"` for MNIST and FashionMNIST).
* `$DIM_Z` and `$N_CHARTS` are the dimension and number of charts, respectively, for the manifold.
* `$N_EPOCHS` is the number of epochs to train for (in the paper we used 1,000 for CIFAR10 and 100 for MNIST and FashionMNIST).
* `$TAU` is the temperature parameter for the contrastive loss function (in the paper we used 0.5 for CIFAR10 and 1.0 for MNIST and FashionMNIST).
* `$OUT_PATH` is the path to save model checkpoints and tensorboard output.
- `$DATASET` is one of `"cifar"`, `"mnist"`, `"fashion_mnist"`.
- `$BACKBONE` is the name of the backbone network (in the paper we used `"resnet50"` for CIFAR10 and `"resnet18"` for MNIST and FashionMNIST).
- `$DIM_Z` and `$N_CHARTS` are the dimension and number of charts, respectively, for the manifold.
- `$N_EPOCHS` is the number of epochs to train for (in the paper we used 1,000 for CIFAR10 and 100 for MNIST and FashionMNIST).
- `$TAU` is the temperature parameter for the contrastive loss function (in the paper we used 0.5 for CIFAR10 and 1.0 for MNIST and FashionMNIST).
- `$OUT_PATH` is the path to save model checkpoints and tensorboard output.

### Manifold metric learning

To train metric learning, run the command

```bash
Expand All @@ -73,11 +89,12 @@ python experiments/triplet.py \
--n_charts $N_CHARTS \
--out_path $OUT_PATH # if not using Weights & Biases for tracking
```

where

* `$DATA_ROOT` is the path to the data (e.g. `data/CUB_200_2011/images/` or `data/cars/`), which should be a folder of subfolders, where each subfolder has the images for one class.
* `$DIM_Z` and `$N_CHARTS` are the dimension and number of charts, respectively, for the manifold.
* `$OUT_PATH` is the path to save model checkpoints and tensorboard output.
- `$DATA_ROOT` is the path to the data (e.g. `data/CUB_200_2011/images/` or `data/cars/`), which should be a folder of subfolders, where each subfolder has the images for one class.
- `$DIM_Z` and `$N_CHARTS` are the dimension and number of charts, respectively, for the manifold.
- `$OUT_PATH` is the path to save model checkpoints and tensorboard output.

## Citation

Expand Down
7 changes: 7 additions & 0 deletions neurve/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
from importlib import metadata

try:
__version__ = metadata.version("neurve")
except Exception:
__version__ = "0.0.0-dev"
del metadata
6 changes: 4 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "neurve"
version = "0.1.0"
dynamic = ["version"]
description = "atlas learning"
readme = "README.md"
requires-python = ">=3.8"
Expand All @@ -16,7 +16,7 @@ dependencies = [
[project.optional-dependencies]
test = ["pytest"]
wandb = ["wandb"]
dev = ["pre-commit"]
dev = ["setuptools_scm", "pre-commit"]

[build-system]
requires = ["setuptools>=61.0", "setuptools_scm[toml]>=6.2"]
Expand All @@ -25,6 +25,8 @@ build-backend = "setuptools.build_meta"
[tool.setuptools]
packages = ["neurve"]

[tool.setuptools_scm]

[tool.black]
line-length = 79

Expand Down

0 comments on commit 968884b

Please sign in to comment.