The anchor free models used in this benchmark is based on the centermask2 architecture and the anchor based models are based on the detectron2-ResNeSt architectures, which is both built upon the detectron2 library.
The models in the LIVECell paper was trained in on 8 Nvidia V100 GPUS. To help others reproduce our results and use the models for further research, we provide pre-trained models and config files.
Architecture | Dataset | Box mAP% | Mask mAP% | download |
---|---|---|---|---|
Anchor free | LIVECell | 48.45 | 47.78 | config | model |
A172 | 31.49 | 34.57 | config | model | |
BT-474 | 42.12 | 42.60 | config | model | |
BV-2 | 42.62 | 45.69 | config | model | |
Huh7 | 42.44 | 45.85 | config | model | |
MCF7 | 36.53 | 37.30 | config | model | |
SH-SY5Y | 25.20 | 23.91 | config | model | |
SkBr3 | 64.35 | 65.85 | config | model | |
SK-OV-3 | 46.43 | 49.39 | config | model | |
Anchor based | LIVECell | 48.43 | 47.89 | config | model |
A172 | 36.37 | 38.02 | config | model | |
BT-474 | 43.25 | 43.00 | config | model | |
BV-2 | 54.36 | 52.60 | config | model | |
Huh7 | 52.79 | 51.83 | config | model | |
MCF7 | 37.53 | 37.94 | config | model | |
SH-SY5Y | 27.87 | 24.92 | config | model | |
SkBr3 | 64.41 | 65.39 | config | model | |
SK-OV-3 | 53.29 | 54.12 | config | model |
The box and mask AP presented here is derived by training on either the whole LIVECell dataset or a cell cell specific subset, and then evaluated on the corresponding test dataset.
To use our fully trained models download them from our S3 bucket, and use it together with appropriate config file as described below in the [traing and evaluation section](#Training and evaluation)
The installation takes approximately 30 minutes
- Linux or macOS with Python ≥ 3.6
- PyTorch ≥ 1.3
- torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this.
- OpenCV, optional, needed by demo and visualization
- pycocotools: pip install cython; pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
Build from source
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
Or, to install it from a local clone:
git clone https://github.com/facebookresearch/detectron2.git
python -m pip install -e detectron2
Or if you are on macOS
CC=clang CXX=clang++ python -m pip install -e detectron2
To install a pre-built detectron for different torch and cuda versions and further information, see the detectron2 install document
Retrive the centermask2 code:
git clone https://github.com/youngwanLEE/centermask2.git
For further information, on installation and usage, see the centermask2 documentation
Retrive the detectron2-ResNeSt code:
git clone https://github.com/chongruo/detectron2-ResNeSt
For further information, on installation and usage, see the detectron2-ResNeSt documentation
- Not compiled with GPU support" or "Detectron2 CUDA Compiler: not available".
CUDA is not found when building detectron2. You should make sure
python -c 'import torch; from torch.utils.cpp_extension import CUDA_HOME; print(torch.cuda.is_available(), CUDA_HOME)'
print valid outputs at the time you build detectron2.
Using a custom dataset such as LIVECell together with the detectron2 code base is done by first registering the dataset via the detectron2 python API. In practice this can be done adding the following code to the train_net.py file in the cloned centermask2 repo:
https://detectron2.readthedocs.io/en/latest/tutorials/datasets.html
from detectron2.data.datasets import register_coco_instances
register_coco_instances([dataset_name], {}, [/path/coco/annotations.json], [path/to/image/dir])
Were dataset_name will be the name of your dataset and will be how you decide what dataset to use in your config file. Per default, the config file will point to TRAIN and TEST, so registering a test dataset as TEST will work directly with the provided config files, for other names, make sure to update your config file accordingly.
- In the config file change the dataset entries with the name used to register the dataset.
- Set the output directory in the config file to save the models and results.
To train a model, change the OUTPUT directory in the config file to where the models and checkpoints should be saved. Make sure you follow the previous step and register a TRAIN and TEST dataset, cd into the cloned directory (centermask2 or detectron2-ResNeSt), and run the following code:
python tools/train_net.py --num-gpus 8 --config-file your_config.yaml
To train a model on the dataset defined in you_config.yaml with 8 gpus.
To fine-tune a model on your own dataset, set MODEL.WEIGTS in the config file to point at one of our weight files, if you want to finetune our centermask2 model for instance.
MODEL:
WEIGHTS: "http://livecell-dataset.s3.eu-central-1.amazonaws.com/LIVECell_dataset_2021/models/Anchor_free/ALL/LIVECell_anchor_free_model.pth"
To evaluate a model, make sure to register a TEST dataset and point to it in your config file and cd into the cloned directory (centermask2 or detectron2-ResNeSt), then run the following code
python train_net.py --config-file <your_config.yaml> --eval-only MODEL.WEIGHTS </path/to/checkpoint_file.pth>
This will evaluate a model defined in your_config.yaml
with the weights saved in /path/to/checkpoint_file.pth
To evaluate one of our models, like the centermask2 (anchor-free), you can point directly at the URI link for the weight file.
python train_net.py --config-file livecell_config.yaml --eval-only MODEL.WEIGHTS http://livecell-dataset.s3.eu-central-1.amazonaws.com/LIVECell_dataset_2021/models/Anchor_free/ALL/LIVECell_anchor_free_model.pth
The original evaluation script available in the centermask and detectron2 repo is based on there being no more than 100
detections in an image. In our case we can have thousands of annotations and thus the AP evaluation will be off. We
therefore provide coco_evaluation.py
evaluation script in the code folder. \
To use this script, go into the train_net.py
file and remove (or comment out) the current import of COCOEvaluator
.
Then import COCOEvaluator
for from the provided coco_evaluator.py
file instead. This will result in AP evaluation
supporting for up to 2000 instances in one image.
The evaluation script will take approximately 30 minutes to run on our test dataset with a tesla V100 GPU. The output of the evaluation will appear in the terminal, begining with information about the environment, data and architecture used. Then it will start evaluating all the images and summerize the results in the following manner:
.
.
.
�[32m[11/18 17:19:06 d2.evaluation.evaluator]: �[0mInference done 1557/1564. 0.1733 s / img. ETA=0:00:06
�[32m[11/18 17:19:11 d2.evaluation.evaluator]: �[0mInference done 1561/1564. 0.1734 s / img. ETA=0:00:02
�[32m[11/18 17:19:14 d2.evaluation.evaluator]: �[0mTotal inference time: 0:22:23.437057 (0.861730 s / img per device, on 1 devices)
�[32m[11/18 17:19:14 d2.evaluation.evaluator]: �[0mTotal inference pure compute time: 0:04:30 (0.173426 s / img per device, on 1 devices)
Loading and preparing results...
DONE (t=1.12s)
creating index...
index created!
Size parameters: [[0, 10000000000.0], [0, 324], [324, 961], [961, 10000000000.0]]
Running per image evaluation...
Evaluate annotation type *bbox*
COCOeval_opt.evaluate() finished in 119.67 seconds.
Accumulating evaluation results...
COCOeval_opt.accumulate() finished in 5.86 seconds.
In method
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=2000 ] = 0.485
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=2000 ] = 0.830
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=2000 ] = 0.504
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.483
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.494
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.507
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.212
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=500 ] = 0.480
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=2000 ] = 0.569
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.531
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.602
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.672
Loading and preparing results...
DONE (t=11.04s)
creating index...
index created!
Size parameters: [[0, 10000000000.0], [0, 324], [324, 961], [961, 10000000000.0]]
Running per image evaluation...
Evaluate annotation type *segm*
COCOeval_opt.evaluate() finished in 135.80 seconds.
Accumulating evaluation results...
COCOeval_opt.accumulate() finished in 5.78 seconds.
In method
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=2000 ] = 0.478
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=2000 ] = 0.816
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=2000 ] = 0.509
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.451
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.491
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.570
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.210
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=500 ] = 0.470
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=2000 ] = 0.547
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.516
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.565
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.649
�[32m[11/18 17:25:07 d2.engine.defaults]: �[0mEvaluation results for cell_phase_test in csv format:
�[32m[11/18 17:25:07 d2.evaluation.testing]: �[0mcopypaste: Task: bbox
�[32m[11/18 17:25:07 d2.evaluation.testing]: �[0mcopypaste: AP,AP50,AP75,APs,APm,APl
�[32m[11/18 17:25:07 d2.evaluation.testing]: �[0mcopypaste: 48.4529,82.9806,50.4426,48.3240,49.4476,50.7434
�[32m[11/18 17:25:07 d2.evaluation.testing]: �[0mcopypaste: Task: segm
�[32m[11/18 17:25:07 d2.evaluation.testing]: �[0mcopypaste: AP,AP50,AP75,APs,APm,APl
�[32m[11/18 17:25:07 d2.evaluation.testing]: �[0mcopypaste: 47.7810,81.6260,50.8958,45.1110,49.0684,56.9874
For further details on training, testing and inference, visit the centermask2 or detectron2-ResNeSt docs
For LIVECell experiments with zero-shot learning of EVICAN and Cellpose the input images was preprocessed using the preprocessing-script preprocessing.py found under the code folder.