forked from facebookresearch/maskrcnn-benchmark
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 61ffdb3
Showing
135 changed files
with
9,869 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# This is an example .flake8 config, used when developing *Black* itself. | ||
# Keep in sync with setup.cfg which is used for source packages. | ||
|
||
[flake8] | ||
ignore = E203, E266, E501, W503 | ||
max-line-length = 80 | ||
max-complexity = 18 | ||
select = B,C,E,F,W,T4,B9 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# compilation and distribution | ||
__pycache__ | ||
_ext | ||
*.pyc | ||
*.so | ||
torch_detectron.egg-info/ | ||
torch_detectron/legacy/ | ||
build/ | ||
dist/ | ||
|
||
# pytorch/python/numpy formats | ||
*.pth | ||
*.pkl | ||
*.npy | ||
|
||
# ipython/jupyter notebooks | ||
*.ipynb | ||
|
||
# Editor temporaries | ||
*.swn | ||
*.swo | ||
*.swp | ||
*~ | ||
|
||
# project dirs | ||
/datasets | ||
/models |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
## Abstractions | ||
The main abstractions introduced by `maskrcnn_benchmark` that are useful to | ||
have in mind are the following: | ||
|
||
### ImageList | ||
In PyTorch, the first dimension of the input to the network generally represents | ||
the batch dimension, and thus all elements of the same batch have the same | ||
height / width. | ||
In order to support images with different sizes and aspect ratios in the same | ||
batch, we created the `ImageList` class, which holds internally a batch of | ||
images (os possibly different sizes). The images are padded with zeros such that | ||
they have the same final size and batched over the first dimension. The original | ||
sizes of the images before padding are stored in the `image_sizes` attribute, | ||
and the batched tensor in `tensors`. | ||
We provide a convenience function `to_image_list` that accepts a few different | ||
input types, including a list of tensors, and returns an `ImageList` object. | ||
|
||
```python | ||
from maskrnn_benchmark.structures.image_list import to_image_list | ||
|
||
images = [torch.rand(3, 100, 200), torch.rand(3, 150, 170)] | ||
batched_images = to_image_list(images) | ||
|
||
# it is also possible to make the final batched image be a multiple of a number | ||
batched_images_32 = to_image_list(images, size_divisible=32) | ||
``` | ||
|
||
### BoxList | ||
The `BoxList` class holds a set of bounding boxes (represented as a `Nx4` tensor) for | ||
a specific image, as well as the size of the image as a `(width, height)` tuple. | ||
It also contains a set of methods that allow to perform geometric | ||
transformations to the bounding boxes (such as cropping, scaling and flipping). | ||
The class accepts bounding boxes from two different input formats: | ||
- `xyxy`, where each box is encoded as a `x1`, `y1`, `x2` and `y2` coordinates) | ||
- `xywh`, where each box is encoded as `x1`, `y1`, `w` and `h`. | ||
|
||
Additionally, each `BoxList` instance can also hold arbitrary additional information | ||
for each bounding box, such as labels, visibility, probability scores etc. | ||
|
||
Here is an example on how to create a `BoxList` from a list of coordinates: | ||
```python | ||
from maskrcnn_baseline.structures.bounding_box import BoxList, FLIP_LEFT_RIGHT | ||
|
||
width = 100 | ||
height = 200 | ||
boxes = [ | ||
[0, 10, 50, 50], | ||
[50, 20, 90, 60], | ||
[10, 10, 50, 50] | ||
] | ||
# create a BoxList with 3 boxes | ||
bbox = BoxList(boxes, size=(width, height), mode='xyxy') | ||
|
||
# perform some box transformations, has similar API as PIL.Image | ||
bbox_scaled = bbox.resize((width * 2, height * 3)) | ||
bbox_flipped = bbox.transpose(FLIP_LEFT_RIGHT) | ||
|
||
# add labels for each bbox | ||
labels = torch.tensor([0, 10, 1]) | ||
bbox.add_field('labels', labels) | ||
|
||
# bbox also support a few operations, like indexing | ||
# here, selects boxes 0 and 2 | ||
bbox_subset = bbox[[0, 2]] | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Code of Conduct | ||
|
||
Facebook has adopted a Code of Conduct that we expect project participants to adhere to. | ||
Please read the [full text](https://code.fb.com/codeofconduct/) | ||
so that you can understand what actions will and will not be tolerated. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
# Contributing to Mask-RCNN Benchmark | ||
We want to make contributing to this project as easy and transparent as | ||
possible. | ||
|
||
## Our Development Process | ||
Minor changes and improvements will be released on an ongoing basis. Larger changes (e.g., changesets implementing a new paper) will be released on a more periodic basis. | ||
|
||
## Pull Requests | ||
We actively welcome your pull requests. | ||
|
||
1. Fork the repo and create your branch from `master`. | ||
2. If you've added code that should be tested, add tests. | ||
3. If you've changed APIs, update the documentation. | ||
4. Ensure the test suite passes. | ||
5. Make sure your code lints. | ||
6. If you haven't already, complete the Contributor License Agreement ("CLA"). | ||
|
||
## Contributor License Agreement ("CLA") | ||
In order to accept your pull request, we need you to submit a CLA. You only need | ||
to do this once to work on any of Facebook's open source projects. | ||
|
||
Complete your CLA here: <https://code.facebook.com/cla> | ||
|
||
## Issues | ||
We use GitHub issues to track public bugs. Please ensure your description is | ||
clear and has sufficient instructions to be able to reproduce the issue. | ||
|
||
Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe | ||
disclosure of security bugs. In those cases, please go through the process | ||
outlined on that page and do not file a public issue. | ||
|
||
## Coding Style | ||
* 4 spaces for indentation rather than tabs | ||
* 80 character line length | ||
* PEP8 formatting following [Black](https://black.readthedocs.io/en/stable/) | ||
|
||
## License | ||
By contributing to Mask-RCNN Benchmark, you agree that your contributions will be licensed | ||
under the LICENSE file in the root directory of this source tree. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
## Installation | ||
|
||
### Requirements: | ||
- PyTorch 1.0 from a nightly release. Installation instructions can be found in https://pytorch.org/get-started/locally/ | ||
- torchvision from master | ||
- cocoapi | ||
- yacs | ||
- (optional) OpenCV for the webcam demo | ||
|
||
|
||
### Step-by-step installation | ||
|
||
```bash | ||
# maskrnn_benchmark and coco api dependencies | ||
pip install ninja yacs cython | ||
|
||
# follow PyTorch installation in https://pytorch.org/get-started/locally/ | ||
# we give the instructions for CUDA 9.0 | ||
conda install pytorch-nightly -c pytorch | ||
|
||
# install torchvision | ||
cd ~/github | ||
git clone [email protected]:pytorch/vision.git | ||
cd vision | ||
python setup.py install | ||
|
||
# install pycocotools | ||
cd ~/github | ||
git clone [email protected]:cocodataset/cocoapi.git | ||
cd cocoapi/PythonAPI | ||
python setup.py build_ext install | ||
|
||
# install PyTorch Detection | ||
cd ~/github | ||
git clone [email protected]:facebookresearch/maskrcnn-benchmark.git | ||
cd maskrcnn-benchmark | ||
# the following will install the lib with | ||
# symbolic links, so that you can modify | ||
# the files if you want and won't need to | ||
# re-build it | ||
python setup.py build develop | ||
|
||
# or if you are on macOS | ||
# MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py build develop | ||
``` | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
MIT License | ||
|
||
Copyright (c) 2018 Facebook | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
## Model Zoo and Baselines | ||
|
||
### Hardware | ||
- 8 NVIDIA V100 GPUs | ||
|
||
### Software | ||
- PyTorch version: 1.0.0a0+dd2c487 | ||
- CUDA 9.2 | ||
- CUDNN 7.1 | ||
- NCCL 2.2.13-1 | ||
|
||
### End-to-end Faster and Mask R-CNN baselines | ||
|
||
All the baselines were trained using the exact same experimental setup as in Detectron. | ||
We initialize the detection models with ImageNet weights from Caffe2, the same as used by Detectron. | ||
|
||
The pre-trained models are available in the link in the model id. | ||
|
||
backbone | type | lr sched | im / gpu | train mem(GB) | train time (s/iter) | total train time(hr) | inference time(s/im) | box AP | mask AP | model id | ||
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | ||
R-50-C4 | Fast | 1x | 1 | 5.8 | 0.4036 | 20.2 | 0.17130 | 34.8 | - | [6358800](https://download.pytorch.org/models/maskrcnn/e2e_faster_rcnn_R_50_C4_1x.pth) | ||
R-50-FPN | Fast | 1x | 2 | 4.4 | 0.3530 | 8.8 | 0.12580 | 36.8 | - | [6358793](https://download.pytorch.org/models/maskrcnn/e2e_faster_rcnn_R_50_FPN_1x.pth) | ||
R-101-FPN | Fast | 1x | 2 | 7.1 | 0.4591 | 11.5 | 0.143149 | 39.1 | - | [6358804](https://download.pytorch.org/models/maskrcnn/e2e_faster_rcnn_R_101_FPN_1x.pth) | ||
X-101-32x8d-FPN | Fast | 1x | 1 | 7.6 | 0.7007 | 35.0 | 0.209965 | 41.2 | - | [6358717](https://download.pytorch.org/models/maskrcnn/e2e_faster_rcnn_X_101_32x8d_FPN_1x.pth) | ||
R-50-C4 | Mask | 1x | 1 | 5.8 | 0.4520 | 22.6 | 0.17796 + 0.028 | 35.6 | 31.5 | [6358801](https://download.pytorch.org/models/maskrcnn/e2e_mask_rcnn_R_50_C4_1x.pth) | ||
R-50-FPN | Mask | 1x | 2 | 5.2 | 0.4536 | 11.3 | 0.12966 + 0.034 | 37.8 | 34.2 | [6358792](https://download.pytorch.org/models/maskrcnn/e2e_mask_rcnn_R_50_FPN_1x.pth) | ||
R-101-FPN | Mask | 1x | 2 | 7.9 | 0.5665 | 14.2 | 0.15384 + 0.034 | 40.1 | 36.1 | [6358805](https://download.pytorch.org/models/maskrcnn/e2e_mask_rcnn_R_101_FPN_1x.pth) | ||
X-101-32x8d-FPN | Mask | 1x | 1 | 7.8 | 0.7562 | 37.8 | 0.21739 + 0.034 | 42.2 | 37.8 | [6358718](https://download.pytorch.org/models/maskrcnn/e2e_mask_rcnn_X_101_32x8d_FPN_1x.pth) | ||
|
||
|
||
## Comparison with Detectron and mmdetection | ||
|
||
In the following section, we compare our implementation with [Detectron](https://github.com/facebookresearch/Detectron) | ||
and [mmdetection](https://github.com/open-mmlab/mmdetection). | ||
The same remarks from [mmdetection](https://github.com/open-mmlab/mmdetection/blob/master/MODEL_ZOO.md#training-speed) | ||
about different hardware applies here. | ||
|
||
### Training speed | ||
|
||
The numbers here are in seconds / iteration. The lower, the better. | ||
|
||
type | Detectron (P100) | mmdetection (V100) | maskrcnn_benchmark (V100) | ||
-- | -- | -- | -- | ||
Faster R-CNN R-50 C4 | 0.566 | - | 0.4036 | ||
Faster R-CNN R-50 FPN | 0.544 | 0.554 | 0.3530 | ||
Faster R-CNN R-101 FPN | 0.647 | - | 0.4591 | ||
Faster R-CNN X-101-32x8d FPN | 0.799 | - | 0.7007 | ||
Mask R-CNN R-50 C4 | 0.620 | - | 0.4520 | ||
Mask R-CNN R-50 FPN | 0.889 | 0.690 | 0.4536 | ||
Mask R-CNN R-101 FPN | 1.008 | - | 0.5665 | ||
Mask R-CNN X-101-32x8d FPN | 0.961 | - | 0.7562 | ||
|
||
### Training memory | ||
|
||
The lower, the better | ||
|
||
type | Detectron (P100) | mmdetection (V100) | maskrcnn_benchmark (V100) | ||
-- | -- | -- | -- | ||
Faster R-CNN R-50 C4 | 6.3 | - | 5.8 | ||
Faster R-CNN R-50 FPN | 7.2 | 4.9 | 4.4 | ||
Faster R-CNN R-101 FPN | 8.9 | - | 7.1 | ||
Faster R-CNN X-101-32x8d FPN | 7.0 | - | 7.6 | ||
Mask R-CNN R-50 C4 | 6.6 | - | 5.8 | ||
Mask R-CNN R-50 FPN | 8.6 | 5.9 | 5.2 | ||
Mask R-CNN R-101 FPN | 10.2 | - | 7.9 | ||
Mask R-CNN X-101-32x8d FPN | 7.7 | - | 7.8 | ||
|
||
### Accuracy | ||
|
||
The higher, the better | ||
|
||
type | Detectron (P100) | mmdetection (V100) | maskrcnn_benchmark (V100) | ||
-- | -- | -- | -- | ||
Faster R-CNN R-50 C4 | 34.8 | - | 34.8 | ||
Faster R-CNN R-50 FPN | 36.7 | 36.7 | 36.8 | ||
Faster R-CNN R-101 FPN | 39.4 | - | 39.1 | ||
Faster R-CNN X-101-32x8d FPN | 41.3 | - | 41.2 | ||
Mask R-CNN R-50 C4 | 35.8 & 31.4 | - | 35.6 & 31.5 | ||
Mask R-CNN R-50 FPN | 37.7 & 33.9 | 37.5 & 34.4 | 37.8 & 34.2 | ||
Mask R-CNN R-101 FPN | 40.0 & 35.9 | - | 40.1 & 36.1 | ||
Mask R-CNN X-101-32x8d FPN | 42.1 & 37.3 | - | 42.2 & 37.8 | ||
|
Oops, something went wrong.