Name	Name	Last commit message	Last commit date
parent directory ..
CMakeLists.txt	CMakeLists.txt
README.md	README.md
nvFasterRCNNPlugin.cpp	nvFasterRCNNPlugin.cpp
nvFasterRCNNPlugin.h	nvFasterRCNNPlugin.h

NvPluginFasterRCNN Plugin

Table Of Contents

Description
- Structure
Parameters
Additional resources
License
Changelog
Known issues

Description

The NvPluginFasterRCNN performs object detection for the Faster R-CNN model. This plugin is included in TensorRT and used in sampleFasterRCNN to perform inference.

NvPluginFasterRCNN decodes predicted bounding boxes, extracts their corresponding objectness score, extracts region of interest from predicted bounding boxes using non maximum suppression, and extracts the feature map of region of interest (ROI) using ROI pooling for downstreaming object classification tasks.

This plugin is optimized for the above steps and it allows you to do Faster R-CNN inference in TensorRT.

Structure

The NvPluginFasterRCNN takes four inputs; scores, deltas, fmap and iinfo.

scores Bounding box (region proposal) objectness scores. scores has shape [N, A x 2, H, W] where N is the batch size, A is the number anchor boxes per pixel on the feature map, H is the height of feature map, and W is the width of feature map. The second dimension is A x 2 because Faster R-CNN uses binary Softmax (probability of having object and probability of not having object) to classify the objectness for each bounding box.

deltas Predicted bounding box offsets. deltas has shape [N, A x 4, H, W] where N is the batch size, A is the number anchor boxes per pixel on the feature map, H is the height of feature map, and W is the width of feature map. The second dimension is A x 4 because each anchor box or bounding box consists of four parameters.

fmap Feature map using for bounding box regression and classification. fmap has shape [N, C, H, W] where N is the batch size, C is the number of channels in feature map, H is the height of feature map, and W is the width of feature map.

iinfo Original image input information. iinfo has shape [N, 3] where N is the batch size, 3 represents the height, width, and resize scale (the same as featureStride) of original input image.

The NvPluginFasterRCNN generates the following two outputs:

rois Coordinates of region of interest bounding boxes on the original input image. rois has shape [N, 1, nmsMaxOut, 4], where N is the batch size, nmsMaxOut is the maximum number of region of interest bounding boxes, and 4 represents and region of interest bounding box coordinates [x_1, y_1, x_2, y_2]. Here, x_1 and y_1 are the coordinates of bounding box at the top-left corner, and x_2 and y_2 are the coordinates of bounding box at the bottom-right corner.

pfmap ROI pooled feature map corresponding to the region of interest. pfmap has shape [N, nmsMaxOut, C, poolingH, poolingW] where N is the batch size, nmsMaxOut is the maximum number of region of interest bounding boxes, C is the number of channels in the feature map, poolingH is the height of ROI pooled feature map, and poolingW is the width of ROI pooled feature map.

NvPluginFasterRCNN essentially does region proposal inference followed by region of interest (ROI) pooling.

The proposal inference step includes three steps: extract objectness scores from scores input, decode predicted bounding box from deltas input, non-maximum suppression and get the region of interest bounding boxes using the extracted objectness scores and the decoded bounding boxes.

The ROI pooling step uses the inferred region of interest bounding boxes information to extract its corresponding regions on feature map, and does POI pooling to get uniformly shaped features from different shaped region of interest bounding boxes.

Parameters

NvPluginFasterRCNN has plugin creator class RPROIPluginCreator and plugin class RPROIPlugin.

The RPROIParams data structure was used to create RPROIPlugin instance. The data structure is defined below and consists of the following attributes:

struct RPROIParams
{
	int poolingH, poolingW, featureStride, preNmsTop,
		nmsMaxOut, anchorsRatioCount, anchorsScaleCount;
	float iouThreshold, minBoxSize, spatialScale;
};

Type	Parameter	Description
`int`	`poolingH`	The height of the output in pixels after ROI pooling on the feature map.
`int`	`poolingW`	The width of the output in pixels after ROI pooling on the feature map.
`int`	`featureStride`	The ratio of the input image size to the feature map size; assuming the max pooling layers in the neural network uses square filters. For example, the input image size is `[1600, 800]`, after max pooling of size `[4, 4]` twice, the feature map now becomes `[100, 50]`, and `featureStride = 4^2 = 16`. In the Faster R-CNN settings from the paper, the value is `16`.
`int`	`preNmsTop`	The number of region proposals before applying NMS using objectness which is the probability of containing an object in the region proposal. The region proposals will be sorted using its objectness. If the number of regions you proposed from the previous region proposal network (RPN) is greater than `preNmsTop`, the exceeded region proposals with low objectness will be ignored. This value is particularly useful during training to control the number of bounding boxes for regression, but is theoretically useless during inference. In the Faster R-CNN settings from the paper, the value is `6000`.
`int`	`nmsMaxOut`	The number of region proposals after applying NMS. The region proposals will be sorted using its objectness and then applied NMS. At most the `nmsMaxOut` region proposals exist after NMS is considered as regions of interest.
`int`	`anchorsRatioCount`	The number of anchor box ratios. For example, if the anchor box ratios (aspect ratios) are 1:1, 1:2, and 2:1, then `anchorsRatioCount = 3`.
`int`	`anchorsScaleCount`	The number of anchor box scales. If the anchor box scales (scale factors) are 8, 16, and 32, then `anchorsScaleCount = 3`.
`float`	`iouThreshold`	The IOU threshold used for the NMS step.
`float`	`minBoxSize`	The minimum box size used for the anchor box calculation.
`float`	`spatialScale`	The inverse of `featureStride`, in other words, `spatialScale = 1.0 / featureStride`.

Additional resources

The following resources provide a deeper understanding of the NvPluginFasterRCNN plugin:

Networks:

Faster R-CNN

Documentation:

ROI Pooling Definition from Fast R-CNN

License

For terms and conditions for use, reproduction, and distribution, see the TensorRT Software License Agreement documentation.

Changelog

May 2019 This is the first release of this README.md file.

Known issues

There are no known issues in this plugin.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nvFasterRCNN

nvFasterRCNN

README.md

NvPluginFasterRCNN Plugin

Description

Structure

Parameters

Additional resources

License

Changelog

Known issues

Files

nvFasterRCNN

Directory actions

More options

Directory actions

More options

Latest commit

History

nvFasterRCNN

Folders and files

parent directory

README.md

NvPluginFasterRCNN Plugin

Description

Structure

Parameters

Additional resources

License

Changelog

Known issues