This page provide details about mmdet2trt.
opt_shape_param
is used to set the min/optimize/max shape of the input tensor. For each dimension in it, min<=optimize<=max. For example:
opt_shape_param=[
[
[1,3,320,320],
[2,3,800,1312],
[4,3,1344,1344],
]
]
trt_model = mmdet2trt( ...,
opt_shape_param=opt_shape_param, # set the opt shape
...)
this config will give you input tensor size between (320, 320) to (1344, 1344), max batch_size=4
warning: Dynamic input shape and batch support might need more memory. Use fixed shape to avoid unnecessary memory usage(min=optimize=max).
fp16 mode can easily accelerate the model, just set the fp16_mode=True
to enable it.
trt_model = mmdet2trt( ...,
fp16_mode=True, # enable fp16 mode
...)
int8 mode need more configs.
- set
input8_mode=True
. - provide calibrate dataset, the
__getitem__()
method of dataset should return a list of tensor with shape (C,H,W), the shape must be the same asopt_shape_param[0][1][1:]
(optimize shape). The tensor should do the same preprocess as the model. There is a default dataset, you can also set your custom one. - set the calibrate algrithm, support
entropy
andminmax
.
from mmdet2trt import mmdet2trt, Int8CalibDataset
cfg_path="..." # mmdetection config path
model_path="..." # mmdetection checkpoint path
image_path_list = [...] # lists of image pathes
opt_shape_param=[
[
[...],
[...],
[...],
]
]
calib_dataset = Int8CalibDataset(image_path_list, cfg_path, opt_shape_param)
trt_model = mmdet2trt(cfg_path, model_path,
opt_shape_param=opt_shape_param,
int8_mode=True,
int8_calib_dataset=calib_dataset,
int8_calib_alg="entropy")
warning: Not all model support int8 mode. If it doesn't works, try fix shape/batch size.
Some layer need extra gpu memory. Any some optimize tactic also need more space. enlarge max_workspace_size
may potentially accelerate your model with the cost of more memory.
The converted model is a python warp on engine.
first, get the serialized engine from trt_model:
with open(engine_path, mode='wb') as f:
f.write(model_trt.state_dict()['engine'])
Link the ${AMIRSTAN_PLUGIN_DIR}/build/lib/libamirstan_plugin.so
in your project. compile and load the engine. enjoy.
warning:
might need to invode initLibAmirstanInferPlugins()
in amirInferPlugin.h to load the plugins.
The engine only contain inference forward. Preprocess(resize, normalize) and postprocess(divid scale factor) should be done in your project.
when converting model, set the output names:
trt_model = mmdet2trt( ...,
output_names=["num_detections", "boxes", "scores", "classes"], # output names
...)
Create engine file:
with open(engine_path, mode='wb') as f:
f.write(model_trt.state_dict()['engine'])
in the deepstream model config file, set some config
[property]
...
net-scale-factor=0.0173 # compute from mean, std
offsets=123.675;116.28;103.53 # compute from mean, std
model-engine-file=trt.engine # the engine file created by mmdet2trt
labelfile-path=labels.txt # label file
...
in the same config file, set the plugin and parse function
[property]
...
parse-bbox-func-name=NvDsInferParseMmdet # parse funtion name(amirstan plugin buildin)
output-blob-names=num_detections;boxes;scores;classes # output blob names, same as convert output_names
custom-lib-path=libamirstan_plugin.so # amirstan plugin lib path
...
you might also need to set group_threshold=0
(not sure why, please tell me if you know.)
[class-attrs-all]
...
group-threshold=0
...
enjoy the model in deepstream.
warning: I am not so familiar with deepstream, if you find any thing wrong with above, please let me know.
set flag enable_mask
to True
# enable mask
trt_model = mmdet2trt(... , enable_mask = True)
inference with trt_model will return 5 tensors
num_detections, bboxes, scores, class_ids, masks = trt_model(input_tensor)
convert and do postprocess if you needed.
# convert post process
mask_postprocessor = mask_processor2trt(max_width = max_width,
max_height = max_height,
max_batch_size = 1,
max_box_per_batch = 10,
mask_size = [28, 28]
)
# do postprocess
final_masks = mask_postprocessor(masks[0:1, :10, ...], bboxes[0:1, :10, ...])