This example demonstrates how to run 2 stage inference with DepthAI library. It recognizes whether (all) detected faces on the frame are wearing face masks. Demo uses face-detection-retail-0004 model to detect faces, crops them on the device using Script node, and then sends face frames to sbd_mask_classification_224x224 model which performs Mask/No-Mask classification.
We wrote a Deploying Custom Models that provides step-by-step details on how to convert, compile and deploy the SBD-Mask (custom model). Deployment part focuses on how this demo was coded.
mask-rec-final.mp4
- Color camera produces high-res frames, sends them to host, Script node and downscale ImageManip node
- Downscale ImageManip will downscale from high-res frame to 300x300, required by 1st NN in this pipeline; object detection model
- 300x300 frames are sent from downscale ImageManip node to the object detection model (MobileNetSpatialDetectionNetwork)
- Object detections are sent to the Script node
- Script node first syncs object detections msg with frame. It then goes through all detections and creates ImageManipConfig for each detected face. These configs then get sent to ImageManip together with synced high-res frame
- ImageManip will crop only the face out of the original frame. It will also resize the face frame to required size (224,224) by the SBD-Mask classification NN model
- Face frames get send to the 2nd NN - SBD-Mask NN model. NN recognition results are sent back to the host
- Frames, object detections, and recognition results are all synced on the host side and then displayed to the user
DepthAI Pipeline Graph was used to generate this image.
python3 -m pip install -r requirements.txt
Run python3 main.py