Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mean Average Precision 0.000 for "mask_rcnn_coco.h5" for instance_custom_training. #97

Closed
fsultana44 opened this issue Sep 1, 2021 · 5 comments

Comments

@fsultana44
Copy link

I am testing PixelLib for the instance segmentation using our custom dataset. At first, I run the model on the CPU only for 18 epochs using batch size 1(we have a very small data set, four training images). The model trained perfectly and detected all the objects in the images, and test performance was 44%. Then I ran the same code and model in the GPU for 250 epochs to get better performance, but it surprised me. I got a Mean Average Precision of 0.0000, and the model could not detect anything. I tried to run several ways by changing batch size, network backbone(resnet101,resnet50), and run the code on the Google colab but have not got any results, always getting Mean Average Precision 0.0000. I did not get any errors while running the code, and I just got some warnings. How to solve this problem?
Here is some snippet of the warning.
train_1
train_2
train_3
model-eva_pro2

@ayoolaolafenwa
Copy link
Owner

@fsultana44 PixelLib is having issue with the latest version of tensorflow(2.6.0). Downgrade to any of the tensorflow versions between 2.0-2.50.

@fsultana44
Copy link
Author

@ayoolaolafenwa, I also checked with TensorFlow versions 2.50, and got same result. I trained 200 epochs and observed that the model saves only the first 2 epochs, after that it doesn't, although the loss is decreased.

@ayoolaolafenwa
Copy link
Owner

ayoolaolafenwa commented Oct 14, 2021

@ayoolaolafenwa, I also checked with TensorFlow versions 2.50, and got same result. I trained 200 epochs and observed that the model saves only the first 2 epochs, after that it doesn't, although the loss is decreased.

Can you share your trained model with me?

@fsultana44
Copy link
Author

fsultana44 commented Oct 14, 2021

The problem has been solved now, while I tested it again; the previous issue might happen for some other reason.
Thank you @ayoolaolafenwa, for your effort.

@gdeepank
Copy link

Hello @fsultana44 @ayoolaolafenwa. Currently, I am facing the same above issue. I am getting mAP 0.000. I was able to complete the training phase. Could you please let me know how to solve this issue? How were you able to fix the issue?
Your help would be very much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants