Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The test results in train.py are inconsistent with those in test.py and detect.py #4

Open
jsago opened this issue Dec 8, 2022 · 2 comments

Comments

@jsago
Copy link

jsago commented Dec 8, 2022

The map in train.py is high, and the map in test.py is low. In addition, the visualization results in test.py have many overlapping boxes from different angles. When my nms threshold is turned down, even the prediction boxes with correct angles are suppressed, while the visualization results in detect.py do not have overlapping boxes from different angles, even though the parameter settings of the two files are the same.

map in train.py
20221208212301
result in test.py
P_curve
R_curve
PR_curve
test_batch1_pred
test_batch6_pred
test_batch9_pred
result in detect.py
image
image

@jsago
Copy link
Author

jsago commented Dec 23, 2022

I found that the mAP during training is higher, generally above 0.8, and the detection result of detect.py is also normal. However,the visual image of test.py shows that the mAP is only about 0.5,and the visualization result of test.py usually has multiple prediction boxes with different angles overlapping. More importantly, the confidence of the box with correct angle prediction is usually lower than that of the box with wrong angle prediction. If the conf-thres in test.py is turned up and iouthres is turned down, The final detection result is likely to be only a box with wrong angle prediction,This question has troubled me for a long time. I hope it can be answered.Thanks
test_batch15_pred

@lx-cly
Copy link
Owner

lx-cly commented Feb 11, 2023

我发现训练时的mAP更高,一般在0.8以上,detect.py 的检测结果也很正常。然而,test.py 的视觉图像显示mAP仅为0.5左右,test.py 的可视化结果通常有多个不同角度重叠的预测框。更重要的是,具有正确角度预测的盒子的置信度通常低于具有错误角度预测的盒子的置信度。如果 test.py 中的 conf-thres 调高,iouthres 被调低,最终的检测结果很可能只是一个角度预测错误的盒子,这个问题困扰了我很久。我希望它能得到回答。谢谢 test_batch15_pred

Thank you for your question! I'm sorry to see it now. The difference between test.py and train.py on evaluation is loading of the model.I don't know the reason for this difference.
Now , you can save the detection result of detect.py,then use DOTA_devkit to evaluate.In this way, accurate evaluation results can be obtained.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants