Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Reproduction -- Awaiting Response #18

Closed
ShuoYang-1998 opened this issue Mar 22, 2021 · 24 comments
Closed

Issue with Reproduction -- Awaiting Response #18

ShuoYang-1998 opened this issue Mar 22, 2021 · 24 comments

Comments

@ShuoYang-1998
Copy link

Hi authors,
I failed to re-produce the reported ORE performance. I couldn't even reproduce the results of faster-rcnn + fine-tuning baseline. In task 3, I got 32 and 11 for previous and current ap50. Compared to the results in your paper, the t3 prev and current is 37 and 12 respectively. The t3 log.txt t3_val_test_log.txt is attached, could you kindly help to fix the problem?
Thanks!

@JosephKJ
Copy link
Owner

Hi @ShuoYang-1998, thanks for sharing the log. It is interesting to see that you have much better A-OSE than what we report in the paper. Have you tried the same number of iterations that we had shared?

Further, did you use the pre-trained models that we shared, or have you trained it from scratch?

@ShuoYang-1998
Copy link
Author

I trained exactly the same iterations as your provided model_backup. I tried both training from scratch and using your provided pre-trained models, but I cannot get the same results as in your paper in both cases. The attached files t4_log.txt is the t4_log when evaluating your provided models, you can see that I got Known AP50: 24.88 while the result in your paper is 26.66. Additionally, in this t2_log.txt , you can see that I got AOSE of 8714 while in your paper is 7772. Note the above two log files were both generated by using your provided pre-trained models. When I try to re-train, the performance gap becomes more unacceptable. Am I doing something wrong?

,

@JosephKJ
Copy link
Owner

What machine are you using, from the logs I presume it is also a DGX-2.

@ShuoYang-1998
Copy link
Author

8 * V100 16g

@JosephKJ
Copy link
Owner

Let me check whether the models shared were correct.

@ShuoYang-1998
Copy link
Author

Hi, I uploaded the experimental script run_OWOD.txt and all config files OWOD_configs.zip along with the running log log.txt. As shown in the log, the re-training results are far away from the reported results. It would be much appreciated if you can help me fix the problem. Thanks!

@AmingWu
Copy link

AmingWu commented Mar 28, 2021

@ShuoYang-1998 @JosephKJ ,When I run the code, I obtained the result. Is it similar to your results?

python tools/train_net.py --num-gpus 4 --config-file configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.005

QQ截图20210328135054

@ShuoYang-1998
Copy link
Author

@ShuoYang-1998 @JosephKJ ,When I run the code, I obtained the result. Is it similar to your results?

python tools/train_net.py --num-gpus 4 --config-file configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.005

QQ截图20210328135054

@AmingWu You should run train-val-test, and report the test results.

@AmingWu
Copy link

AmingWu commented Mar 28, 2021

@ShuoYang-1998
QQ截图20210328142823
,Thanks for your reply. This is the test result. Is it similar to you?

@ShuoYang-1998
Copy link
Author

@ShuoYang-1998
QQ截图20210328142823
,Thanks for your reply. This is the test result. Is it similar to you?

The task1 is not class incremental, can you share your task3 results?

@AmingWu
Copy link

AmingWu commented Mar 28, 2021

OK,Now I run the task3, Does this order is correct?

I first run:

python tools/train_net.py --num-gpus 4 --config-file configs/OWOD/t3/t3_train.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.005

Then I run:

python tools/train_net.py --num-gpus 4 --eval-only --config-file configs/OWOD/t1/t1_test.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.005

@ShuoYang-1998
Copy link
Author

OK,Now I run the task3, Does this order is correct?

I first run:

python tools/train_net.py --num-gpus 4 --config-file configs/OWOD/t3/t3_train.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.005

Then I run:

python tools/train_net.py --num-gpus 4 --eval-only --config-file configs/OWOD/t1/t1_test.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.005

I think it should be train-ft-val-test, the task1 doesn't need ft. You can refer to my uploaded 'run_OWOD.txt', which contains all training and testing scripts. But I don't know if the script is correct because the author didn't provide running scripts yet.

@ShuoYang-1998
Copy link
Author

@JosephKJ Some people also has the reproduce problem. Can you fix it?Or you will keep ignoring it?

@ShuoYang-1998 ShuoYang-1998 changed the title Regarding the reported performance The results in the paper are unreproducible Mar 30, 2021
@salman-h-khan salman-h-khan changed the title The results in the paper are unreproducible Issue with Reproduction -- Awaiting Response Mar 31, 2021
@salman-h-khan
Copy link
Collaborator

Hi Shuo,

The first author has been traveling recently and will soon provide the rerun models.

Best,
Salman

@JosephKJ
Copy link
Owner

JosephKJ commented Apr 1, 2021

Continuing the discussion in #26

@JosephKJ JosephKJ closed this as completed Apr 1, 2021
@JosephKJ
Copy link
Owner

JosephKJ commented May 8, 2021

I have added 'replicate.py' to replicate results from the pertained models shared before. You can find the binaries and logs here, if you want to verify the authenticity of the results.

Please find my results below:
Replicated Results

@iFighting
Copy link

iFighting commented May 10, 2021

I have added 'replicate.py' to replicate results from the pertained models shared before. You can find the binaries and logs here, if you want to verify the authenticity of the results.

Please find my results below:
Replicated Results

The result from the pertained models in your figure is not consistent with the result in your paper。
In addition,we still cannot reproduce the result from the training scheduler。
I think you should seriously consider this problem。

@dyabel
Copy link

dyabel commented May 13, 2021

@ShuoYang-1998
QQ截图20210328142823
,Thanks for your reply. This is the test result. Is it similar to you?

I have the similar result to you in task 1, the mAP is only 52

@JosephKJ
Copy link
Owner

@dyabel : How many number of GPUs are you using?

@dyabel
Copy link

dyabel commented May 14, 2021

@dyabel : How many number of GPUs are you using?

I used 4 gpus and I have found the reason, I should multiple the iters and step by 2. But the WI in task1 is still 0.05092. I have attached my config.yaml and log.txt.
dyabel.zip

and

@JosephKJ
Copy link
Owner

Are the other metrics matching up?

@dyabel
Copy link

dyabel commented May 14, 2021

Are the other metrics matching up?

yes

@dyabel
Copy link

dyabel commented Sep 12, 2021

Are the other metrics matching up?

yes

I mean i can only reproduce task 1 except WI. Task 2-4 do not match.

@akshitac8
Copy link

Hello @JosephKJ @yuandhu I was successfully able to reproduce the results and also attached the table for the same

Screenshot from 2021-09-13 11-41-53

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants