Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about the spatial loss when training? #24

Open
onlyinheaven opened this issue Jan 2, 2024 · 0 comments
Open

Questions about the spatial loss when training? #24

onlyinheaven opened this issue Jan 2, 2024 · 0 comments

Comments

@onlyinheaven
Copy link

Dear NVDS authors,

Thank you for publishing this outstanding work. However, I have some questions while reading your paper. Since the depth prediction network is fixed during the training of the stabilization network, I would like to understand why there is a spatial loss term L(t-1). According to my understanding, during inference, the stabilization network takes four depth inputs and outputs the depth for the target frame, without explicitly providing the depth for t-1. So, during training, why is there a spatial loss term L(t-1)? Does the stabilization network simultaneously output stabilization depth for all four frames? If not, does it involve inferring t-1 depth twice during each gradient backward pass – once for input t-4 to t-1, producing the depth for t-1, and another for input t-3 to t, producing the depth for t, and then calculating the loss?

Apart from this question, I would also like to understand how the temporal loss during training, which uses t-1 depth, is obtained.

Thank you for your clarification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant