You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, From your description, this method most rely on two view geometry to get pose and point cloud from two unposed images and then do 3DGS. Can it be extended to arbitrary number of images? And a small question, Does the "sparse view" input really matters in reconstruction? For example, If I have sparse views (2 images or 3 images), why don't I take more images (like 200 or more to reconstruct a whole house). From my perspective, taking 200 or 300 images (or video sequence) have no much difference with taking just 2/3 images in terms of time and money cost. So I think if we can extend to arbitrary views, this would be much applicable in industry.
The text was updated successfully, but these errors were encountered:
It can be extended to more input views. We show the results for 3 input views in Table 5 and Figure 9. However, it is difficult to scale to an arbitrary number of images because the memory consumption of the GPU increases when the number of input images increases. In our experiments, the maximum number of input views for the 24GB 4090 GPU during inference is about 45.
I agree that dense reconstruction is important. However, sparse view input is also an important task. For example, when a user takes only a few images, the sparse reconstruction method can still reconstruct them. Moreover, using sparse input is much faster than using a large number of input views.
Hi, From your description, this method most rely on two view geometry to get pose and point cloud from two unposed images and then do 3DGS. Can it be extended to arbitrary number of images? And a small question, Does the "sparse view" input really matters in reconstruction? For example, If I have sparse views (2 images or 3 images), why don't I take more images (like 200 or more to reconstruct a whole house). From my perspective, taking 200 or 300 images (or video sequence) have no much difference with taking just 2/3 images in terms of time and money cost. So I think if we can extend to arbitrary views, this would be much applicable in industry.
The text was updated successfully, but these errors were encountered: