-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to run DBSCAN in distributed memory? #1149
Comments
Hi @lykos98. In the paper, the application (HACC) was responsible for the distributed memory. The way it is organized is that it construct local domains with some halos that guarantees that the clusters will be fully contained in the local data. That is what ArborX' algorithms were run on. We do not have a distributed DBSCAN implementation at the moment. |
Hi thank you for the quick answer! I understand implementing it in a true distributed fashion is quite difficult, nevertheless the performance reported in the paper is impressive. Congratulations! |
I'm sorry for not getting back to you sooner. I missed the notification. Thank you for the reference! I had already encountered that paper a while ago, I did not mean actually in the previous comment "difficult" in the algorithmic sense but in the implementation side which needs a non-negligible amount of effort to make it work. I'll keep an eye on this repository, thank you again! |
We are going to implement this. |
Hi!
I found this repository by reading the newly published paper on the advancements of the library. The paper mentions that you were able to run dbscan on 2*10^12 points, but in the documentation it is not clear how and if the actual implementation of dbscan works in distributed memory.
Can you provide an example of the algorithm applied also using mpi when the data is scattered between different processes?
Thank you!
F
The text was updated successfully, but these errors were encountered: