You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-25149][GRAPHX] Update Parallel Personalized Page Rank to test with large vertexIds
## What changes were proposed in this pull request?
runParallelPersonalizedPageRank in graphx checks that `sources` are <= Int.MaxValue.toLong, but this is not actually required. This check seems to have been added because we use sparse vectors in the implementation and sparse vectors cannot be indexed by values > MAX_INT. However we do not ever index the sparse vector by the source vertexIds so this isn't an issue. I've added a test with large vertexIds to confirm this works as expected.
## How was this patch tested?
Unit tests.
Please review http://spark.apache.org/contributing.html before opening a pull request.
Closesapache#22139 from MrBago/remove-veretexId-check-pppr.
Authored-by: Bago Amirbekian <[email protected]>
Signed-off-by: Joseph K. Bradley <[email protected]>
0 commit comments