Spark user defined aggregate function that calculates percentile with quick select algorythm.
It performs linear intepolation between adjacent ranks.
Quick select provides linear complexity of calculations.
Null values on the input are ignored.
Other two benchmarking reports was performed on the same machine in the same conditions and can be used to compare performance with percentiles calculations:
This code was developed by me during my work in SBDA Group