This script fetches a a dataset from a link then performs a cleaning operation using pyspark and returns the result in json This is a flask app which uses pyspark library to perform cleaning operations on on a web fetched dataset which is in json format. After fetching the data it converts it into a Pyspark dataframe.