Problem: For the given dataset (data_2d.csv attached),
- Find the number of "natural clusters" in the dataset (i.e., the number of clusters you would set for k-means)
- Cluster the data using k-means. Output can be a file in the format of your choice
- Find the outliers/noise (i.e., points which may not belong to any cluster). Output can be a file in the format of your choice.