GitHub

REPORT

Euclidean Distance: Represents the shortest distance between two vectors.
Manhattan Distance: Measures the distance between two points by adding up the absolute differences of their coordinates.
Mahalanobis Distance: Represents the distance between a point P and a distribution D by measuring how many standard deviations away P is from the mean of D.

Chebyshev Distance: Also known as Chessboard distance, because in Chess it equals the minimum number of moves needed by a king to go from one square to another.
Minkowski Distance: A generalized distance metric which can be modified by substituting the value of ‘p’ to calculate the distance between two points:
- If p = 1, Manhattan Distance (L1 norm)
- If p = 2, Euclidean Distance (L2 norm)
- If p = ∞, Chebyshev Distance (L∞ norm)
Cosine Distance: Measures the degree of angle between two vectors (often used for word frequency comparisons in documents). It is used when the magnitude does not matter, but their orientation does.

Prevents overfitting by testing on different subsets of data.
K-fold cross-validation: Splits data into k parts, trains on k-1 parts, tests on the remaining, and repeats k times.
Improves reliability by averaging performance across multiple splits.
Reduces bias & variance compared to a single train-test split.
Essential for model selection and hyperparameter tuning.

Bias: The error due to overly simplistic assumptions in the model.
- Small k: Low bias (follows training data closely).
- Large k: High bias (overly smooth decision boundary).
Variance: The error due to excessive sensitivity to training data.
- Small k: High variance (sensitive to noise).
- Large k: Low variance (more stable but may underfit).

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
output_img		output_img
wandb/run-20250226_151633-3646t7b0		wandb/run-20250226_151633-3646t7b0
wandb_screenshots		wandb_screenshots
.gitignore		.gitignore
Assignment of Template image to cluster.png		Assignment of Template image to cluster.png
Dr_Shashi_Tharoor.jpg		Dr_Shashi_Tharoor.jpg
Plaksha_Faculty.jpg		Plaksha_Faculty.jpg
README.md		README.md
autorun_ipynb.yml		autorun_ipynb.yml
clusters and centroids wtih template image.png		clusters and centroids wtih template image.png
clusters and centroids.png		clusters and centroids.png
dataset.csv		dataset.csv
detected_faces_img.png		detected_faces_img.png
distance_classification.py		distance_classification.py
dockerfile		dockerfile
face_clustering_plt.png		face_clustering_plt.png
file_loading_test.ipynb		file_loading_test.ipynb
harry_potter.png		harry_potter.png
lab5-adityamasutey.ipynb		lab5-adityamasutey.ipynb
template image.png		template image.png
test_script.py		test_script.py