near-duplicate images detection 🪞

That is our attempt on solving the CSC Hackathon 2023 problem provided by LUN.UA

Results 🚀

training was done on 32000 samples

              precision    recall  f1-score   support
           0    0.99825   0.99935   0.99880     15431
           1    0.99861   0.99627   0.99744      7229
    accuracy                        0.99837     22660
   macro avg    0.99843   0.99781   0.99812     22660
weighted avg    0.99837   0.99837   0.99837     22660

Brief model desciption 🔍

We take ResNet 152 and retrain the last fully connected layer. Then we produce embeddings of images using our network and measure distance between them. Read more about Contrastive loss and Siamese Network if you are interested or check the implementation.

Model was trained of 32000 pairs of images on MacBook Pro 2021, M1, 32GB (took ~25 minutes). Evaluating the test dataest consisting of 22660 images took 1063 seconds, or 0.046 seconds per pair.

Testing

Feel free to experiment with the model yourself!

Check out ResNet.ipynb Don't forget to load the unzipped photoes to the folder data/images though.

Directory


image-similarity
  |
  | - Data (metadata files for the test and training data)
  |
  | - ResNet.ipynb (train and experiment with the model yourself)
  |
  | - hypertuner.ipynb (optuna hyperparameter tuner)
  |
  | - hypertuner-augment.ipynb (modified optuna hyperparameter tuner that uses data augmentations)
  |
  | - utils.py (miscellanious methods and classes: custom Dataset, Dataloader, model train, etc)
  |
    . . .  (you won't need the other ones most likely. mostly experiments and scripts for data loading)

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
data		data
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ResNet.ipynb		ResNet.ipynb
augment.ipynb		augment.ipynb
create_small.ipynb		create_small.ipynb
hypertuner-augment.ipynb		hypertuner-augment.ipynb
hypertuner.ipynb		hypertuner.ipynb
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

near-duplicate images detection 🪞

Results 🚀

Brief model desciption 🔍

Testing

Directory

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

CSC-Hackathon-Solutions/image-similarity

Folders and files

Latest commit

History

Repository files navigation

near-duplicate images detection 🪞

Results 🚀

Brief model desciption 🔍

Testing

Directory

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages