Team: Bartosz Brzoza, Magdalena Buszka, Martyna Firgolska, Michał Kulibiński
Description: This project contains simple workflow for classification of images of marine animals, leveraging kedro framework. It was made as a student project for "Projeect: Deep Learning" course at University of Wrocław.
The dataset contains images of marine animals - 23 different classes (Seahorse, Nudibranchs, Sea Urchins, Octopus, Puffers, Rays, Whales, Eels, Crabs, Squid, Corals, Dolphins, Seal, Penguin, Starfish, Lobster, Jelly Fish, Sea Otter, Fish, Shrimp and Clams). Each image size is of the type (k, 300px) or (300px, k), where k is a number less or equal to 300. Example images:
Begin by downloading the repository by cloning it or downloading the zip to the directory of your choice and openig the repo's folder.
To install the environment run the commands below
conda env create --file conda.yml
conda activate dlmarines
poetry install
You can download data manually from https://www.kaggle.com/datasets/vencerlanz09/sea-animals-image-dataste into data/01_raw or use data_downloading pipeline by running
kedro run --pipeline=data_downloading
Note that the pipeline uses kaggle api, so in order to run it follow the steps below to download your kaggle key.
Read more about data downloading pipeline.
Download Kaggle Api Key:
- Sign in to kaggle
- Go to Account
- Go to API section and click
Create New API Token
. It will downloadkaggle.json
with your username and key.
{ "username":"your_kaggle_username","key":"123456789"}
- In
conf/local/credentials.yml
add your username and key as shown below:
kaggle:
username: "your_kaggle_username"
key: "123456789"
To preprocess data from /data/01_raw/sea-animals-image-dataste.zip use data_processing pipeline
kedro run --pipeline=data_processing
Read more about data preprocessing pipeline.
To train the model use model_training pipeline
kedro run --pipeline=model_training
Read more about model training pipeline.
To evaluate the model use model_evaluation pipeline
kedro run --pipeline=model_evaluation
Read more about model evaluation pipeline.
To run all pipelines you can use command:
kedro run
Remember that in order to automatically download dataset you need to add your kaggle key.
Here you can view the Weights&Biases
report
Detailed documentation can be found here
The technologies and main libraries used in the project: