Skip to content

szabreka/uva_thesis_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

98 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Lightweight CLIP-Based Ensemble Model for State-of-the-art Industrial Smoke Detection from Video Clips

This work was submitted as a thesis project to the Master Information Studies, Data Science track at the University of Amsterdam.

Supervisor of the project: Dr. Yen-Chia Hsu ([email protected])

Date of submission: 30.06.2024

Abstarct of the thesis report: Advanced computer vision techniques have led to the development of various vision-based approaches to tackle the complex challenge of smoke detection. A lightweight AI tool that can recognize poten- tially harmful industrial emissions would aid local communities in their fight for clean air. This thesis addresses the research gap in the potential benefits of utilizing OpenAI’s Contrastive Language- Image Pre-training (CLIP) model for industrial smoke detection. The proposed Vision-Language model exploits the advantages of comparing text and image semantics. Prior research has primar- ily utilized Convolutional Neural Networks, Vision Transformers, and segmentation methods. This study evaluates CLIP in indus- trial smoke detection in terms of performance and computational efficiency. Experiments show that fully-supervised CLIP training, particularly when utilizing linear probe representation learning, performs better than zero-shot and few-shot setups for smoke de- tection. Additionally, an ensemble approach, combining a fully- supervised CLIP model with a small MobileNetV3 using a GRU or LSTM layer outperforms the baseline Two-Stream Inflated 3D ConvNet model with a single Timeception layer in terms of light- weight design, as the model can be run using only CPU cores. The ensemble model achieved an average F1 score of 0.72, indicating a promising approach for smoke detection.

Used datasets

Used splits on the combined dataset

View S0 S1 S2 S4 S5
0-0 Train Train Test Train Train
0-1 Test Train Train Train Train
0-2 Train Test Train Train Train
0-3 Train Train Validate Train Test
0-4 Validate Train Train Test Validate
0-5 Train Validate Train Train Test
0-6 Train Train Test Train Validate
0-7 Test Train Train Validate Train
0-8 Train Train Validate Test Train
0-9 Train Test Train Validate Train
0-10 Validate Train Train Test Train
0-11 Train Validate Train Train Test
0-12 Train Train Test Train Train
0-13 Test Train Train Train Train
0-14 Train Test Train Train Train
1-0 Test Test Test Test Test
2-0 Test Test Test Test Test
2-1 Test Test Test Test Test
2-2 Test Test Test Test Test
3-0 Test Test Test Test Test
3-1 Test Test Test Test Test
3-2 Validate Train Test Train Train
3-3 Train Train Test Train Validate
3-4 Train Train Validate Test Train
4-0 Test Train Train Train Validate
4-1 Test Train Validate Train Train
4-2 Validate Train Test Train Train
4-3 Train Train Train Test Train
5-0 Train Validate Train Train Test
5-1 Train Test Train Validate Train

How to use code?

  1. Fork repository

  2. Set up virtual environment:

    conda create -n deep-smoke-machine
    conda activate deep-smoke-machine
    conda install python=3.9.19
    conda install pip
    pip install -r requirements.txt
    
  3. Run code

Note that the baseline model used in the thesis report is available at: https://github.com/szabreka/thesis-deep-smoke-machine-rsz

About

Industrial Smoke Detection from Video Clips

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published