lookwei / COMP4423 Public

Notifications You must be signed in to change notification settings
Fork 5
Star 28

Course materials for COMP 4423 - Computer Vision for Beginners at the Hong Kong Polytechnic University

28 stars 5 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
slides		slides
Pytorch.ipynb		Pytorch.ipynb
README.md		README.md
T1 - Get Environment Ready.pdf		T1 - Get Environment Ready.pdf
T1-coronvirus-mask.png		T1-coronvirus-mask.png
T10-Images.zip		T10-Images.zip
T10_Detection_and_Segmentation.ipynb		T10_Detection_and_Segmentation.ipynb
T11 - RNN & Network Debug.pptx		T11 - RNN & Network Debug.pptx
T11_RNN.ipynb		T11_RNN.ipynb
T2-lenna.png		T2-lenna.png
T2_Play_with_images_answers.ipynb		T2_Play_with_images_answers.ipynb
T2_Play_with_images_tasks.ipynb		T2_Play_with_images_tasks.ipynb
T3-Challenges.ipynb		T3-Challenges.ipynb
T3-Play.with.content-answers.ipynb		T3-Play.with.content-answers.ipynb
T3-Play.with.content-tasks.zip		T3-Play.with.content-tasks.zip
T4_Feature_Extraction_Task.zip		T4_Feature_Extraction_Task.zip
T4_Feature_extraction_answer.ipynb		T4_Feature_extraction_answer.ipynb
T5-play.cam-dynamic-tone-display.py		T5-play.cam-dynamic-tone-display.py
T5-play.cam-dynamic-tone-train2.py		T5-play.cam-dynamic-tone-train2.py
T5_Image_retrieval_tasks.ipynb		T5_Image_retrieval_tasks.ipynb
T6-Task.zip		T6-Task.zip
T6-challenge.zip		T6-challenge.zip
T7-data.zip		T7-data.zip
T7_Machine_learning_Deep_learning_tasks.ipynb		T7_Machine_learning_Deep_learning_tasks.ipynb
T8-Data.zip		T8-Data.zip
T8-Task.ipynb		T8-Task.ipynb
T9-Task.zip		T9-Task.zip

Repository files navigation

Computer Vision for Beginners - COMP4423 @ PolyU HK

The Lectures and Tutorials

L1 Introduction to Computer Vision

What is Computer Vision?
Applications (object detection, semantic segmentation, style transfer, etc.)
A brief history of Computer Vision
Play with FPV Recognition

Lecture Slides: L1-Introduction.pdf

Video Link: https://youtu.be/sWwWroRpqkM?si=V3FSwlet643YTDSU

Tutorial Environment Setup: T1-Get Environment Ready

L2 Image Processing I: Let's play with the images

How Human/Computers see images
Display the images
Play with the images (colors, sizes, rotations)
Examples from IMHere

Lecture Slides: L2-Image.Processing.I.pdf

Video Link: https://youtu.be/scrAoh-L7KU?si=w2AmQ0Pl4AAgBoJd

Tutorial Tasks (Google CoLab): T2-Play.with.images-tasks.ipynb

Tutorial Answers (Google CoLab): T2-Play.with.images-answers.ipynb

Image Lenna: T2-lenna.png

L3 Image Processing II: Let's play with the content

Filters and convolutions
Edge Filters
Nose Reduction
Morphological Operations

Slides: L3-Image.Processing.II.pdf

Video Link: https://youtu.be/UVGG4ZFQWrw?si=DkQj4y8ppGYacYxO

Tutorial Tasks (Google CoLab): T3-Play.with.content-tasks.ipynb

Tutorial Answers (Google CoLab): T3-Play.with.content-answers.ipynb

Challenge Tasks (Google CoLab): T3-Play.with.content-challenge.ipynb

Virus Image: T1-coronvirus-mask.png

Image Lenna: T2-lenna.png

L4 Featrue Extraction

Feature vectors
Feature Space
Quantization
Metrics (Distance and Similarity)
Global and Local Features (Color Histograms, LBP, SIFT)

Lecture Slides: L4-Feature.Extraction.pdf

Video Link: https://youtu.be/7UUWyQiCtfU?si=mbCBjrJLwoi6kXhO

Demo: Keypoint extraction and tracking

Demo 2: Keypoint extraction and tracking

Tutorial Tasks (Google CoLab): T4-Feature_extraction_task

Tutorial Answers (Google CoLab): T4-Feature_extraction_answers

L5 Image Retrieval Fundamentals

Clustering
K-Means
Content-based image retrieval (CBIR)
Bag of Visual Words (BoVW)

Lecture Slides: L5-Image.Retrieval.pdf

Video Link：https://youtu.be/VtCf9HCqAEw?si=a-7A9YHesKOWu49g

Tutorial Tasks (Google CoLab): T5-Image.retrieval-tasks.ipynb

Sample Code for tone modifier challange:

For vocabulary learning: T5-Challenge-train
Tone modification and display: T5-Challenge-display

L6 Image Classification Fundamentals

Classification
Supervised learning
K nearest neighbors (k-NN)
Bayesian classifiers
Support vector machines (SVM)

Lecture Slides: L6-Image.Classification.pdf

Video Link: https://youtu.be/bUwGY5sqZHU?si=GSxOPDWWQaSr0dw9

Paper Rock Scissors Game Demo: https://youtu.be/dGwou6Khvqo?si=zoMzRBObLU9FUXZr

Tutorial Tasks: T6-Image-Classification

Challenges: T6-Challenges

L7 Traditional Machine Learning to Deep Learning

Traditional machine learning vs. deep learning
Gradient decent
Neural networks
Deep neural networks
Convolutional neural networks (CNN)
Layers, pooling, and activations
AlexNet, VGG, and ResNet

Lecture Slides: L7-Machine.learning.Deep.learning.pdf

Video Link: https://youtu.be/xc5MKb8LNBo?si=MlCAFszzgy001A3e

Tutorial Tasks (Google CoLab): T7-Machine.learning.Deep.learning-tasks.ipynb

Tutorial Data: T7-data.zip

L8 Deep Image Retrieval

Deep image retrieval
Feature aggregation/embedding/fusion
Fine tuning (Siamese/Triplet networks)
R-Mac, VLAD, BoVW

Lecture Slides: L8-Deep.image.retrieval.pdf

Video Link: https://youtu.be/klu6SHHoC2E?si=5vCc6-mbt-VzCOlN

Tutorial Answers (Google CoLab): T8-Deep.image.retrieval-answers.ipynb

Tutorial Data: T8-data.zip

Pytorch - Quick Start: T8-Pytorch-Quick-Start.ipynb

L9 CAM, Attentions and Transformers

Class Activation Mapping (CAM)
Attentions
Self-Attentions, and Transformers

Lecture Slides: L9-CAM.Attention.Transformer.pdf

Video Link: https://youtu.be/Ypi4F7nt2u4?si=9FDTkpZw3UIjwdvz

Tutorial Answers: T9-CAM and ViT

L10 Detection & Segmentation

Object detection and Image Segmentation
Yolo
UNet,
R-CNN, Fast-RCNN, Faster-RCNN, Mask-RCNN

Lecture Slides: L10-Detection.Segmentation.pdf

Video Link: https://youtu.be/gdDDQtcttZA?si=LgCJqo5hs1vuT7Bg

Tutorial Answers (Google CoLab): T10-Detection.Segmentation-answers.ipynb

Tutorial Data: T10-Images

L11 Learning Paradigms

Multi-task learning
N-shot learning (Few-shot, Zero-shot)
Transfer learning, Metric learning, Meta-learning
Generative networks (VAE, GAN)
Reinforcement learning

Lecture Slide: L11-Learning.Paradigms.pdf

Video Link: https://youtu.be/_jyfvaiB4g4

Tutorial RNN: T11-RNN.ipynb

Tutorial Slides: T11-RNN-and-Network-Debug

L12 Large Models

RNN and Image Captioning
Transformers
Large Language Models

Lecture Slide: L12-Large.Models.pdf

Appendix: Image-Synthesis

About

Course materials for COMP 4423 - Computer Vision for Beginners at the Hong Kong Polytechnic University

Report repository

Releases

No releases published

Packages

No packages published

Contributors 2

Languages

Jupyter Notebook 100.0%