- conda 4.12.0 (later versions may also work) - Installation
- VS Code - Ubuntu Installation
- (Optional) CUDA Version: 11.4; Driver Version: 470.129.06 - Installation
- Install Poetry using Poetry full guide.
- Important: Check if it is working using
poetry --version
- Important: Check if it is working using
- Run command to keep your
.venv
folder right in your project:poetry config virtualenvs.in-project true
poetry shell
- Important: If you have
conda
and 2 environments were activated:conda deactivate
- Important: If you have
poetry install --no-root
In order to activate environment on the next use. Important: you should be inside your project
poetry shell
If you have CUDA
conda env create -f environment_gpu.yaml
Otherwise
conda env create -f environment.yaml
In order to activate environment
conda activate iasa_nlp_env
You may use any port
jupyter lab --port 7766
- Структура та структурні елементи постановки ML задачі. Формалізація бізнес задач. Основні задачі й методи в сфері Обробки природних мов
- Author: Sydorskyi Volodymyr
- Recording: https://drive.google.com/drive/folders/166r0s2p8Exc7Fucs3XFkHPuQaHB3LDAQ?usp=drive_link
- Представлення природніх мов в машинному вигляді. Класичні та нейронні алгоритми векторизації. Класичні ML підходи в NLP
- Author: Yelisieiev Vladyslav
- Recording: https://drive.google.com/drive/folders/1ZShCNBmnlQrvsReLdMfAy8k6oLkDK9jT?usp=drive_link
- Основні метрики в NLP (обробка природніх мов). Побудова оцінки підходів і моделей в NLP - валідація
- Author: Bazdyrev Anton
- Recording: https://drive.google.com/drive/folders/1Ee-RTDhgxWCa8MPpyI29j7CIJF0t6x6s?usp=drive_link
- Підходи з використанням архітектур RNN/GRU/LSTM
- Author: Sydorskyi Volodymyr
- Recording: https://drive.google.com/drive/folders/1SMGWOdwuBeN69DcGv_jbdV0zzftYAIx4?usp=drive_link
- Підходи з використанням архітектури Transformer
- Author: Bazdyrev Anton
- Recording: https://drive.google.com/drive/folders/1hpdbO4ElfSt44c5MHuT-b0fhgKDSVCou?usp=drive_link
- Генеративні задачі: машинний переклад, сумаризація тексту, умовна та безумовна текстова генерація, розгляд GPT архітектури
- Author: Yelisieiev Vladyslav
- Recording: https://drive.google.com/drive/folders/1uWAWtQzOaGvpkdzWWPs5YXM-T7rjLayG?usp=drive_link
- Задача кластеризації. Задача моделювання тем
- Author: Sydorskyi Volodymyr
- Recording: https://drive.google.com/drive/folders/1lLx7zBQ1GnoJHP02vGaoGehaRJfG6Lwa?usp=drive_link
- MLOps - розгортання моделей
- Author: Bazdyrev Anton
- Recording: https://drive.google.com/drive/folders/1TUsONxXg-RaCVuSPd5CPSsiPnqz6Nbbg?usp=drive_link
- Create Kaggle account
- Create Notebook
- Explore docs and find out how
- Add Kaggle dataset to notebook
- Turn on GPU
- Create Notebook in Colab
- Enable GPU
- Add Kaggle dataset to Colab - https://www.geeksforgeeks.org/how-to-import-kaggle-datasets-directly-into-google-colab/
- For most of lectures you will need datasets from Kaggle. Prepare in advance
- CommonLit - Evaluate Student Summaries dataset API command:
kaggle competitions download -c commonlit-evaluate-student-summaries
- Natural Language Processing with Disaster Tweets dataset API command:
kaggle competitions download -c nlp-getting-started
- Mantis Analytics Location Detection dataset:
kaggle datasets download -d vladimirsydor/mantis-analytics-location-detection
- Dataset for Topic Modelling:
https://drive.google.com/drive/folders/1jwh225T0DIEN4A1wMZ8-dVJX-2Tsovqf?usp=sharing
- CommonLit - Evaluate Student Summaries dataset API command:
- We recommend to create
data
folder in the course root directory and put all datasets there. So you might have next structure
data/
nlp_getting_started/
train.csv
test.csv
...
...
Lecture_1/
...
- Create Kaggle account
- Proceed with Installation & Authentication
- Don't forget to join a competition and accept its rules on a Kaggle website.
- Download dataset with API command
Raw table : https://docs.google.com/spreadsheets/d/1P38uhwkMQo0cd1avywbnVJ-dwxiswHpAcIpEHrlv1PY/edit?usp=sharing
- Process recordings and upload them to YouTube
- Process 2023 Feedback
@misc{iasa_nlp_course_2023,
author = {Sydorskyi Volodymyr, Bazdyrev Anton, Yelisieiev Vladyslav},
title = {IASA NLP course 2023},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/VSydorskyy/iasa_nlp_course}},
}