nlp-predicting-tags-stackoverflow

How to predict tags for posts from StackOverflow? To solve this task you will use multilabel classification approach.

Week 1 of Natural Language Processing Course

Problem

Considering the task of predicting tags for posts from StackOverflow.
To solve this task you will use multilabel classification approach with bag of words model, with tf-idf features.
For MultiClass Classification we use One vs Rest Classifier with Logistic Regression

Libraries

In this task you will need the following libraries:

Numpy — a package for scientific computing.
Pandas — a library providing high-performance, easy-to-use data structures and data analysis tools for the Python
scikit-learn — a tool for data mining and data analysis.
NLTK — a platform to work with natural language.

Datasets

In this task you will deal with a dataset of post titles from StackOverflow. You are provided a split to 3 sets: train, validation and test. All corpora (except for test) contain titles of the posts and corresponding tags (100 tags are available). The test set is provided for Coursera's grading and doesn't contain answers

Results and Conclusions

We evaluated each model with F1 weighted score, and submitted on coursera platform.
The best model had f1-score = 0.65 on validation dataset and passed on coursera evaluation.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.ipynb_checkpoints		.ipynb_checkpoints
README.md		README.md
grader.py		grader.py
lemmatization_demo.ipynb		lemmatization_demo.ipynb
metrics.py		metrics.py
tfidf_demo.ipynb		tfidf_demo.ipynb
week1_MultilabelClassification.ipynb		week1_MultilabelClassification.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nlp-predicting-tags-stackoverflow

Problem

Libraries

Datasets

Results and Conclusions

About

Releases

Packages

Languages

vgp314/nlp-predicting-tags-stackoverflow

Folders and files

Latest commit

History

Repository files navigation

nlp-predicting-tags-stackoverflow

Problem

Libraries

Datasets

Results and Conclusions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages