Text Classification With Transfer Learning

As part of a 2021 summer internship with Algomo I fine-tuned DistilBERT on the banking77 dataset to achieve a maximum of 99.7% accuracy in question intent classification. I used pandas to handle and manipulate the dataset, keras (tensorflow) to build the models, with matplotlib and seaborn to visualise metadata and key processes.

The task served as a way to learn about deep learning and specifically transfer learning in the field of NLP, particularly to gain familiarity with BERT based models. I became interested in how the distribution of the training dataset could affect accuracy, so I set out to compare three models trained on differently sampled versions of the banking77 training set (Hugging Face already splits it into train and test datasets). After this I experimented with a final model trained on a sample of the banking77 dataset which had the same distribution as the test set, with the motivation being that in the real world it is often natural for training and test datasets to have similar distributions.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
distilbertathon.ipynb		distilbertathon.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Text Classification With Transfer Learning

About

Uh oh!

Releases

Packages

Languages

Sheev13/text-classification

Folders and files

Latest commit

History

Repository files navigation

Text Classification With Transfer Learning

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages