This code complements my diploma thesis on "Text Classification Using Unsupervised Learning Techniques" for the Electrical and Computer Engineering Department of Aristotle University of Thessaloniki, Greece.
Author: Kitsios Konstantinos
Correspondance address: [email protected]
Code sub-routines and pretrained neural networks are used from the research papers below:
[1]: D. Cer, Y. Yang, S. yi Kong, N. Hua, N. Limtiaco, R. S. John, N. Constant,M. Guajardo-Cespedes, S. Yuan, C. Tar, Y.-H. Sung, B. Strope, and R. Kurzweil, ‘‘Universal sentence encoder,’’arXiv preprint arXiv:1803.11175, 2018.
[2]: A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, ‘‘Language models areunsupervised multitask learners,’’, 2019.
[3]: N. Pitsianis, A. Iliopoulos, D. Floros, and X. Sun, ‘‘Spaceland embedding of sparse stochastic graphs,’’2019 IEEE High Performance Extreme Computing Conference (HPEC),pp. 1-8, 2019.
The packages needed in order to run the code are in the requirements.txt file and you can install them through pip by running
pip install requirements.txt
The code for each dataset can be executed from the associated notebook (.ipynb)