SGNS-PFNE is an open source implementation of PFNE with skip-gram with negative sampling.
- Linux(Tested with Ubuntu Linux release 16.04)
- Python3
- gcc(>=5)
- Numpy
- argparse
- pickle
- data/: Pre-processed corpous.
- ngram2~ngram6/: Computing PATI score from 2-gram to 6-gram.
- embedding/: Training distributed word vectors based on SGNS-PFNE.
- Oshikiri, T. (2017). Segmentation-Free Word Embedding for Unsegmented Languages. In Proceedings of EMNLP2017.
- Mikolov, T., Corrado, G., Chen, K., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. In Proceedings of ICLR2013.