Assessing Emoji Use in Modern Text Processing Tools
We have developed a test suite to assess the ability of tools to properly handle different kinds of emojis, including skin tone and composite emojis, with regard to tokenization and various natural language processing tasks.
- Tokenization
- Part of speech tagging
- Dependency Parsing
- Sentiment analysis
- Gensim
- NLTK
- NLTK Tweet Tokenizer
- PyNLPl
- SpaCy
- SpaCyMoji
- Stanford CoreNLP
- Stanza
- Textblob
- Email: [email protected]