My virtual poster presentation for SciPy 2023 on Rozha, a package to simplify and streamline a number of natural language processing processes and methods for a wide variety of languages, empowering users to use NLP on both non-English and English texts. The notebook included in this repo can also be accessed via Google Colab at this link.
Much of the work that has been done using natural language processing (NLP) has been focused on an Anglocentric model, using English texts in conjunction with tools and computer models that are primarily designed to work with the English language. Rozha was created to make it easier for people to begin engaging with non-English materials within the context of their NLP and digital humanities work. Rozha is a Python package designed to simplify multilingual NLP processes and pipelines, with a focus on supporting academic users in computational research on non-English languages.