Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ability to limit the languages to check for #62

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

whnr
Copy link

@whnr whnr commented Feb 20, 2019

Added a list as language limitation for load_profiles. Also implemented in detect(text, languages=[]) and detect_langs(text, languages=[]). Auto reloading the _factory when the language selection changes.

Added a list as language limitation for load_profiles
@Vangelys
Copy link

Hello, I also want to limit languages in detection for a project, do you have any news about this functionality and the acceptation of this branch ?

@ManuelMartinG
Copy link

ManuelMartinG commented Jul 3, 2020

I'm also interested in this feature. Is there any expectation to be merged in a new release?

The reason behind my concern is that if you are using langdetect within a PySpark UDF, it's not efficient to load every possible language available. It adds quite an overhead to the serialized size of the UDF passed to Spark. Usually, you're expecting a limited number of languages to appear in your application, no need to have such a big list by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants