Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hf tokenizer class definition header #57

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

cptspacemanspiff
Copy link

Issue:

The current implementation of the HFTokenizer has specialized methods that expose the add_special_tokens flag. However these are inside the .cc file and are not exposed as part of the api/cmake includes

Fix:

This splits the definition and instantiation of the HFTokenizer, allowing me to place the definition in the header file (I placed in tokenizers_cpp.h )

I also added/moved the factory methods for constructing the derived HFTokenizer class into HFTokenizer, and then had the base classes factory call those (allowing for implicit conversion of the unique pointer from HFTokenizer to Tokenizer.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant