- the training samples as shared by @akshala, slight change to that
- for ex: "German" - "BIIIII"
- changed to : "German" - "BBBBBB"
- those samples are saved in conll format for ease in loading. fields are [id,form,tag]
- test.conll : doesnt mean that it is test file. temporary name for the file.
- test.conll is referenced in loadDataset.py
- loadDataset.py : used to load the conll format file in HuggingFace's Datasets library format. for ease of training.
- full_pipeline.ipynb : used for performing hashtag segmentation
-
Notifications
You must be signed in to change notification settings - Fork 3
prashantkodali/HashSet
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published