Skip to content

An embedding method derived from syllables and morphemes to leverage OOV problem for agglutinative languages.

Notifications You must be signed in to change notification settings

Meinwerk/SyllableLevelLanguageModel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Syllable Level Language Model

Description

We have proposed embedding derived from syllables and morphemes for the words to improve the performance of language model. Our method has achieved state of the art performance in terms of Key Stroke Saving (KSS) w.r.t. to existing device input prediction methods and has been commercialized.

Data

The data set which is used for evaluating the model proposed in ""

Evaluation data is manually curated to compare our performance with existing word prediction methods. The dataset is consist of 67 sentences (825 words, 7,531 characters) which are collection of formal and informal utterances from various sources which covers general keyboard scenarios.

Download

You can download the data set directly from the comman line:

git clone https://github.com/Meinwerk/SyllableLevelLanguageModel.git

You can also download the data set as a zip file using the following URL:

https://github.com/Meinwerk/SyllableLevelLanguageModel/master.zip 

Evaluation Data

./eval_kss_ko.txt

Contact

Contact: Seunghak Yu, Nilesh Satish Kulkarni
Email: <full_name>@gmail.com

About

An embedding method derived from syllables and morphemes to leverage OOV problem for agglutinative languages.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published