Skip to content

r-tae/nyingarn-phonetic-search

Repository files navigation

ElasticSearch phonetic search plugin for Nyingarn

Algorithm description: (not comprehensive)

  1. replace unicode diacritics with ASCII versions
  2. remove macron from vowels
  3. replace diphthong combinations with -y-/-w-and initial i/u->yi/wu
  4. -ow/-aw -> AWU
  5. ah er uh ar -> A
  6. EN/EL -> IN/IL
  7. long vowel -> short vowel
  8. UA/UI/OA -> UWA/UWI
  9. remove double letters (except for rr)
  10. enye -> NG
  11. YNY/YLY/YN/YL -> NY/LY
  12. double digraphs -> single digraph (e.g., NGNG -> NG)
  13. final Y -> AYI
  14. initial G -> K
  15. single letter replacements i. B -> P ii. D -> T iii. O -> U iv. E -> I
  16. G -> K (except for NG)
  17. di- and tri-graphs -> J
  18. S -> J
  19. C -> K, WH -> W
  20. WU -> U (except initially or after A I or U)
  21. WA/WI/WU -> AWA/AWI/AWU (except initially or after AIU)
  22. RA/RI/RU -> URA/URI/URU (except initially or after AIUR)
  23. Y -> AYI (except next to AIU, or after N or L)
  24. AY -> AYI (except before AIU)
  25. Y -> I (except after AIUNLT)
  26. remove special characters and digits

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published