A Soundex-like phonetic algorithm for the Greek language.
word --> SGRSimp | SGRExtra Θάλασσα --> θ740 | θ969 θάλλασα --> θ740 | θ969 θάλασα --> θ740 | θ969 θαλασών --> θ748 | θ969 μήνυμα --> μ880 | μ@7@ μύνημα --> μ880 | μ@7@ μίνιμα --> μ880 | μ@7@ μοίνειμα --> μ880 | μ@7@ έτοιμος --> έ384 | ε3@7 έτιμος --> έ384 | ε3@7 έτημος --> έ384 | ε3@7 έτυμος --> έ384 | ε3@7 έτιμως --> έ384 | ε3@7 αίτημος --> α384 | ε3@7 αυγό --> α200 | α12$ αβγό --> α120 | α12$ αυγολάκια --> α276 | α12$ αβγά --> α120 | α129 αυγά --> α200 | α129 τζατζίκι --> τ434 | c94@ τσατζίκι --> τ434 | c94@ τσατσίκι --> τ434 | c94@ κορονοιός --> κ!84 | κ$8$ κοροναιός --> κ!84 | κ$8$ οβελίας --> ο174 | ο1*6 ωβελύας --> ω174 | ο1*6 οβελίσκος --> ο174 | ο1*6 Βαγγέλης --> β274 | β95* Βαγκέλης --> β267 | β95* Βαγκαίλης --> β267 | β95* Γιάννης --> γ840 | γ@97 Γιάνης --> γ840 | γ@97 Γιάνννης --> γ840 | γ@97 αναδιατάσσω --> α833 | α793 αναδιέταξα --> α833 | α793 θαύμα --> θ800 | θ917 θάβμα --> θ180 | θ917 θαυμαστικό --> θ843 | θ917
This work has started in the context of the Diploma thesis of Antrei Kavros under the supervision of Yannis Tzitzikas, in the Computer Science Department of the University of Crete, GREEECE.
- client/Examples: prints indicative examples
- client/Prompt: for computing the codes for words inputed through the command line
- evaluation/BulkCheck: for running experiments over an evaluation collection that is included (in the folder Resources)
More information is available in the following paper:
- Antrei Kavros and Yannis Tzitzikas, SoundexGR: An Algorithm for Phonetic Matching for the Greek Language, Natural Language Engineering, (accepted for publication), 2022 (preprint: http://users.ics.forth.gr/~tzitzik/publications/Tzitzikas_2022_SoundexGR.pdf)
- Antrei Kavros
- Yannis Tzitzikas