Morpheme Based Myanmar Word Segmenter


Article type :

Original Article

Author :

Sin Thi Yar Myint | Hanni Htun | Myat Myo Nwe Wai

Volume :

3

Issue :

5

Abstract :

Myanmar script has no fixed delimiters between words or syllables. Therefore, to achieve meaningful and correct segmented words from the text is a challenging task. This paper has proposed a morpheme based Myanmar word tokenizer which combines rule based syllable breaking and dictionary lookup syllable merging methods with longest string matching approach. The proposed approach is tested on a Monolingual dictionary that contains useful information for the word segmentation. It also contains above 32,581 words including headwords, stop words and essential words with Myanmar3 font. These words are collected from Myanmar and Essential Words dictionaries. According to the experimental results, it can provide the promising segmentation accuracy of Myanmar text. Sin Thi Yar Myint | Hanni Htun | Myat Myo Nwe Wai "Morpheme Based Myanmar Word Segmenter" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26520.pdf Paper URL: https://www.ijtsrd.com/computer-science/other/26520/morpheme-based-myanmar-word-segmenter/sin-thi-yar-myint

Keyword :

Syllable breaking, Morpheme, style, styling
Journals Insights Open Access Journal Filmy Knowledge Hanuman Devotee Avtarit Wiki In Hindi Multiple Choice GK