Semantic-based Malay-English Translation using N-Gram Model
Keywords:
Machine Transalation, Malay-English Translation, N-Gram, Semantic, Ambigous,Abstract
Most of the existing machine translations are based on word-for-word translation. The major obstacle in developing such a system is natural language is not free from ambiguity problems. One word may have more than one semantic, and vice versa. Herein, we propose a semantic-based Malay-English translation using an n-gram model. The Malay-English translation is not a word-for-word basis but is dependent on the semantic meaning of the Malay phrase. In particular, a bigram is used to approximate the probability of a word by using the conditional probability of the preceding word. For this study, whenever the semantic ambiguity occurs, the English word with the highest probability value is chosen to translate the Malay word (or 2-sequence Malaysia word). The proposed technique has been tested with three categories of sentences namely easy, moderate and complex. The performance of the proposed MalayEnglish translation is based on human judgement that demonstrates an averaged validity ratio of positive value. The positive value indicates that at least half of the respondents agreed that the translation outputs are at least “still make sense semantically”. The contribution of the proposed method can be ascribed to the enhancement of word-for-word translation for solving the ambiguity issue in Malay-English translation.Downloads
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)