Semantic-based Malay-English Translation using N-Gram Model

Authors

  • Nooraini Yusoff School of Computing, UUM College of Arts and Sciences, Universiti Utara Malaysia, 06010 UUM Sintok, Kedah, Malaysia .
  • Zulikha Jamaludin School of Computing, UUM College of Arts and Sciences, Universiti Utara Malaysia, 06010 UUM Sintok, Kedah, Malaysia .
  • Muhammad Hilmi Yusoff School of Computing, UUM College of Arts and Sciences, Universiti Utara Malaysia, 06010 UUM Sintok, Kedah, Malaysia .

Keywords:

Machine Transalation, Malay-English Translation, N-Gram, Semantic, Ambigous,

Abstract

Most of the existing machine translations are based on word-for-word translation. The major obstacle in developing such a system is natural language is not free from ambiguity problems. One word may have more than one semantic, and vice versa. Herein, we propose a semantic-based Malay-English translation using an n-gram model. The Malay-English translation is not a word-for-word basis but is dependent on the semantic meaning of the Malay phrase. In particular, a bigram is used to approximate the probability of a word by using the conditional probability of the preceding word. For this study, whenever the semantic ambiguity occurs, the English word with the highest probability value is chosen to translate the Malay word (or 2-sequence Malaysia word). The proposed technique has been tested with three categories of sentences namely easy, moderate and complex. The performance of the proposed MalayEnglish translation is based on human judgement that demonstrates an averaged validity ratio of positive value. The positive value indicates that at least half of the respondents agreed that the translation outputs are at least “still make sense semantically”. The contribution of the proposed method can be ascribed to the enhancement of word-for-word translation for solving the ambiguity issue in Malay-English translation.

Downloads

Published

2016-12-01

How to Cite

Yusoff, N., Jamaludin, Z., & Yusoff, M. H. (2016). Semantic-based Malay-English Translation using N-Gram Model. Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 8(10), 117–123. Retrieved from https://jtec.utem.edu.my/jtec/article/view/1382