Evaluating LSTM Networks, HMM and WFST in Malay Part-of-Speech Tagging


  • Tien-Ping Tan School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia.
  • Bali Ranaivo-Malançon Faculty of Computer Science & Information Technology, Universiti Malaysia Sarawak, Sarawak, Malaysia.
  • Laurent Besacier LIG, Université Grenoble Alpes, CNRS, Grenoble, France.
  • Yin-Lai Yeong School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia.
  • Keng Hoon Gan School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia.
  • Enya Kong Tang School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia.


Malay Part-Of-Speech Tagging, Recurrence Neural Network (RNN), Long Short Term Memory (LSTM) Networks, Sequence-To-Sequence Learning,


Long short term memory (LSTM) networks have been gaining popularity in modeling sequential data such as phoneme recognition, speech translation, language modeling, speech synthesis, chatbot-like dialog systems and others. This paper investigates the attention-based encoder-decoder LSTM networks in Malay part-of-speech (POS) tagging when it is compared to weighted finite state transducer (WFST) and hidden Markov model (HMM). The attractiveness of LSTM networks is its strength in modeling long distance dependencies. Malay POS tagging is examined from two different conditions: with and without morphological information. The experiment results show that LSTM networks that are trained without any explicit morphological knowledge perform nearly equally with WFST but better than HMM approach that is trained with morphological information.


M. Nielsen, “Neural networks and deep learning,” Online: http://neuralnetworksanddeeplearning.com/, 2017.

A. Graves, A.-R. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” in Proc. ICASSP, Vancouver, Canada, 2013, pp. 6645–6649.

H. Sak, A. Senior, and F. Beaufays, “Long short-term memory recurrent neural network architectures for large scale acoustic modeling,” in Proc. INTERSPEECH, Singapore, 2014, pp.338-342.

M. Sundermeyer, R. Schluter, and H. Ney, “Lstm neural networks for language modeling.” in Proc. INTERSPEECH, Portland, 2012, pp. 194–197.

H. Zen and H. Sak, “Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis,” in Proc. ICASSP, Brisbane, 2015, pp. 4470–4474.

A. Bérard, O. Pietquin, L. Besacier, and C. Servan, “Listen and translate: A proof of concept for end-to-end speech-to-text translation,” in Conf. on Neural Information Processing Systems (NIPS), Barcelona, 2016, pp. 1–5.

B. Ranaivo-Malançon, “Computational analysis of affixed words in Malay language,” in ISMIL, Penang, 2004, pp. 1-11.

G. Knowles and Z. M. Don, Word Class in Malay: A Corpus-Based Approach, Kuala Lumpur: Dewan Bahasa dan Pustaka, 2006.

R. Alfred, A. Mujat, and J. H. Obit, “A ruled-based part of speech (RPOS) tagger for Malay text articles,” in Conf. on Intelligent Information and Database Systems, Kuala Lumpur, 2013, pp. 50-59.

M. P. Hamzah, B. S. Kamaruddin and S. F. Na'imah, “Part of speech tagger for Malay language based on words morphology,” in Int. Sym. on Research in Innovation and Sustainability, Melaka, 2014, pp. 1409-1502.

J. A. Bakar, K. Omar, M. F. Nasrudin and M. Z. Murah, “Morphology analysis in Malay pos prediction,” in Proc. of the Int. Conf. on Artificial Intelligence in Computer Science and ICT, Langkawi, 2013, pp. 112-119.

B. M. X. Chu., M. Lubani, K. P. Liew, K. Bouzekri, R. Mahmud, and D. Lukose, “Benchmarking mi-pos: Malay part-of-speech tagger,” International Journal of Knowledge Engineering, vol. 2, no. 3, pp. 115-121, 2016.

M. Hassan, N. Omar, and M. J. A. Aziz, “Statistical Malay part-ofspeech (POS) tagger using hidden Markov approach,” in Conf. on Semantic Technology and Information Retrieval, Putrajaya, 2011, pp. 231-236.

P. M. Nugues, An Introduction to Language Processing in Perl and Prolog, New York: Springer, 2010, pp.133-144.

D. Jurafsky and H. James, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech, New Jersey: Prentice Hall, 2000.

C. Olah, “Understanding Lstm networks,” Online: http://colah.github.io/posts/2015-08-Understanding-LSTMs/, 2015.

K. Cho, B. Van Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk, and Q. V. Le, “Learning phrase representations using rnn encoderdecoder for statistical machine translation,” in Conf. on Empirical Methods in Natural Language Processing (EMNLP), Doha, 2014, pp. 1724–1734.

I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” Advances in Neural Information Processing Systems, pp. 3104-3112, 2014.

D. Britz, “Attention and memory in deep learning and nlp,” Online: http://www.wildml.com/2016/01/attention-and-memory-in-deeplearning-and-nlp/, 2016.

H. N. Lim, H. H. Ye, C. K. Lim and E. K. Tang, “Adapting an existing example-based machine translation (ebmt) system for new language pairs based on an pptimized bilingual knowledge bank (bkb),” Int. Conf. on Translation, Kuala Lumpur, 2007, pp. 399-406.

B. Ranaivo-Malançon, C. C. Chua, P. K. Ng, “Identifying and classifying unknown words in Malay texts,” in Int. Sym. on Natural Language Processing, Pattaya, 2007, pp. 493-498.

J. R. Novak, N. Minematsu, K. Hirose, “Wfst-based grapheme-tophoneme conversion: open source tools for alignment, model-building and decoding,” in Int. Workshop on Finite State Methods and Natural Language Processing, Donostia–San Sebastia, 2012, pp. 45-49.

P. Koehn, H. Hoang, A. Birch, C. Callison-Burch, M. Federico, N. Bertoldi, B. Cowan, W. Shen, C. Moran and R. Zens, “Moses: open source toolkit for statistical machine translation,” in Proc. of the 45th annual meeting of the ACL, Prague, 2007, pp. 177-180.




How to Cite

Tan, T.-P., Ranaivo-Malançon, B., Besacier, L., Yeong, Y.-L., Hoon Gan, K., & Tang, E. K. (2017). Evaluating LSTM Networks, HMM and WFST in Malay Part-of-Speech Tagging. Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 9(2-9), 79–83. Retrieved from https://jtec.utem.edu.my/jtec/article/view/2679