Evaluating LSTM Networks, HMM and WFST in Malay Part-of-Speech Tagging
Keywords:
Malay Part-Of-Speech Tagging, Recurrence Neural Network (RNN), Long Short Term Memory (LSTM) Networks, Sequence-To-Sequence Learning,Abstract
Long short term memory (LSTM) networks have been gaining popularity in modeling sequential data such as phoneme recognition, speech translation, language modeling, speech synthesis, chatbot-like dialog systems and others. This paper investigates the attention-based encoder-decoder LSTM networks in Malay part-of-speech (POS) tagging when it is compared to weighted finite state transducer (WFST) and hidden Markov model (HMM). The attractiveness of LSTM networks is its strength in modeling long distance dependencies. Malay POS tagging is examined from two different conditions: with and without morphological information. The experiment results show that LSTM networks that are trained without any explicit morphological knowledge perform nearly equally with WFST but better than HMM approach that is trained with morphological information.References
M. Nielsen, “Neural networks and deep learning,” Online: http://neuralnetworksanddeeplearning.com/, 2017.
A. Graves, A.-R. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” in Proc. ICASSP, Vancouver, Canada, 2013, pp. 6645–6649.
H. Sak, A. Senior, and F. Beaufays, “Long short-term memory recurrent neural network architectures for large scale acoustic modeling,” in Proc. INTERSPEECH, Singapore, 2014, pp.338-342.
M. Sundermeyer, R. Schluter, and H. Ney, “Lstm neural networks for language modeling.” in Proc. INTERSPEECH, Portland, 2012, pp. 194–197.
H. Zen and H. Sak, “Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis,” in Proc. ICASSP, Brisbane, 2015, pp. 4470–4474.
A. Bérard, O. Pietquin, L. Besacier, and C. Servan, “Listen and translate: A proof of concept for end-to-end speech-to-text translation,” in Conf. on Neural Information Processing Systems (NIPS), Barcelona, 2016, pp. 1–5.
B. Ranaivo-Malançon, “Computational analysis of affixed words in Malay language,” in ISMIL, Penang, 2004, pp. 1-11.
G. Knowles and Z. M. Don, Word Class in Malay: A Corpus-Based Approach, Kuala Lumpur: Dewan Bahasa dan Pustaka, 2006.
R. Alfred, A. Mujat, and J. H. Obit, “A ruled-based part of speech (RPOS) tagger for Malay text articles,” in Conf. on Intelligent Information and Database Systems, Kuala Lumpur, 2013, pp. 50-59.
M. P. Hamzah, B. S. Kamaruddin and S. F. Na'imah, “Part of speech tagger for Malay language based on words morphology,” in Int. Sym. on Research in Innovation and Sustainability, Melaka, 2014, pp. 1409-1502.
J. A. Bakar, K. Omar, M. F. Nasrudin and M. Z. Murah, “Morphology analysis in Malay pos prediction,” in Proc. of the Int. Conf. on Artificial Intelligence in Computer Science and ICT, Langkawi, 2013, pp. 112-119.
B. M. X. Chu., M. Lubani, K. P. Liew, K. Bouzekri, R. Mahmud, and D. Lukose, “Benchmarking mi-pos: Malay part-of-speech tagger,” International Journal of Knowledge Engineering, vol. 2, no. 3, pp. 115-121, 2016.
M. Hassan, N. Omar, and M. J. A. Aziz, “Statistical Malay part-ofspeech (POS) tagger using hidden Markov approach,” in Conf. on Semantic Technology and Information Retrieval, Putrajaya, 2011, pp. 231-236.
P. M. Nugues, An Introduction to Language Processing in Perl and Prolog, New York: Springer, 2010, pp.133-144.
D. Jurafsky and H. James, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech, New Jersey: Prentice Hall, 2000.
C. Olah, “Understanding Lstm networks,” Online: http://colah.github.io/posts/2015-08-Understanding-LSTMs/, 2015.
K. Cho, B. Van Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk, and Q. V. Le, “Learning phrase representations using rnn encoderdecoder for statistical machine translation,” in Conf. on Empirical Methods in Natural Language Processing (EMNLP), Doha, 2014, pp. 1724–1734.
I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” Advances in Neural Information Processing Systems, pp. 3104-3112, 2014.
D. Britz, “Attention and memory in deep learning and nlp,” Online: http://www.wildml.com/2016/01/attention-and-memory-in-deeplearning-and-nlp/, 2016.
H. N. Lim, H. H. Ye, C. K. Lim and E. K. Tang, “Adapting an existing example-based machine translation (ebmt) system for new language pairs based on an pptimized bilingual knowledge bank (bkb),” Int. Conf. on Translation, Kuala Lumpur, 2007, pp. 399-406.
B. Ranaivo-Malançon, C. C. Chua, P. K. Ng, “Identifying and classifying unknown words in Malay texts,” in Int. Sym. on Natural Language Processing, Pattaya, 2007, pp. 493-498.
J. R. Novak, N. Minematsu, K. Hirose, “Wfst-based grapheme-tophoneme conversion: open source tools for alignment, model-building and decoding,” in Int. Workshop on Finite State Methods and Natural Language Processing, Donostia–San Sebastia, 2012, pp. 45-49.
P. Koehn, H. Hoang, A. Birch, C. Callison-Burch, M. Federico, N. Bertoldi, B. Cowan, W. Shen, C. Moran and R. Zens, “Moses: open source toolkit for statistical machine translation,” in Proc. of the 45th annual meeting of the ACL, Prague, 2007, pp. 177-180.
Downloads
Published
How to Cite
Issue
Section
License
TRANSFER OF COPYRIGHT AGREEMENT
The manuscript is herewith submitted for publication in the Journal of Telecommunication, Electronic and Computer Engineering (JTEC). It has not been published before, and it is not under consideration for publication in any other journals. It contains no material that is scandalous, obscene, libelous or otherwise contrary to law. When the manuscript is accepted for publication, I, as the author, hereby agree to transfer to JTEC, all rights including those pertaining to electronic forms and transmissions, under existing copyright laws, except for the following, which the author(s) specifically retain(s):
- All proprietary right other than copyright, such as patent rights
- The right to make further copies of all or part of the published article for my use in classroom teaching
- The right to reuse all or part of this manuscript in a compilation of my own works or in a textbook of which I am the author; and
- The right to make copies of the published work for internal distribution within the institution that employs me
I agree that copies made under these circumstances will continue to carry the copyright notice that appears in the original published work. I agree to inform my co-authors, if any, of the above terms. I certify that I have obtained written permission for the use of text, tables, and/or illustrations from any copyrighted source(s), and I agree to supply such written permission(s) to JTEC upon request.