Enhanced Affixation Word Stemmer with Stemming Error Reducer to Solve Affxation Stemming Errors

Authors

  • Mohamad Nizam Kassim Strategic Research, CyberSecurity Malaysia, The Mines Resort City, 43300 Seri Kembangan, Malaysia.
  • Mohd Aizaini Maarof Faculty of Computing, Universiti Teknologi Malaysia, 81310 Skudai Johore, Malaysia
  • Anazida Zainal Faculty of Computing, Universiti Teknologi Malaysia, 81310 Skudai Johore, Malaysia
  • Amirudin Abdul Wahab Strategic Research, CyberSecurity Malaysia, The Mines Resort City, 43300 Seri Kembangan, Malaysia.

Abstract

Word stemming algorithm (or word stemmer) is an important preprocessing component in the information retrieval and text categorization that aims to reduce derived words to their respective root words. Most of the existing Malay word stemmers adopt rule-based affixes removal method and dictionary lookup to stem affixation words. Despite of many stemming approaches have been proposed in the past research, the existing Malay word stemmers still suffer from affixation stemming errors due to the complexity of Malay morphology. These stemming errors can be classified into over stemming, under stemming, unstem, and special variations and exceptions. Hence this paper presents the enhanced affixation word stemmer that aims to solve these stemming errors. This paper also examined the root causes of these stemming errors in the existing Malay stemmers. The experimental results indicate that the enhanced word stemmerable to stem prefixation, suffixation, confixation and infixation wordswith better stemming accuracy by using enhanced Rule Application Order and Stemming Errors Reducer.

Downloads

Download data is not yet available.

Downloads

Published

2016-06-01

How to Cite

Kassim, M. N., Maarof, M. A., Zainal, A., & Abdul Wahab, A. (2016). Enhanced Affixation Word Stemmer with Stemming Error Reducer to Solve Affxation Stemming Errors. Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 8(3), 37–41. Retrieved from https://jtec.utem.edu.my/jtec/article/view/999