Evaluation of Transformer-Based Models for Sentiment Analysis in Bahasa Malaysia

Mohd Asyraf  Zulkalnain; A. R.  Syafeeza; Wira Hidayat  Mohd Saad; Shahid  Rahaman

doi:10.54554/jtec.2025.17.01.004

Authors

Mohd Asyraf Zulkalnain Fakulti Teknologi dan Kejuruteraan Elektronik dan Komputer, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya, 76100 Durian Tunggal, Melaka, Malaysia
A. R. Syafeeza Fakulti Teknologi dan Kejuruteraan Elektronik dan Komputer, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya, 76100 Durian Tunggal, Melaka, Malaysia
Wira Hidayat Mohd Saad Fakulti Teknologi dan Kejuruteraan Elektronik dan Komputer, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya, 76100 Durian Tunggal, Melaka, Malaysia
Shahid Rahaman Department of Computer Science, University of Buner, Pakistan

DOI:

https://doi.org/10.54554/jtec.2025.17.01.004

Keywords:

Transformer-based models, Sentiment analysis, Bahasa Malaysia, Natural Language Processing

Abstract

This study investigates the application of advanced Transformer-based models, namely BERT, DistilBERT, BERT-multilingual, ALBERT, and BERT-CNN, for sentiment analysis in Bahasa Malaysia, addressing unique challenges such as mixed-language usage and abbreviated expressions in social media text. Using the Malaya dataset to ensure linguistic diversity and domain coverage, the research incorporates robust preprocessing techniques, including synonym mapping and sentiment-aware tokenization, to enhance feature extraction. Through rigorous evaluation, BERT-CNN exhibits the best accuracy (96.3%), followed by BERT-multilingual (89.84%) and BERT (89.5%). DistilBERT and ALBERT delivered competitive performance (88.96% and 88.76%, respectively) while offering reduced computational requirements, highlighting the trade-offs between performance and efficiency. The study emphasizes optimized strategies for handling challenges in positive sentiment classification and demonstrates the efficacy of transformer architectures in nuanced sentiment detection for low-resource languages. These findings contribute to advancing Natural Language Processing (NLP) for scalable sentiment analysis across domains.

Evaluation of Transformer-Based Models for Sentiment Analysis in Bahasa Malaysia

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Information