Twitter Data Classification using Multinomial Naive Bayes for Tropical Diseases Mapping in Indonesia

Authors

  • Romy Ranovan Department of Informatics, Universitas Sebelas Maret (UNS), Surakarta, Indonesia.
  • Afrizal Doewes Department of Informatics, Universitas Sebelas Maret (UNS), Surakarta, Indonesia.
  • Ristu Saptono Department of Informatics, Universitas Sebelas Maret (UNS), Surakarta, Indonesia.

Keywords:

Classification, Mapping, Multinomial Naive Bayes, Tropical Diseases,

Abstract

Tropical diseases are diseases commonly found in tropical and sub-tropical regions. The goal of this research is to map the tropical diseases based on data from Twitter to help policymakers take essential steps regarding health condition in Indonesia. Tweets classification was conducted in two phases, both using Multinomial Naive Bayes. The first phase is to filter non-Indonesian tweets, and the second phase is to classify the tweets containing diseases information. The result shows the type of the diseases and location with high accuracy supported by map visualization.

References

J. Farrar, P. Hotez, J., T. Junghanss, G. Kang, D. Lalloo, and N. White. Manson's tropical diseases. Elsevier Health Sciences, 2013.

F. Wulandini and A. S. Nugroho,. "Text Classification Using Support Vector Machine for Webmining Based Spatio Temporal Analysis of the Spread of Tropical Disease, in International Conference on Rural Information and Communication Technology, 2009.

B. Pang, and L Lee, "Opinion Mining and Sentiment Analysis". Foundations and Trends in Information Retrieval, vol. 2, pp. 1-135, 2008.

V. Prieto, S. Matos, M. Alvarez, F. Chaceda, and J. L. Oliveira, “Twitter : A Good Place to Detect Health Condition”. PLoS ONE. vol. 9, num. 1, pp. e86191, 2014.

A. Bermingham, and A. Smeaton, "Classifying Sentiment in Microblog : Is Brevity an Advantage?". 19th ACM International Conference on Information and Knowledge Management, 2010.

S. Allesio et al. "The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic". PLoS ONE 6(5): e19467, doi:10.1371/journal.pone.0019467, 2010.

I. Rish, "An Empirical Study of the Naive Bayes Classifier", in International Joint Conference on Artificial Intelligence, 2001.

R. V. Imbar, A. M. Adelia, and A. Rehatta, "Implementasi Cosine Similarity dan Algoritma Smith-Waterman untuk Mendeteksi Kemiripan Teks". Jurnal Informatika, vol. 10 no. 1, Juni 2014.

N. V. Chawla, K.W. Bowyer, L. O. Hall, and W.P Kegelmeyer, "SMOTE : Synthetic Minority Over-sampling Technique". Journal of Artificial Intelligence, vol. 16, pp. 321-357, 2002.

R. Dubey, J.Y. Zhou, Y. Wang, P.M. Thompson, and J.P. Ye. "Analysis of Sampling Technique for Imbalance Data: An N=648 ADNI Study". Neuroimage, vol. 87. pp. 220-241, 2014.

R. Kohavi, "A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection", in International Joint Conference of Artificial Intelligence, 1995.

D. M.W. Powers, “Evaluation : From Precision, Recall, and F-Factor to ROC, Informedness, Markedness, and Correlation”. Technical Report SIE-07-001, 2007.

Downloads

Published

2018-07-03

How to Cite

Ranovan, R., Doewes, A., & Saptono, R. (2018). Twitter Data Classification using Multinomial Naive Bayes for Tropical Diseases Mapping in Indonesia. Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 10(2-4), 155–159. Retrieved from https://jtec.utem.edu.my/jtec/article/view/4335