Hosting Customer Clustering Based On Log Web Server Using K-Means Algorithm

Authors

  • Mutiara Auliya Khadija Universitas Sebelas Maret, Surakarta, Central Java, Indonesia.
  • Wiranto Wiranto Universitas Sebelas Maret, Surakarta, Central Java, Indonesia.
  • Abdul Aziz Universitas Sebelas Maret, Surakarta, Central Java, Indonesia.

Keywords:

Data Mining, Clustering, Hosting Customer K-Means Algorithm, Log Web Server,

Abstract

To compete in global industries, a company must have a good business strategy. Especially for domain and hosting company that has many competitors there. The business strategy could be found with hosting customer behavior based on log web server analytics. The most important log web server associated with customer access is recorded in the access.log file. Potential customers were read from access activity in the form of request method /pesan on access.log. One of popular method for data mining from log server is Clustering with K-Means Algorithm. This algorithm was chosen because K-Means has a fast execution time, easy to implement, and good for a big numeric data. The evaluation technique determining the optimal value of K is used Elbow Method and the Calinski Harabasz Index. K-Means algorithm can be used to determine the pattern of hosting customers based log web server. The results of this research indicate that the clustering process based on web server log with K-Means Algorithm can be used to know the pattern of customer hosting. There are total 5 clusters for data by week and data access time. The pattern of hosting customers that are formed in ordering a succession of clusters 1,2,3,4,0. The most ordered areas are Jakarta in cluster 1, Bandung Semarang, Surabaya on cluster 2 and Medan, Tangerang, Malang, Yogyakarta on cluster 3. The frequency of booking is mostly done at the beginning of the month at 12.00 - 23.59. This customer behavior could be a reference to know the best business strategy to expand the marketing in cluster 4 and 0 and help any other stakeholder for making some policy to develop the company.

References

K. Hans-Ruediger, Handbook of Research on Managing and Influencing Consumer Behavior. IGI Global, 2014.

“Tentang DomaiNesia.” [Online]. Available: https://www.domainesia.com/about/. [Accessed: 26-Jun-2017].

L. G. Schiffman and L. L. Kanuk, Consumer Behavior [With 2 Volumes of Cases]. Pearson College Division, 2006.

Rangkuti. F, Riset Pemasaran. Gramedia Pustaka Utama, 2001.

G. Sreedhar, Web Data Mining and the Development of KnowledgeBased Decision Support Systems. IGI Global, 2016.

B. Liu, Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data. Springer Science & Business Media, 2011.

G. Gan, C. Ma, and J. Wu, Data Clustering: Theory, Algorithms, and Applications. SIAM, 2007.

I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2016.

Y. Liu, Z. Li, H. Xiong, X. Gao, and J. Wu, “Understanding of Internal Clustering Validation Measures,” IEEE Int. Conf. Data Min., vol. 10, 2010.

K. Senthil A. V., Web Usage Mining Techniques and Applications Across Industries. IGI Global, 2016.

D. Racha, “Web Usage Mining For extracting Users’ Navigational Behavior,” Int. J. Eng. Comput. Sci., vol. 3, no. 5, pp. 5989–5995, 2014.

D. A. Menascé and V. A. F. Almeida, Scaling for E-business: Technologies, Models, Performance, and Capacity Planning. Prentice Hall Professional, 2000.

L. Vendramin, R. J. G. B. Campello, and E. R. Hruschka, “Relative Clustering Validity Criteria: A Comparative Overview,” Wiley Period. Inc Stat. Anal. Data Min., vol. 3, pp. 209–235, 2010.

Downloads

Published

2018-07-03

How to Cite

Khadija, M. A., Wiranto, W., & Aziz, A. (2018). Hosting Customer Clustering Based On Log Web Server Using K-Means Algorithm. Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 10(2-4), 75–79. Retrieved from https://jtec.utem.edu.my/jtec/article/view/4320