Big Data Analytics: Feature Selection and Machine Learning for Intrusion Detection on Microsoft Azure Platform


  • Nachirat Rachburee Department of Computer Engineering, Faculty of Engineering, Rajamangala University of Technology Thanyaburi, Pathumthani, Thailand
  • Wattana Punlumjeak Department of Computer Engineering, Faculty of Engineering, Rajamangala University of Technology Thanyaburi, Pathumthani, Thailand


Big Data, Feature Selection, Intrusion Detection.,


In recent years, the overwhelming networking data has been growing at an exponential rate. Not only storage but also computing needs a system to process an intrusion detection system with a massive dataset. This research used cloud analytics to store big dataset, preprocess data, classify and evaluate results by using Microsoft azure, which can provide the appropriate environment. Because of the growth of data volume, intrusion detection model that adopts data mining technique has been used to detect intrusion pattern. Our research used mutual information and chi-square as a feature selection technique to reduce a feature set for computation time. Then, decision forest and neural network were used to classify the attack type of intrusion by 100% KDD CUP 1999 dataset. The performance of intrusion detection was measured by the accuracy of detection rate of attack type from the evaluation process in Microsoft azure.


Intel IT Center, “Planning Guide: Getting Started with Hadoop, Steps IT Managers Can Take to Move Forward with Big Data Analytics”, retrieved November, 10, 2015 from


Sagiroglu, S., and Sinanc, D., “Big data: A review”, 2013 International Conference on Collaboration Technologies and Systems (CTS), 2013, pp.42-47.

Sharma, S. and Gupta, R. K., “Intrusion Detection System: A Review”, International Journal of Security and Its Applications, vol.9, no.5,2015, pp.69-76.

Mukherjee, S. and Sharma, N., “Intrusion detection using naive Bayes classifier with feature reduction”, Procedia Technology, vol.4, 2012, pp.119-128.

Gong, Y., Fang, Y., Liu, L. and Li, J., “Multi-agent Intrusion Detection System Using Feature Selection Approach”, 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2014, pp.528-531.

Wang, W. and Gombault, S., “Efficient detection of DDoS attacks with important attributes”, Third International Conference on Risks and Security of Internet and Systems: CRiSIS’2008, 2008, pp.61-67.

Wei, M. and Chan, R. H., “Dimensionality reduction of hybrid data using mutual information-based unsupervised feature transformation: With application on intrusion detection”, 2015 IEEE 13th International Conference on Industrial Informatics (INDIN), 2015, pp. 1108-1111.

Ambusaidi, M., He, X., Tan, Z., Nanda, P., Lu, L. F., and Nagar, U. T., “A novel feature selection approach for intrusion detection data classification”, 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications, 2014, pp.82-89.

Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., and Steinberg, D., “Top 10 algorithms in data mining”, Knowledge and Information Systems, vol.14, no.1, 2008, pp.1-37.

Elbasiony, R. M., Sallam, E. A., Eltobely, T. E. and Fahmy, M. M., “A hybrid network intrusion detection framework based on random forests and weighted k-means”, Ain Shams Engineering Journal,

vol.4, no.4, 2005, pp.753-762.

Chebrolu, S., Abraham, A. and Thomas, J. P., “Feature deduction and ensemble design of intrusion detection systems”, Computers &

Security, vol.24, no.4, 2005, pp. 295-307.

Relan, N. G. and Patil, D. R., “Implementation of network intrusion detection system using variant of decision tree algorithm”, 2015 International Conference on Nascent Technologies in the Engineering Field (ICNTE-2015), 2015, pp.1-5.

Wang, G., Hao, J., Ma, J. and Huang, L. 2010. A new approach to intrusion detection using Artificial Neural Networks and fuzzy clustering. Expert Systems with Applications, 37(9):6225-6232.

Shah, B. and Trivedi, B. H., “Reducing Features of KDD CUP 1999 Dataset for Anomaly Detection Using Back Propagation Neural Network”, 2015 Fifth International Conference on Advanced Computing & Communication Technologies, 2015, pp.247-251.

Harbola, A., Harbola, J. and Vaisla, K. S., “Improved Intrusion Detection in DDoS Applying Feature Selection Using Rank & Score of Attributes in KDD-99 Data Set”, 2014 Sixth International

Conference on Computational Intelligence and Communication Networks, 2014, pp.840-845.

“KDD CUP 1999 : UCI data repository”, The Fifth International Conference on Knowledge Discovery and Data Mining retrieved November, 10, 2015 from




How to Cite

Rachburee, N., & Punlumjeak, W. (2017). Big Data Analytics: Feature Selection and Machine Learning for Intrusion Detection on Microsoft Azure Platform. Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 9(1-4), 107–111. Retrieved from