Significant Features Determination for ATS Drug Identification

Authors

  • Y.C. Saw Computational Intelligence and Technologies Lab (CIT Lab), Faculty of Information and Communication Technology, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya, Durian Tunggal, 76100 Melaka, Malaysia
  • A.K. Muda Computational Intelligence and Technologies Lab (CIT Lab), Faculty of Information and Communication Technology, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya, Durian Tunggal, 76100 Melaka, Malaysia
  • Z.I.M. Yusoh Computational Intelligence and Technologies Lab (CIT Lab), Faculty of Information and Communication Technology, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya, Durian Tunggal, 76100 Melaka, Malaysia

Keywords:

ATS Drug, 3D Molecule Structure, Feature Selection, Filter-Embedded,

Abstract

Laboratory testing for ATS drug identification is a costly and lengthy process. In this paper, we propose a computational analysis approach as an alternative solution in identifying the ATS drugs. High dimensional dataset is one of the key challenges for computational analysis. This paper will investigate the effectiveness of several feature selection algorithms in identify the significant features and filter out the irrelevant features in the dataset. Specifically, four filters feature selection techniques (Information Gain (IG), Gain Ratio (GR), Symmetrical Uncertainty (SU), and ReliefF) and two embedded feature selection techniques (Support Vector Machine based Recursive Elimination Method (SVM-RFE) and Variable Importance based Random Forest (VIRF)) have been explored. The main fundamental perspective that is taken into consideration in performance analysis is to identify which feature selection technique can return minimal features while achieving a higher identification performance. The experimental evaluation on the ATS drugs 3D molecular structure representation dataset is performed using five classifiers, which are Random Forest (RF), Naïve Bayes (NB), IBK, SMO and J48 decision trees. The findings show that ReliefF and VIRF can select a smaller feature subset with the highest identification accuracy than the other feature selection techniques.

References

W. H. O. Geneva, “Neuroscience of psychoactive substance use and dependence,” World Health Organization, Switzerland, 2004.

U. N. O. on D. and C. UNODC, WORLD DRUG REPORT 2016. 2016.

D. Rouen and K. Dolan, “A Review of Drug Detection Testing and an Examination of Urine , Hair , Saliva and Sweat,” 2001.

R. Shipman, T. Conti, T. Tighe, and E. Buel, “Forensic Drug Identification by Gas Chromatography – Infrared Spectroscopy,” 2013.

W. Shin, X. Zhu, M. G. Bures, and D. Kihara, “Three-Dimensional Compound Comparison Methods and Their Application in Drug Discovery,” Mol. 20, no. 7, pp. 12841–12862, 2015.

M. Awale, X. Jin, and J. Reymond, “Stereoselective virtual screening of the ZINC database using atom pair 3D-fingerprints,” J. Cheminform., pp. 1–15, 2015.

M. Aldeghi, S. Malhotra, L. David, A. Wing, and E. Chan, “Two- and Three-dimensional Rings in Drugs,” Chem. Biol. Drug Des., vol. 4, no. 83, pp. 450–461, 2014.

L. Yu and H. Liu, “Feature Selection for High-Dimensional Data : A Fast Correlation-Based Filter Solution,” Yu, Lei, Huan Liu. "Feature Sel. high-dimensional data A fast Correl. filter Solut. Proc. 20th Int. Conf. Mach. Learn., pp. 856–863, 2003.

I. Guyon, “An Introduction to Variable and Feature Selection Introduction,” J. Mach. Learn. Res., vol. 3, pp. 1157–1182, 2003.

I. R. Adeyemi, S. A. Razak, M. Salleh, and H. S. Venter, “Leveraging Human Thinking Style for User Attribution in Digital Forensic Process,” Int. J. Adv. Sci. Eng. Inf. Technol., vol. 7, no. 1, pp. 198–206, 2017.

B. Alshaikhdeeb and K. Ahmad, “Biomedical Named Entity Recognition : A Review,” Int. J. Adv. Sci. Eng. Inf. Technol., vol. 6, no. 6, pp. 889–895, 2016.

M. Zhang, J. Sun, Z. Liu, M. Ren, and H. Shen, “Improving N 6 - methyladenosine site prediction with heuristic selection of nucleotide physical e chemical properties,” Anal. Biochem., vol. 508, pp. 104–113, 2016.

S. Li et al., “Identification and characterization of colorectal cancer using Raman spectroscopy and feature selection techniques.,” Opt. Express, vol. 22, no. 21, pp. 25895–908, 2014.

A. M. M. M. Amaral, C. O. D. A. Freitas, and F. Bortolozzi, “Feature selection for forensic handwriting identification,” Proc. Int. Conf. Doc. Anal. Recognition, ICDAR, pp. 922–926, 2013.

A. Park, S. J. Baek, A. Shen, and J. Hu, “Detection of Alzheimer’s disease by Raman spectra of rat’s platelet with a simple feature selection,” Chemom. Intell. Lab. Syst., vol. 121, pp. 52–56, 2013.

Y. Dupuis, X. Savatier, and P. Vasseur, “Feature subset selection applied to model-free gait recognition,” Image Vis. Comput., vol. 31, no. 8, pp. 580–591, 2013.

ICGEB CRP Research Grant Programme Projects, “CRP - ICGEB Research Grants Completed in 2016 (A New 3D Descriptor of Synthetic Drug Molecular Structure for Drug Analysis - CRP/13/010),” 2012.

I. H. Witten, E. Frank, and M. a. Hall, Data Mining: Practical Machine Learning Tools and Techniques, Third Edition, vol. 54, no. 2. 2011.

M. Hall and L. a Smith, “Feature Selection for Machine Learning : Comparing a Correlation-based Filter Approach to the Wrapper CFS : Correlation-based Feature,” Int. FLAIRS Conf., p. 5, 1999.

I. Kononenko, “ReliefF for estimation and discretization of attributes in classification, regression, and ILP problems,” Artif. Intell. Methodol. Syst. Appl., pp. 1–15, 1996.

M. R.- Sikonja, “Theoretical and Empirical Analysis of ReliefF and RReliefF,” Mach. Learn. J., pp. 23–69, 2003.

I. Guyon, J. Weston, S. Barnhill, T. Labs, and R. Bank, “Gene Selection for Cancer Classification using Support Vector Machines,” Mach. Learn., vol. 46, no. 1–3, pp. 389–422, 2002.

L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, 2001.

R. O. Duda, P. E. Hart, and D. G. Stork, “Pattern Classification,” New York: John Wiley, Section. p. 680, 2001.

D. Hardin, I. Tsamardinos, and C. Aliferis, “A theoretical characterization of linear SVM-based feature selection,” The TwentyFirst International Conference on Machine Learning (ICML 2004). pp. 48–55, 2004.

Downloads

Published

2018-07-04

How to Cite

Saw, Y., Muda, A., & Yusoh, Z. (2018). Significant Features Determination for ATS Drug Identification. Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 10(2-5), 87–92. Retrieved from https://jtec.utem.edu.my/jtec/article/view/4356