Significant Features Determination for ATS Drug Identification
Keywords:ATS Drug, 3D Molecule Structure, Feature Selection, Filter-Embedded,
AbstractLaboratory testing for ATS drug identification is a costly and lengthy process. In this paper, we propose a computational analysis approach as an alternative solution in identifying the ATS drugs. High dimensional dataset is one of the key challenges for computational analysis. This paper will investigate the effectiveness of several feature selection algorithms in identify the significant features and filter out the irrelevant features in the dataset. Specifically, four filters feature selection techniques (Information Gain (IG), Gain Ratio (GR), Symmetrical Uncertainty (SU), and ReliefF) and two embedded feature selection techniques (Support Vector Machine based Recursive Elimination Method (SVM-RFE) and Variable Importance based Random Forest (VIRF)) have been explored. The main fundamental perspective that is taken into consideration in performance analysis is to identify which feature selection technique can return minimal features while achieving a higher identification performance. The experimental evaluation on the ATS drugs 3D molecular structure representation dataset is performed using five classifiers, which are Random Forest (RF), Naïve Bayes (NB), IBK, SMO and J48 decision trees. The findings show that ReliefF and VIRF can select a smaller feature subset with the highest identification accuracy than the other feature selection techniques.
W. H. O. Geneva, “Neuroscience of psychoactive substance use and dependence,” World Health Organization, Switzerland, 2004.
U. N. O. on D. and C. UNODC, WORLD DRUG REPORT 2016. 2016.
D. Rouen and K. Dolan, “A Review of Drug Detection Testing and an Examination of Urine , Hair , Saliva and Sweat,” 2001.
R. Shipman, T. Conti, T. Tighe, and E. Buel, “Forensic Drug Identification by Gas Chromatography – Infrared Spectroscopy,” 2013.
W. Shin, X. Zhu, M. G. Bures, and D. Kihara, “Three-Dimensional Compound Comparison Methods and Their Application in Drug Discovery,” Mol. 20, no. 7, pp. 12841–12862, 2015.
M. Awale, X. Jin, and J. Reymond, “Stereoselective virtual screening of the ZINC database using atom pair 3D-fingerprints,” J. Cheminform., pp. 1–15, 2015.
M. Aldeghi, S. Malhotra, L. David, A. Wing, and E. Chan, “Two- and Three-dimensional Rings in Drugs,” Chem. Biol. Drug Des., vol. 4, no. 83, pp. 450–461, 2014.
L. Yu and H. Liu, “Feature Selection for High-Dimensional Data : A Fast Correlation-Based Filter Solution,” Yu, Lei, Huan Liu. "Feature Sel. high-dimensional data A fast Correl. filter Solut. Proc. 20th Int. Conf. Mach. Learn., pp. 856–863, 2003.
I. Guyon, “An Introduction to Variable and Feature Selection Introduction,” J. Mach. Learn. Res., vol. 3, pp. 1157–1182, 2003.
I. R. Adeyemi, S. A. Razak, M. Salleh, and H. S. Venter, “Leveraging Human Thinking Style for User Attribution in Digital Forensic Process,” Int. J. Adv. Sci. Eng. Inf. Technol., vol. 7, no. 1, pp. 198–206, 2017.
B. Alshaikhdeeb and K. Ahmad, “Biomedical Named Entity Recognition : A Review,” Int. J. Adv. Sci. Eng. Inf. Technol., vol. 6, no. 6, pp. 889–895, 2016.
M. Zhang, J. Sun, Z. Liu, M. Ren, and H. Shen, “Improving N 6 - methyladenosine site prediction with heuristic selection of nucleotide physical e chemical properties,” Anal. Biochem., vol. 508, pp. 104–113, 2016.
S. Li et al., “Identification and characterization of colorectal cancer using Raman spectroscopy and feature selection techniques.,” Opt. Express, vol. 22, no. 21, pp. 25895–908, 2014.
A. M. M. M. Amaral, C. O. D. A. Freitas, and F. Bortolozzi, “Feature selection for forensic handwriting identification,” Proc. Int. Conf. Doc. Anal. Recognition, ICDAR, pp. 922–926, 2013.
A. Park, S. J. Baek, A. Shen, and J. Hu, “Detection of Alzheimer’s disease by Raman spectra of rat’s platelet with a simple feature selection,” Chemom. Intell. Lab. Syst., vol. 121, pp. 52–56, 2013.
Y. Dupuis, X. Savatier, and P. Vasseur, “Feature subset selection applied to model-free gait recognition,” Image Vis. Comput., vol. 31, no. 8, pp. 580–591, 2013.
ICGEB CRP Research Grant Programme Projects, “CRP - ICGEB Research Grants Completed in 2016 (A New 3D Descriptor of Synthetic Drug Molecular Structure for Drug Analysis - CRP/13/010),” 2012.
I. H. Witten, E. Frank, and M. a. Hall, Data Mining: Practical Machine Learning Tools and Techniques, Third Edition, vol. 54, no. 2. 2011.
M. Hall and L. a Smith, “Feature Selection for Machine Learning : Comparing a Correlation-based Filter Approach to the Wrapper CFS : Correlation-based Feature,” Int. FLAIRS Conf., p. 5, 1999.
I. Kononenko, “ReliefF for estimation and discretization of attributes in classification, regression, and ILP problems,” Artif. Intell. Methodol. Syst. Appl., pp. 1–15, 1996.
M. R.- Sikonja, “Theoretical and Empirical Analysis of ReliefF and RReliefF,” Mach. Learn. J., pp. 23–69, 2003.
I. Guyon, J. Weston, S. Barnhill, T. Labs, and R. Bank, “Gene Selection for Cancer Classification using Support Vector Machines,” Mach. Learn., vol. 46, no. 1–3, pp. 389–422, 2002.
L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, 2001.
R. O. Duda, P. E. Hart, and D. G. Stork, “Pattern Classification,” New York: John Wiley, Section. p. 680, 2001.
D. Hardin, I. Tsamardinos, and C. Aliferis, “A theoretical characterization of linear SVM-based feature selection,” The TwentyFirst International Conference on Machine Learning (ICML 2004). pp. 48–55, 2004.
How to Cite
TRANSFER OF COPYRIGHT AGREEMENT
The manuscript is herewith submitted for publication in the Journal of Telecommunication, Electronic and Computer Engineering (JTEC). It has not been published before, and it is not under consideration for publication in any other journals. It contains no material that is scandalous, obscene, libelous or otherwise contrary to law. When the manuscript is accepted for publication, I, as the author, hereby agree to transfer to JTEC, all rights including those pertaining to electronic forms and transmissions, under existing copyright laws, except for the following, which the author(s) specifically retain(s):
- All proprietary right other than copyright, such as patent rights
- The right to make further copies of all or part of the published article for my use in classroom teaching
- The right to reuse all or part of this manuscript in a compilation of my own works or in a textbook of which I am the author; and
- The right to make copies of the published work for internal distribution within the institution that employs me
I agree that copies made under these circumstances will continue to carry the copyright notice that appears in the original published work. I agree to inform my co-authors, if any, of the above terms. I certify that I have obtained written permission for the use of text, tables, and/or illustrations from any copyrighted source(s), and I agree to supply such written permission(s) to JTEC upon request.