Female Voice Recognition Using Artificial Neural Networks and MATLAB Voicebox Toolbox

Stanley Glenn E. Brucal; Aaron Don M. Africa; Elmer P. Dadios

Authors

Stanley Glenn E. Brucal Department of Electronics and Communications Engineering, Gokongwei College of Engineering, De La Salle University Manila, 2401 Taft Avenue Manila, Philippines 1004.
Aaron Don M. Africa Department of Electronics and Communications Engineering, Gokongwei College of Engineering, De La Salle University Manila, 2401 Taft Avenue Manila, Philippines 1004.
Elmer P. Dadios Department of Manufacturing Engineering and Management, Gokongwei College of Engineering, De La Salle University Manila, 2401 Taft Avenue Manila, Philippines 1004.

Keywords:

Voice Feature, Mel Frequency Cepstral Coefficient, Artificial Neural Network, Voice Recognition, Speech Recognition,

Abstract

Voice and speaker recognition performances are measured based on the accuracy, speed and robustness. These three key performance indicators are primarily dependent on voice feature extraction method and voice recognition algorithm used. This paper aims to discuss various researches in speech recognition that has yielded high accuracy rates of 95% and above. The extracted MFCCs from MATLAB Voicebox toolbox were used as inputs to the multilayer Artificial Neural Networks (ANN) for female voice recognition algorithm. This study explored the recognition performance of the neural networks using variable number of hidden neurons and layers, and determine the architecture that would provide the optimum performance in terms of high recognition rate. MATLAB simulation resulted to a training and testing recognition rate of 100.00% when using 3-hidden-layer neural network from speech samples of a single-speaker, and highest training recognition rate of 98.11% and testing recognition rate of 87.20% when using 4-hidden-layer neural network from speech samples of several speakers. When tested with homonyms, the best recognition rate was 75.00% from a 3-hidden-layer neural network trained from a single-speaker, and 81.91% from a 4- hidden-layer neural network trained from multiple speakers. The deviation in recognition rates were primarily attributed to the variations made in the number of input neurons, hidden layers, and neurons of the speech recognition neural network.

Female Voice Recognition Using Artificial Neural Networks and MATLAB Voicebox Toolbox

Authors

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Information