Viseme Recognition using lip curvature and Neural Networks to detect Bangla Vowels
Keywords:
Active Appearance Model, Artificial Neural Networks, Bangla Viseme Recognition, Lip Reading, Speech Recognition,Abstract
Automatic Speech Recognition plays an important role in human-computer interaction, which can be applied in various vital applications like crime-fighting and helping the hearing-impaired. This paper provides a new method for recognition of Bengali visemes based on a combination of image-based lip segmentation techniques, use of curvature of the both inner and outer lips as well as neural networks. The method is divided into three steps. First step is a lip segmentation step that uses a combination of red exclusion method, HSV space and CIE spaces to produce illumination invariant images. Next, inner and outer lips are extracted separately using a new technique for curve-fitting. Second step is the feature extraction step, which makes use of quadratic curve-coefficients of the inner and outer lip contours. Finally, viseme recognition is done using a Neural Network. A dataset was created with 171 lip images of Bangla Visemes being spoken by different speakers and under different lighting conditions. The proposed method gave a viseme recognition result of 87.3%. Due to the use of non-iterative method as opposed to conventional methods, the algorithm was found to be faster in detecting lip contours.Downloads
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)