A Review of Audio-Visual Speech Recognition

Authors

  • Thum Wei Seong Applied Electronic and Computer Engineering Cluster Faculty of Electrical & Electronic Engineering, University Malaysia Pahang, 26600 Pekan, Pahang, Malaysia.
  • M. Z. Ibrahim Applied Electronic and Computer Engineering Cluster Faculty of Electrical & Electronic Engineering, University Malaysia Pahang, 26600 Pekan, Pahang, Malaysia.

Keywords:

Audio-Visual Speech Recognition, AudioVisual Data Corpus, Feature Extraction, Model Validation Techniques, Performance Evaluation,

Abstract

Speech is the most important tool of interaction among human beings. This has inspired researchers to study further on speech recognition and develop a computer system that is able to integrate and understand human speech. But acoustic noisy environment can highly contaminate audio speech and affect the overall recognition performance. Thus, Audio-Visual Speech Recognition (AVSR) is designed to overcome the problems by utilising visual images which are unaffected by noise. The aim of this paper is to discuss the AVSR structures, which includes the front end processes, audio-visual data corpus used, recent works and accuracy estimation methods.

Downloads

Published

2018-01-29

How to Cite

Seong, T. W., & Ibrahim, M. Z. (2018). A Review of Audio-Visual Speech Recognition. Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 10(1-4), 35–40. Retrieved from https://jtec.utem.edu.my/jtec/article/view/3573

Most read articles by the same author(s)