Automatic Phoneme Identification for Malay Dialects


  • Yen-Min Jasmina Khaw School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia.
  • Tien-Ping Tan School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia.
  • Bali Ranaivo-Malançon Faculty of Computer Science and Information Technology, Universiti Malaysia Sarawak, Sarawak, Malaysia.


Phoneme Identification, Malay Dialect, Multilingual, Text Transcript,


In many languages such as English, French, German, and Mandarin, there is a documented way of how words are pronounced. The pronunciation of a word is determined by the sequence of phonemes or some speech sounds. Each language or dialect might have different phoneme set. However, there is often a lack of phonological study for a dialect. The number of phonemes is unknown for some of the dialects or languages without a written form. In this work, we propose an approach to identify the phonemes for a dialect from the dialect text transcript and speech corpus, leveraging on existing resources from standard language and multilingual resources. Our study was carried out on Malay dialects. The result shows that the accuracy of the phoneme identification approach is high when we compare the results against previous works in the area.


