Mobile Application for Improving Speech and Text Data Collection Approach
Keywords:—Mobile Application, Data Collection Tools, Corpus Development,
AbstractThis paper describes our work in developing a mobile application for collecting language speech and text data. The application is built to assist linguists or researchers in simplifying their tasks in data collection who of native speakers living in remote interiors. Researchers rely on numerous apparatus to carry out their tasks to capture audio or text from far to reach places, but with this mobile application, they would only need to carry one device, which can ease their logistics troubles. The mobile app, named as Kalaka, is designed for users to store details of native speakers, record speech and insert speech transcripts all in one platform. Kalaka is built on the Android platform, which allows data stored in the mobile device to be transferred to a cloud storage using WiFi networks. Usability tests performed in respondents shows, all participants in the evaluation are able to use the application to record their voices and save texts. We also received positive feedbacks on the mobile application from our survey, with more than half of the respondents gave their confidence using Kalaka and they would use the system frequently.
L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” In Conference Proceedings of IEEE, vol. 77, pp. 257-286, 1989.
T. Schultz, GlobalPhone: A multilingual speech and text database developed at Karlsruhe University, pp. 345-348, 2002.
L. Besacier, E. Barnard, A. Karpov, and T. Schultz, “Automatic speech recognition for under-resourced Languages: A Survey,” Speech Communication Journal, vol. 56, pp. 85-100, Jan. 2014.
S. Juan, Exploiting resources from closely-related languages for automatic speech recognition system for low-resource languages from Malaysia, Grenoble, France: Université Grenoble-Alpes, 2015.
S. S. Juan, L. Besacier, B. Lecouteux, and M. Dyab, “Using resources from a closely-related language to develop ASR for a very underresourced language: A case study for Iban,” In INTERSPEECH, Dresden, Germany, 2015.
G. Boulianne, L. Burget, A. Ghoshal, O. Glembek, N. Goel, M. Hannemann, P. Motlı́ček, D. Povey, Y. Qian, P. Schwarz, J. Silovský, G. Stemmer, and K. Veselý, “The Kaldi speech recognition toolkit,” In IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, Hawaii, 2011.
E. Albright and J. Hatton, “Wesay, a tool for engaging communities in dictionary building,” In V. D. Rau and M. Florey, eds., Language Documentation and Conservation Special Publication No. 1: Documenting and Revitalizing Austronesian Languages, p. 189201. University of Hawaii Press, 2008. Available at: http://hdl.handle.net/10125/1368.
Taiwan Indigenous Council, Aboriginal Ethnic Language Dictionary, 2016: http://e-dictionary.apc.gov.tw/Index.htm
M. Bettinson and S. Bird, “Developing a suite of mobile applications for collaborative language documentation,” In Workshop on the Use of Computational Methods in the Study of Endangered Languages, Honolulu, 2017. Available at: http://www.aclweb.org/anthology/W/W17/W17-0121.pdf
S. Bird, F. R. Hanke, O. Adams, H. Lee, “Aikuma: A mobile app for collaborative language documentation,” In Workshop on the Use of Computational Methods in the Study of Endangered Languages, pp. 1- 5, Baltimore, USA, 2014.
How to Cite
TRANSFER OF COPYRIGHT AGREEMENT
The manuscript is herewith submitted for publication in the Journal of Telecommunication, Electronic and Computer Engineering (JTEC). It has not been published before, and it is not under consideration for publication in any other journals. It contains no material that is scandalous, obscene, libelous or otherwise contrary to law. When the manuscript is accepted for publication, I, as the author, hereby agree to transfer to JTEC, all rights including those pertaining to electronic forms and transmissions, under existing copyright laws, except for the following, which the author(s) specifically retain(s):
- All proprietary right other than copyright, such as patent rights
- The right to make further copies of all or part of the published article for my use in classroom teaching
- The right to reuse all or part of this manuscript in a compilation of my own works or in a textbook of which I am the author; and
- The right to make copies of the published work for internal distribution within the institution that employs me
I agree that copies made under these circumstances will continue to carry the copyright notice that appears in the original published work. I agree to inform my co-authors, if any, of the above terms. I certify that I have obtained written permission for the use of text, tables, and/or illustrations from any copyrighted source(s), and I agree to supply such written permission(s) to JTEC upon request.