A Learning-Based Approach for Word Segmentation in Text Document Images
Keywords:
Word Segmentation, Deep Learning, Space Classification, Locally Likely Arrangement Hashing, Document Image Retrieval,Abstract
In conventional document retrieval (DIR) systems based on locally likely arrangement hashing (LLAH), the word detection approach is sensitive to the distance between the camera and the text document, e.g. a single word may be detected as several words when the camera is too close. Thus, the systems work well only when the distance in which the text document was registered is similar to the one of the retrieval. Moreover, they were implemented in a desktop setup where it might not suffer from the distance problem since the camera is rigidly attached to the computer. In this paper, a new word segmentation approach is proposed to increase the robustness of LLAH-based DIR systems so that they may be implemented on a mobile platform where the distance between the camera and text document may be easily changeable. The proposed method uses a deep neural network to classify spaces between connected components as between-words space or intra-word space. From experiments results, the proposed method successfully could detect the same words in different camera distances and orientation as the neural networks offered classification accuracy as high as 92.5%. Moreover, it showed higher robustness than the state-of-the-art methods when implemented on a mobile platform.References
K. Takeda, K. Kise, and M. Iwamura, “Real-time document image retrieval for a 10 million pages database with a memory efficient and stability improved LLAH,” Proc. of ICDAR, pp. 1054-1058, 2011.
F. Alaei, A. Alaei, M. Blumenstein, and U. Pal, “A brief review of document image retrieval methods: recent advances,” Proc. of IJCNN, 2016.
A. Gordo, J. Gibert, E. Valveny, and M. Rusinol, “A kernel-based approach to document retrieval,” Proc. of DAS, 2010.
A. Gordo, F. Perronnin, and E. Valveny, “Large-scale document image retrieval and classification with runlength histograms and binary embeddings,” Pattern Recognition, vol. 46, no. 7, pp. 1898-1905, 2013.
M. Shirdhonkar, and M.B. Kokare, “Handwritten document image retrieval,” Proc. of ICCMS, pp. VI-506–VI-510, 2011.
R.M. Haralick, K. Shanmugam, and I.H. Dinstein, “Textural features for image classification,” IEEE Trans. on Systems, Man and Cybernetics, vol. 3, no. 6, pp. 610-621, 1973.
A. Mishra, K. Alahari, and C.V. Jawahar, “Image retrieval using textual cues,” Proc. of ICCV, 2013.
H. Uchiyama, J. Pilet, and H. Saito, “On-line document registering and retrieving system for AR annotation overlay,” Proc. of AH, no. 23, pp. 1-5, 2010.
Google Tensorflow is a machine learning library for python, https://www.tensorflow.org [Online; accessed 16-Jan-2018]
Android Studio, http://www.developer.android.com/studio/index.html [Online; accessed 16-Jan-2018].
OpenCV, http://www.opencv.org [Online; accessed 16-Jan-2018]
IntelliJ idea, http://www.jetbrains.com/idea [Online; accessed 16-Jan- 2018]
Pycharm development tool for python, https://www.jetbrains.com/pycharm [Online; accessed 16-Jan-2018]
Downloads
Published
How to Cite
Issue
Section
License
TRANSFER OF COPYRIGHT AGREEMENT
The manuscript is herewith submitted for publication in the Journal of Telecommunication, Electronic and Computer Engineering (JTEC). It has not been published before, and it is not under consideration for publication in any other journals. It contains no material that is scandalous, obscene, libelous or otherwise contrary to law. When the manuscript is accepted for publication, I, as the author, hereby agree to transfer to JTEC, all rights including those pertaining to electronic forms and transmissions, under existing copyright laws, except for the following, which the author(s) specifically retain(s):
- All proprietary right other than copyright, such as patent rights
- The right to make further copies of all or part of the published article for my use in classroom teaching
- The right to reuse all or part of this manuscript in a compilation of my own works or in a textbook of which I am the author; and
- The right to make copies of the published work for internal distribution within the institution that employs me
I agree that copies made under these circumstances will continue to carry the copyright notice that appears in the original published work. I agree to inform my co-authors, if any, of the above terms. I certify that I have obtained written permission for the use of text, tables, and/or illustrations from any copyrighted source(s), and I agree to supply such written permission(s) to JTEC upon request.