A Method for Arabic Handwritten Diacritics Characters
Faiz Alotaibi1, Muhamad Taufik Abdullah2, Rusli Abdullah3, Rahmita Wirza4, Masrah Azrifah Azmi Murad5
1Faiz Alotaibi, Department of Computer Science and Information Technology, University Putra Malaysia.
2Muhamad Taufik Abdullah, Department of Computer Science and Information Technology, University Putra Malaysia.
3Rusli Abdullah, Department of Computer Science and Information Technology, University Putra Malaysia.
4Rahmita Wirza, Department of Computer Science and Information Technology, University Putra Malaysia.
5Masrah Azrifah Azmi Murad, Department of Computer Science and Information Technology, University Putra Malaysia.
Manuscript received on 27 September 2019 | Revised Manuscript received on 09 November 2019 | Manuscript Published on 22 November 2019 | PP: 209-212 | Volume-8 Issue-6S3 September 2019 | Retrieval Number: F10340986S319/19©BEIESP | DOI: 10.35940/ijeat.F1034.0986S319
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: An Optical Character Recognition (OCR) is the process of converting an image representation of a document into an editable format. In addition, people have the ability to recognize characters without difficulty as reading papers or books. However, developing an OCR system that has the ability to read and recognized Arabic diacritics characters as human still, remain a problem. More, specifically, poor recognition rate in most of optical diacritics characters recognition is mainly attributed to failing in segmenting a handwritten text correctly. To overcome this problem, we perform develop a method based on seven operations; it starts with searching the text-line height followed by reading words from the line. Then identify the diacritics regions. The segmentation is also applied during this operation by converting the text-line into a grayscale and binary image. Moreover, we introduced a new model based on k-nearest neighbors (KNN) algorithm to identify diacritics and characters segmentation. KNN is trained to directly predict the diacritic from the text-line. Finally, we offer an evaluation discussion on optical diacritics characters recognition.
Keywords: Diacritics Characters, Handwritten, KNN, Image Recognition.
Scope of the Article: Probabilistic Models and Methods