Segmenting Arabic handwritings had been one of the subjects of research in the field of Arabic character recognition for more than 25 years. The majority of reported segmentation techniques share a critical shortcoming, which is over-segmentation. The aim of segmentation is to produce the letters (segments) of a handwritten word. When a resulting letter (segment) is made of more than one piece (stroke) instead of one, this is called over-segmentation. Our objective is to overcome this problem by using an Artificial Neural Networks (ANN) to verify the resulting segment. We propose a set of heuristic-based rules to assemble strokes in order to report the precise segmented letters. Preprocessing phases that include normalization and feature extraction are required as a prerequisite step for the ANN system for recognition and verification. In our previous work [1], we did achieve a segmentation success rate of 86% but without recognition. In this work, our experimental results confirmed a segmentation success rate of no less than 95%.
Automatic recognition of handwritings is developing well as a result of research contributions in this field. There are two main types of character recognition systems, namely, on-line and off-line systems. On-line systems recognize handwritings input on a tablet or any a similar device by a digital pen or stylus. Off-line systems deal with images of handwritings stored. The temporal information in the first type would positively contribute to the recognition process. Such information is absent in the second type, which makes this problem in particular more challenging.
In this paper, we study the problem of Arabic handwritten characters recognition, [
Our recognition system is embedded in the segmentation system that was proposed in [
The rest of the paper is organized as follows. Section 2 presents related work. Segmentation phase is described in Section 3 while the recognition phase is discussed in Section 4. Section 5 presents the experimental results, followed by the Conclusion and future direction.
Most of the work on handwriting recognition was done on Latin text. This lack in Arabic handwriting recognition systems is highly related to the difficulty of segmenting words into characters because of the cursive nature of Arabic handwriting. Therefore, Arabic recognition methods can be divided into those which first segment the word to be recognized, and those that recognize the whole word. In this work we are focusing on the segmentation of Arabic text rather than digits’ strings such as in [
Letter | Class # | Letter | Class # | Letter | Class # |
---|---|---|---|---|---|
ﻓ ﻗ | 31 | ﺴ ﺸ | 17 | ﺎ | 1 |
ف ق | 32 | س ش | 18 | أ | 2 |
ﮏ | 33 | 19 | 3 | ||
ﻛ | 34 | 20 | 4 | ||
ﻬ | 35 | ﺼ ﺿ | 21 | ﺑ ﺗ ﺛ | 5 |
36 | ص ض | 22 | ب ت ث | 6 | |
ﻠ | 37 | 23 | ﺦ ﺢ ﺞ | 7 | |
ﻪ | 38 | ط ظ | 24 | 8 | |
39 | ﻊ ﻎ | 25 | ﺧ ﺣ ﺠ | 9 | |
ه | 40 | 26 | خ ح ج | 10 | |
ﻣ | 41 | ﻋ ﻏ | 27 | 11 | |
م | 42 | ع غ | 28 | ل د ر | 12 |
43 | 29 | ي | 13 | ||
ن | 44 | ﻫ | 30 | ﻲ | 14 |
45 | 15 | ||||
و | 46 | 16 |
Earlier surveys discussed both Arabic printed and handwritten texts, [
In 1986, Amin and Masini proposed a system for segmentation and recognition that used horizontal and vertical projections and shape-based primitives [
Gillies et al. constructed a recognition system for Arabic text, [
In 2002, Hamami and Berkani developed a structural approach to handle many fonts and it included rules to prevent over-segmentation [
Hidden Markov Model (HMM) is used also to recognize words by using words features. In 2001, Dehghanet. Al split words into overlapping vertical segments [
Al-Qahtani and Khorsheed presented a system based on Hidden Markov Model Toolkit in 2004, [
Two segmentation free recognition methods appeared in 1995 by Al-Badr and Haralick. In the first system, [
Khorsheed and Clocksin proposed in 1999 another holistic system where features were extracted from a word’s skeleton for recognition without prior segmentation [
In 2000, Amin introduced another holistic approach where global features such as loops and peaks were extracted from the input word [
Another method was presented by Pechwitz and Maergner [
In this work, a recognition-based segmentation method for Arabic handwriting is developed. The method used a multi-agent approach to segment words, [
Our segmentation system, which we proposed in [
Initially, the image of Arabic handwritten text was binarized and cleaned from noise. Then, the text was segmented into lines and each line was segmented into words. The resulting words were thinned and the main connected components in each word were determined and passed to agents that extracted three types of feature points before starting their work.
The identification of initial cutting points strongly depends on seven agents. Six agents are major, which are: loop agent, letter Seen agent, under-baseline-cavities agent, above-baseline-right-cavity agent, above-baseline- left-cavity agent, and above-baseline-narrow-left-cavity agent. The other agent which is the baseline agent is a minor one since it was used by major agents to facilitate their task. First, the agents detected regions that look like some of Arabic characters, these regions were subtracted from the whole word and the remaining parts were left for further processing. Next, all end points features were extracted from the remaining regions and an initial cutting point was inserted between every two successive end points’. Finally, a set of filtering rules was applied to remove the extra segmentation points. The experiments reported very good results where the success rate was 86%.
This phase is very important in our segmentation system, [
Generally, artificial neural networks are very common in pattern recognition field. Our decision to use ANN as a recognition model was based on the excellent features that it possesses compared to other recognition tools. A well-trained neural network can perform complex functions and solve challenging problems that are difficult for conventional computers or human beings since it is based on learning what it sees. In addition, neural networks can be modified easily and retrained when the requirements of the problem are changed. Finally, its integration property allows several recognition tools to work properly and cooperate with neural networks. This feature may increase the efficiency of the problem solution. The following sections describe the main steps in our approach.
In this step, each segment image is converted into numerical features which describe the segment. The feature extraction methods used in character segmentation systems are probably the most important factor in achieving a good segmentation/recognition rate. After segmenting the word, its output segments are normalized into 250 × 250 images. Then, twenty structural features are extracted from them. Fifteen Fourier descriptors are extracted from the segments contour and normalized to remove character variations in shift, size, and rotation, [
A different number of Fourier descriptors are tested and the final set includes 15 descriptors. The selection of these features was based on their ability of describing the general shape of any closed curve such as characters by a set of Fourier coefficients. Suppose that a character consists of a sequence of points
The first 15 coefficients (descriptors) are selected as our features. This is referred to that the general properties of the character shape are kept in the first (low) coefficients. Because characters varied in size, location and maybe rotation angle, Fourier descriptors can be manipulated to be character rotation, scale, and shift invariant. To make Fourier descriptors rotation and shift invariant, only their absolute values are used, and to make them scale invariant, the coefficient are normalized by dividing them by the first coefficient a(1), [
Artificial neural networks are computational models which take their inspiration from the models and theories of the human brain. The most popular neural network is the multilayer feed-forward network where neurons are grouped as layers and connections between neurons in consecutive layers are permitted. The inputs are fed from the input layer and outputs are at the output layer.
In this work, after images normalization, a vector of 20 features is extracted from each segment image and classified using a feed forward neural network trained by back-propagation learning algorithm [
using tan-sigmoid and linear algorithms, respectively, and the network is trained using traincgf function. The final selection of the ANN’s structure and the used algorithms was determined after trying so many other structures and testing several algorithms. The ANNs which are trained using “traincgf” give better results compared to those that use other training algorithms. Moreover, traincgf has smaller storage requirements and faster convergence in some recognition problems.
The 46 outputs represent the classes that each segment may belong to, and each class includes letters that have similar shape (body) in a specific location in the word; in the beginning of the word, in the middle, in the end, or isolated. The list of output classes appears in
The proposed ANN is trained using 2000 characters; more than 40 characters from each class, written by different people. Then, testing was accomplished by selecting examples from each class and passing them to the ANN. A total of 250 characters were used as testing examples. The obtained recognition rate exceeds 87%.
This step is required when the word is over-segmented and additional segmentation points were determined. As a result, pseudo characters that passed to the neural network are not correctly recognized. To remedy this situation, the extra segmentation points are removed and the adjacent segments are combined and passed again to the neural network, [
A preprocessing step was applied first to remove segmentation points that yield to segments with width less than a threshold. This process eliminates most of the strokes which wrongly found in letters such as “ainﻋ” and “haaﺣ”. The following examples depict this case in
Because word segmentation is a precise process, a set of rules is used to cooperate with the embedded recognition system in order to keep the correct segments (characters) and combine the wrong ones in a correct way.
The grouping rules are based on the recognition results of segments. As observed in the segmentation stage, a letter is segmented in the worst case into three segments and this happened in letters that belong to the classes: 15, 16, 17, 18, and in some types of the handwritten letters of classes 7, 10, 19, 20, 21, and 22. In addition, characters of classes 2, 3, 6, 11, 12, 13, 23, 24, 29, 32, 43, and 44, shown in
The main objective of the cooperation between the recognition model and the grouping rules is to handle the over segmented letters introduced in
First, the number of resulting segments was determined. Grouping rules were applicable only when the number of segments is equal to or greater than two. Initially, the first two segments or three segments, if any, are passed to the neural network and recognized separately. If the first segment was recognized as a letter which can be a part of any other letter that may over-segmented into three parts, this first segment is combined with the second segment and the third segment and passed again to the neural network. If this grouping was well recognized, then the final character will be the grouping form of the three segments. Otherwise, only the first two segments are combined and passed to the neural network. If the first segment can be a part of any other letter that may over-segmented into two parts, then The final character will be the one with the higher recognition result of the first segment alone and its grouping with the second segment. Finally, this process of grouping and recognition repeated starting from the next unrecognized segment until all resulting segments are recognized as letters or combined into recognized letters.
More than 600 of over-segmented words were tested using the proposed recognition system aided by the above grouping rules. The obtained results are very encouraging, as illustrated in
The majority of the over-segmented characters are combined and recognized correctly.
As demonstrated above, both double and triple segmented letters are combined correctly. The letter “ش” which appears in the words “الشوامخ” and “بوعطوش” was segmented into three parts and each part is a candidate letter. However, the grouping and recognition processes yield to one strong candidate letter instead of the three parts. A different example of double segmented letters can be observed in letters “ث” and “ي” that belong to words “مارث” and “صحراوي”, respectively. Similarly, the combined segments had higher recognition rate, compared to each segment separately. The same case can be observed in the rest of examples.
No recognition | After recognition & grouping | |
---|---|---|
% of correct segmentation | 86% | 95% |
% of over segmentation | ≈14% | ≈5% |
% of missing/wrong segmentation | 0.3% | 0.3% |
Because the grouping rules are strongly based on the recognition result of each segment, those which are misrecognized may not help in handling the over-segmented letters. This case appears in the letters of classes: 3, 6, 29, 32, 43, and 44, where the left segment of the over-segmented letter is wrongly recognized as letter alif “ﺎ”, which belongs to class 1. One example is shown in
The under-segmentation problem may also occur because of letter misrecognition. Adjacent letters that form shapes similar to those of classes 15, 16, 17, 18, 19, 20, 21 and 22 may wrongly be combined although they are correctly classified before grouping. This scenario is very clear in
However, these results still fair because the combined segments form a body shape similar to that of an existing alphabetical letter. Moreover, the recognizer is not an interpreter to search for the meaning of the word based on the recognition results of its letters. Therefore, the obtaining outcomes are acceptable since no recent work could solve these situations.
In this paper, an effective segmentation method for Arabic handwriting was developed. The method used a multi-agent approach to segment words and relied on recognition to verify the validity of the candidate segmenta-
tion points. The use of an artificial neural network along with grouping rules lead to a good treatment of the over-segmentation problem in Arabic handwritings. Furthermore, it achieved better results, when compared to similar works, by reducing the effect of under segmentation. This is attributed to the decision agent, which makes the proper decisions to identify the candidate segmentation points. The resulting segments are passed to the recognizer, which will invoke and apply the grouping-rules agent on the unrecognized segments before passing it to the recognizer again. The experimental results (~95%) were very satisfactory and promising. Our future direction will focus on improving this approach and including other styles of Arabic handwritings. On the improvement front, currently we are studying the use of SVM and HMM recent and relevant techniques.
AshrafElnagar,RahimaBentrcia, (2015) A Recognition-Based Approach to Segmenting Arabic Handwritten Text. Journal of Intelligent Learning Systems and Applications,07,93-103. doi: 10.4236/jilsa.2015.74009