Handwritten signature and character recognition has become challenging research topic due to its numerous applications. In this paper, we proposed a system that has three sub-systems. The three subsystems focus on offline recognition of handwritten English alphabetic characters (uppercase and lowercase), numeric characters (0 - 9) and individual signatures respectively. The system includes several stages like image preprocessing, the post-processing, the segmentation, the detection of the required amount of the character and signature, feature extraction and finally Neural Network recognition. At first, the scanned image is filtered after conversion of the scanned image into a gray image. Then image cropping method is applied to detect the signature. Then an accurate recognition is ensured by post-processing the cropped images. MATLAB has been used to design the system. The subsystems are then tested for several samples and the results are found satisfactory at about 97% success rate. The quality of the image plays a vital role as the images of poor or mediocre quality may lead to unsuccessful recognition and verification.
The prominent advancement in Handwritten Character Recognition has become possible for Neural Network by learning distinguished features from a large amount of entitled data [
There are two types of authentication methods: Online and Offline [
The signature verification is mostly used in bank check. Bank check contains the signature and handwritten characters. Computer systems are slower and yield less appropriate results than humans in the processing of handwritten fields [
The steps involved in this research are approximately given as follows. Section 2 describes the methods and problem to be solved. Section 3 explains the proposed system. Section 4 discusses the implementation, results, and performance of the proposed system. Section 5 discusses the conclusion.
This section involves the work done by various researchers in the field of Handwritten character Recognition and Signature Verification. V. Patil et al. [
This research emphasizes increasing the accuracy of handwritten numeral recognition, alphabetic character recognition and signature recognition and verification. The accuracy for recognition and verification has been improved remarkably by using distinguished feature extraction methodologies after training and testing by Neural Network.
The character and signature recognition and verification systems are used in several fields of technology. A bank check consists of many fields such as the courtesy amount and the signature of the person who wrote the check as well as symbols and graphics. On the other hand, the character recognition is used by OCR, digital dictionaries. Here digital dictionary is a concept device used to optically detect handwritten words to show their definitions.
Actually, handwritten signature recognition can be of two kinds:
1) Online verification: it needs a device that is connected to the remote computer to right the running signature verification. A stylus is needed to sign on an electronic tablet to acquire the dynamic signature information [
2) Offline verification: The user does not need to be there for verification. There is some compatibility in a fixed signature. The most commonly applied of fixed signature in document verify the banking system. Fixed signature has fewer features so it has to be more careful achieving the 100% accuracy is the feature of signature verification.
The proposed system is a model for detecting the characters and letters and identifying the signatures of correct persons. The system consists of four modules:
1) Image preprocessing
2) Extraction of characters and signature
3) Segmentation of digits and letters
4) Recognition using a neural network
The process of obtaining a digitized image from real-world source is image acquisition. It can be done using several devices such as the scanner, digital camera, PDA, web camera, camcorder etc. [
The contrivance of this designed system is to perceive any signature. Here these following steps are coursed in
We converted RGB images into a grayscale image using NTSC grayscale conversion that converts RGB to grayscale conversion. The equation is shown below:
Grayscale value = 0.3 × Red + 0.59 × Green + 0.11 × Blue
The background of the scanned image of signatures and digits are blurred by using a 3-by-3 median filter.
The signatures and characters are selected manually from the filtered image. It can be done also by setting the rectangular area in predefined function to auto select.
The signature and character are converted to binary image from grayscale and so the image is converted with two types of pixels 0's (white) and 1's (black). The binary image is shown below in
After converting the image into a binary image we removed the unwanted pixels (0) and resized the image.
The segmentation strategy lies in the determination of the best cut path to recognize a correct isolated character. We mainly worked on the segmentation-based recognition technique. We used an algorithm that represents the total number of white pixels in vertical direction of the binary image so that the text region can be separated easily. The segmented digits and letters are shown in
A rotation and size independent feature extraction methods are used to extract the feature of the segmented digit and signature and obtain 44 features for each digitized signature.
Center of the image
Center of the image can be obtained by using following two equations
Center_x = width/2 (1)
Center_y = height/2 (2)
Feature 1-38
These features emphasize at checking how the black pixels are allocated in the image. At first, the total number of pixels of the image is calculated that is total_pixels of images.
Total_pixels = height × weight (3)
The percentage of black pixels in the upper and lower areas of the images is defined as Feature 1 and Feature 2 respectively. They are also called pixels located at up and down the central point.
feature 1 = up_pixels/total_pixels (4)
feature 2 = down_pixels/total_pixels (5)
Like the arithmetic equations used above Feature 3 and Feature 4 represent the percentage of black pixels located in the left and right areas of the image, in other words, the pixels located in the left and right of the central point.
feature 3 = left_pixels/total_pixels (6)
feature 4 = right_pixels/total_pixels (7)
Now partition the image into four sub-regions and calculate the percentage of black pixels locate in every region. Then again we subdivide every region into four and calculate the percentage of black pixels of those regions.
featuren = sub_area_pixelsn/total_pixels (8)
where n = 5 to 24.
In order to extract the features from 25 to 30, we need to consider the 16 (4 × 4) subregions or blocks.
feature 25 = total number of black pixels from block (0,0) to (3,3)/Total_pixels(9)
feature 26 = total number of black pixels from block (1,0) to (4,3)/Total_pixels(10)
feature 27 = total number of black pixels from block (0,1) to (3,4)/Total_pixels(11)
feature 28 = total number of black pixels from block (1,1) to (4,4)/Total_pixels(12)
feature 29 = total black pixels of 2nd and 3rd rows of blocks/ Total_pixels(13)
feature 30 = total black pixels of 2nd and 3rd columns of blocks/Total_pixels(14)
feature 31 = total black pixels of 1st row of blocks/Total_pixels (15)
feature 32 = total black pixels of 2nd row of blocks/Total_pixels (16)
feature 33 = total black pixels of 3rd row of blocks/Total_pixels (17)
feature 34 = total black pixels of 4th row of blocks/Total_pixels (18)
feature 35 = total black pixels of 1st column of blocks /Total_pixels (19)
feature 36 = total black pixels of 2nd column of blocks/Total_pixels (20)
feature 37 = total black pixels of 3rd column of blocks/Total_pixels (21)
feature 38 = total black pixels of 4th column of blocks/Total_pixels (22)
Feature 39
The feature 39 is the average of the distance between all the black pixels and the central point.
feature 39 = 1 Total_pixels × ∑ y ∑ x ( x − i ) 2 × ( y − j ) 2 (23)
where (i, j) are the coordinates of a point and (x, y) are the coordinates of the central point.
Feature 40-46
These features are used to generate the seven moments of the image. These are well-known as Hu moment invariants. We calculated the central movements of the segmented signature. For f(x, y) 2-D function of M × N binary image, the moment of order (p + q) is defined by:
m p q = ∑ x = 1 M ∑ y = 1 N ( x ) p ( y ) q f ( x , y ) (24)
where p , q = 0 , 1 , 2 , 3 , ⋯ .
Central moments are obtained by the following equations:
μ p q = ∑ x ∑ y ( x − x ¯ ) p ( y − y ¯ ) q f ( x , y ) (25)
where x ¯ = m 10 m 00 and y ¯ = m 01 m 00 .
For scaling normalization the central moment changes as following equations:
η p q = μ p q μ 00 γ (26)
where γ = [ ( p + q ) 2 ] + 1 .
Seven values, enumerated by normalizing central moments through order three, that are invariant to object scale, position, and orientation. In terms of central moments, the seven moments are given as,
M 1 = η 20 + η 02 (27)
M 2 = ( η 20 − η 02 ) 2 + 4 η 11 2 (28)
M 3 = ( η 30 − 3 η 12 ) 2 + ( 3 η 21 − η 03 ) 2 (29)
M 4 = ( η 30 + η 12 ) 2 + ( η 21 + η 03 ) 2 (30)
M 5 = ( η 30 − 3 η 12 ) ( η 30 + η 12 ) [ ( η 30 + 3 η 12 ) 2 − 3 ( η 21 + η 03 ) 2 ] + ( 3 η 21 − η 03 ) ( η 21 + η 03 ) [ 3 ( η 30 + 3 η 12 ) 2 − ( η 21 + η 03 ) 2 ] (31)
M 6 = ( η 20 − η 02 ) [ ( η 30 + η 12 ) 2 − ( η 21 + η 03 ) 2 ] + 4 η 11 ( η 30 + η 12 ) ( η 21 + η 03 ) (32)
M 7 = ( 3 η 21 − η 03 ) ( η 30 + η 12 ) [ ( η 30 + η 12 ) 2 − 3 ( η 21 + η 03 ) 2 ] − ( η 30 + 3 η 12 ) ( η 21 + η 03 ) [ 3 ( η 30 + 3 η 12 ) 2 − ( η 21 + η 03 ) 2 ] (33)
Feature 47-52
Feature 47 to feature 52 show the mean of Major Axis Length, Minor Axis Length, Contrast, Homogeneity, Correlation, and Energy respectively.
A neural network is comprised of a number of leaves called neurons joint by links. Every link has a numeric heft. Neurons are the fundamental construction blocks of a neural network.
A neural network is appointed for signature recognition. For this purpose a multilayer feed forward neural network with administered learning method is much feasible and productive. The network implements back propagation-learning algorithm that is a meticulous technique for training different layer ANNs. The Network design for the system is shown in
Neural Network has numerous parameters i.e. learning Rate Parameter (η), Weights (w), and Momentum (α). Learning rate is schemed to ordain how much the link weights and bias can be revised based on the direction and change rate. Its value must be in the range 0 to 1. For a conferred neuron the learning rate can be picked inversely proportional to the square root of the number of synaptic conception to the neuron.
The weights of the network to be accommodated by using back-propagation algorithm must be initialized by some non-zero values. Initialize weights are randomly chosen between [−0.5, +0.5] or [−0.1, +0.1]. The weight alteration is performed using the following equation.
Δ w ( n ) = η ⋅ δ ( n ) ⋅ y ( n ) + α Δ w ( n − 1 ) (34)
Momentum (α) range 0 to 1 but in 0.9, found to be most applicable for most application.
Parameter | Value |
---|---|
Hidden Layer | 85 |
Output Layer | 30 |
Learning Rate | 0.0001 |
Total number of epochs | 1000 |
Performance Goal | 0.00000001 |
Momentum | 0.9 |
The implementation of performance analysis is done after testing the result of this experiment. The MATLAB has been used to implement the proposed system. MATLAB can be used in Algorithm development, math and computation, Data analysis, exploration, and visualization, Modeling, simulation, and prototyping, Scientific and engineering graphics and Application development. Each digit is converted into 20 × 15 binary images to be tested by the neural network after the extraction and segmentation of the scanned image, we split the samples into two groups: the training set and the test set. The training set contains 60% of the total genuine samples and the remaining are used for testing the system.
The method has been tested with 10 types of handwritten digits and 52 types of alphabets. Every digit and alphabet has ten samples. We collected signature from 30 different persons and used ten samples for a single person. The recognition of signature is 100% if the train and test set is same. For digit recognition or accuracy rate is above 98.1% and for English alphabets, the matching or accuracy rate is above 97.31%. The accuracy rate for signature is above 97.6% and the error rate is only 2.4%. The overall dataset of recognition rates of test data is shown below in Tables 2-4.
The final training performance is shown below in
Types of Digit | No. of Sample | Correct recognition rate (%) | Error rate (%) |
---|---|---|---|
10 | 99.00 | 1.00 | |
10 | 98.50 | 1.50 | |
10 | 99.75 | 0.25 | |
10 | 98.50 | 1.50 | |
10 | 97.25 | 2.75 | |
10 | 98.00 | 2.00 | |
10 | 99.00 | 1.00 | |
10 | 98.50 | 1.5 | |
10 | 95.00 | 5.00 | |
10 | 97.00 | 3.00 | |
Total | 100 | 98.1 | 1.9 |
Alphabet | No. of samples | Correct recognition rate (%) | Error rate (%) |
---|---|---|---|
10 | 99.00 | 1.00 | |
10 | 98.50 | 1.50 | |
10 | 98.50 | 1.50 | |
10 | 99.75 | 0.25 | |
10 | 99.00 | 1.00 | |
10 | 99.50 | 0.50 | |
10 | 98.50 | 1.50 | |
10 | 98.75 | 1.25 | |
10 | 99.25 | 0.75 | |
10 | 97.75 | 2.25 | |
10 | 97.00 | 3.00 | |
10 | 92.50 | 7.50 | |
10 | 97.75 | 2.25 | |
10 | 99.50 | 0.50 | |
10 | 99.25 | 0.75 | |
10 | 95.50 | 4.50 | |
10 | 98.50 | 1.50 | |
10 | 97.50 | 2.50 | |
10 | 93.75 | 6.25 | |
10 | 99.25 | 0.75 | |
10 | 98.25 | 1.75 | |
10 | 95.00 | 5.00 | |
10 | 98.50 | 1.50 | |
10 | 98.75 | 1.25 | |
10 | 97.25 | 2.75 | |
10 | 94.25 | 5.75 | |
10 | 95.5 | 4.50 |
10 | 98.75 | 1.25 | |
---|---|---|---|
10 | 99.00 | 1.00 | |
10 | 98.75 | 1.25 | |
10 | 95.00 | 5.00 | |
10 | 98.25 | 1.75 | |
10 | 96.50 | 3.50 | |
10 | 98.75 | 1.25 | |
10 | 99.00 | 1.00 | |
10 | 98.50 | 1.50 | |
10 | 98.00 | 2.00 | |
10 | 97.50 | 2.50 | |
10 | 95.25 | 4.75 | |
10 | 98.75 | 1.25 | |
10 | 99.25 | 0.75 | |
10 | 98.50 | 1.50 | |
10 | 98.75 | 1.25 | |
10 | 95.75 | 4.25 | |
10 | 92.50 | 7.50 | |
10 | 98.25 | 1.75 | |
10 | 90.25 | 9.75 | |
10 | 98.25 | 1.75 | |
10 | 98.50 | 1.50 | |
10 | 93.75 | 6.25 | |
10 | 93.50 | 6.50 | |
10 | 94.50 | 5.50 | |
Total | 520 | 97.31 | 2.69 |
Type of signature | No. of samples | Correct recognition rate (%) | Error rate (%) |
---|---|---|---|
10 | 99.5 | 0.50 | |
10 | 99.00 | 1.00 | |
10 | 91.00 | 9.00 | |
10 | 96.25 | 3.75 | |
10 | 99.75 | 0.25 | |
10 | 99.25 | 0.75 | |
10 | 99.25 | 0.75 | |
10 | 99.75 | 0.25 | |
10 | 98.00 | 2.00 | |
10 | 95.75 | 4.25 | |
10 | 99.5 | 0.5 | |
10 | 99.25 | 0.75 | |
10 | 94.5 | 5.5 | |
10 | 97.5 | 2.5 | |
10 | 99.25 | 0.75 | |
10 | 98.5 | 1.5 | |
10 | 99.75 | 0.25 | |
10 | 99.75 | 0.25 | |
10 | 90.00 | 10.00 |
10 | 99.25 | 0.75 | |
---|---|---|---|
10 | 96.75 | 3.25 | |
10 | 99.75 | 0.25 | |
10 | 91.25 | 8.75 | |
10 | 98.75 | 1.25 | |
10 | 99.00 | 1.00 | |
10 | 98.25 | 1.75 | |
10 | 96.00 | 4.00 | |
10 | 99.75 | 0.25 | |
10 | 96.00 | 4.00 | |
10 | 96.5 | 3.5 | |
Total | 300 | 97.6 | 2.4 |
The proposed system has developed a method for Recognition handwritten characters (both numerical and alphabetic) and signatures. A neural network is designed to test 10 samples for each type of characters and 10 samples for each signature. These samples are of different types and each of them shows the percentage of matching/acceptance rate. Acceptance rate depends on the appropriate training sample. On average, the success rate of Numerical Character Recognition and Verification System (NCRVS) is 98.1%; the success rate of Alphabetic Character Recognition and Verification System (ACRVS) is 97.31%; and for Signature Recognition and Verification System (SRVS), it is 97.6% which meets the expectation of the research. This thesis mainly aims at reducing the cases of fraud in commercial transactions.
Nashif, Md.H.H., Miah, Md.B.A., Habib, A., Moulik, A.C., Islam, Md.S., Zakareya, M., Ullah, A., Rahman, Md.A. and Al Hasan, Md. (2018) Handwritten Numeric and Alphabetic Character Recognition and Signature Verification Using Neural Network. Journal of Information Security, 9, 209-224. https://doi.org/10.4236/jis.2018.93015