Mexican Sign Language Recognition Using Jacobi-Fourier Moments

doi:10.4236/eng.2015.710061

Engineering
Vol.07 No.10(2015), Article ID:60835,5 pages
10.4236/eng.2015.710061

Francisco Solís¹, Carina Toxqui², David Martínez¹

●How to Cite this Article

¹University Center UAEM Teotihucan Valley, Autonomous University of Mexico State, México

²Polytechnic University of Tulancingo, Tulancingo, México

Email: jfsolisv@uamex.mx, ctoxqui@upt.edu.mx

This work is licensed under the Creative Commons Attribution International License (CC BY).

http://creativecommons.org/licenses/by/4.0/

Received 13 July 2015; accepted 27 October 2015; published 30 October 2015

ABSTRACT

The present work introduces a system for recognizing static signs in Mexican Sign Language (MSL) using Jacobi-Fourier Moments (JFMs) and Artificial Neural Networks (ANN). The original color images of static signs are cropped, segmented and converted to grayscale. Then to reduce computational costs 64 JFMs were calculated to represent each image. The JFMs are sorted to select a subset that improves recognition according to a metric proposed by us based on a ratio between dispersion measures. Using WEKA software to test a Multilayer-Perceptron with this subset of JFMs reached 95% of recognition rate.

Keywords:

Mexican Sign Language, Jacobi-Fourier Moments, Digital Image Processing

1. Introduction

Sign Language Recognition (SLR) is a research field that has grown in recent years; researchers around the world are increasingly interested more in this area. Sign Language (SL) is the main and the most natural form of communication for unhearing community; however, most people interact verbally, leading to the SL mainly limited to the deaf people and closest hearing people that interact with them [1] .

SL is not universal; each country or region has its own SL, Mexican Sign Language (MSL), American Sign Language (ASL), Chinese Sign Language (ChSL), Japanese Sign Language(JSL), Persian Sign Language (PSL); to name a few, there are differences between them depending on their uses and customs where they are used [2] .

In order to express SL it usually involves the use of hands, arms, body movements and facial expressions; because of this, the development of a system to recognize all these expressions is a complex task; in fact most of the systems are very limited.

SLR process can be classified into two main classes based on how they acquire information; the first type uses Digital Image Processing, allowing users to interact in a more natural way with the system, but it is more difficult to acquire accurate data because most of these proposals work with 2D images, so it is difficult to follow the position or movement of the fingers or hand shape itself; on the other hand the second type SLRs use electronic devices physically connected to the user’s body allowing to acquire accurate data of position, movement or velocity fingers and other points of interest, with the disadvantage of not letting free movement of users.

This is an extended work previously published in this journal [3] ; the aim of this report is to recognize static signs by digital image processing (without movement sensors, special wires or any electronic device attached to signer) avoiding the use of special color markers or clothes.

2. Database

Mexican sign language consists in 26 signs, two of them are expressed by movement (“j” and “z”), for this reason the database created has 24 static signs (see Figure 1) were selected. A solid white background was used for segmentation purposes; images were captured by digital Canon EOS rebel T3 EF-S 18 - 55 camera using flash mode in order to decrease shadows. Five versions per sign were captured from single signer.

Figure 1. Static Mexican Sign Language (MSL) alphabets captured with a white background and avoiding the use of gloves or special color markers.

3. Jacobi-Fourier Moments

The technique named Jacobi-Fourier Moments [4] (JFMs) is a powerful tool extensively used in image analysis. JFMs are useful to extract relevant information from a function (in this case image of sign segmented in gray scale) and they are able to represent this function with few data with minimum redundancy due to its orthogonality property.

General expression of JFMs is expressed as

, (1)

where n denotes order and m repetition, is the image function in polar coordinates and is the kernel function given by

, (2)

where is the Fourier term and is the radial orthogonal Jacobi polynomial expressed as

. (3)

, and are Jacobi polynomials, weight function and normalization constant respectively and can be described using gamma function (Γ) as [5] :

, (4)

(5)

and

. (6)

The restrictions for and are and.

4. Proposed System

Figure 2 shows the block diagram of proposed system. Original image can be seen in Figure 2(a). In order to reduce computational costs a Region Of Interest (ROI) was selected by cutting the original image (see Figure 2(b)). Figure 2(c) illustrates the segmented alphabet “A” represented in gray scale which is used to calculate 64 JFMs (Figure 2(d)). JFMs are used as descriptors of signs, they use four parameters (order p, repetition q, α and β). Experimentally we found best results in recognition rate for this database when. 64 JFMs were computed the combinations of and, this features were also experimentally adjusted. Then the 64 JFMs were sorted according to a metric that we propose that measures the performance of each JFM (process in Figure 2(e)). Finally 64 test are computed using a Multilayer Perceptron in WEKA [6] , first test only uses the first JFM (best), second test uses first two sorted JFMs, third test uses first three sorted JFMs, and so on, the test number 64 uses all 64 JFMs (process in Figure 2(f)).

The metric proposed to sort the JFMs according to its performance in order to do a feature selection is described as follows. First a matrix of Descriptors (JFMs) is defined to represent the JFM calculated in database, where M and n represent signs and versions respectively. A desirable JFM should be similar (numerically) when is calculated on different versions of same sign and should change when is calculated on different signs, this means that a JFM (with a particular α, β, p and q) computed in all database and represented by matrix should be invariant along each row (low dispersion) and at the same time should be variant along each column (high dispersion).

In order to achieve a metric which considers the above mentioned some data are calculated. First a vector is computed to get the averages of versions per sign as

(7)

Figure 2. Block diagram of proposed system. (a) Original captured image of alphabet “A”; (b) cropped image; (c) segmented and RGB to gray scale converted; (d) 64 JFMs computed from “(c)”; (e) JFMs subset computed according to metric proposed and (f) database classification using “(e)” and Multilayer Perceptron.

where stores the mean of versions per sign. This vector is used to calculate variance of versions as

(8)

which is expected to be close to cero (minimum dispersion in versions), because descriptors should not change between the versions of same sign. Then versions average is calculated as

(9)

in order to compute variance of versions as

. (10)

This two dispersion metrics (SDN_i-variance of versions and -variance of signs) are used to get a metric by a pondered ratio between them that estimates within a single value whether a JFM is good or not, this metric is expressed as

, (11)

where mo is the objective metric that determines whether a JFM is good, this means that if and then, this value for mo is considered as desirable (minimum variance in versions and big variance in signs) for means that is not a good descriptor due to or or both (big variance in versions and/or minimum variance in signs).

64 JFMs were calculated and sorted by mo metric in ascending order then 64 tests were made in WEKA [6] using a Multilayer Perceptron (first introduced by Rosenblatt [7] ). First test uses only a single descriptor to represent each image for all database, second test uses two descriptors, and so on. Last test uses all 64 descriptors to represent each image. Every test was made using cross validation.

Table 1 shows the results of each classification test, first test which uses the JFM with p = 0 and q = 0

Table 1. 64 classification tests using a Multilayer Perceptron.

achieves 8.3333% of recognition rate, second test uses two JFMs (p = 0, q = 0 and p = 0, q = 2) achieving 8.3333%, the best subset is the one which uses the first 27 JFMs to represent each image for all database achieving 95.0% of recognition rate.

5. Conclusion

JFMs can be used to extract descriptors of static signs. JFMs reduce computational cost for MSL recognition since an image can be represented by only 27 values. MSL recognition can be achieved without using gloves or special markers (using a special white background). A Multilayer Perceptron can be used to classify the signs using the JFMs and can achieve 95% of recognition rate in a cross validation scheme. The proposed metric can improve the global recognition rate; this can be seen in Table 1 which shows that using all 64 JFMs 89.1667% of recognition rate was achieved and using the first 27 JFMs improves for this database the recognition rate in almost 6%.

Acknowledgements

Authors thank to research department (Secretaría de Investigación) of Autonomous University of Mexico State (Universidad Autónoma del Estado de México) for the financial support to accomplish this work in the University Center UAEM of Teotihuacan Valley (Centro Universitario UAEM Valle de Teotihuacán) and the Polytechnic University of Tulancingo (Universidad Politécnica de Tulancingo) for all their support.

Cite this paper

FranciscoSolís,CarinaToxqui,DavidMartínez, (2015) Mexican Sign Language Recognition Using Jacobi-Fourier Moments. Engineering,07,700-705. doi: 10.4236/eng.2015.710061

References

1. Shohieb, S.M., Elminir, H.K. and Riad, A.M. (2015) SignsWorld Atlas; a Benchmark Arabic Sign Language Database. Journal of King Saud University—Computer and Information Sciences, 27, 68-76.

2. López, V., Barra, R., Lutfi, S., Montero, J.M. and San, R. (2013) LSESpeak: Aspoken Language Generator for Deaf People. Expert Systems with Applications, 40, 1283-1295.
http://dx.doi.org/10.1016/j.eswa.2012.08.062

3. Solís, F., Hernández, M., Pérez, A. and Toxqui, C. (2014) Static Digits Recognition Using Rotational Signatures and Hu Moments with a Multilayer Perceptron. Engineering, 6, 692-698.
http://dx.doi.org/10.4236/eng.2014.611068

4. Ping, Z., Ren, H., Zou, J., Sheng, Y. and Bo, W. (2007) Generic Orthogonal Moments: Jacobi-Fourier Moments for Invariant Image Description. Pattern Recognition, 40, 1245-1254.
http://dx.doi.org/10.1016/j.patcog.2006.07.016

5. Hoang, T. and Tabbone, S. (2013) Errata and Comments on “Generic Orthogonal Moments: Jacobi-Fourier Moments for Invariant Image Description”. Pattern Recognition, 46, 3148-3155.
http://dx.doi.org/10.1016/j.patcog.2013.04.011

6. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P. and Witten, I. (2009) The Weka Data Mining Software: An Update. ACM SIGKDD Explorations Newsletter, 11, 10-18.
http://www.cs.waikato.ac.nz/ml/weka/
http://dx.doi.org/10.1145/1656274.1656278

7. Rosenblatt, F. (1962) Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. 1st Edition, Spartan Books, Michigan University, Ann Arbor.

Journal Menu >>