Paper Menu >>
Journal Menu >>
Journ al of Software Engineering and Applications, 2012, 5, 200-203 doi:10.4236 /j sea.2 012.512b038 Published Online December 2012 (http://www.SciRP.org/journal/jsea) Copyright © 2012 SciR es. JSEA Image Classification using Statistical Learning Methods Jassem Mtimet, Hamid Amiri Signal, Image and Technology of Information Laboratory, National Engineering School of Tunis, Tunis El Manar University, BP 37, Le Belvdre 1002, Tunis, Tunisia. Email: mtimat. jasse m@y a hoo .fr, hamidlamiri@gmail.com Received 2012 ABSTRACT In gene ral, d igital i mages can b e classi fied into photo grap hs, text ual and mixed d ocu me nts. T his taxo nomy is very use- ful in many applications, such as archiving task. However, there are no effective methods to perfor m this classification automatically. In this paper, we present a method for classifying and archiving document into the following semantic classes: photographs, textual and mixed documents. Our method is based on combining low-level i mage fea tures, s uch as mean, Standard deviation, Skewness. Both the Decisio n Tree and Neuronal Network Classifiers are used for classifi- cation task. Keywords: Image Classification; Decision Tree; Neuronal Network; Statistica l Ana l ysi s 1. Introduction Nowadays, a huge number of documents are available in electronic format, whether as photos, p lans, letter s or pres s releases. With the continuous increase of the amount of such information, many applications for organizing this flood of documents are emerging. Amongst them, auto- matic image archiving systems are necessary to classify and to store a large collection of documents autono- mously, to simplify searching and retrieving individual documents. Recently automatic semantic classification and arc- hiving of images has become an important field of re- search, aiming to automatically classify images, i.e. clas- sification of images into significant categories, such as outdoor/indoor, city/landscape and people/non-people scene s [1,2]. In order to classify images into two classes (in- door/outdoor, city/landscape, etc.) Vailaya et al. use a Bayesian framework and obtain an average accuracy of 94.1% [3]. In [4] Gorkani et al. suggest an image classification method based on the most dominant orientation in the image’s texture. In fact, this feature a llows dif fere ntiating two final classes of images: city and landscape. Thus, they achieve a classification accuracy of 92.8%. Another approach was proposed by Prabhakar et al. in [5]. They used three low-level image descriptors (color, texture and edge information) to separate pictures and graphic images. Their algorithm reaches an accuracy rate of 96.6%. In [6] Schettini et al. aim to classify images into four classes (photographs, graphics, text and mixed docu- ments). Therefore, from every image, they extract six features which represent color descriptor, edge represen- tation, texture features, wavelet coefficients and skin color pixels percentage. This paper presents a system able to automatically classify a nd ar chivin g d oc u me nts i nto the fo ll o win g t hre e categories: photos, textual documents and mixed docu- ments. In Section 2, theoretic background of our approach is explained. Then in section 3, the experience plan is de- scribed, including data sets, experimental results and evaluation criteria, while in Section 4, results are dis- cussed and new perspectives are suggested. 2. Proposed System The system we propose allows discriminating documents into photographs, textual and mixed documents. It is based on two main stages (F ig ure 1): i) The features extraction: These features are extracted automatically from images using specific programs. For every single image, the values of these features will be used as coeffi- cients of a representative vector. ii) The classification and archiving mo dule : This i s o btaine d after trai ning a nd validating a model used to discriminate and store docu- ments. 2.1. Features Extraction Features selection is the key step leading to the success or failure of the classification phase. Therefore, several Image Classification using Statistical Learn ing Methods Copyright © 2012 SciR es. JSEA 201 features are tested, looking to their relevance. In fact, features selection is an empiric process, though many approaches are suggested to weight their importance. In our system, images are classified based on six low-level featured, these features are considered as the coefficients of the image representative vector. They are calculated as follows: ● Mean: is the average color value in the image. 1 1N i ij j µP N= = ×∑ (1) Were i represent the color channel and Pij is the pr ob- ability of occurrence of pixel wit h intensity j. ● Standard deviation: is the square root of the va- riance of the distr ibution ( ) 2 1 1 N iij i j P N σµ = = − ∑ (2) ● Skewn e ss: represents the measure of the degree of asymmetr y in t he distribution. ( ) 3 3 1 1N iij i j sP N µ = = − ∑ (3) ● Entropy: represent the disorder or the complexity of the image. A high value of entropy indicates a complex textur es. 22 1 log log N i ijij j EP P = = −∑ (4) ● Image dimension: represents the length and width of the image. 2.2. Classification Stage After the extraction of the representative vector for each image, every document is classified as a photo, text or a mixed one. Photo family included indoor, outdoor, Training Doc uments Testing Doc uments Extraction of image features Training the classifier Classification model validation Features extraction Classified Images Figure 1. Impleme ntation strategy. scenes, landscape, people, logos, and maps. Text family includes scanned and computer-generated text in various fonts. Mixed documents are documents that contain text and photo region. Thus, two well known classifiers are used to classify our doc ument s na mel y the De cisio n tre e a nd the Ne uron- al Network [7,8]. The Decision Trees The Decision Tree Classifier is a set of hierarchical rules which are successively applied to the input data [9]. Those rules are thresholds used to split the data into two binary nodes. Each node is such that the descendant nodes contain more homogeneous data samples. Many features can be input into the Decision Tree to refine class description. A split is chosen because of its ability to render the nodes purer based on a purity measure and can be determined by any single feature [10]. In our paper we fitted the DT to the training data using the cross validation technique in order to select the best tree. Thus, we obtained two tree-based models (original, pruned) that we re used in the classification task. The Artificial Neuronal Network A neural network is a set of connected units (nodes, neurons). Each node has an input and output then it can be connects with other nodes. Each connection has a weight associated to it. The topology of the neural net- work, the training methodology and the connections be- tween the different nodes define the type of the corres- ponding Ne uronal Network [11-13]. In our case we used an RBF network. In which the input layer had 6 nodes that are equal to the number of features organized as vectors in the database. For the hidden layer, we chose 6 node s while the outp ut l ayer co ntain s thre e node s. B y the end of this process, an input image is classified either as a photo, a pure text or a compound documen t. 3. Experimental Results A data base of 291 documents was considered for both classification systems. From this set of documents 75% were used for training and 25% for testing the system performance. Thus, the training data set consists of 136 photo including indoor, outdoor, scenes, landscape im- ages documents, 39 textual documents include scanned and computer-generated text in various font and 51 compound documents. Figure 2 shows some of the class images from the training data set. In order to evaluate the accuracy of our approach, the following statistical coef ficients are c omputed [14 ][15]: ● The recall rate= CCI/TI ● The precision rate= CCI/(TI+MI) ● F-measure= ( ) ( ) 2 2 1Precision Recall Precision Recall b b +⋅ ⋅ ⋅+ . Here, b Image Classification us in g Statistical Learning Met hod s Copyright © 2012 SciR es. JSEA 202 equals 1. CCI represents the number of Correctly Classified Images. MI is the number of Misclassified Images and TI is the number of Test Images for each class. Figure 3 presents the results obtained by using the Decision Tree. We can see that only for textual docu- ments the full Decision Tree achieve high F-measure value than the p rune d one . The results obtained using the neural network as clas- sifier are presented in Fig ure 4. These results show that both classifiers achieve notable results in the classifica- tion of documents. The DT classifier outperforms the NN classifier in execution speed and Recall value (by 12%). There are some cases of misclassification produced by the both classifiers. Figure 5 shows examples of these images. The main causes of misclassification on text are due to bad lighting conditions and to excessively noisy back- grounds that cause the final unifor mity te st to fail. Figure 2 . Examples of training data set images. Figure 3. Classification results using DT. Figure 4. Classification results using NN. Figure 5 . Samples of mis classifi ed images. 4. Conclusions Automatic classification and archiving of images is an emerging research field in image processing. In this pa- per an algor ithm for cla ssifyi ng phot o, text ual a nd mixe d documents based on low-level image features was pre- sented. Firstly, features are extracted from images to be assigned to a characteristic vector. Then, the Decision Tree and the neuronal Network classifiers are used to train and to validate a classification model using the ex- tracted feature vectors. The obtained models allowed reaching an accuracy rate of 96% for discriminating a photo, a text and a mixed document. Nevertheless, features relevance is weighted to select the most contributory ones, in order to increase classifi- cation and archiving performance. Moreover, we are curr ently stud ying othe r usefu l high-level feature to raise the accuracy and to build a new intelligent classifier. REFERENCES [1] Chih-Fong Tsai, On Classifying Digital Accounting Documents, The International Journal of Digital Ac- counting Research, Vol. 7, N. 13, pp. 53-71, 2007 [2] S.J. Simske, Low-resolution photo/drawing classification: metrics, method and archiving optimization, Proceedings IEEE ICIP, IEEE, Genoa, Italy, pp. 534-537, 2005. [3] V ai l aya, A., Figueiredo, M., A. Jain, and H. J. Zhang, Bayesian framework for hierarchical semantic classifica- Image Classification using Statistical Learn ing Methods Copyright © 2012 SciR es. JSEA 203 tion o f vacation i mages, Proceed ings of th e IEEE Intern a- tional Conference on Multimedia Computing and Sys- tems (ICMSC), pp. 518- 523, Floren ce, Italy, 1999. [4] M. M. Gorkani and R. W. Picard, Texture orientation for sorting photos ’at a Glance’, Proc. ICPR, pp. 459-464 Oct. 1994 [5] S. Prabhakar, H. Cheng, J.C. Handley, Z. Fan Y.W. Lin, P icture-graphics Color Image Classification, Proc. of ICIP, pp. 785-788, 2002. [6] R. Schettini, C. Brambilla, G. Ciocca, Valsasna,M. De Ponti, A hierarchical classification strategy for digital documents, Pattern Recognition, vol 35, pp. 1759-1769, 2002. [7] Olivier Bousquet, Stéphane Boucheron, and Gabor Lugosi, Introduction to Statistical Learning Theory, Advanced Lectures on Machine Learning, pp.169-207, 20 03 [8] S. B. Kotsiant is, Supervised M achine Learni ng: A Review of Classification Techniques, Informatica journal, Vo- lume 31, Number 3, pp. 249-268, 200 7. [9] Jay Gao, Decision Tree Image Analysis, Digital Analysis of Remotely Sensed Imagery book, The McGraw-Hill Companies, Inc. pp.351-388, 200 9. [10] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees, New York: Chap- man & Hall, 1984. [11] G.P. Zhang, Neural Network for classification: A Survey, IEEE Transaction on Systems, Man and Cybernetics-Part C: applications and reviews, Vol.30, no. 4, pp. 451-462, 2000. [12] Ajith Abraham, Artificial Neural Networks, Handbook of Measuring System Design, Peter Sydenham and Richard Thorn (Eds.), John Wiley and Sons Ltd., London, pp. 901-908, 2005. [13] Hyontai Sug, Performance Comparison of RBF networks and MLPs for Classification, Proceedings of the 9th WSEAS International Conference on applied Informatics and Communications (AIC ’09), pp.450-454, 2009. [14] Lamiroy, Bart and Sun, Tao, Precision and Recall Without Ground Truth, In Ninth IAPR International Workshop on Graphics RECognition – GREC 2011, Seoul, Core, sep. 2011. [15] John Makhoul and Francis Kubala and Richard Schwartz and Ralph Weischedel,Performance Measures For Infor- mation Extraction, In Proceedings of DARPA Broadcast News Workshop, pp. 249-252,1999. |