Vol.2, No.1, 49-53 (2010) Natural Science
http://dx.doi.org/10.4236/ns.2010.21007
Copyright © 2010 SciRes. OPEN ACCESS
Face recognition based on manifold learning and
Rényi entropy
Wen-Ming Cao, Ning Li
Intelligent Information Processing Key Laboratory, Shenzhen University, Shenzhen, China; wmcao@szu.edu.cn
Received 10 September 2009; revised 29 October 2009; accepted 9 November 2009.
ABSTRACT
Though manifold learning has been success-
fully applied in wide areas, such as data visu-
alization, dimension reduction and speech rec-
ognition; few researches have been done with
the combination of the information theory and
the geometrical learning. In this paper, we carry
out a bold exploration in this field, raise a new
approach on face recognition, the intrinsic
α-Rényi entropy of the face image attained from
manifold learning is used as the characteristic
measure during recognition. The new algorithm
is tested on ORL face database, and the ex-
periments obtain the satisfying results.
Keywords: Manifold Learning; Rényi Entropy; Face
Recognition
1. INTRODUCTION
Face recognition has becoming a research hotspot in the
fields of image processing, pattern recognition and arti-
ficial intelligence in recent years. Numerous research
papers appear on the famous international publications, a
great deal of capital and manpower is invested to this
research and its relevant application system develop-
ments. However, the performance of the face recognition
could be influenced by many factors, such as illumina-
tion, gesture, age, facial expressions, image resolution,
and noise, etc; which cause difficulties for the computer
processing of face recognition, and turn it into a chal-
lenging task at the same time.
The existing face recognition methods can be roughly
classified into two categories [1]: local feature based,
and global feature based. A local feature based method
symbolizes a human face with the extracted feature vec-
tors (eyes, nose, mouth, hair, and face contours), designs
certain kinds of classifiers to make recognition. On the
other hand, a global feature based method employs the
whole images as the input feature vectors, and then
low-dimensional features are extracted by some learning
algorithms. The main difference between the two cate-
gories lies in the way how the features are extracted. In a
local feature based method, features are designed com-
pletely by the algorithm designers; while in global fea-
ture based method, features are automatically extracted
or learned by some self-learning algorithms.
Local feature based or learning based methods can be
further divided into two classes: 1) statistical learning,
such as artificial neural networks (ANN) [2-4], support
vector machine (SVM) [5,6], and Boosting [7,8]; 2)
manifold learning(or dimensionality reduction), such as
linear methods like PCA [9,10], LDA [11,12], and
nonlinear methods like Isomap [13], LLE [14], Lapla-
cian Eigenmaps [15,16].
In recent years, nonlinear dimensionality reduction
(NLDR) methods have attracted great attentions due to
their capability to deal with nonlinear and curved data
[1]. All these algorithms rely on an assumption that the
image data lie on or close to a smooth low-dimensional
manifold in a high-dimensional input image space. A big
limitation of NLDR algorithms is the way how to esti-
mate the intrinsic dimension of the manifold. LLE and
Laplacian Eigenmaps do not give method to estimate the
intrinsic dimension; Isomap shows a simple way to es-
timate the intrinsic dimension by searching for the “el-
bow point” where the residual error decreases signifi-
cantly. However, for some real data, it is difficult to find
an obvious “elbow point” to indicate the intrinsic di-
mension.
The intrinsic dimensionality estimation of a data set is
a classical problem of pattern recognition. From the
math point, the intrinsic dimension of a manifold is the
dimension of the vector space that is homeomorphic to
local neighborhoods of the manifold; in other words,
intrinsic dimension describes how many “degrees of
freedom” are necessary to generate the observed data.
When the samples are drawn from a large population of
signals one can interpret them as realizations from a
multivariate distribution supported on the manifold. The
intrinsic entropy of random samples obtained from a
manifold is an information theoretic measure of the
complexity of this distribution. The entropy is a finite
value when the distribution satisfies the restriction of
50 W. M. Cao et al. / Natural Science 2 (2010) 49-53
Copyright © 2010 SciRes. OPEN ACCESS
Lebesgue integral on the lower dimensional manifold.
In 2003, Costa and Hero proposed an algorithm that
can jointly estimate both the intrinsic dimension and
intrinsic entropy of a dataset randomly distributed on a
smooth manifold [17,18]. The algorithm constructs the
Euclidean K-NN graph over all the sample points and
use its growth rate to estimate the intrinsic dimension
and entropy by simple linear least squares and method of
moments procedure. This approach allows for the esti-
mation of the desired quantities using algorithms with
low computational complexity that avoid reconstructing
the manifold or estimating multivariate distributions.
Considering the fact that on the various conditions of
distance, direction, illumination or gestures, one object
can form different face images; the corresponding high-
dimensional dataset of those images can be viewed as a
nonlinear low-dimensional embedded manifold deter-
mined by factors of illumination, location, scale, gesture,
etc. Based on Costa’s algorithm, we take Rényi entropy
as the characteristic vector; present a novel face recogni-
tion method—Face Recognition Based on Manifold
Learning and Rényi Entropy.
2. MANIFOLD LEARNING USING
K-NEAREST NEIGHBOR GRAPHS
Costa’s algorithm is based on minimal Euclidean graph
methods. Let
1,,
nn
X
X
be n independent iden-
tically distributed (i.i.d.) random sample points in a com-
pact subset of d
, with multivariate Lebesgue density f.
xn is also called the set of random vertices on a minimal
Euclidean graph. First, a Euclidean k-nearest neighbors
(k-NN) graph is constructed over all the sample points.
Let

,ki n
N
be the set of k-nearest neighbors of xn in
Xi. Then the total edge length functional of the Euclidean
k-NN graph is:
 

,
1
,
n
ki
n
KNN
ni
iXN
LdXX

where γ is a power weighting constant,
,i
dXX is the
Euclidean distance between X and X
i. The k-NN graph
exhibits an asymptotic behavior of their total edge func-
tional:
 

,
lim/exp 1
ndL
nLn Hf

 
 
where ,dL
is a constant independent of f, and


1log
1
d
d
f
fxdxH
is the extrinsic Rényi α-entropy of the multivariate
Lebesgue density f. So the asymptotic unbiased and
strongly consistent estimator of the α-entropy is:
 
,
1
ˆlog/ log
1
nndL
HLn



The convergence results for k-NN graphs are extended
from Euclidean spaces to general Riemannian manifolds.
Let (M, g) be a compact smooth Riemann m-dimen-
sional manifold. Suppose 1,,{}
nn
yYY are i.i.d. ran-
dom elements of M with bounded density f relative to
the volume element g
. Let ˆ
L
be the total edge leng-
th of the k-NN graph. Assume 2,1mm
 and let
/mm

 .Then

,
ˆ
lim n
mL g
M
n
Ly
f
ydy
n


where ,mL
is a constant independent of f and M. Ac-
cordingly, the asymptotic unbiased and strongly consis-
tent estimator of the α-entropy

,Mg
H
f
is:

 

,
,
ˆlog(/) log
m
m
Mg M
nnmL
m
Hy Lyn


Define
ˆ
log
nn
lLy
, the convergence theorem (5)
suggests the log-linear model below:
log
nn
lanb

where


,
,
/,
log/ Mg
mL
amm
bmHf


and n
is an error residual that goes to zero as n.
Bootstrap resampling is used here to estimate mean
graph length
[]
M
n
EL y
, and linear least squares (LLS)
is applied to jointly estimate slope ˆ
a and intercept ˆ
b
from sequence
log[], log
M
nn
EL yn
. After that, the
following estimates of dimension ˆ
m and entropy ˆ
H
are obtained by inversion of the relations:



,
ˆ,
ˆˆ
/1
ˆ
ˆlog
Mg
mL
m rounda
m
Hb


The specific steps of the algorithm are described as:
1) Using entire database of signals
1,,
nn
YYy
to construct K-NN graph.
2) Estimate the dimension and α-Rényi entropy of the
manifold that the sample sets lie in.
a) Set parameters M, N, 1,,
Q
p
p, (1Q
p
pn );
b) Initialize 0,0,1mHM
;
c) Choose P from 1,,
Q
p
p in turn,
.0; 1;LN
W. M. Cao et al. / Natural Science 2 (2010) 49-53 51
Copyright © 2010 SciRes. OPEN ACCESS
. Randomly select a subset of P signals
p
y from
n
y; Compute graph total edge length
p
L
;
p
L
LL ;
. 1;NN


if NN
, goto step , else goto ;
. Compute sample average graph length

ˆˆ/;
p
ELyL N
d) Use Eq.9 to estimate dimension ˆ
m and entropy
ˆ
H from


1
ˆˆQ
p
p
p
p
ELy


;
,ˆ
ˆ
M
M
HHmmm H

 ;
e) 1MM

, if
M
M
, goto step (f), else goto (c).
f) The estimate dimension ˆ/mmM; α-Rényi en-
tropy ˆ/
H
HM.
g). End.
The parameters M and N are used to provide a trade-
off between the bias and variance performance of the
estimators for finite n. ,mL
is the limit of the normal-
ized length functional of the corresponding Euclidean
entropic graph for a uniform distribution on the unit
cube

0,1 m. It can be determined by performing Monte
Carlo simulations of the entropic graph length on the
unit cube

0,1 m for uniform random samples.
Unlike previous solutions, Costa’s algorithm can
prove statistical consistency of the obtained estimators
under weak assumptions of compactness of the manifold
and boundedness of the (Lebesgue) sampling density
supported on the manifold. This approach allows for the
estimation of the desired quantities using algorithms
with low computational complexity (

logOnn) that
avoid reconstructing the manifold or estimating multi-
variate distributions.
3. FACE RECOGNITION ALGORITHM
BASED ON MANIFOLD’S RÉNYI
ENTROPY
We applied the method to a real high-dimensional data-
set with unknown manifold structure and intrinsic di-
mension and entropy---face images. We chose “ORL
Face Database” [19] of AT&T Laboratories Cambridge.
There are 10 different images of each of 40 distinct peo-
ple (40 sample sets). The images were taken at different
times, varying the lighting, facial expressions (open /
closed eyes, smiling / not smiling) and facial details
(glasses / no glasses).
The original image is in PGM format, 92*112 pixels,
with 256 grey levels per pixel. We first processed each
image into BMP format, 64*64 pixels, normalized the
pixel values between [0, 1]. Then we arranged each im-
age in a 4096*1 matrix using the common lexicographic
order.
From each of the 40 sample sets ,(1, ,40)
i
si, we
randomly chose 3,4,5,6,7,8 images as the training sam-
ples; hence there were 40*3=120, 40*4=160, 40*5=200,
40*6=240, 40*7=280, 40*8=320 samples in the training
set respectively. All 400 images were treated as test
samples. Each 5,6,7,8 images of the same sample set
(4096*5, 4069*6, 4069*7, 4069*8 matrix) were trained
at one time to get an estimate Rényi Entropy ˆ
i
s
H
ac-
cording to the algorithm introduced in section 2. The
estimate entropy served as the characteristic vector of
the recognition, each test image X was combined with
every 3,4,5,6,7,8 sample images in one sample set
(4096*n,n=3,4,5,6,7,8 matrix respectively ) to get an
estimate Rényi entropy ˆ
X
H, then the classification cri-
terion is :

1, ,40
ˆˆˆˆ
|min
ki
kX SX S
i
XsHH HH


X would be classified to the set where its combined
entropy is nearest to the origin entropy of the training
set.
4. EXPERIMENT RESULTS
The parameters used in the experiment were set as fol-
lows: 3, 1,1,5,6kMNn
. Figure 1 shows the
5 training images of face 1; and Figure 2 displays the
results of running 20 simulations of the Costa’s algorithm
Figure 1. The 5 training images of face 1.
0246810 12 1416 18 20
4
4. 5
5
5. 5
6
6. 5
7
7. 5
8
Simulation Times
Estimate Dimension
m=6
m=5
Figure 2. The estimate manifold dimension of face 1 is
between 5~6.
52 W. M. Cao et al. / Natural Science 2 (2010) 49-53
Copyright © 2010 SciRes. OPEN ACCESS
using the 5 training images in Figure 2. In same condi-
tions, as shown in Table 3, our Rényi Entropy method
gives a weakly higher recognition accuracy compared
with PCA and 2DPCA methods for ORL face database in
Figure 3.
Table 1 shows a part of estimate results of the training
set from ORL face database.
Using the Face recognition algorithm in section3, the
results of our method are showed in Table 2.
The results indicate that the face recognition algo-
rithm based on Manifold’s Rényi Entropy we present in
this paper works effectively. The K–NN graph is em-
ployed so that we avoid the complex estimation of geo-
desic distances on the manifold.
An important factor influenced the recognition effi-
ciency is the sample number of the same set we used is a
little small (only 10), which induce the lack of informa-
tion needed for training samples. Besides, there must be
certain inevitable information loss during the procedure
of image pretreatment. Further work includes add boot-
strap confidence intervals for the estimators and apply
other classification method to minimize the error.
Table 1. Dimension estimate ˆ
m and entropy ˆ
H
for some
training face images
ˆ
m ˆ
H(bits )
Face 1 5 20.856
Face 4 6 19.890
Face 7 6 21.052
Face 14 5 20.175
Face 17 5 18.706
Face 39 5 17.804
Table 2. The results of Face recognition algorithm based on
Manifold’s Rényi Entropy.
Training
samples
Computa-
tional com-
plexity
Test samples
Error
recog-
nition
Correct
Recog-
nition
rate
120 7 400 118 70.5%
160 8 400 51 87.2%
200 11 400 19
95.2
240 13 400 12 97%
280 18 400 10 97.5%
320 21 400 7 98.2%
Figure 3. PCA , 2DPCA, Rényi entropy.
5. CONCLUSIONS
We have presented a new face recognition algorithm
based on Manifold’s Rényi Entropy. The algorithm is
applied to ORL face database and the experiments obtain
the satisfying results. At present we are studying the
theoretic prove of this method, and trying to make fur-
ther utilizations in other research fields such as pattern
recognition and artificial intelligence.
6. ACKNOWLEDGMENTS
The work is supported by the National Science Foundation of China
(No.60871093), and Pre-Research and Defense Fund of China:
No.9140C80002080C80.
REFERENCES
[1] Lin, T. (2005) Machine perception of human faces: a
grand challenge. The Korea Foundations for Advanced
Studies (KFAS).
[2] Rowley, H.A., Baluja, S. and Kanade, T. (1998) Neural
network-based face detection [J]. IEEE Trans Pattern
Analysis and Machine Intelligence, 20(1), 23-38.
[3] Feraud, R., Bernier, O.J., Villet, J.E. and Collobert, M.
(2001) A fast and accurate face detector based on neural
networks [J]. IEEE Trans Pattern Analysis and Machine
Intelligence, 22(1), 42-53.
[4] Garcia, C. and Delakis, M. (2004) Convolutional face
finder: a neural architecture for fast and robust face de-
tection [J]. IEEE Trans Pattern Analysis and Machine
Intelligence, 26(11), 1408-1423.
[5] Osuna, E., Freund, R. and Girosi F. (1997) Training sup-
port vector machines: an application to face detection [J].
Proc, IEEE Conf Computer Vision and Pattern Recogni-
tion, 130-136.
[6] Phillips, P.J. (1998) Support vector machines applied to
face recognition [J]. Adv. Neural Inform. Process. Syst.
11, 803-809.
[7] Viola, P. and Jones, M. J. (2004) Robust real-time face
detection [J]. Int. J. Computer Vision, 57(2).
[8] Li, S.Z. and Zhang, Z. (2004) Float boost learning and
statistical face detection [J]. IEEE Trans Pattern Anal.
Mach. Intel., 26(9), 1112-1123.
[9] Kirby, M. and Sirovic, L. (1990) Application of the kar-
hunen-loeve procedure for the characterization of human
faces [J]. IEEE Trans Pattern Anal. Mach. Intel., 12(1),
103-108.
[10] Turk, M. and Pentland, A. (1991) Eigenfaces for Recog-
nition [J]. J. Cogn. Neurosci, 3, 72-86.
[11] Belhumeur, P.N., Hespanha, J.P. and Kriegman, D.J.
(1997) Eigenfaces vs. fisherfaces: recognition using class
specific linear projection [J]. IEEE Transactions on Pat-
tern Analysis and Machine Intelligence, 19(7), 711- 720.
[12] Kim, T.K. and Kittler, J. (2005) Locally linear discrimi-
nant analysis for multimodally distributed classes for
face recognition with a single model image [J]. IEEE
Transactions on Pattern Analysis and Machine Intelli-
gence, 27(3), 318-327.
W. M. Cao et al. / Natural Science 2 (2010) 49-53 53
Copyright © 2010 SciRes. OPEN ACCESS
[13] Tenenbaum, J.B., de Silva, V. and Langford, J.C. (2000)
A global geometric framework for nonlinear dimension-
ality reduction [J] Science, 290, 2319-2323.
[14] Roweis, S.T. and Saul L.K. (2000) Nonlinear dimension-
ality reduction by locally linear embedding [J]. Science,
290, 2323-2326.
[15] Belkin, M. and Niyogi, P. (2003) Laplacian eigenmaps
for dimensionality reduction and data representation [J].
Neural Computation, 15(6), 1373-1396.
[16] He, X., Yan, S., Hu, Y., Niyogi, P., and Zhang, H. (2005)
Face recognition using laplacianfaces [J]. IEEE Transac-
tions on Pattern Analysis and Machine Intelligence,
27(3), 328-340.
[17] Costa, J.A. and Hero, A.O. (2003) Entropic graphs for
manifold learning [J]. In the Thirty-Seventh Asilomar
Conference on Signals, Systems and Computers, 316- 320.
[18] Costa, J.A. and Hero, A.O. (2004) Manifold learning
using k-nearest neighbor graphs [J]. Proc. of IEEE Int.
Conf. on Acoust. Speech and Signal Processing. Mont-
real, Canada.
[19] ORL, Face Database, AT&T Laboratories Cambridge.
[DB/OL].
http://www.cl.cam.ac.uk/Research/DTG/attarchive/pub/d
ata/att_fa ce s. zip.