Applied Mathematics, 2011, 2, 1258-1262
doi:10.4236/am.2011.210175 Published Online October 2011 (http://www.SciRP.org/journal/am)
Copyright © 2011 SciRes. AM
Convergence Rates of Density Estimation in Besov Spaces
Huiying Wang
Department of Ap pl i e d M athematics, Beijing University of Technology, Beijing, Ch ina
E-mail: b200806005@emails.bjut.edu.cn
Received July 29, 201 1; revised August 23, 2011; accepted August 30, 2011
Abstract
The optimality of a density estimation on Besov spaces
,
s
rq
BR for the risk was established by
Donoho, Johnstone, Kerkyacharian and Picard (“Density estimation by wavelet thresholding,” The Annals of
Statistics, Vol. 24, No. 2, 1996, pp. 508-539). To show the lower bound of optimal rates of conver-
gence
p
L
,,
s
nrq
R
Bp, they use Korostelev and Assouad lemmas. However, the conditions of those two lemmas
are difficult to be verified. This paper aims to give another proof for that bound by using Fano’s Lemma,
which looks a little simpler. In addition, our method can be used in many other statistical models for lower
bounds of estimations.
Keywords: Optimal Rate of Convergence, Density Estimation, Besov Spaces, Wavelets
1. Introduction
Wavelet analysis has many applications, one of which is
to estimate an unknown density function based on inde-
pendent and identically distributed (i.i.d.) random sam-
ples. Let be a probability measurable space
and 1
(,,)P
,,n
X
X be i.i.d. random variables with an un-
known density function f. We use to denote the
expectation of a random variable X. The sequence

EX

,:inf sup
n
n
n
p
ffV
RVpE ff
 is called optimal rate
of convergence on the functional class V for the
p
L risk.
Here, n
f
is an arbitrary estimator of f with n i.i.d. ran-
dom samples. Kerkyacharian and Picard [1] study
n when V is a Besov space with matched case.
Donoho, Johnstone, Kerkyacharian and Picard [2] con-
sider unmatched cases. In fact, they show the optimal
convergence rates for Besov class
,p
RV
r
L,
s
rq
B
and
p
L
risk



1/ 1/
21/1
,
21
ln, ,
21
,
,.
21
sr p
sr
s
nrq s
s
p
nnr s
RB pp
nr
s


(1.1)
To show the lower bou nd of (1.1), author s of [2,3] use
Korostelev and Assouad lemmas. However, the condi-
tions of those two lemmas are difficult to be verified. In
this small paper, we give another proof for the lower
bound of (1.1) by using Fano’s lemma [4]. It should be
pointed out that Fano’s lemma can be used to a variety of
statistical models, see [5-7].
As usual,
1
p
LRp denotes the classical Lebes-
gue space on the real line R. In particular,
2
LR
R
stands
for the Hilbert space, which consists of all square inte-
grable functions. As a subspace of p, the Sobolev
space with an integer exponent k means
L




:,,0,1,,1
m
k
pp
WRffLRmkp .

The corresponding norm

:.
k
p
k
Wp
p
fff
Moreover, the Besov space
, [3] ,
s
pq
BR (1 pq
,
s
n
and (0,1])
can be defined by
 


2
,,2 ,2
n
snjj
p
qpp
jZ
BR fWRfl
 
q
with the associated norm


,
2()
:2,2
sn
pq pq
jnj
p
BW lZ
ff f
 ,
where

2,:sup2 2.
pht
p
ftfxhfxh fx

In general, it can be shown that compactly supported and
n times differentiable functions belong to
,
s
pq
BR for
H. Y. WANG
1259
Where p and q are density functions of P Q respec-
tively.
Lemma 1.2. (Fano’s Lemma, [4]) Let be
pr
,
0
s
n and 1, .
pq
The Besov space can be discretized by the sequence
norm of wavelet coefficients. Many useful wavelets are
generated by scaling functions. More precisely, if
is
a scaling function with
 
22 ,
k
k
x
hxk


then

122
k
:1
k
k
x
hx
k


defines a wave-
let [3]. Clearly, when
is compactly supported and
continuous, the corresponding wavelet
has the same
properties. An orthonormal wavelet basis of
2
LR is
generated from dilation and translation of a scaling func-
tion and its corresponding wavelet, i.e.




0
0
0
2
2
,
:2 2,
:2 2.
jj
jj
0
jk
jk
j
jkZ
xxk
xxk





Although wavelet basis are constructed for
2
LR,
most of them constitute unconditional bases for
p
LR.
A scaling function
is called t regular, if
has con-
tinuous derivatives of order t and its corresponding
wavelet
has vanishing moments of order t, i.e.

k
xx
d0, 0,1,,1x kt
.
The following lemma [3] plays important roles in this
paper.
Lemma 1.1. Let
be a compactly supported, t regu-
lar orthonormal scaling function with the corresponding
wavelet
and0
s
t. If

p
f
LR, 00kk
:,,sf
:,
j
kjk
and 1, df
pq
, then the following
two conditions are equivalent:
1)

,
s
pq
f
BR;
2)
11
2
0
0
2js pj
pp
jq
sd











.
Furthermore,
,
11
2
0
0
2
s
pq
js pj
Bpp
jq
fsd










.
Before introducing Fano’s Lemma, we need the nota-
tion of Kullback-Leilber distance [4]. Let P and Q with P
being absolutely continuous with respect to Q (denoted
by ). Then the Kullback-Leilber distance is de-
fined by
PQ
 

0
,:ln d
pq
px
(,, )
k
P
obability measurable spaces and k
A, 0,1, ,km
.
If k
A
,
K
PQp xx
qx

v
A
kv
, then with c
A
standing for for
the complement of A and

,
kv
v
0
1
:inf
m
vm k
K
PP
,
m

 
1
1,exp 3.
2m
m e
0sup min
c
kk
km
PA


By Lemma 1.1 and 1.2, we can show the following re-
sult:
Theorem 1.1. Let

,,
s
rq
f
BRL with, 1rq
,
1p
 and 1sr . If n
f
is an estimator of f with
n i.i.d. random samples, then



,
1/ 1/
21/1 21
,
ln
supmax, ,
s
rq
sr ps
sr s
np
fB RL
n
Ef fn






n



where
 
,
,,
,:, s
rq
ss
rqrq B
BRL fBRfL 
f
and
x
y means has compact support}; The notation
with a constant C.
rk 1.1. Note that
x
Cy
Rema
 
1/ 1/
21/1
ln 1/ 1/
21/1
21 ln
max ,
s
rp p
s
sr
n
n
 sr
sr
sn
n

n

 




for


21
p
r
s
and for 21
p
r
s

1/ 1/
1/1 21 21
max ,
srp
2
ln
s
s
sr
s
s
nn
n







.
Then theorem 1.1 is a reformulate of the loweund in
(1.1). By using the idea of reference [5], we show this
theorem in the next two sections.
irstly, we prove
n




r bo
2. Proof of Theorem 1.1
F



1/ 1/
21/1
ln .
s
,,
sup
s
rq
rp
sr
np
n
f fn





One need construct
fB RL
E
such that
,
s
kr
k
g
,q
g
BRL
and


1/ 1/
/1
.
21
ln
sup
s
rp
r
nk
p
kn


Let
s
n
Ef g


ono be a compactly supported, regular and
orthrmal scaling function,

tt s
be the corresponding
wavlet with suppe
0,l
, lN
. Here and after,
Copyright © 2011 SciRes. AM
1260
dgers. Th
H. Y. WANG
N enotes the set of positive inteen there ex-
ists a compactly supported density function g (i.e.

0gx and

g
) satisfying
 
,
s
rq
d1x x
g
xBR and


0
0, 0.
l
gx c

,2
j
lll l. ThenLet
elem

,2,,21
j
:0,
j

ents in the number of
j
enote
ed by [ is 2
j
5], one defi nes
1, dd by1. #2
j
j
 

2 1/
Motivat 1/
:2
j
sr and
j
a
 

2j
kjjk
kl
gx gxxIk
:,
j
a
:1
if 2
with 2
I

j
kl
j
kl, else

2:0
j
kl
I. Obvi-
ously, 2jl
g
g,

gx x

d1 and

1/s r
k
gx for large implies
that k
020c
 
jj, which
g
his a density function
e assumptions of for each k.
By t
, the wavelet
is com-
pported and pactly su
s
rq
Bt times differentiable. Therefore,
 
Rts
 and

,.
s
krq
,
g
BRBec ause

1/2 1/
21
js r
j
a , ,
s
rq
jjk
B
aC
and so is ,
s
rq
kB
g
Hence, due to Lemma 1.1.
,
s
krq

,
g
xBRL. Clearly,

2
1/ 1/
2:
j
kk kjjk
plp
p
jspr
j
p
gggga

 

(2.1)
For

2j
j
kk l
 due to
:2
1/2 1/
j
s
j
a
r. Fur-
thermore, :2
j
knk
p
Afg




satisfies kk
AA
. Recall that 1. By Lemmfor kk #2
ja 1.2,
j
 

2
13
supmin,2 expj
k
c j
gk





. Here and
,
2
j
ke
 
after
n
PA
n
f
P
thstands for the peasure correspond-robability m
oe density function ing t

2n1
:
n
f
xfxf
nn xfx.
It is easy to see that 0k
g
g
P
P from the constructions
of k
g
. Since n
f
is an estimator of density with n i.i.d.
ra ndom samp le s,


2
ik
jjj
nnc
nkg nkgk
pp
EfgPfgP A


 

.
22

Then,


2
supsup 2
13
min,2 exp.
22
k
j
k j
j
jnc
nk gk
pk
jj
EfgP A
e







(2.2)
Next, one shows12
0
2j
j
cna

 

12
1
12 1
02
,:ln d
nn
n
nn n
n
ff
fx
K
PPf xx
fx

,
11
nn
j
fx

1
j
f
x and

212
nn
j
j
f
xf


x. Then

 

111
12 112
12
,lnd,
i
nn ii
ii
fx
: Recall that
.
n
K
PPfxx nKPP
fx

Note that

 

1
11
12 12
,:ln d
fx
PP fxx
fx
ln 1uu
K
and
for en
0 Thu.



 

 
1
121 2
1
12
12
212
,lnd
1d
d.
nn fx
KP Pnfxx
fx
fx
nfx x
fx
nfxfx fxx






Hence,
2
2:inf 2,2,.
jkvk j
l
jj
jnn jnn
gg gg
vkv k
KP PKPP

 


Moreover,
 
12
22d
j
j
jk
k
ngxgxgx


.x (2.3)
ng to the definition of Accordik
g
,

supp 0,
k
g
gl
and
0
g
xc
on
0,l. Thus,
 
12
k
g
xgxg
x
 
22
11212
00
2
d
0
j
jkj jkj
xxc axca


by the
orthonormality of
dx ca

. Then (2.3) reduces to
j
k
12
0
2.
j
j
cna
(2.4)
ke Ta

1
21
2ln
/1
11
2
22
2l
js r
j
na nn




.
s
r
n
. Then
j
n



n
Now, one can choose such that
lnC n
0C2
j
na
and
0
41Csr c2



. Therefore,

41/ 2
1
sr c
n

1
11
00
22
22 ln
jcC
jj
jn
eena
n






an ) reduces to d (2.2
jEf
supknk j
p
g C
 n . The
the desired follows from
1/
2
1/
j
sp
p
 )
r
j
by (2.1
and

1
21/1
2ln
sr
jn
n



.


,
21
,
sup
s
rq
s
s
np
RL
Ef fn
Now, we prove
fB
. Our
proof depends on another lemma [4].
Copyright © 2011 SciRes. AM
H. Y. WANG
1261
ov-Gilbert) Let Lemma 2.1. (Varsham


1
:,,
m

 ,

0,1
i
. Then there exists a sub-
set
0,,
M
of
/8m
with

00, ,0 such that
2
M
and

10.
8
mij
m
kk
k
ij
M

 
construct It is sufficient to

0,1, ,
i
g
i
M such
that

,,
is
rq
g
BRL
and

21
.
is
npn
(2.5) sup
iEf g
As proove, let
s
ved ab
be a compactly supported,
tt s regular on,and orthormal scaling functi on
be
uppthe corresponding wavelet with sN
0,l
, l
.
s

s
Assume,,
rq
g
BR

1/2 , :
j


:
L

and ne
:
j
a
0,,2,ll [0, ]0
|0
l
gc

1l and

. Defi
2
js ,2
j
j
jkjk
k
g
xgxa x


with


0,1 2
j
j
kk


 (note that 0
g
g). Since

0,1
k
, one knows that 2
j
r
j
k
k

nd a

1/
1/2 1/
21
j
j
r
r
rk
k
a






By Lemma 1.1,
.
js
,
s
jrq
kjk
B

j
k
aC

, and so is
,
s
rq
B
g
. Hence

,
s,rq
g
BRL.
at the supports ofNote th
j
k
for
j
k are mutu-
ally disjoint. Then

00
20
js
jjk
gxc ac

 
for big j. This with1
 
dd
g
xx gxx

 implies
that
g
is a density

2
0,1function for each
j
1, there exists . Ac-
cording to Lemma 2.
01
,,,
M

such
that 3
2
2
j
M
and
3
2.
li j
k

(2.6)
j
kk

Because suppjk
supp jk

for
j
kk

, one
knows that

1
2
li j
pp
pl i
jk k
k
p
j
p
jk
p
p
p
sp jli
kk
k
p
gg a




This with (2.6) and



,
l
k

0,1
i
k
leads to
3
2
li pj
p
g

 and
2
pps
p
g
1/
82:
li psj
j
p
p
gg


 . (2.7)
Clearly, the sets

0,1,,
2
ii
j
n
A
fgi M


 


satisfy li
AA

for . Then Fano’s Lemma
yi
il
elds


1
0
1
supmin,expM3.
2
i
i
nc
gM
iM
PA e



(2.8)
On the other hand, it follows 12
2
0
j
M
c
f of (2.4). Take
j
na from
the similar arguments to the proo
1
21
2j
s
n
. Then
choose a c0 suhat
2(21)
21
sj
j
naa 
. Hence, one can
onstant ch t
C
12 1
44
00
22
22
2e2e1
jj
jj
j
Mcna cC
Me

 .
Therefore, (2.8) reduces to

0sup i
i
nc
g
iM
PA C

0 and
0supi
np
iM
Efg

0sup .
22
j
iM C
i
i
jj
n
gn p
Pf
g




This with (2.7) and 1
21
2j
s
n
yield (2.5).
3.
The author Huiying Wang is grateful to the referees for
their valuable comments and thanks her advisoProfes-
sor Youming Liu, for his helpful guidance. Th work is
supported by the National Natural Scien ce Foundation of
China (No. 10871012) and Natural Science Foundation
of Beijing (No. 1082003) .
] G. Kerkyacharian and D. Picard, “Density Estimation in
Acknowledgements
r,
is
4. References
[1 Besov Spaces,” Statistics & Probability Letters, Vol. 13,
No. 1, 1992, pp. 15-24.
doi:10.1016/0167-7152(92)90231-S
[2] D. L. Donoho, I. M. Johnstone, G. Kerkyacharian and D.
Picard, “Density Estimation by Wavelet Thresholding,”
The Annals of Statistics, Vol. 24, No. 2, 1996, pp. 508-
539. doi:10.1214/aos/1032894451
Kerkyacharian, D. Picard and A. B. Tsy-
lets, Approximation and Statistical Appli-
cations,” Springer-Verlag, New York, 1997.
mir Zaiats, Springer
rk, 2009.
[3] W. Härdle, G.
bakov, “Wave
[4] A. B. Tsybakov, “Introduction to Nonparametric Estima-
tion,” (English) Revised and Extended from the 2004
French Original, Translated by Vladi
Series in Statistics, Springer, New Yo
[5] P. Baldi, G. Kerkyacharian, D. Marinucci and D. Picard,
“Adaptive Density Estimation for Directional Data Using
Needlets,” The Annals of Statistics, Vol. 37, No. 6A,
Copyright © 2011 SciRes. AM
H. Y. WANG
Copyright © 2011 SciRes. AM
1262
9-AOS6822009, pp. 3362-3395. doi:10.1214/0
05.010
[6] C. Christophe, “Regression with Random Design: A
Minimax Study,” Statistics & Probability Letters, Vol. 77,
No. 1, 2007, pp. 40-53.
doi:10.1016/j.spl.2006.
[7] A. B. Tsybakov, “Optimal Rates of Aggregation,” COLT/
Kernel 2003 Lecture Notes in Artificial Intelligence 2777,
Springer, Heidelberg, 2003, pp. 303-313.