Advances in Pure Mathematics
Vol.06 No.07(2016), Article ID:67409,9 pages
10.4236/apm.2016.67036
An Efficient Adaptive Iteratively Reweighted l1 Algorithm for Elastic lq Regularization
Yong Zhang, Wanzhou Ye
Department of Mathematics, College of Science, Shanghai University, Shanghai, China
Copyright © 2016 by authors and Scientific Research Publishing Inc.
This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/
Received 12 May 2016; accepted 13 June 2016; published 16 June 2016
ABSTRACT
In this paper, we propose an efficient adaptive iteratively reweighted algorithm (A-IRL1 algorithm) for solving the elastic
regularization problem. We prove that the sequence generated by the A-IRL1 algorithm is convergent for any rational
and the limit is a critical point of the elastic
regularization problem. Under certain conditions, we present an error bound for the limit point of convergent sequence.
Keywords:
Compressed Sensing, Elastic Minimization, Nonconvex Optimization, Convergence, Critical Point
1. Introduction
Compressed sensing (CS) has been emerging as very active research field and brings about great changes in the fields of signal processing in recent years [1] [2] . The main task of CS focuses on the recovery of sparse signal from a small number of linear measurement data. It can be mathematically modeled as following optimization problem,
(1)
where,
(commonly
) is a measurement matrix and
, formally called the
quasi-norm, denotes the number of nonzero components of
. In general, it is difficult to tackle problem (1) due to its nonsmooth and nonconvex nature. In recent years, some researchers have proposed the
norm regularization problem [3] - [5] with
, that is, to consider the following
regularization problem
(2)
or the unconstrained regularization problem
(3)
where for
and
is a regularization parameter.
When, it is well known that the problems (2) and (3) are both convex optimization problems, and therefore, can be solved efficiently [6] [7] . On the other hand, when
, the above problems (2) and (3) lead to nonconvex, nonsmooth and even non-Lipschitz optimization problem. It is difficult to solve them fastly and efficiently. Iterative reweighted algorithms, which include iteratively reweighted
algorithm [8] and iteratively reweighted least squares [9] , are very effective for solving the nonconvex
regularization problem.
In this paper, we consider the following elastic regularization problem,
(4)
where are two parameters. When
, the above problem (4) reduces to the well-known elastic-net regularization proposed by Zou and Hastie [10] , which is an effective method for variable selection. In [10] , Zou et al. showed that this method outperformed Lasso [11] in terms of prediction accuracy for both simulation studies and real-data applications on variable selection. For further statistical properties of the elastic-net regularization in detail, we refer to references [12] [13] . When
, problem (4) is an extension of elastic net regularization from
penalty to
penalty. In statistics, elastic
regularization is usually very effective for group variable selection.
Obviously, for, the
norm term in (4) is not differentiable at zero. Therefore, in this paper, we study the following relaxed elastic
minimization problem with
(5)
The model (5) can be considered as an approximation to the model (4) as. In order to solve the above problem (5), we propose the following adaptive iteratively reweighted
minimization algorithm (A-IRL1 algorithm),
(6)
where the weight is defined by the previous iterates and updated in each iteration as
The adaptive iteratively update of in the proposed algorithm is the same as the one in [9] , which is also adopted in [14] . The A-IRL1 algorithm (6) solves a convex
minimization problem, which can be solved by many efficient algorithms [6] [7] [15] .
The relaxed elastic regularization problem (5) can be solved by A-IRL1 algorithm (6). In this paper, we prove that any sequence generated by the A-IRL1 algorithm (6) is convergent itself for any rational
as the case
. Moreover, we present an error bound between the limit point and the sparse solution of problem (1).
The rest of this paper is organized as follows: In Section 2, we summarize the A-IRL1 algorithm for solving elastic regularization problem (5). In Section 3, we present a detail convergence analysis for the A-IRL1 algorithm (6). We prove that the A-IRL1 algorithm is convergent for any rational
based on an algebraic method with
. Furthermore, under certain conditions, we present an error bound between the limit point and the sparse solution of problem (1). Finally, a conclusion is given in Section 4.
2. A-IRL1 Algorithm for Solving Elastic Regularization
We give a detailed implementation of A-IRL1 algorithm (6) for solving elastic regularization problem (5). The algorithm is summarized as Algorithm 1.
In Algorithm 1, is the rearrangement of the absolute values of
in decreasing order. If
, we choose
to be the approximate sparse solution and stop iteration. Otherwise, we stop the algorithm within a reasonable time and return the last
.
It is clear from Algorithm 1 that is a nonincreasing sequence which is convergent to some nonnegative number
. In the next section, we prove that the sequence
is convergent when
, and the limit is a critical point of problem (5) with
. Furthermore, we also present an error bound for the limit point.
3. Convergence of Algorithm 1
In this section, we first prove that the the sequence generated by Algorithm 1 is bounded and asymp- totically regular. Then, based on an algebraic method, we prove that Algorithm 1 is convergent for any rational
with
. Next, we begin with the following inequality.
Lemma 1. Given and
, then the inequality
(7)
holds for any.
Proof. We first define. For any
, by the mean value theorem, we have
(8)
The following inequality is always hold for any,
or
,
Let and
, we thus have
(9)
After rewriting the terms of (9), we thus get the desired inequality (7).
Our next result shows the monotonicity of along the sequence
and this sequence is also asymptotically regular.
Lemma 2. Let be the sequence generated by Algorithm 1. Then we have
(10)
Furthermore,
(11)
Proof. Since is a solution of problem (6), we thus have,
Besides, we can get the subgradient of as follows,
(12)
Hence, we find
(13)
which means that there exists such that
(14)
where
We compute
(15)
Using (14), we have
(16)
Substituting (16) into (15) and using Lemma 1 yields
(17)
where the first inequality uses and
, and the last inequality uses Lemma 1. Therefore, from (17) we get the desired results (10) and (11).
From Lemma 2 (10), we know that is monotonically decreasing and bounded. Otherwise,
as
. Therefore,
is also bounded. On the other hand, from (11) we obtain that the sequence
is asymptotically regular.
In order to prove that the whole sequence generated by Algorithm 1 is convergent, we need the following lemma, which plays an important role in the proof of convergence. The following lemma mainly states that for almost every system of n polynomial equations in n complex variables, if its corresponding highest ordered system of equations have only trivial solution, then there is a finite number of solutions to the n polynomial equations. For detailed proof refer to Theorem 3.1 in [16] .
Lemma 3. ( [16] ) Let n polynomial equations in n complex variables be given, and let
be its corresponding highest ordered system of equations. If
has only the trivial solution
, then
has
solutions, where
is the degree of
.
With above lemmas, we are now in a position to present the convergence of Algorithm 1 for any rational number with
.
Theorem 1 For any, if the limit of
is
, then the sequence
generated by Algorithm 1 is convergent. Denoting the limit by
, i.e.,
. Moreover, the limit
is a critical point of problem (5) with
.
Proof. From (10), we know that the sequence is monotonically decreasing and bounded below. Thus, we can infer that the sequence
is also bounded. The boundedness of
implies that
there exists at least one convergent subsequence. We assume that is any one of the convergent subsequences of
with limit
. By (11), we know that the sequence
also converges to
. Now replacing
,
,
,
with
,
,
,
in (14) respectively, and letting
yields
(18)
where.
The above Equation (18) demonstrates that the limit of any convergent subsequence of is a stationary point of problem (5) with
. In order to prove the convergence of the whole sequence
, one first needs to prove that the limit point set, denoted by
, which contains all the limit points of convergent subsequence of
, is a finite set.
A classification is made for the limit point set with different sparsity s,
. That is the set
which contains all the limit points with each sparsity s. If we prove the set is a finite set, then we obtain that the limit point set
is also a finite set. Without loss of generality, we define a set
(19)
Furthermore, for any given with
, we define another set
(20)
From (19) and (20), we have. If we prove that the set
is a finite set, then it implies that the set
is also finite, and we further conclude that the limit point set
is a finite set.
For any with
, let
denotes the support set of
with
. By (18), we know that
satisfies the following equation
(21)
where denotes the subvector of
with components restricted to S and
denotes the submatrix of A with columns restricted to S. Next, if we prove the Equation (21) has finite solutions, then we can obtain the set
as a finite set.
It is clear that (21) can be rewritten as follows:
(22)
where is an
positive-definite matrix,
and
is the
identity matrix. We observe that (22) can further be rewritten as follows:
(23)
where is an
diagonal matrix with the diagonal entries
for
. Without loss of generality, we denote
and
. Let
, where
are two positive integers. By using simple calculation for Equation (23), we get the following system:
(24)
Since all the solutions of Equation (21) satisfy (24), we can thus show that Equation (21) has finite solutions as long as we can prove that (24) has finite solutions. To do that, we show that the following system has finite solutions:
(25)
where. Now, we extract the highest order terms from system (25) to get the following system:
(26)
To prove that system (26) has only trivial solution, we use the method of proof by contradiction. Without loss of generality, we assume is a nonzero solution of (26),
for
, and
. By the assumption
for
, and from (26) we can get the following equation:
(27)
where is the
leading principal submatrix of matrix
and
for
. Because the matrix
is positive definite; implies that the
matrix B is also positive definite, and thus we have for
. This contradicts the assumption that
,
. Therefore, we get that the system (26) has only trivial solutions. According to Lemma 3, we deduce that the system (25) has finite solutions, which further implies that the Equation (21) has also finite solutions, that is, the set
is a finite set. Therefore, we get that the limit point set
is a finite set.
Combining with as
, we thus obtain that the sequence
is convergent. By the convergence of sequence
and (18), we obtain that the limit
is a critical point of problem (5) with
.
Theorem 1 gives a detailed convergence proof of Algorithm 1 based on an algebraic approach. In the next, we will present an error bound between the convergent limit and the sparse solution of problem (1).
Under the Restricted Isometry Property (RIP) on the matrix A, we present an error bound between the convergent limit and the sparse solution of problem (1). First of all, we give a definition of RIP on the matrix A as follows.
Definition: For every integer, we define
as the s-restricted isometry constant of A as the smallest positive quantity such that
(28)
for all subsets of cardinality at most s and vectors x supported on T.
Under the RIP assumption, we can ensure that the limit is a reasonable approximation of the sparse solution if
has a very small tail in the sense that
for, which is the error term of the best s-term approximation of
in the
-norm.
With the concept of RIP, we are able to prove the result of following theorem.
Theorem 2. Suppose that x is an s-sparse solution of (1) satisfying. Assume that A satisfies the RIP of order 2s with
and
. For any fixed
.
(1) When, the limit
of convergent sequence
is a critical point of problem (5) with
, and it satisfies
(29)
(2) When, there must exist a subsequence from
converging to an s-sparse point
which satisfies
(30)
Here,
and C are positive constants dependent on
and the initial point
.
Proof. (1). In Theorem 1, we have proved that the limit of convergent sequence
is a critical point of problem (5) with
.
We use Lemma 2 to get
(31)
where we use the assumption that the initial value satisfies
and
in Algorithm 1. By (31), we have
et S be the index set of the s-sparse solution x, and let be the index set of s largest entries in the absolute value of
. Since
, we have
Letting and
.
(3) If, then
for some k or
holds for sufficiently large k and some integer
. In the former case,
is an
-sparse vector, and we denote
. In the latter case, by the boundedness of
, we have
. Then
. That is,
is an s-sparse vector. Therefore, in both cases, we have an s-sparse limit point
. Without loss of generality, we assume
. Using RIP of A, we get
where the third inequality uses (31) with, the last equality uses the assumption
and
. Denoting
. This completes the proof.
Under the condition of RIP on the matrix A, when, Theorem 2 provide an error bound between the convergent limit and the sparse solution of problem (1). While
, we present an error bound for the limit point of any convergent subsequence. In this case, the limit point of any convergent subsequence is an s-sparse vector.
4. Conclusion
The iteratively reweighted algorithm has been widely used for solving nonconvex optimization problem. In this paper, we propose an efficient adaptive iteratively reweighted
algorithm (6) for solving the elastic
regularization (5) and we prove the convergence of the algorithm. In particular, we first prove that the sequence generated by Algorithm 1 is bounded and the sequence is asymptotically regular. When
, based on an algebraic method, we prove that the sequence generated by Algorithm 1 is convergent for any rational
and the limit is a critical point of problem (5) with
. Furthermore, under the condition of the RIP on the matrix A, when
, we present an error bound between the convergent limit and the sparse solution of problem (1). While
, we present an error bound for the limit point of any convergent subsequence. Our established convergence results provide a theoretical guarantee for a wide range of applications of adaptive iteratively reweighted
algorithm.
Cite this paper
Yong Zhang,Wanzhou Ye, (2016) An Efficient Adaptive Iteratively Reweighted l1 Algorithm for Elastic lq Regularization. Advances in Pure Mathematics,06,498-506. doi: 10.4236/apm.2016.67036
References
- 1. Donoho, D.L. (2006) Compressed Sensing. IEEE Transactions on Information Theory, 52, 1289-1306.
http://dx.doi.org/10.1109/TIT.2006.871582 - 2. Candès, E., Romberg, J. and Tao, T. (2006) Robust Uncertainty Principles: Exact Signal Reconstruction from Highly Incomplete Frequency Information. IEEE Transactions on Information Theory, 52, 489-509.
http://dx.doi.org/10.1109/TIT.2005.862083 - 3. Chartrand, R. (2007) Exact Reconstruction of Sparse Signals via Nonconvex Minimization. IEEE Signal Processing Letters, 14, 707-710.
http://dx.doi.org/10.1109/LSP.2007.898300 - 4. Xu, Z.B., Chang, X.Y., Xu, F.M. and Zhang, H. (2012) L1/2 Regularization: A Thresholding Representation Theory and a Fast Solver. IEEE Transactions on Neural Networks and Learning Systems, 23, 1013-1027.
http://dx.doi.org/10.1109/TNNLS.2012.2197412 - 5. Foucart, S. and Lai, M. (2009) Sparsest Solutions of Underdetermined Linear Systems via lq Minimization for 0<q≤1. Applied and Computational Harmonic Analysis, 26, 395-407.
http://dx.doi.org/10.1016/j.acha.2008.09.001 - 6. Chen, S.S., Donoho, D.L. and Saunders, M.A. (1998) Atomic Decomposition by Basis Pursuit. SIAM Journal on Scientific Computing, 20, 33-61.
http://dx.doi.org/10.1137/S1064827596304010 - 7. Daubechies, I., Defrise, M. and Christine, D.M. (2004) An Iterative Thresholding Algorithm for Linear Inverse Problems with a Sparsity Constraint. Communications on Pure and Applied Mathematics, 57, 1413-1457.
http://dx.doi.org/10.1002/cpa.20042 - 8. Candes, E.J., Wakin, M.B. and Boyd, S.P. (2008) Enhancing Sparity by Reweighted l1 Minimization. Journal of Fourier Analysis and Applications, 14, 877-905.
http://dx.doi.org/10.1007/s00041-008-9045-x - 9. Daubechies, I., Devore, R., Fornasier, M. and Güntük, C.S. (2010) Iteratively Reweighted Least Suqares Minimization for Sparse Recovery. Communications on Pure and Applied Mathematics, 63, 1-38.
http://dx.doi.org/10.1002/cpa.20303 - 10. Zou, H. and Hastie, T. (2005) Regularization and Variable Selection via the Elastic Net. Journal of the Royal Statistical Society: Series B, 67, 301-320.
http://dx.doi.org/10.1111/j.1467-9868.2005.00503.x - 11. Tibshirani, R. (1996) Regression Shrinkage and Selection via The Lasso. Journal of the Royal Statistical Society: Series B, 58, 267-288.
- 12. Jin, B.T., Lorenz, D.A. and Schiffler, S. (2009) Elastic-Net Regularization: Error Estimates and Active Set Methods. Inverse Problems, 25, Article ID: 115022.
http://dx.doi.org/10.1088/0266-5611/25/11/115022 - 13. De Mola, C., De Vitob, E. and Rosasco, L. (2009) Elastic-Net Regularization in Learning Theory. Journal of Complexity, 25, 201-230.
http://dx.doi.org/10.1016/j.jco.2009.01.002 - 14. Lai, M.J., Xu, Y.Y. and Yin, W.T. (2013) Improved Iteratively Reweighted Least Squares for Unconstrained Smoothed lq Minimization. SIAM Journal on Numerical Analysis, 51, 927-957.
http://dx.doi.org/10.1137/110840364 - 15. Donoho, D.L. and Tsaig, Y. (2008) Fast Solution of l1 Norm Minimization Problems When the Solution May Be Sparse. IEEE Transactions on Information Theory, 54, 4789-4812.
http://dx.doi.org/10.1109/TIT.2008.929958 - 16. Garcia, C.B. and Li, T.Y. (1980) On the Number of Solutions to Polynomial Systems of Equations. SIAM Journal on Numerical Analysis, 17, 540-546.
http://dx.doi.org/10.1137/0717046