Optimality condition and iterative thresholding algorithm for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_p$$\end{document}lp-regularization problems

This paper investigates the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_p$$\end{document}lp-regularization problems, which has a broad applications in compressive sensing, variable selection problems and sparse least squares fitting for high dimensional data. We derive the exact lower bounds for the absolute value of nonzero entries in each global optimal solution of the model, which clearly demonstrates the relation between the sparsity of the optimum solution and the choice of the regularization parameter and norm. We also establish the necessary condition for global optimum solutions of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_p$$\end{document}lp-regularization problems, i.e., the global optimum solutions are fixed points of a vector thresholding operator. In addition, by selecting parameters carefully, a global minimizer which will have certain desired sparsity can be obtained. Finally, an iterative thresholding algorithm is designed for solving the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_p$$\end{document}lp-regularization problems, and any accumulation point of the sequence generated by the designed algorithm is convergent to a fixed point of the vector thresholding operator.

, Tian and Jiao (2015), Xu et al. (2010Xu et al. ( , 2012, Shehu et al. (2013Shehu et al. ( , 2015, Bredies et al. (2015), Fan et al. (2016). In Chen et al. (2010), Chen et al. derive the lower bounds for the absolute value of nonzero entries in each local optimum solution of the model. Xu et al. (2012) presented an analytical expression in a thresholding form for the resolvent of gradient of s 1/2 1/2 and developed an alternative feature theorem on optimum solutions of the L 1/2 regularization problem, and proposed an iterative half thresholding algorithm for fast solving the problem. But there is no result for the characteristics of the global optimum solution for the problem (1).
In this article, we pay more attention to derive the characteristics of the global optimum solution of problem (1), which is inspired by Xu et al. (2012). The remaining sections of the paper are organized as follows. In "Technical preliminaries" section, we portray some important technical results. "Lower bound and optimality conditions" section first develop the proximal operator associated with a non-convex l p quasi-norm, which can be looked as an extension of the well-known proximal operator associated with convex functions. Next, an exact lower bound for the absolute value of nonzero entries in every global optimum solution of (1) is derived, which clearly demonstrates the relation between the sparsity of the optimum solution and the choice of the regularization parameter and norm. We also establish the necessary condition for global optimum solutions of the l p -regularization problems, i.e., the global optimum solutions are fixed points of a vector thresholding operator. In "Choosing the parameter λ for sparsity" section, we also propose a sufficient condition on the selection of to meet the sparsity requirement of global minimizers of the l p -regularization problems. "Iterative thresholding algorithm and its convergence" section proposes an iterative thresholding algorithm for the l p -regularization problems, and any accumulation point of the sequence produced by the designed algorithm is convergent to a fixed point of the vector thresholding operator. Finally, some conclusions are drawn in "Numerical experiments" section.

Technical preliminaries
By utilizing the objective function's separability and the operator splitting technique, the l p -regularization problems (1) can be converted into n homologous single variable minimization problems defined on (−∞, +∞). Therefore, at first we investigate the homologous single variable minimization problem where > 0 and p ∈ (0, 1) are all any real numbers, s ∈ R is a variable and r ∈ R is a parameter. Besides, we only need to consider the following two sub-problems In , investigated the subproblem (3) and presented some results, which can be used to derive our conclusions. Let  (5) and (6). Then, there is a unique implicit function s = h ,p (r) define on (r, +∞), which satisfies s 0 = h ,p (r 0 ) , h ,p (r) >s and G(h ,p (r), r) ≡ 0 for ∀r ∈ (r, +∞). Furthermore, for the function s = h ,p (r), the following conclusions hold: . 3. s = h ,p (r) is a strictly increasing function over (r, +∞). Moreover, if r >r, then s = h ,p (r) is the sole local minimizer of g r (s) over (0, +∞). Proof If s ≥ 0, then g r (s) = s 2 − 2rs + s p . Let s * 1 is a global optimum solution for the problem (3), then from Lemma 2, we have r > r * ( (1 − p)) 1/(2−p) or 0, r = r * 0. r < r * If s ≤ 0, then g r (s) = s 2 − 2rs + (−s) p = (−s) 2 + 2r(−s) + (−s) p . Let y = −s, we have y ≥ 0 and g (−r) (y) = y 2 + 2ry + y p , we follow the first case. If y * is a global optimum solution for the problem g (−r) (y) over [0, +∞), then from Lemma 2, we have Therefore, if s ≤ 0, s * 2 is a global optimum solution for the problem min s∈R − g r (s) = s 2 − 2rs + (−s) p , then we have Combining (8) and (9) together, we can get (7). Therefore, the proof is complete.
Proof By Proposition 1 and Lemma 1, this proposition can be followed.

Lower bound and optimality conditions
In this section, by using function's separability and the operator splitting technique, we propose the proximal operator associated with l p quasi-norm. Next, we present the properties of the global optimum solutions of the l p -regularization problems (1). For convenience, first of all, we define the following thresholding function and thresholding operators.
Definition 1 (p thresholding function) Assume that r ∈ R, for any > 0, the function h (r) defined in (7) is called as a p thresholding function.
Definition 2 (Vector p thresholding operator) Assume that s ∈ R n , for any > 0, the vector p thresholding operator H (s) is defined as In this section, one of the main results is a proximal operator associated with the nonconvex l p (0 < p < 1) quasi-norm, and which can be also looked as an extension of the well-known proximal operator associated with convex functions.
Theorem 1 For given a vector y ∈ R n and constants > 0, 0 < p < 1. Assume that s * be the global optimum solution of the following problem then s * can be expressed as Furthermore, we can get the exact number of global optimum solutions for the problem.
(11) min s∈R n f (s) := �s − y� 2 2 + �s� p p , Therefore, to solve the problem (11) is equivalent to solving the following n problems, for each i = 1, 2, . . . , n, By Proposition 1, for each i = 1, 2, . . . , n, we can follow (12) has two solutions; else, unique solution. Hence we can know the exact number of global optimum solutions of (11). The proof is thus complete.
For any , µ > 0, 0 < p < 1, and z ∈ R n , let For simplicity, let Theorem 2 Assume that s * ∈ R n be the global minimizer of f µ (s, z) for any fixed > 0, µ > 0 and z ∈ R n , then we have Proof Without loss of generality, f µ (s, z) can be rewritten as Therefore, to solve min s∈R n f µ (s, z) for any fixed ν, µ and Y is equivalent to solving By Theorem 1, thus the proof is complete.
Hence, the proof is complete. Proof Since s * is a global minimizer of f µ (s, z) for given z = s * , by Theorem 2 and Lemma 3, we can directly get (16) and (17). By proposition 2, we can follow that By Proposition 2, combining with the strict monotonicity of h µ (·) on (r, +∞) and Therefore, the proof is completed.
Remark 1 In Theorem 3, the necessary condition for global optimum solutions of the l p -regularization problems is established, which is a thresholding expression associated with the global optimum solutions. Particularly, the global optimum solutions for the problem (1) are the fixed points of a vector-valued thresholding operator. In contrast, the conclusion does not hold in general, i.e., a point satisfying (16) is not the global optimum solution for the l p -regularization problems (1) in general. This is related to the nature of the matrix A, for an instance, when A ≡ I and µ = 1, a fixed point of (16) is the global optimum solution for the l p -regularization problems (1) (i.e., Theorem 1).
f µ (s, s * ) = µ(f (s) − �As − As * � 2 2 ) + �s − s * � 2 2 = µ(�As − b� 2 2 + �s� Remark 2 In Theorem 3, the exact lower bound for the absolute value of nonzero entries in every global optimum solution of the model is also provided, which can be used to identify zero entries precisely in any global optimum solution. These lower bounds clearly demonstrate the relationship between the sparsity of the global optimum solution and the choices of the regularization parameter and norm, therefore, our theorem can be used to select the desiring model parameters and norms.

Choosing the parameter for sparsity
In many applications such that sparse solution reconstruction and variable selection, one need to seek out least square estimators with no more than k nonzero entries.  present a sufficient condition on for global minimizers of the l p -regularization problems, which have desirable sparsity, and which are based on the lower bound theory in local optimum solutions. In this paper, we also present a sufficient condition on for global minimizers of the l p -regularization problems, which also have desirable sparsity, but which are based on the lower bound theory in global optimum solutions.

Theorem 4 Set
The following conclusions hold.
Proof Assume that s * � = 0 is a global minimizer of the l p -regularization problems (1). Let B = A T ∈ R m×|T | , where T = support(s * ) and |T | = �s * � 0 is the cardinality of the set T. Therefore, according to the first order necessary condition, s * must satisfy which shows As * − b = Bs * T − b � = 0. Hence, we have By Theorem 3, we can follow that Therefore, we have (21) f (s * ) > |T |( µ(1 − p)) p/(2−p) .
In the following, we will discuss different cases: 1. Assume that ≥ β(k), we shall prove it through apagoge. If �s * � 0 ≥ k ≥ 1, then by (3.11) and the definition of β(k) in (3.8), we have This is in contradiction with that s * is a global minimizer of (1). Therefore, we have �s * � 0 < k. 2. Assume that ≥ β(1), we shall prove it through apagoge. If s * � = 0, then there exists i 0 satisfying s * i 0 � = 0 and This is in contradiction with that s * is a global minimizer of (1). Therefore, s * = 0 must be the unique global minimizer of (1).

Iterative thresholding algorithm and its convergence
By the thresholding representation formula (16), an iterative thresholding formula of the problem (1) can be presented in the following: initilized s 0 ∈ R n , where When |r| = r * , the adjustment here is, we only select h µ (r) = 0. Firstly, some important lemmas are given in the following.
Proof For 0 < µ < �A� −2 , we have Hence, The first equality can be followed from the definition of f µ (s, z). The second inequality is because that the s k+1 is the minimizer of f µ (s, s k ).
This lemma demonstrate that, from iteration to iteration, the objective function f (s) does not increase, moreover, using the proposed algorithm does not lead to worse results than not using the proposed algorithm. The algorithm (22) does not have a unique fixed point, therefore it is very important to analyze the fixed points in detail.  (22) is any s * satisfying s * = H µ (s * + µA T (b − As * )), i.e., s * i = h µ (s * i + µA T i (b − As * )). If i ∈ Ŵ 0 , the equality holds when and only when |µA T

Lemma 5 Let
. Similarly, i ∈ Ŵ 1 when and only when The following lemma demonstrate that the sequence {s k } produced by the algorithm (22) is asymptotically regular, i.e., lim k→∞ �s k+1 − s k � 2 = 0.
Proof We prove the convergence of K k=0 �s k+1 − s k � 2 2 , which implies the lemma. First of all, we prove that K k=0 �s k+1 − s k � 2 2 is monotonically increasing. We can follow monotonicity from Then, we will show the boundness of K k=0 �s k+1 − s k � 2 2 . For 0 < µ < �A� −2 , we have 0 < δ := 1 − µ�A� 2 < 1 and Therefore, The second inequality can be followed from the proof of Lemma 4 and the last inequality can be taken from f (s 0 ) < ∞.
In the following, we present an very important property of the algorithm, i.e., any accumulation point of the sequence {s k } is a fixed point of the algorithm (22). Therefore, we have the following theorem and conclusion.
Theorem 5 If f (s 0 ) < ∞ and 0 < µ < �A� −2 , then we have the following conclusion: any accumulation point of the sequence {s k } produced by the algorithm (22) is a fixed point of (22).

Numerical experiments
Now we report numerical results to compare the performance of Iterative thresholding algorithm (ITA) (p = 0.5) for solving (1) (Signal reconstruction) with LASSO to find sparse solutions. The computational test was conducted on a Intel(R) Core(TM)2 Duo CPU E 8400 @3.00GHZ Dell desktop computer with 2.0GHz of memory with using Matlab R2010A.
�s k j +1 − s * � 2 ≤ �s k j +1 − s k j � 2 + �s k j − s * � 2 → 0, as k j → +∞, Consider a real-valued, finite-length signal x ∈ R n . Suppose x is T-sparse, that is, only T of the signal coefficients are nonzero and the others are zero. We use the following Matlab code to generate the original signal, a matrix A and a vector b.
The computational results for this experiment are displayed in Table 1.
From Table 1 we find that ITA has smaller prediction accuracy than LASSO in shorter time.

Conclusion
In this paper, an exact lower bound for the absolute value of nonzero entries in each global optimum solution of the problem (1) is established. And the necessary condition for global optimum solutions of the l p -regularization problems is derived, i.e., the global optimum solutions are the fixed points of a vector thresholding operator. In addition, we have derived a sufficient condition on the selection of for the desired sparsity of global minimizers of the problem (1) with the given (A, b, p). Finally, an iterative thresholding algorithm is designed for solving the l p -regularization problems, and the convergence of algorithm is proved.