Research | Open | Published:

# \(\ell _1\)-regularized recursive total least squares based sparse system identification for the error-in-variables

*SpringerPlus***volume 5**, Article number: 1460 (2016)

## Abstract

In this paper an \(\ell _1\)-regularized recursive total least squares (RTLS) algorithm is considered for the sparse system identification. Although recursive least squares (RLS) has been successfully applied in sparse system identification, the estimation performance in RLS based algorithms becomes worse, when both input and output are contaminated by noise (the error-in-variables problem). We proposed an algorithm to handle the error-in-variables problem. The proposed \(\ell _1\)-RTLS algorithm is an RLS like iteration using the \(\ell _1\) regularization. The proposed algorithm not only gives excellent performance but also reduces the required complexity through the effective inversion matrix handling. Simulations demonstrate the superiority of the proposed \(\ell _1\)-regularized RTLS for the sparse system identification setting.

## Background

There has been a recent interest in adaptive algorithms to handle sparsity in various signals and systems (Gu et al. 2009; Chen et al. 2009; Babadi et al. 2010; Angelosante et al. 2010; Eksioglu 2011; Eksioglu and Tanc 2011; Kalouptsidis et al. 2011). The idea is to exploit a priori knowledge about sparsity in a signal that needs to be processed for system identification. Several algorithms based on the least-mean square (LMS) (Gu et al. 2009; Chen et al. 2009) and the recursive least squares (RLS) (Babadi et al. 2010; Angelosante et al. 2010; Eksioglu 2011; Eksioglu and Tanc 2011) techniques have been reported with different penalty or shrinkage functions. In a broad range of signal processing applications, not only the system output is corrupted by measurement noise, but also the measured input signal may often be corrupted by the additive noise due to such as sampling error, quantization error and wide-band channel noise. However, the algorithms for sparsity can handle only the corrupted output case. It is necessary to derive an algorithm to handle a noisy case of both noisy input and noisy output (i.e. the error-in-variables problem).

One of the potential counterparts to handle the error-in-variables problem is the total-least-squares estimator (TLS) that seeks to minimize the sum of squares of residuals on all of the variables in the equation instead of minimizing the sum of squares of residuals on only the response variable. Golub and Van Loan introduced the TLS problem to the field of numerical analysis (Golub and Loan 1980); consequently, other researchers have developed and analyzed adaptive algorithms that employ the TLS formulation and its extensions (Dunne and Williamson 2000, 2004; Feng et al. 1998; Arablouei et al. 2014; Davila 1994; Arablouei et al. 2015). Recursive based algorithms were also studied along with adaptive TLS estimators (Soijer 2004; Choi et al. 2005). The algorithms were denoted as recursive total least squares (RTLS) or sequential total least squares (STLS). These algorithms recursively calculate and track the eigenvector corresponding to the minimum eigenvalue (the TLS solution) from the inverse covariance matrix of the augmented sample matrix. Some TLS based algorithms were proposed for the sparse signal processing (Tanc 2015; Arablouei 2016; Zhu et al. 2011; Dumitrescu 2013; Lim and Pang 2016). The algorithms in Tanc (2015), Arablouei (2016) utilized the gradient based method. The algorithms in Zhu et al. (2011), Dumitrescu (2013) were based on the block coordinate descent method. In Lim and Pang (2016), the TLS method was applied to handle the group sparsity problem.

In this paper, we consider the \(\ell _1\) regularization for the RTLS cost function, in which the recursive procedure is derived from the generalized eigendecomposition method in Davila (1994) and Choi et al. (2005), and the regularization approach outlined in Eksioglu and Tanc (2011) is used in order to handle the sparsity. We develop the update algorithm for the \(\ell _1\)-regularized RTLS using results from subgradient calculus. As a result, we propose the algorithm superior to the algorithm of Eksioglu and Tanc (2011) in the error-in-variables. We also reduce the total complexity by utilizing the inverse matrix update effectively. The proposed algorithm improves the sparse system estimation performance in the error-in-variables with only a little additional complexity. We provide simulation results to examine the performance of the proposed algorithm in comparison with the algorithm of Eksioglu and Tanc (2011).

## Sparse system identification problem

In the sparse system identification problem of interest, the system observes a signal represented by an \(M\times 1\) vector \({\mathbf{x}}(k) = [x_1 (k), \ldots ,x_M (k)]^T\) at time instant *k*, performs filtering and obtains the output \(y(k) = {\mathbf{x}}^T(k){\mathbf{w}}_o (k)\), where \({\mathbf{w}}_o (k) = [w_1 (k), \ldots ,w_M (k)]^T\) is an M–length finite-impulse-response (FIR) system that represents the actual system. For system identification, an adaptive filter with M coefficients \({\hat{\mathbf{w}}}(k)\) is employed in such a way that observes \({\mathbf{x}}(k)\) and produces an estimate \(\hat{y}(k) = {\mathbf{x}}^T(k){{\hat{\mathbf{w}}}}(k)\). The system identification scheme then compares the output of the actual system *y*(*k*) and the adaptive filter \(\hat{y}(k)\), resulting in an error signal \(e(k) = y(k) + n(k) - \hat{y}(k) = \tilde{y}(k) - \hat{y}(k)\), where *n*(*k*) is the measurement noise. In this context, the goal of an adaptive algorithm is to identify the system by minimizing the cost function defined by

The gradient based minimization derives the following equation.

where \({{{\varvec{\Phi }} }}\left( k \right) = \sum \nolimits _{m = 0}^k {\lambda ^{k - m}{\mathbf{x}}(m){\mathbf{x}}^T(m)}\) and \({\mathbf{r}}(k) = \sum \nolimits _{m = 0}^k {\lambda ^{k - m}\tilde{y}(m){\mathbf{x}}(m)}\). This equation is the matrix form of the normal equations for least squares solution.

Especially, we call *M*-th order \({\mathbf{w}}_o (k)\) sparse system, when the number of nonzero coefficients \(K \ll M\). In order to estimate the *M*-th order sparse system, most estimation algorithms exploit non-zero coefficients of the system to obtain performance benefits and/or a computational complexity reduction (Gu et al. 2009; Chen et al. 2009; Babadi et al. 2010; Angelosante et al. 2010; Eksioglu 2011; Eksioglu and Tanc 2011).

## \(\ell _1\)-regularized RTLS (recursive total least squares)

In TLS, we assume a given unknown system that both the input and output are corrupted by noise. The system should be estimated from the noisy observation of the input and the output (Fig. 1). In this system output is given by:

where the output noise \(n_{o}(k)\) is the Gaussian white noise with variance \(\sigma _o^2\) and independent of the input signal. The noisy input of the system is given by:

where the input noise \(n_{i}(k)\) is the Gaussian white noise with variance \(\sigma _i^2\).

For TLS solution, the augmented data vector is considered as:

And its covariance matrix has the following structure:

where \({\mathbf{p}} = E\left\{ {{{\tilde{\mathbf{x}}}}(k)y(k)} \right\}\) and \(c = E\left\{ {y(k)y(k)} \right\}\), \({{\tilde{\mathbf{R}}}} = E\left\{ {{{\tilde{\mathbf{x}}}}(k){{\tilde{\mathbf{x}}}}^T(k)} \right\} = {\mathbf{R}} + \sigma _i^2 {\mathbf{I}}\), \({\mathbf{R}} = E\left\{ {{\mathbf{x}}(k){\mathbf{x}}^T(k)} \right\}\). In Davila (1994) and Choi et al. (2005), the TLS problem is thus reduced to finding the eigenvector that is associated with the smallest eigenvalue of \({{\bar{\mathbf{R}}}}\). The following equation is the simplified cost function to find the eigenvector that is associated with the smallest eigenvalue of \({{\bar{\mathbf{R}}}}\).

where \(\overline{\mathbf{D}} = \left[ {{\begin{array}{*{20}c} {\mathbf{I}} & {\mathbf{0}} \\ {\mathbf{0}} & \gamma \\ \end{array} }} \right]\) with \(\gamma = \frac{\sigma _o^2 }{\sigma _i^2 }\). Minimum value of (7) is recognized as the smallest generalized eigenvalue of \(\overline{\mathbf{R}}\) (Dunne and Williamson 2004). Therefore, we can find the eigenvector associated with the smallest eigenvalue. The smallest eigenvector can also be derived from the maximization of (8).

We adopt the modified cost function by the addition of a penalty function. This penalty function can be chosen to reflect a priori knowledge about the true sparsity system.

where \(\gamma\) is the regularized parameter in Eksioglu and Tanc (2011). We adopt the \(\ell _1\) penalty function as follows:

We solve the equations by \(\nabla _{{\tilde{\mathbf{w}}}} J(k) = 0\),

where \({{\bar{\mathbf{R}}}}(k) = \lambda {{\bar{\mathbf{R}}}}(k - 1) + {{\bar{\mathbf{x}}}}(k){{\bar{\mathbf{x}}}}^H(k)\), \({{\tilde{\mathbf{w}}}}(k) = \left[ {{{\hat{\mathbf{w}}}}^T(k), - 1} \right] ^T\)and \({{\hat{\mathbf{w}}}}(k)\) is the estimation result for the unknown system at *k*-th time step. The subgradient of \(f({{\tilde{\mathbf{w}}}})\) is \(\nabla ^s\left\| {{\tilde{\mathbf{w}}}} \right\| _1 = \mathrm{sgn}({{\tilde{\mathbf{w}}}})\) (Babadi et al. 2010; Kalouptsidis et al. 2011), and sgn (\(\cdot\)) is the component-wise sign function. In (11), the regularized parameter,\(\gamma _k\), is time-varying, which governs a tradeoff between the approximation error and the penalty function. From (11), we obtain

And we obtain the estimated parameter of the unknown system as

As the estimated unknown system can be derived from the ratio between \(\tilde{w}_{M + 1} (k)\) and the elements in \({{\tilde{\mathbf{w}}}}_{1:M} (k)\), the normalization of \({{\tilde{\mathbf{w}}}}(k)\) as \({{\tilde{\mathbf{w}}}}(k) = {{\tilde{\mathbf{w}}}}(k) / \left\| {{{\tilde{\mathbf{w}}}}(k)} \right\|\) keeps the same solution in (13) as well as numerical stability in the iteration. By applying the normalization of (12) becomes,

In addition, we can approximate (14) as follows.

where \({{\bar{\mathbf{R}}}}(k) = \lambda {{\bar{\mathbf{R}}}}(k - 1) + {{\bar{\mathbf{x}}}}(k){{\bar{\mathbf{x}}}}^H(k)\). In (15),

In (11), we apply the same \(\gamma _k\) in Eksioglu and Tanc (2011).

where \({{\hat{\mathbf{w}}}}_{aug} (k) = \left[ {{{\hat{\mathbf{w}}}}^T(k), - 1} \right] ^T\), \({{\hat{\mathbf{w}}}}_{aug,RLS} (k) = \left[ {{{\hat{\mathbf{w}}}}_{RLS}^T (k), - 1} \right] ^T\), \({{\varepsilon }}(k) = {{\hat{\mathbf{w}}}}_{aug} (k) - {{\hat{\mathbf{w}}}}_{aug,RLS} (k)\), and \({{\hat{\mathbf{w}}}}_{RLS} (k)\) is the parameter estimated by recursive least squares (RLS).

## A simplified way to solve \(\ell _1\)-regularized RTLS

The proposed algorithm needs a solution in (15) as well as RLS solution for \({{\hat{\mathbf{w}}}}_{RLS} (k)\) in \(\gamma _k\). However, this makes the algorithm complex and we find a less complex way from block matrix inversion lemma in Moon and Stirling (2000). The required calculation complexity can be simplified if we use the following matrix manipulation: \({{\bar{\mathbf{X}}}}(k) = \left[ {{{\tilde{\mathbf{X}}}}(k)^T \vdots {{\tilde{\mathbf{y}}}}(k)} \right] ^T\) and \({{\bar{\mathbf{X}}}}(k){{\bar{\mathbf{X}}}}^T(k) = \left[ {{\begin{array}{*{20}c} {{{\tilde{\mathbf{X}}}}(k){{\tilde{\mathbf{X}}}}^T(k)} & {{{\tilde{\mathbf{X}}}}(k){{\tilde{\mathbf{y}}}}(k)} \\ {{{\tilde{\mathbf{y}}}}^T(k){{\tilde{\mathbf{X}}}}^T(k)} & {{{\tilde{\mathbf{y}}}}^T(k){{\tilde{\mathbf{y}}}}(k)} \\ \end{array} }} \right] = \left[ {{\begin{array}{*{20}c} {{{\tilde{\mathbf{X}}}}(k){{\tilde{\mathbf{X}}}}^T(k)} & {\mathbf{a}} \\ {{\mathbf{a}}^T} & c \\ \end{array} }} \right]\), then:

where \({{\tilde{\mathbf{X}}}}(k) = \left[ {{{\tilde{\mathbf{x}}}}(k),\sqrt{\lambda }{{\tilde{\mathbf{x}}}}(k - 1), \cdots ,\left( {\sqrt{\lambda }} \right) ^k{{\tilde{\mathbf{x}}}}(0)} \right]\), \(\beta = \left( {c - {\mathbf{a}}^H\left( {{{\tilde{\mathbf{X}}}}(k){{\tilde{\mathbf{X}}}}^T(k)} \right) ^{ - 1}{\mathbf{a}}} \right) ^{ - 1}\). \({\mathbf{A}}_{12} = - \beta \left( {{{\tilde{\mathbf{X}}}}(k){{\tilde{\mathbf{X}}}}(k)^T} \right) ^{ - 1}{\mathbf{a}}\) in (18) includes

from which the RLS solution, \({{\hat{\mathbf{w}}}}_{RLS} (k)\) can be derived by dividing \({\mathbf{A}}_{12}\)with the constant \(- \beta\). When we solve the proposed \(\ell _1\)-RTLS in (15–17), \(\gamma _k\) in (17) needs to solve the RLS for \({{\varepsilon }}(k)\). Therefore, all procedures in (15–17) need \(2M^2 + 2M\) additional multiplications. However, the simplification in this section needs only an additional division.

## Simulation results

In this experiment, we follow the experiment scenario in Eksioglu and Tanc (2011). The true system function \({\mathbf{w}}\) has a total of N = 64 taps, where only S of them are nonzero. The nonzero coefficients are positioned randomly and take their values from an \(N(0,1/\mathrm{S})\) distribution. The input signal is \(x_k \sim N(0,1)\). Noise is added to both the input and the output, and the additive input and output noises in this paper are \(n_{in,k} \sim N(0,\sigma _{in}^2 )\) and \(n_{out,k} \sim N(0,\sigma _{out}^2 )\), respectively. These additional noises are necessary to experiment the errors-in-variables problem. The proposed \(\ell _1\)-RTLS algorithm is realized with the automatic \(\gamma _k\) using (17). The \(\rho\) value in (17) is taken to be the true value of \(f({\mathbf{w}})\) as in Eksioglu and Tanc (2011), that is \(\rho = \left\| {{\mathbf{w}}_o } \right\| _1\) for the \(\ell _1\)-RTLS. We also compare the ordinary RLS and the \(\ell _1\)-RLS of Eksioglu and Tanc (2011) with the proposed \(\ell _1\)-RTLS.

The proposed algorithm needs the inversion of the covariance matrix as (15) in order to derive the eigenvector. The better covariance matrix is needed for the better eigenvector. Therefore, the forgetting factor is needed close to 1. Figure 2 compares the estimation performance results between the \(\ell _1\)-RLS and the \(\ell _1\)-RTLS in mean square deviation (MSD) with S = 4 and the different forgetting factors. We add the noise at both input and output with \(\sigma _{in} = \sigma _{out} = 0.1\). For this comparison, we set the forgetting factor to 0.999, 0.9995, 0.9999 and 1, respectively. Figure 2 shows that the performance becomes better as the forgetting factor goes to 1 although the proposed \(\ell _1\)-RTLS becomes better when the forgetting factor is greater than 0.999.

In Fig. 3 we simulate the algorithms with \(\sigma _{in} = \sigma _{out} = 0.1\) and for S = 4, 8, 16 and 64, where S = 64 corresponds to a completely non-sparse system. In this simulation we set the forgetting factor to 0.9999. Figure 3a–d plot the MSD curves of the proposed \(\ell _1\)-RTLS, the \(\ell _1\)-RLS and the ordinary RLS with the different S values. Figure 3 includes the MSD curves from the \(\ell _1\)-RLS and the ordinary RLS with the contaminated output only, and shows the estimation performance of the \(\ell _1\)-RLS and the ordinary RLS significantly deteriorates when input and output are contaminated with noise. The proposed \(\ell _1\)-RTLS, however, outperforms the \(\ell _1\)-RLS and the ordinary RLS when input and output are contaminated with noise.

Table 1 summarizes the steady-state MSD values at the end of 500 independent trials for the algorithms. In this simulation, we set the forgetting factor to 0.999, 0.9995, 0.9999 and 1 and vary the sparsity S to 4, 8, 16 and 64, respectively. Table 1 shows the performance of the \(\ell _1\)-RLS is almost the same as that of the ordinary RLS. This means the \(\ell _1\)-RLS cannot improve the estimation performance when both input and output are contaminated by noise. However, the proposed \(\ell _1\)-RTLS outperforms the other algorithms. The improvement goes better as the forgetting factor gets close to 1.

## Conclusion

In this paper, we propose an \(\ell _1\)-regularized RTLS for sparse system identification. The proposed algorithm keeps good performance in case of both noisy input and noisy output. We develop the recursive procedure for total least squares solution with an \(\ell _1\)-regularized cost function. We also present a simplified solution requiring only a little additional complexity in order to integrate the regularization factor. Simulations show that the introduced \(\ell _1\)-regularized RTLS algorithm shows better performance than RLS and \(\ell _1\)-regularized RLS in the sparse system with noisy input and noisy output.

## References

Angelosante D, Bazerque J, Giannakis G (2010) Online adaptive estimation of sparse signals: where RLS meets the l1-norm. IEEE Trans Signal Process 58:3436–3447

Arablouei R, Werner S, Dogancay K (2014) Analysis of the gradient-decent total least-squares algorithm. IEEE Trans Signal Process 62:1256–1264

Arablouei R, Dogancay K, Werner S (2015) Recursive total least-squares algorithm based on inverse power method and dichotomous coordinate-descent iterations. IEEE Trans Signal Process 63:1941–1949

Arablouei R (2016) Fast reconstruction algorithm for perturbed compressive sensing based on total least-squares and proximal splitting. Signal Process. doi:10.1016/j.sigpro.2016.06.009

Babadi B, Kalouptsidis N, Tarokh V (2010) SPARLS: the sparse RLS algorithm. IEEE Trans Signal Process 58:4013–4025

Chen Y, Gu Y, Hero A (2009) Sparse LMS for system identification. In: Paper presented at IEEE international conference on acoustics, speech and signal processing, Taiwan, 19–24 April, 2009

Choi N, Lim J, Sung K (2005) An efficient recursive total least squares algorithm for training multilayer feedforward neural networks. LNCS 3496:558–565

Davila C (1994) An efficient recursive total least squares algorithm for FIR adaptive filtering. IEEE Trans Signal Process 42:268–280

Dumitrescu B (2013) Sparse total least squares: analysis and greedy algorithms. Linear Algebra Appl 438:2661–2674

Dunne B, Williamson G (2000) Stable simplified gradient algorithms for total least squares filtering. In: Paper presented at the 32nd annual asilomar conference on signals, systems, and computers, Pacific Grove, Oct 29–Nov 1 2000

Dunne B, Williamson G (2004) Analysis of gradient algorithms for TLS-based adaptive IIR filters. IEEE Trans Signal Process 52:3345–3356

Eksioglu E (2011) Sparsity regularized RLS adaptive filtering. IET Signal Process 5:480–487

Eksioglu E, Tanc A (2011) RLS algorithm with convex regularization. IEEE Signal Process Lett 18:470–473

Feng D, Bao Z, Jiao L (1998) Total least mean squares algorithm. IEEE Trans Signal Process 46:2122–2130

Golub G, Van Loan C (1980) An analysis of the total least squares problem. SIAM J Numer Anal 17:883–893

Gu Y, Jin J, Mei S (2009) Norm constraint LMS algorithm for sparse system identification. IEEE Signal Process Lett 16:774–777

Kalouptsidis N, Mileounis G, Babadi B, Tarokh V (2011) Adaptive algorithms for sparse system identification. Signal Process 91:1910–1919

Lim J, Pang H (2016) Mixed norm regularized recursive total least squares for group sparse system identification. Int J Adapt Control Signal Process 30:664–673

Moon T, Stirling W (2000) Mathematical methods and algorithm for signal processing. Prentice Hall, New Jersey

Soijer MW (2004) Sequential computation of total least-squares parameter estimates. J Guidance 27:501–503

Tanc A (2015) Sparsity regularized recursive total least-squares. Digital Signal Process 40:176–180

Zhu H, Leus G, Giannakis G (2011) Sparsity-cognizant total least-squares for perturbed compressive sampling. IEEE Trans Signal Process 59:2002–2016

## Authors' contributions

This paper considers the sparse system identification. Especially, authors contribute to propose a new recursive sparse system identification algorithm, when both input and output are contaminated by noise. The algorithm not only gives excellent performance but also reduces the required complexity. Both authors read and approved the final manuscript.

### Acknowledgements

This paper was supported by Agency for Defense Development (ADD) in Korean (15-106-104-027).

### Competing interests

The authors declare that they have no competing interests.

## Author information

## Rights and permissions

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## About this article

#### Received

#### Accepted

#### Published

#### DOI

### Keywords

- Adaptive filter
- TLS
- RLS
- Convex regularization
- Sparsity
- \(\ell\)1-norm