# Difference-based ridge-type estimator of parameters in restricted partial linear model with correlated errors

- Jibo Wu
^{1, 2}Email author

**Received: **7 December 2015

**Accepted: **12 February 2016

**Published: **25 February 2016

## Abstract

In this article, a generalized difference-based ridge estimator is proposed for the vector parameter in a partial linear model when the errors are dependent. It is supposed that some additional linear constraints may hold to the whole parameter space. Its mean-squared error matrix is compared with the generalized restricted difference-based estimator. Finally, the performance of the new estimator is explained by a simulation study and a numerical example.

## Keywords

## Mathematics Subject Classification

## Background

*i*.

*i*.

*d*. \(N(0,\sigma ^2)\) distributed.

Since partially linear model has parametric and nonparametric components, and it is more flexible than linear model, many authors have been studied it, such as Ahn and Powell (1993), Wang et al. (2007).

In model (2), Yatchew (1997) mainly studied the estimation of the linear component and used differencing to eliminate bias induced from the presence of the nonparametric component. Wang et al (2007) presented higher-order differences for optimal efficiency in estimating the linear part by using a special class of difference sequences.

In this article we will use the ridge regression concept that was presented by Hoerl and Kennard (1970) to overcome the multicollinearity in regression problem. Multicollinearity is denoted as the existence of nearly linear dependency among column vectors of the design matrix *X* in the linear model \(y=X\beta +\epsilon\), where *y* is a \(n\times 1\)vector of observed responses, *X* is the observed matrix of independent variables of dimension \(n\times p\), assumed to have full rank *p*, \(\beta\) is an unknown parameter, \(\epsilon\) is an error vector with \(E(\epsilon )=0, E(\epsilon \epsilon ')=\sigma ^{2}I_{p}\). Multicollinearity may lead to wide confidence intervals for individual parameters may produce estimates with wrong signs, etc.

The condition number is a measure of the presence of multicollinearity. The condition number of the matrix *X* present some information about the existence of multicollinearity, however it does not illustrate the structure of the linear dependency among the column vectors \(X_{1}, X_{2}, \ldots , X_{n}\). The best way of illustrating the existence and structure of multicollinearity is to see the eigenvalues of \(X'X\). If \(X'X\) is ill-conditioned with a large condition number a ridge regression estimator can be used to estimate \(\beta\) [see e.g., Swamy et al. (1978); Sarkar (1992); Shi (2001); Zhong and Yang (2007); Zhang and Yang (2007); Tabakan and Akdeniz (2010); Akdeniz and Tabakan (2009); Roozbeh et al. (2010); Duran and Akdeniz (2012); Duran et al. (2012); Hu (2005) and Hu et al. (2015)]. In this paper, we will examine a biased estimation techniques to be followed when the matrix \(X'X\) appears to be ill-conditioned in the partial linear model. We suppose that the condition number of the parameteric component is large explain that a biased estimation procedure is desirable.

The rest of the paper is organized as follows. In section “The model and differencing-based estimator”, the model and differencing methodology are given. Section “Generalized difference-based ridge estimator” contains the definition of the generalized difference-based ridge estimator and some comparison results are given in section “MSEM-superiority of the generalized difference-based ridge estimator \(\hat{\beta }_{GRD}(k)\) over the the generalized restricted difference-based estimator \(\hat{\beta }_{GRD}\) ”. The results from section “MSEM-superiority of the generalized difference-based ridge estimator \(\hat{\beta }_{GRD}(k)\) over the the generalized restricted difference-based estimator \(\hat{\beta }_{GRD}\) ” are applied to a simulation study in section “Exemplary simulation” and a numerical example is given to illustrate the theoretical result in section “A numerical example”. Some conclusion remarks are given in section “Conclusions”.

## The model and differencing-based estimator

*f*is an unknown smooth function and has a bounded first derivative.

*m*is the order of differencing and \(d_{0},\ldots ,d_{m}\) are differencing weights satisfying the conditions

*D*whose elements satisfy Eq. (4) as follows:

*D*in model (6) can remove the nonparametric effect in large samples (Yatchew 2003). This ingores the presence of

*Df*(

*t*). Thus, we may write Eq. (6) as

So, we can see that \(\widetilde{\epsilon }\) is a \(n-m\) vector of disturbances distributed with \(E(\widetilde{\epsilon })=0\quad \text {and} \quad E(\widetilde{\epsilon }\widetilde{\epsilon }')=\sigma ^{2}DD'\).

*f*() in the model (3) (Yatchew 2003). Once \(\beta\) is estimated, a variety of nonparametric techniques could be applied to estimate

*f*() as if \(\beta\) were known.

*P*is the projection matrix and defined as

## Generalized difference-based ridge estimator

*P*is the projection matrix and defined as

*R*with rank \(q< p\). Subject to the linear restriction (17), the generalized restricted difference-based estimator is given by

Then, it is easy to see that \(\hat{\beta }_{GRD}\) and \(\hat{\beta }_{GRD}(k)\) are restricted with respect to \(R\beta =0\). It is also clear that for \(k=0\), we obtain \(\hat{\beta }_{GRD}(0)=\hat{\beta }_{GRD}\).

## MSEM-superiority of the generalized difference-based ridge estimator \(\hat{\beta }_{GRD}(k)\) over the the generalized restricted difference-based estimator \(\hat{\beta }_{GRD}\)

*W*is an nonnegative definite matrix [see Shi (2001)], we can conclude that \(\text{ Var}(\hat{\beta }_{GRD})-\text{ Var}(\hat{\beta }_{GRD}(k))\) is an nonnegative definite matrix.

Then using Theorem (Farebrother 1976), we can conclude that if \(k>0,\; \beta '\left( W+\frac{2}{k}I\right) ^{-1}\beta \le \sigma ^{2}\), then \(\hat{\beta }_{GRD}(k)\) is preferred to \(\hat{\beta }_{GRD}\).

###
**Theorem 4.1**

*Consider the two estimator*\(\hat{\beta }_{GRD}\)

*and*\(\hat{\beta }_{GRD}(k)\)

*of*\(\beta\)

*. Then the biased estimator*\(\hat{\beta }_{GRD}(k)\)

*is MSEM-superior over the*\(\hat{\beta }_{GRD}\)

*if*

*is satisfied.*

## Exemplary simulation

*k*and

*n*. In this paper, we simulate the response from the following model:

*V*is \(v_{ij}=(0.1)^{|i-j|}\) and \(\sigma =0.1\), \(f(t_{i})=\sqrt{t_{i}(1-t_{i})}\sin \frac{2.1\pi }{t_{i}+0.05}\) that is called Doppler function for \(t_{i}=(i-0.5)/n\) and for \(i=1,\ldots ,n\), the explanatory variables are generated by the following equation (Liu 2003):

*R*is given as follows:

From Figs. 1 and 3, we see that we *k* is smaller, the new estimator is better than the generalized difference-based estimator in the mean squared error sense. And with the increase of the mulitillinearity, the new estimator is perform well.

## A numerical example

In this section, we consider a numerical example to explain the performance of theoretical result presented in “MSEM-superiority of the generalized difference-based ridge estimator \(\hat{\beta }_{GRD}(k)\) over the the generalized restricted difference-based estimator \(\hat{\beta }_{GRD}\)” section. The data was generated by Yatchew (2003), later discussed by Tabakan and Akdeniz (2010) and came from the survey of 81 municipal electricity distribution in Ontario, Canada, in 1993.

*V*is seldom known, the estimation of

*V*can be used. Trenkler (1984) gave some estimates of

*V*as

*R*is given as follows:

*V*is estimated by (35). It is easy to compute the condition number is 2365.158, suggesting the presence of severe collinearity.

In this section we use the method which Hoerl and Kennard proposed to estimate *k*. Then we get \(MSE (\hat{\beta }_{GRD}(k),\beta )=0.323\) and \(MSE (\hat{\beta }_{GRD},\beta )=0.597\), that is to say the new estimator is better than restricted difference-based estimator.

## Conclusions

In this article, we present a new generalized difference-based ridge estimator that can be applied in the presence of multicollinearity in a partial linear model. Its MSE is compared analytically with the generalized restricted difference-based estimator. It is shown that for small values of the ridge parameter *k*, the new estimator is MSEM-superior to the generalized restricted difference-based estimator over an interval depending on the design points and the unknown parameter.

## Declarations

### Acknowledgements

The author would like to thank the anonymous referees and the Associate Editor for their constructed suggestions which signicantly improved the presentation of the article. This work was supported by the Scientific and Technological Research Program of Chongqing Municipal Education Commission (no. KJ1501114), the Natural Science Foundation Project of CQ CSTC (cstc2015jcyjA00001), and the Scientific Research Foundation of Chongqing University of Arts and Sciences (no: R2013SC12, Y2015SC47).

### Competing interests

The author declares that they have no competing interests.

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Authors’ Affiliations

## References

- Ahn H, Powell J (1993) Semiparametric estimation of censored selection models with a nonparametric selection mechanism. J Econom 58:3–29View ArticleGoogle Scholar
- Akdeniz F, Tabakan G (2009) Restricted ridge estimators of the parameters in semiparametric regression model. Commun Stat Theory Methods 38(11):1852–1869View ArticleGoogle Scholar
- Duran EA, Akdeniz F (2012) Efficiency of the modified jackknifed Liu-type estimator. Stat Pap 53(2):265–280View ArticleGoogle Scholar
- Duran EA, Härdle WK, Osipenko M (2012) Difference based ridge and Liu type estimators in semiparametric regression models. J Multivar Anal 105(1):164–175View ArticleGoogle Scholar
- Farebrother RW (1976) Further results on the mean square error of ridge regression. J R Stat Soc B 38:248–250Google Scholar
- Hu HC, Yang Y, Pan X (2015) Asymptotic normality of DHD estimators in a partially linear model. Stat Papers. doi:10.1007/s00362-015-0666-2
- Hu HC (2005) Ridge estimation of a semiparametric regression model. J Comput Appl Math 176:215–222View ArticleGoogle Scholar
- Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12:55–67View ArticleGoogle Scholar
- Liu KJ (2003) Using Liu type estimator to combat multicollinearity. Commun Stat Theory Methods 32(5):1009–1020View ArticleGoogle Scholar
- Ruppert D, Wand MP, Holst U, Hossjer O (1997) Local polynomial variance-function estimation. Technometrics 39:262–272View ArticleGoogle Scholar
- Roozbeh M, Arashi M, Niroumand HA (2010) Semiparameteric ridge regression approach in partially linear models. Commun Stat Simul Comput 39:449–460View ArticleGoogle Scholar
- Swamy PAVB, Mehta JS, Rapport PN (1978) Two methods of eval uating Hoerl and Kennard’s ridge regression. Commun Stat 12:1133–1155View ArticleGoogle Scholar
- Shi JH (2001) The conditional ridge-type estimation of regression coefficient in restricted linear regression model. J Shanxi Teach Univ Natural Sci Ed 15:10C16Google Scholar
- Sarkar N (1992) A new estimator combining the ridge regression and the restricted least squares methods of estimation. Commun Stat Theory Methods 21:1987–2000View ArticleGoogle Scholar
- Tabakan G, Akdeniz F (2010) Difference-based ridge estimator of parameters in partial linear model. Stat Pap 51:357–368View ArticleGoogle Scholar
- Trenkler G (1984) On the performance of biased estimators in the linear regression model with correlated or heteroscedastic errors. J Econom 25:179–190View ArticleGoogle Scholar
- Wang L, Brown LD, Cai TT (2007) A difference based approach to semiparametric partial linear model. Department of Statistics, The Wharton School University of Pennsylvania, Pennsylvania, Technical reportGoogle Scholar
- Yatchew A (1997) An elemantary estimator of the partial linear model. Econ Lett 57:135–143. Additional examples contained in Econ Lett (1998) 59, 403–405Google Scholar
- Yatchew A (2000) Scale economies in electricity distribution: a semiparametric analysis. J Appl Econom 15(2):187–210View ArticleGoogle Scholar
- Yatchew A (2003) Semiparametric regression for the applied econometrican. Cambridge University Press, Cambridge 123View ArticleGoogle Scholar
- Zhong Z, Yang H (2007) Ridge estimation to the restricted linear model. Commun Stat Theory Methods 36:2099–2115View ArticleGoogle Scholar
- Zhang CM, Yang H (2007) The conditional ridge-type estimation in singular linear model with linear equality restriction. Statistics 41(6):485–494View ArticleGoogle Scholar