Perturbative method for maximum likelihood estimation of the Weibull distribution parameters
- V. H. Coria^{1},
- S. Maximov^{1},
- F. Rivas-Dávalos^{1}Email author and
- C. L. Melchor-Hernández^{1}
- Received: 1 June 2016
- Accepted: 10 October 2016
- Published: 18 October 2016
Abstract
The two-parameter Weibull distribution is the predominant distribution in reliability and lifetime data analysis. The classical approach for estimating the scale \((\alpha )\) and shape \((\beta )\) parameters employs the maximum likelihood estimation (MLE) method. However, most MLE based-methods resort to numerical or graphical techniques due to the lack of closed-form expressions for the Weibull \(\beta\) parameter. A Weibull \(\beta\) parameter estimator based on perturbation theory is proposed in this work. An explicit expression for \(\beta\) is obtained, making the estimation of both parameters straightforward. Several right-censored lifetime data sets with different sample sizes and censoring percentages were analyzed in order to assess the performance of the proposed estimator. Study case results show that our parameter estimator provides solutions of high accuracy, overpassing limitations of other parameter estimators.
Keywords
- Weibull distribution
- Maximum likelihood estimation
- Parameter estimation
- Censored data
- Perturbation theory
Background
The two-parameter Weibull distribution is widely used in reliability engineering and lifetime data analysis because of its flexibility to properly model increasing and decreasing failure rates. It has gained the interest of researchers who have worked on its various aspects, such as inference, application and parameter estimation (see Nelson 1982; Cohen 1991; Johnson et al. 1994; Meeker and Escobar 1998). Traditional parameter estimation methods call on probability plotting, least squares and maximum likelihood estimation (Lawless 1982).
A probability plotting approach is straightforward and it is best used for small size data samples. However, this estimation method has not been sufficiently accurate as reported in Mao and Li (2007). The least squares (or rank regression) method is essentially a probability plotting method that applies least squares to determine lines through points. The main disadvantage of this method is that it assigns a large weight for extreme observations, producing a large variance (Genschel and Meeker 2010).
Maximum likelihood estimation (MLE) is considered one of the most robust parameter estimation techniques. It constructs a likelihood function for a set of statistical data, which is optimized to find its extremum with respect to the distribution parameters. The MLE method can handle survival and interval data better than rank regression approaches, particularly when dealing with heavily censored data sets that contain few points of highly accurate observed data. Teimouri et al. (2013) compares the MLE method with other four methods [the Method of Logarithm Moment (MLM), the Percentile Method (PM), the L-Moments Method (LM), and the Method of Moments (MM)] to determine Weibull parameters. One of the main findings of this work is that estimation of parameters is better performed using MLE and LM estimators. However, MLE leads to likelihood equations that need to be solved numerically. Therefore, low convergence rates and efficient iterative methods must be properly addressed, which can be particularly difficult with censored data (Balakrishnan and Kateri 2008).
Recent research has been focused on obtaining new efficient numerical and statistical inference methods in order to deal with this problem. Joarder et al. (2011) consider statistical inferences of the unknown parameters of the Weibull distribution with right-censored data samples, stating that the MLE cannot lead to explicit forms of the Weibull distribution. Therefore, they propose approximate maximum likelihood estimators (AMLE), which are obtained by expanding the MLE equations in Taylor series. Also, the authors propose a fixed-point algorithm to compute the maximum likelihood estimators.
Balakrishnan and Mitra (2012) use an expectation-maximization (EM) algorithm to estimate the model parameters of the Weibull distribution of left-truncation and right-censored data. The algorithm consists of two steps: expectation step (E-step) and maximization step (M-step). The conditional expectation of the complete data likelihood is obtained with the E-step, using the incomplete observed data and current estimated value of the parameter. This expected likelihood is essentially a function of the involved parameter and its current value under which the expectation has been calculated. The expected likelihood is then maximized with respect to the parameter using the EM gradient algorithm. The E- and M-steps are then iterated until convergence. MLE and Bayes estimators are applied to calculate the survival function and the failure rate of the Weibull distribution for censored data in Guure and Ibrahim (2012). In order to estimate the survival and the failure rate functions under the MLE, the authors applied the Newton–Raphson method. Bayes estimators are obtained using a linear exponential, general entropy and squared error loss functions while a prior noninformative Bayesian approach is employed to estimate the survival function and failure rate. However, the aposteriori distribution function cannot be reduced to a closed form because it involves a ratio of complicated integrals. More work concerning Weibull parameter estimation can be found in Jabeen et al. (2013), Yang and Scott (2013), Guure and Ibrahim (2014), Mohammed Ahmed (2014) and Wang and Ye (2015).
Most parameter estimation methods presented in the literature are useful tools for solving practical problems, showing that the Weibull parameter estimation problem continues to be important in the research field of data analysis. Hence, it is clear that the development of general and new methods for a wider range of applications is desirable.
In this paper, an approximate analytical method to estimate the \(\beta\) Weibull parameter for complete and right-censored data is proposed using perturbation theory. The method involves a systematic construction of an analytical solution to the likelihood equation for \(\beta\), taking advantage of the presence of a small parameter. The solution is developed as a power series with respect to this parameter. As a result, the likelihood equation for \(\beta\) is replaced by a set of simple solvable algebraic equations. These equations are explicitly solved one by one in order to obtain an increasingly accurate approximation to the true solution.
Problem statement
Existence and uniqueness of the likelihood estimate
The existence and uniqueness of the solution of the MLE equation have already been proved in Balakrishnan and Kateri (2008) and Farnum and Booth (1997) using Cauchy-Schwarz inequality. A different proof is presented here, leading to our proposed analytical solution for the \(\beta\) parameter.
The solution \(\beta ^{*}\) is found from Eq. (20), which can be substituted into \(\beta ^{*}=\beta _{1}\,\zeta ^{*}=\beta _{1} (1+z)\) according to Eqs. (8) and (19). This way, \(\alpha ^{*}\) can be calculated by substituting the estimated \(\beta ^{*}\) value into Eq. (5).
A perturbative approach to estimate the shape parameter
Perturbation theory is employed in this section to solve Eq. (20). It allows the representation of \(\zeta ^{*}\) to be asymptotically expanded, which in turn can be conveniently truncated to obtain an analytical solution to Eq. (20).
Cases of study
Three study cases are shown in this section to illustrate the application of our proposed analytical method for the estimation of the Weibull \(\beta\) parameter. The first study considers right-censored data set found in Balakrishnan and Kateri (2008), where a graphical solution for the determination of the MLE shape parameter is employed. In a second study, the proposed method is applied to right-censored data used in Balakrishnan and Mitra (2012). Finally, sets of lifetime data are randomly generated combining different censoring rates and sample sizes, in order to cover a wider range of data sampling scenarios that might be encountered in practical applications. Corresponding Weibull parameters for each data set are accordingly estimated.
For the first two cases, the \(\beta\) parameter was also estimated using a Newton-Rapshon algorithm with the purpose of illustrating the advantage of our proposed analytical method, where \(\beta\) is obtained by a single equation.
Case 1
Lifetime data for Case 1
k | \(t_k\) | \(\delta _k\) | k | \(t_k\) | \(\delta _k\) | k | \(t_k\) | \(\delta _k\) | k | \(t_k\) | \(\delta _k\) |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 12.5 | 1 | 6 | 95.5 | 1 | 11 | 125.6 | 1 | 16 | 152.7 | 0 |
2 | 24.4 | 1 | 7 | 96.6 | 1 | 12 | 152.7 | 1 | 17 | 152.7 | 0 |
3 | 58.2 | 1 | 8 | 97.0 | 1 | 13 | 152.7 | 0 | 18 | 152.7 | 0 |
4 | 68.0 | 1 | 9 | 114.2 | 1 | 14 | 152.7 | 0 | 19 | 152.7 | 0 |
5 | 69.1 | 1 | 10 | 123.2 | 1 | 15 | 152.7 | 0 | 20 | 152.7 | 0 |
Parameter estimates for Case 1
Parameters | Approximate analytical method | Approach in Balakrishnan and Kateri (2008) | NR method \(\beta _{0}=2\) |
---|---|---|---|
\(\beta\) | 1.6466 | 1.647 | 1.6466 (6 iterations) |
\(\alpha\) | 162.22306 | 162.223 | 162.2330 |
The MLE was combined with the Newton–Raphson method using a convergence tolerance set to 0.001 and an initial guess \(\beta _{0}=2\), which corresponds to our \(\beta\) solution when it is rounded up to the next integer. This last criterion is adopted from the well-known fact that an appropriate initial value (close to the desired solution) ensures the convergence of the NR method within few iterations.
It can be observed from Table 2 that the parameters obtained from our analytical method match closely the estimates of the NR method and graphical method proposed in Balakrishnan and Kateri (2008). Therefore, it can be stated that our proposed parameter estimation methodology not only works well for this case, but it also directly provides \(\beta\) from an explicit expression [Eq. (32)].
Case 2
Simulated data set provided by Balakrishnan and Mitra (2012)
Simulated data set provided by Balakrishnan and Mitra (2012) No. | Installation year | Failure year | No. | Installation year | Failure year | No. | Installation year | Failure year | No. | Installation year | Failure year |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1984 | – | 26 | 1986 | – | 51 | 1982 | – | 76 | 1974 | 2006 |
2 | 1990 | 2001 | 27 | 1987 | – | 52 | 1981 | – | 77 | 1978 | 1995 |
3 | 1983 | 2002 | 28 | 1990 | 1997 | 53 | 1986 | – | 78 | 1962 | 1993 |
4 | 1981 | 2000 | 29 | 1980 | 1996 | 54 | 1980 | 1990 | 79 | 1963 | – |
5 | 1985 | – | 30 | 1980 | – | 55 | 1980 | 1994 | 80 | 1960 | 1998 |
6 | 1991 | – | 31 | 1981 | – | 56 | 1982 | – | 81 | 1962 | 2007 |
7 | 1982 | – | 32 | 1983 | 1997 | 57 | 1990 | 2008 | 82 | 1960 | 1990 |
8 | 1990 | – | 33 | 1980 | – | 58 | 1985 | – | 83 | 1962 | 1980 |
9 | 1983 | 1999 | 34 | 1984 | – | 59 | 1983 | – | 84 | 1961 | 1981 |
10 | 1992 | – | 35 | 1982 | – | 60 | 1982 | – | 85 | 1964 | 1989 |
11 | 1983 | – | 36 | 1980 | – | 61 | 1963 | 1996 | 86 | 1964 | 1987 |
12 | 1989 | – | 37 | 1985 | 2007 | 62 | 1963 | 2001 | 87 | 1960 | 2006 |
13 | 1985 | – | 38 | 1993 | – | 63 | 1961 | 1998 | 88 | 1961 | 1992 |
14 | 1982 | – | 39 | 1983 | – | 64 | 1961 | 1992 | 89 | 1964 | – |
15 | 1983 | – | 40 | 1980 | – | 65 | 1960 | 1984 | 90 | 1963 | 1991 |
16 | 1981 | – | 41 | 1981 | 2001 | 66 | 1964 | 2004 | 91 | 1973 | – |
17 | 1985 | – | 42 | 1989 | – | 67 | 1961 | 1994 | 92 | 1964 | – |
18 | 1981 | – | 43 | 1993 | – | 68 | 1977 | 1998 | 93 | 1972 | 1984 |
19 | 1988 | 2002 | 44 | 1983 | – | 69 | 1963 | 1987 | 94 | 1962 | 2007 |
20 | 1983 | – | 45 | 1993 | – | 70 | 1960 | 1991 | 95 | 1963 | 1997 |
21 | 1984 | – | 46 | 1987 | – | 71 | 1961 | 1983 | 96 | 1964 | 1987 |
22 | 1989 | – | 47 | 1994 | – | 72 | 1964 | 1995 | 97 | 1964 | 2002 |
23 | 1988 | – | 48 | 1985 | 2007 | 73 | 1963 | 1998 | 98 | 1971 | – |
24 | 1982 | – | 49 | 1981 | – | 74 | 1961 | 2001 | 99 | 1965 | 1990 |
25 | 1981 | – | 50 | 1983 | 2004 | 75 | 1960 | 1988 | 100 | 1962 | 1994 |
Parameter estimates for Case 2
Parameters | Approximate analytical method | NR method \(\beta _{0}=4\) |
---|---|---|
\(\beta\) | 3.205 | 3.2506 (8 iterations) |
\(\alpha\) | 35.245 | 35.2084 |
It can be observed again that the parameters obtained with our analytical method match quite well the estimates of the NR program. This case is another example that shows the efficacy of our proposed analytical method for the determination of Weibull parameters.
Case 3
The simulation approach of Zhou et al. (2013) was adopted to generate different sets of lifetime data at prespecified points of time. The generated data mimic a time-censored sampling scenario for a hypothetical number of transformer units, which operate at the same time. In addition, it is assumed that a population of units is homogenous with a fixed censoring time C, and each individual unit has a lifetime \(t_k\), \(k=\overline{1,M}\), where M denotes the total number of units. Each \(t_k\) is identically considered an independent random variable that follows a specific probability distribution. Finally, lifetime data is characterized by the censoring rate CR, defined as the proportion of censored data and calculated as the number of suspensions divided by the sample size.
The sample sizes employed in this study were 10, 20, 50, 100, 500 and 1000. Censoring rates were fixed at 0, 20 and \(80\,\%\). Each group of simulated lifetime data required M numbers of \(t_k\) that were randomly generated from a two-parameter Weibull distribution with prespecified values of \(\alpha _{\mathrm{true}}=3.0\) and \(\beta _{\mathrm{true}}=1.5\). Censoring times were chosen to have a common value, which is calculated as \(F_x^{-1} (p;\alpha ,\beta )\), where p is the probability of a unit, starting at time 0, fails before reaching censoring time C. p was fixed for each CR at 1.0, 0.8 and 0.2. Then, a lifetime data set is generated through the comparison of lifetime units and a selected censoring time: If \(t_{k}\) is less than or equal to C, the unit is failed. Otherwise, the unit is in suspension with lifetime data censored at time C.
This study was specially designed to bring about the effectiveness of our proposed analytical MLE method for Weibull parameters. Our proposed method was also compared in this work with the L-Moments estimation method presented in Teimouri et al. (2013), which is based on linear combination of order statistics and provides closed-form expressions for Weibull parameters.
Case 3. Parameter estimates for different simulated data sets
CR (%) | M | L-Moments method | Approximate analytical method | ||
---|---|---|---|---|---|
\(\alpha\) | \(\beta\) | \(\alpha\) | \(\beta\) | ||
0 | 10 | 3.0522 | 2.1521 | 3.0563 | 2.5286 |
20 | 3.4026 | 1.3887 | 3.4289 | 1.4851 | |
50 | 2.7332 | 1.3915 | 2.7112 | 1.3497 | |
100 | 2.9970 | 1.5129 | 2.9860 | 1.4877 | |
500 | 3.1178 | 1.5677 | 3.0670 | 1.4772 | |
1000 | 3.0152 | 1.5289 | 2.9624 | 1.4342 | |
20 | 10 | 2.6830 | 1.9116 | 2.9353 | 1.7227 |
20 | 2.9035 | 1.8576 | 3.3553 | 1.4175 | |
50 | 2.5856 | 1.8822 | 2.8145 | 1.5906 | |
100 | 2.8119 | 2.1616 | 2.9557 | 1.9031 | |
500 | 2.7591 | 1.8930 | 3.0847 | 1.4296 | |
1000 | 2.7483 | 1.9360 | 3.0350 | 1.5349 | |
80 | 10 | 1.0279 | 4.5619 | 1.4874 | 2.1709 |
20 | 1.0871 | 12.9564 | 2.4157 | 2.2982 | |
50 | 1.0709 | 7.8137 | 3.0147 | 1.4884 | |
100 | 1.0553 | 6.1198 | 3.4455 | 1.2138 | |
500 | 1.0757 | 9.3024 | 2.8698 | 1.6711 | |
1000 | 1.0709 | 8.2974 | 2.9130 | 1.5363 |
Conclusions
An analytical approach is developed in this work to estimate the Weibull parameter \(\beta\) for complete and right-censored data using perturbation theory. The idea behind this method is to formally expand the \(\beta\) solution to its likelihood equation around point 1.0 as a power series in \(\varepsilon\), which turns out to be a small parameter. In fact, if \(\varepsilon\) is zero, the equation is exactly solvable. Therefore, the problem is reduced to find the asymptotic behavior of the best approximation to the true solution within \(\varepsilon ,\varepsilon ^{2},\ldots\) Thus, perturbation theory leads to an expression for the desired solution in terms of a formal power series in a “small” parameter that quantifies the deviation from the exactly solvable problem. Hence, an approximate analytical solution for \(\beta\) parameter is obtained by truncating the series at a prespecified order.
Our analytical method for estimation of the Weibull parameter was tested on several lifetime data sets. This way, it was concluded that the performance of our proposed method was satisfactory for all lifetime data sets with different combinations of sample sizes with small and heavy censoring. These data sets cover a wide range of practical scenarios that our method can easily deal with.
The main conclusion that can be drawn from this work is that the use of the formulations described in “Existence and uniqueness of the likelihood estimate” and “A perturbative approach to estimate the shape parameter” sections allows the analytical obtention of \(\beta\). This method was not only numerically tested using common and practical data sets, but it was also theoretically and mathematically proved. Our approach efficiently estimates \(\beta\) employing a single equation, with no need of graphical or iterative procedures.
Finally, it is worth mentioning that although the estimation of Weibull parameters under right-censored scheme was considered, the proposed method can be extended to other censoring schemes such as left-truncation and hybrid. Additional work is required in this direction, which is currently considered by the authors.
Authors' contributions
VHC is one of the authors of the original idea, mathematical background and all numerical simulations. SM is the leader of the group, one of the authors of the original idea and mathematical background of the work. He also made substantial critical revisions to the manuscript. FR-D made substantial contributions to the design and execution of this study and made critical revisions to the manuscript. CLM-H was involved in the study design, data acquisition and analysis, and provided critical revisions to the manuscript. All authors read and approved the final manuscript.
Competing interests
The authors declares that they have no competing interests.
