Selecting statistical model and optimum maintenance policy: a case study of hydraulic pump
 S. Ruhi^{1}Email author and
 M. R. Karim^{2}
Received: 1 December 2015
Accepted: 17 June 2016
Published: 4 July 2016
Abstract
Introduction
Proper maintenance policy can play a vital role for effective investigation of product reliability. Every engineered object such as product, plant or infrastructure needs preventive and corrective maintenance.
Case description
In this paper we look at a real case study. It deals with the maintenance of hydraulic pumps used in excavators by a mining company. We obtain the data that the owner had collected and carry out an analysis and building models for pump failures. The data consist of both failure and censored lifetimes of the hydraulic pump.
Discussion and evaluation
Different competitive mixture models are applied to analyze a set of maintenance data of a hydraulic pump. Various characteristics of the mixture models, such as the cumulative distribution function, reliability function, mean time to failure, etc. are estimated to assess the reliability of the pump. Akaike Information Criterion, adjusted Anderson–Darling test statistic, Kolmogrov–Smirnov test statistic and root mean square error are considered to select the suitable models among a set of competitive models. The maximum likelihood estimation method via the EM algorithm is applied mainly for estimating the parameters of the models and reliability related quantities.
Conclusions
In this study, it is found that a threefold mixture model (Weibull–Normal–Exponential) fits well for the hydraulic pump failures data set. This paper also illustrates how a suitable statistical model can be applied to estimate the optimum maintenance period at a minimum cost of a hydraulic pump.
Keywords
Case study EM algorithm Hydraulic pumps Maintenance data Mixture models ReliabilityIntroduction
Every engineered object (product, plant or infrastructure) needs preventive and corrective maintenance. The cost of maintenance can vary from 5 to 30 % (Campbell 1995) of the operating budget depending on the industry sector. This implies that businesses need to manage maintenance effectively to ensure minimum costs. This requires proper data management to assist in building models for effective decision making.
In this paper we look at a real case study. It deals with the maintenance of hydraulic pumps used in excavators by a mining company. We look at the data that the owner (mining company) had collected and carry out an analysis and build models for pump failures. The data given in Murthy et al. (2015) and Karim et al. (2015) consist of both failure and censored lifetimes of the pump. Murthy et al. (2015) and Karim et al. (2015) showed that the threefold Weibull mixture distribution is the best distribution for the data among the three competing distributions (single Weibull, twofold Weibull mixture and threefold Weibull mixture). In this paper we search a suitable distribution for the data from a set of competitive mixture models (based on Weibull, Exponential, Normal and Lognormal distributions). Finally the selected distribution is used to find out the optimum time at which the expected cost for maintenance of the pump will be minimum.
The remainder of the article is organized as follows: “Hydraulic pump failure data” section describes a set of hydraulic pump failure data which will be analyzed in this paper. “Mixture models for modeling failure data” section presents the mixture models for modeling failure data. “Parameter estimation” section presents the MLEs of the parameters of mixture models by applying the Expectation–Maximization (EM) algorithm. “Model selection” section describes about the model selection for the data through graphical and statistical approaches. “Optimum maintenance cost” section expresses a procedure in which we have tried to find out the optimum time at which the expected cost for maintenance of the pump will be minimum. Finally, “Conclusion” section concludes the article with a discussion of the key findings.
Hydraulic pump failure data
Hydraulic pump failure data
Age (h)  Type  Age (h)  Type  Age (h)  Type  Age (h)  Type 

81  0  3333  1  9334  1  12,198  0 
149  1  3569  1  9368  1  12,198  0 
245  1  3837  0  9729  1  12,198  0 
340  1  3837  0  9751  0  12,198  0 
407  1  4150  0  10,299  1  12,236  0 
461  1  5123  1  10,389  0  12,236  0 
629  1  5258  1  10,413  0  12,236  0 
856  0  5662  0  10,557  1  12,236  0 
947  0  5923  1  10,944  1  12,236  0 
1460  1  6333  1  10,970  1  12,236  0 
1513  1  6717  1  11,647  0  12,394  0 
1670  1  7207  1  11,678  1  12,459  0 
1688  0  7265  1  11,686  1  13,097  0 
2093  0  7624  1  11,798  0  13,497  0 
2242  0  7625  0  11,869  0  13,497  0 
2242  0  7973  1  11,869  0  13,497  0 
2242  0  8183  1  11,923  0  13,497  0 
2242  0  8217  1  12,005  0  13,497  0 
2242  0  8390  1  12,082  0  13,497  0 
2607  1  8462  1  12,090  0  13,497  0 
2668  1  8728  1  12,136  0  14,407  1 
2806  1  8817  1  12,141  0  15,536  1 
3132  0  8870  1  12,143  0  16,289  1 
3132  0  8884  0  12,163  0  17,517  1 
3132  0  9055  1  12,198  0  
3132  0  9182  1  12,198  0 
Mixture models for modeling failure data
A variety of statistical models have been developed and studied extensively in the analysis of product failure data (Kalbfleisch and Prentice 1980; Meeker and Escobar 1998; Blischke and Murthy 2000; Lawless 2003; Murthy et al. 2004). A set of mixture models that have been used to analyze the pump failure data, given in Table 1, are discussed below.
The cumulative distribution functions, probability density functions and reliability functions for the various twofold and threefold mixture models can be obtained from Eqs. (1)–(3) by putting n = 2 and n = 3, respectively. Ruhi et al. (2015) applied a twofold Weibull mixture model for analyzing failure data. More literatures on the applications of mixture models can be found in Titterington et al. (1985), Mendenhall and Hader (1958), Ahmad and Abdelrahman (1994), and Murthy et al. (2004).
Parameter estimation
We estimate the parameters of different mixture models by applying the maximum likelihood estimation method. We apply the Expectation–Maximization (EM) algorithm to find the maximum likelihood estimates (MLEs) of the parameters. Details on the application of EM algorithm for mixture models with censored data can be found in Ateya (2012), Bordes and Chauveau (2012) and Ruhi, et al. (2015). Karim, et al. (2015) have applied single Weibull, twofold Weibull mixture and threefold Weibull mixture models for this data set and suggested the threefold Weibull mixture model as the best fitted model on the basis of various graphical and statistical approaches. In addition to threefold Weibull mixture model, here we have assumed two other threefold mixture models (WeibullNormalExponential and NormalLognormalWeibull) for the data. Our aim is to find out whether any other threefold mixture model fits this data set better than the threefold Weibull mixture model or not. And if the distribution changed, what would be its effect on optimal maintenance policy.
The parameters of these three mixture models are estimated by applying maximum likelihood method via the Expectation–Maximization (EM) algorithm. R programming codes are written for all computations of the paper. Programming codes for analyzing the data with Weibull–Normal–Exponential mixture model are given in the “Appendix”. The given codes can be used for other two models after simple modifications, mainly related to the functions dweibull(), pweibull(), dnorm(), pnorm(), dexp() and pexp() and the parameter vector theta.
MLEs of the parameters of assumed models
Threefold mixture models  MLEs of parameters 

Weibull (β_{1}, η_{1})–Weibull (β_{2}, η_{2})–Weibull (β_{3}, η_{3})  \(\begin{aligned} & \left\{ {\beta_{1} ,\eta_{1} ,\beta_{2} ,\eta_{2} ,\beta_{3} ,\eta_{3} ,p_{1} ,p_{2} ,p_{3} } \right\} = \\ & \{ 1.0191,\;2364.0191,\;5.5758,\;9481.8351,\; \\ & 16.6426,\;16535.5039, \, 0.1659,\;0.3220,\;0.5120\} \\ \end{aligned}\) 
Weibull (β, η)–Normal (μ, σ)–Exponential (δ)  \(\begin{aligned} & \left\{ {\beta ,\eta ,\mu ,\sigma ,\delta ,p_{1} ,p_{2} ,p_{3} } \right\} = \\ & \{ 5.5391,\;9527.83,\;15991.11,\;1073.821,\; \\ & 0.0004,\;0.3249,\;0.5076,\;0.1674\} \\ \end{aligned}\) 
Normal (μ_{1}, σ_{1})–Lognormal (μ_{2}, σ_{2})–Weibull (β, η)  \(\begin{aligned} & \left\{ {\mu_{1} ,\sigma_{1} ,\mu_{2} ,\sigma_{2} ,\beta ,\eta ,p_{1} ,p_{2} ,p_{3} } \right\} = \\ & \{ 15992.0308,\;1072.7513,\;7.5063,\;1.3759,\; \\ & 5.4782,\;9497.0899,\;0.4947,\;0.1872,\;0.3180\} \\ \end{aligned}\) 
Comment

For Weibull (β_{1}, η_{1})–Weibull (β_{2}, η_{2})–Weibull (β_{3}, η_{3}) mixture model, the mean for \(F_{3} \left( {t;\beta_{3} ,\eta_{3} } \right)\) = 16,018.005 > mean for \(F_{2} \left( {t;\beta_{2} ,\eta_{2} } \right)\) = 8760.457 > mean for \(F_{1} \left( {t;\beta_{1} ,\eta_{1} } \right)\) = 2345.628.

For Weibull (β, η)–Normal (μ, σ)–Exponential (δ) mixture model, the mean for \(F_{2} \left( {t;\mu ,\sigma } \right)\) = 15,991.110 > mean for \(F_{1} \left( {t;\beta ,\eta } \right)\) = 8799.642 > mean for \(F_{3} \left( {t;\delta } \right)\) = 2500.000.

For Normal (μ_{1}, σ_{1})–Lognormal (μ_{2}, σ_{2})–Weibull (β, η) mixture model, the mean for \(F_{1} \left( {t;\mu_{1} ,\sigma_{1} } \right)\) = 15,992.031 > mean for \(F_{3} \left( {t;\beta ,\eta } \right)\) = 8765.749 > mean for \(F_{2} \left( {t;\mu_{2} ,\sigma_{2} } \right)\) = 4688.418.
Model selection
Figure 1 indicates that all the cdfs obtained from the three different mixture models give approximately same result, except at the right tail of the figure of cdfs, where the cdfs of Weibull–Normal–Exponential and Normal–Lognormal–Weibull mixture models belong slightly closer to the nonparametric estimate of cdf than that of the cdf of threefold Weibull mixture model. Hence we may consider both the Weibull–Normal–Exponential and Normal–Lognormal–Weibull mixture models for the data set.
Estimates of AIC, AD*, KS test statistic and RMSE for the models
Threefold mixture models  AIC  AD*  KS test  RMSE 

Threefold Weibull  965.5942  0.6272  0.1068  0.0247 
Weibull–Normal–Exponential  963.2532  0.5278  0.0876  0.0209 
Normal–Lognormal–Weibull  964.6492  0.4781  0.0877  0.0217 
From Table 3, we found that the Weibull–Normal–Exponential mixture model contains the smallest values of AIC and RMSE and the Normal–Lognormal–Weibull mixture model contains the smallest value of AD* test statistic among all of the mixture models. Hence, it can be concluded that, among these mixture models, Weibull–Normal–Exponential mixture model can be selected as the best model for hydraulic pump failure data according to the values of AIC and RMSE.
We have also applied the Kolmogrov–Smirnov (KS) test statistic as a goodnessoffit test for these threefold mixture models. At the 5 % level of significance, with n = 102, the critical value of the Kolmogorov–Smirnov onesample test is \(1.36/\sqrt {102} = 0.135\) (Siegel and Castellan 1988). Since the observed value of the KS test statistic for all the threefold mixture models (given in Table 3) are less than the critical value, we cannot reject the null hypothesis, H_{0}, that the observed data are from a population specified by these threefold mixture distribution. But we may consider that among all these three mixture models the Weibull–Normal–Exponential mixture model gives the smallest value for the KS test statistic.
 q::

Probability that the pump is scrapped and replaced by a new one under service exchange
 1 – q::

Probability that the pump is not scrapped and reconditioned under service exchange
 p::

Probability that the item used in service exchange is installed correctly
 1 – p::

Probability that the item used in service exchange is not installed correctly
 F _{ N }(t)::

Failure distribution of new item installed correctly
 F _{ R }(t)::

Failure distribution of reconditioned item installed correctly
 F _{ I }(t)::

Failure distribution of incorrectly installed item (new or reconditioned)
Optimal \(T^{*}\) and \(J\left( {T^{*} } \right)\) for different values of \(\xi\)
Model  Optimal values  Additional cost  

\(\xi\) = 70,000  \(\xi\) = 90,000  \(\xi\) = 110,000  \(\xi\) = 130,000  
Threefold Weibull  \(T^{*}\)  14,631  14,484  14,377  14,295 
\(J\left( {T^{*} } \right)\;\)  10.40593  11.43373  12.45314  13.46734  
Weibull–Normal–Exponential  \(T^{*}\)  14,468  14,361  14,286  14,230 
\(J\left( {T^{*} } \right)\;\)  10.33359  11.33318  12.3265  13.31607  
Normal–Lognormal–Weibull  \(T^{*}\)  14,476  14,368  14,291  14,234 
\(J\left( {T^{*} } \right)\;\)  10.32712  11.32669  12.31991  13.30929 
Using the estimates of p _{1}, p _{2} and p _{3} from Table 2 in Eq. (5), we get the estimates of p = 0.8326 and q = 0.6096.
Optimum maintenance cost
Obtaining the solution to the problem involves building a model and deciding on the optimal age for PM action requires an objective function. The objective function is the asymptotic expected cost per unit time. Note that every time instant an exchanged pump is put into operation can be viewed as a renewal point for a renewal process characterizing the replacements of pumps over time. The time between two successive renewal points defines a cycle. The asymptotic expected cost per unit time can be obtained as the ratio of the expected cycle cost (ECC) and the expected cycle length (ECL).
The optimal \(T\) depends on the average cost of each CM and PM. Like Karim et al. (2015), we use the following additional notations and assumptions.
\(C_{n}\): Sale price for new pump ($80,000).
\(C_{r}\): Cost (charged by the service agent) for reconditioning a pump under CM or PM action ($60,000).
\(\xi\): Additional cost (due to downtime, loss in revenue, etc.) resulting from CM action. We look at values of \(\xi\) = $70,000, $90,000, $110,000 and $130,000.
A maintenance action involves replacement by a new item or a reconditioned item with probabilities \(q\) and \(\left( {1  q} \right)\) respectively. As a result, the average cost of a PM action is \(C_{p} = qC_{n} + \left( {1  q} \right)C_{r}\) and of a CM action is \(C_{f} = C_{p} + \xi\). The optimal \(T^{*}\) is obtained using (9) with threefold mixture cdf \(F(t) = G_{3} (t)\) and the optimal expected cost per unit time is given by \(J\left( {T^{*} ;F\left( . \right)} \right)\) i.e., \(J(T^{ * } ;G_{3} ( \cdot ))\).
Here we can see that, the optimal \(T^{*}\) depend on the additional cost \(\xi\). The optimal \(T^{*}\) and optimal expected cost per unit time \(J\left( {T^{*} } \right)\) on various values of \(\xi\) for the three different threefold mixture models has been estimated. These results are given in Table 4, from where it can be seen, for every model, the optimal \(T^{*}\) decrease and optimal \(J\left( {T^{*} } \right)\) increasing with \(\xi\) increases, as to be expected.
Table 4 indicates that the threefold Weibull mixture model gives a bit larger optimal maintenance period \(T^{*}\) than other two models, however the Weibull–Normal–Exponential model shows a reduction in the maintenance cost than the threefold Weibull mixture model for all \(\xi\).
Conclusion
Proper data management (data collection and analysis) is very important for effective maintenance of any engineered object. Data is critical for building and selecting suitable statistical models and model provides new insights for improvements to maintenance operations.
This paper has dealt with a real case study to illustrate how statistical models can be selected and applied for estimating optimum maintenance period and cost of a hydraulic pump. It is recommended that the Weibull–Normal–Exponential mixture model can be selected as the best model for hydraulic pump failure data among three competitive models. This model suggests the optimum maintenance period for the pump that reduces the maintenance cost. Annotated R code is provided for analyzing hydraulic pump failure data with Weibull–Normal–Exponential mixture model. The code can be modified easily to apply other threefold mixture models.
Declarations
Authors’ contributions
The authors with the consultation of each other have carried out this work and drafted the manuscript together. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Ahmad KE, Abdelrahman AM (1994) Updating a nonlinear discriminant function estimated from a mixture of 2Weibull distributions. Math Comput Model 18:41–51View ArticleGoogle Scholar
 Ateya SF (2012) Maximum likelihood estimation under a finite mixture of generalized exponential distributions based on censored data. In: Statistical papers (5 November 2012), pp 1–15Google Scholar
 Blischke WR, Murthy DNP (2000) Reliability. Wiley, New YorkView ArticleGoogle Scholar
 Blischke WR, Karim MR, Murthy DNP (2011) Warranty data collection and analysis. Springer, BerlinView ArticleGoogle Scholar
 Bordes L, Chauveau D (2012) EM and stochastic EM algorithms for reliability mixture models under random censoring. hal00685823, v1. https://hal.archivesouvertes.fr/hal00685823v1
 Campbell JD (1995) Outsourcing in maintenance management—a valid alternative to selfprovision. J Qual Maint Eng 1(3):18–24View ArticleGoogle Scholar
 Kalbfleisch JD, Prentice RL (1980) The Statistical analysis of failure time data. Wiley, New YorkGoogle Scholar
 Karim MR, Ahmadi A, Murthy DNP (2015) Modeling of maintenance data. In: Presented at ICRESHARMS conference, 2015, Lulea, SwedenGoogle Scholar
 Lawless JF (2003) Statistical methods for lifetime data. Wiley, New YorkGoogle Scholar
 Meeker WQ, Escobar LA (1998) Statistical methods for reliability data. Wiley, New YorkGoogle Scholar
 Mendenhall W, Hader RJ (1958) Estimation of parameters of mixed exponentially distributed failure time distributions from censored life test data. Biometrica 45:504–520View ArticleGoogle Scholar
 Murthy DNP, Xie M, Jiang R (2004) Weibull models. Wiley, New YorkGoogle Scholar
 Murthy DNP, Karim MR, Ahmadi A (2015) Data management in maintenance outsourcing. Reliab Eng Syst Saf 142:100–110View ArticleGoogle Scholar
 Ruhi S, Sarker S, Karim MR (2015) Mixture models for analyzing product reliability data: a case study. SpringerPlus 4:634View ArticleGoogle Scholar
 Siegel S, Castellan NJ (1988) Nonparametric statistics for the behavioral sciences. McGrawHill, New YorkGoogle Scholar
 Titterington M, Smith AFM, Makov UE (1985) Statistical analysis of finite mixture distribution. Wiley, New YorkGoogle Scholar