The McDonald exponentiated gamma distribution and its statistical properties

Abstract In this paper, we propose a five-parameter lifetime model called the McDonald exponentiated gamma distribution to extend beta exponentiated gamma, Kumaraswamy exponentiated gamma and exponentiated gamma, among several other models. We provide a comprehensive mathematical treatment of this distribution. We derive the moment generating function and the rth moment. We discuss estimation of the parameters by maximum likelihood and provide the information matrix. AMS Subject Classification Primary 62N05; secondary 90B25


Introduction
The gamma distribution is the most popular model for analyzing skewed data and hydrological processes. One of the important families of distributions in lifetime tests is the exponentiated gamma (EG) distribution. The exponentiated gamma (EG) distribution has been introduced by Gupta et al. 1998 which has cumulative distribution function (c.d.f.) and a probability density function (p.d.f.) of the form, respectively; G(x, λ, θ) = 1 − e −λx (1 + λx) θ , λ > 0, θ > 0 and x ≥ 0. (1) where λ and θ are scale and shape parameters respectively. The corresponding probability density function (pdf ) is given by Shawky and Bakoban 2008 discussed the exponentiated gamma distribution as an important model of life time models and derived Bayesian and non-Bayesian estimators of the shape parameter, reliability and failure rate functions in the case of complete and type-II censored samples. Also order statistics from exponentiated gamma distribution and associated inference was discussed by Bakoban 2009. Ghanizadeh, et al. 2011, dealt with the estimation of parameters of the exponentiated gamma (EG) distribution with presence of k outliers. The maximum likelihood and moment estimators were derived. These estimators are compared empirically using Monte Carlo simulation. Singh et al. 2011b proposed Bayes estimators of the parameter of the exponentiated gamma distribution and associated reliability function under general entropy loss function for http://www.springerplus.com/content/4/1/2 a censored sample. The proposed estimators were compared with the corresponding Bayes estimators obtained under squared error loss function and maximum likelihood estimators through their simulated risks. Khan and Kumar 2011 established the explicit expressions and some recurrence relations for single and product moments of lower generalized order statistics from exponentiated gamma distribution. Sing et al. 2011a where proposed Bayes estimators of the parameter of the exponentiated gamma distribution and associated reliability function under general entropy loss function for a censored sample. Feroze ans Aslam 2012 introduced Bayesian analysis of exponentiated gamma distribution under type II censored samples. Recently, Nasiri et al. 2013 discussed Classical and Bayesian estimation of parameters on the generalized exponentiated gamma distribution.

Mc-Donald generalized distribution
Consider an arbitrary parent cdf G(x). The probability density function (pdf ) f (x) of the new class of distributions called the Mc-Donald generalized distributions (denoted with the prefix "Mc" for short) is defined by where a > 0, b > 0 and c > 0 are additional shape parameters . (See Corderio et al. (2012) for additional details). Note that g(x) is the pdf of parent distribution , g(x) = dG(x) dx . Introduction of this additional shape parameters is specially to introduce skewness. Also, this allows us to vary tail weight. It is important to note that for c = 1 we obtain a submodel of this generalization which is a beta generalization (see Eugene et al. 2002) and for a = 1, we have the Kumaraswamy (Kw), [Kumaraswamy generalized distributions (see Cordeiro and Castro 2011)). For random variable X with density function (2), we write X ∼ Mc − G. The probability density function (3) will be most tractable when G(x) and g(x) have simple analytic expressions. The corresponding cumulative function for this generalization is given by where dw denotes the incomplete beta function ratio (Gradshteyn and Ryzhik 2000). Equation (4) can also be rewritten as follows is the well-known hypergeometric functions which are well established in the literature (see, Gradshteyn and Ryzhik 2000). Some mathematical properties of the cdf F(x) for any Mc-G distribution defined from a parent G(x) in Equation 5, could, in principle, follow from the properties of the hypergeometric function, which are well established in the literature (Gradshteyn and Ryzhik 2000 Sec. 9.1 ). One important benefit of this class is http://www.springerplus.com/content/4/1/2 its ability to skewed data that cannot properly be fitted by many other existing distributions. Mc-G family of densities allows for higher levels of flexibility of its tails and has a lot of applications in various fields including economics, finance, reliability, engineering, biology and medicine. The hazard function (hf ) and reverse hazard functions (rhf ) of the Mc-G distribution are given by and distribution which extends the exponentiated gamma model and has several other models as special cases. since it has more shape parameters, yielding a large variety of forms. It can also be useful for testing the goodness of fit of its sub-models. The outline of this paper is as follows. In Section 2, the McDonald exponentiated gamma (McEG) and related family distributions are introduced. The series expansion for the density, hazard and reverse hazard functions, and other properties are presented in Section 3. Section 4 provides expansions for the cumulative and density functions. In Section 5, we present the statistical properties, in particular moments , moment generating function. The distribution of the order statistics is expressed in Section 6. Section 7 provides least squares and weighted least squares estimators. Maximum likelihood estimates of the parameters index to the distribution are discussed in Section 8. Section 9 provides applications to real data sets. Section 10 ends with some conclusions.

McDonald exponentiated gamma distribution
In this section we studied the five parameter McDonald exponentiated gamma (McEG) distribution. Using G(x) and g(x) in (3) to be the cdf and pdf of (1) and (2). The pdf of the McEG distribution is given by  (λ, θ , a, b, c). The corresponding cdf of the McEG distribution is given by also, the cdf can be written as follows The hazard rate function and reversed hazard rate function of the new distribution are given by and

Expansions for the cumulative and density functions
In this section,we present a series expansion of the McEG cdf and pdf. distribution depending if the parameter b > 0 is real non-integer or integer. First, if |z| < 1 and b > 0 is real non-integer, we have in this subsection, we present some representations of cdf and pdf of (McEG) Equations 7 and (8) are straightforward to compute using any software with algebraic facilities. The mathematical relation given below will be useful in this subsection. If b is a positive real non integer and |z| ≤ 1,then Using the expansion (12) in (8) If b > 0 is an integer, then Similarly, if b > 0 is real non-integer the pdf is given by and G(x, λ, θ c(a + j)) is a finite mixture of exponentiated gamma distribution with λ and θ c(a + j) are scale and shape parameters respectively. The graphs below are the pdf, cdf, survival function, h(x), and τ (x) of the McEG distribution for different values of parameters λ, θ , a, b and c.

Statistical properties
This section is devoted to studying statistical properties of the (McEG) distribution, specifically quantile function , moments and moment generating function

Quantile function and simulation
The quantile function corresponding to (7) is u),is given by the following relation Simulating the McEG random variable is straightforward. Let U be a uniform variate on the unit interval (0, 1). Thus, by means of the inverse transformation method, we consider the random variable X given by the relation (17) http://www.springerplus.com/content/4/1/2

Moments
In this subsection we discuss the r th moment for (McEG) distribution. Moments are necessary and important in any statistical analysis, especially in applications. It can be used to study the most important features and characteristics of a distribution (e.g., tendency, dispersion, skewness and kurtosis). We use the results presented earlier, which was obtained by expanding the pdf. θ , a, b, c) then the r th moment of X is given by the following where Proof. Let X be a random variable with density function (7). The r th ordinary moment of the (McEG) distribution is given by Using the fact that we obtain again using the binomial series expansion but which completes the proof . Based on the first four moments of the (McEG) distribution, the measures of skewness A(ϕ) and kurtosis k(ϕ) of the (McEG) distribution can obtained as and

Moment generating function
In this subsection we derived the moment generating function of (McEG) distribution.
Theorem 3.2. If X has (McEG) distribution, then the moment generating function M X (t) has the following form Proof. We start with the well known definition of the moment generating function given by which completes the proof.

Conditional moments, residual life and reversed failure rate function
For lifetime models , it is also of interest to find the conditional moments and the mean residual lifetime function. The conditional moments for (McEG) distribution is given by using (20), (22) and (23), Equation 29 becomes where Given that a component survives up to time t ≥ 0, the residual life is the period beyond t until the time of failure and defined by the conditional random variable X − t|X > t. In reliability, it is well known that the mean residual life function and ratio of two consecutive moments of residual life determine the distribution uniquely (Gupta and Gupta, 1983). Therefore, we obtain the r th -order moment of the residual life via the general formula Applying the binomial expansion of (x − t) r and substituting f (x, ϕ) given by (7) into the above formula gives On the other hand, we analogously discuss the reversed residual life and some of its properties. The reversed residual life can be defined as the conditional random variable t − X|X ≤ t which denotes the time elapsed from the failure of a component given that its life is less than or equal to t. This random variable may also be called the inactivity time (or time since failure); for more details you may see (Kundu and Nanda, 2010; Nanda, Singh, Misra, and Paul, 2003). Also, in reliability, the mean reversed residual life and ratio of two consecutive moments of reversed residual life characterize the distribution uniquely. the reversed failure (or reversed hazard) rate function is given by Equation 11. The r th -order moment of the reversed residual life can be obtained by the well known formula Applying the binomial expansion of (t − x) r and substituting f (x, ϕ) given by (2.1) into the above formula gives where γ (s, t) = Using m(t)and m 2 (t) we obtain the variance of the reversed residual life of the McEG distribution , and hence the coefficient of variation of the reversed residual life of the McEG distribution can be easily obtained.

Distribution of the order statistics
In this section, we derive closed form expressions for the pdfs of the r th order statistic of the (McEG) distribution, also, the measures of skewness and kurtosis of the distribution of the r th order statistic in a sample of size n for different choices of n; r are presented in this section. Let X 1 , X 2 , . . . , X n be a simple random sample from (McEG) distribution with pdf and cdf given by (7) and (9), respectively. Let X 1 , X 2 , . . . , X n denote the order statistics obtained from this sample. We now give the probability density function of X r:n , say f r:n (x, ϕ) and the moments of X r:n , r = 1, 2, . . . , n. Therefore, the measures of skewness and kurtosis of the distribution of the X r:n are presented. The probability density function of X r:n is given by where F(x, ϕ) and f (x, ϕ) are the cdf and pdf of the (McEG) distribution given by (7), (8), respectively, and since 0 < F(x, ϕ) < 1, for x > 0, by using the binomial series expansion of [1 − F(x, ϕ)] n−r , given by we have substituting from (7) and (8) into (37), we can express the k th ordinary moment of the r th order statistics X r:n say E(X k r:n ) as a liner combination of the k th moments of the (McEG) distribution with different shape parameters. Therefore, the measures of skewness and kurtosis of the distribution of X r:n can be calculated.

Estimation and inference
In this section, we determine the maximum likelihood estimates (MLEs) of the parameters of the (McEG) distribution from complete samples only. Let X 1 , X 2 , . . . , X n be a random sample of size n from McEG (λ, θ , a, b, c).The likelihood function for the vector of parameters ϕ = (λ, θ , a, b, c) can be written as Taking the log-likelihood function for the vector of parameters ϕ = (λ, θ , a, b, c) we get log L = n log θ + 2n log λ + n log c + n log (a + b) The log-likelihood can be maximized either directly or by solving the nonlinear likelihood equations obtained by differentiating (39). The components of the score vector are given by and We can find the estimates of the unknown parameters by maximum likelihood method by setting these above non-linear Eqs. 40-(44) to zero and solve them simultaneously. Therefore, we have to use mathematical package to get the MLE of the unknown parameters. Also, all the second order derivatives exist. Thus we have the inverse dispersion matrix is given by ⎛ The elements of Hessian matrix is given in the Appendix. By solving this inverse dispersion matrix these solutions will yield asymptotic variance and covariances of these ML estimators for λ, , θ, a , b and c. Using (44), we approximate 100(1 − γ )% confidence intervals for λ, θ , a, b and c are determined respectively as where z γ is the upper 100γ the percentile of the standard normal distribution.
We can compute the maximized unrestricted and restricted log-likelihood functions to construct the likelihood ratio (LR) test statistic for testing on some the McEG sub-models. For example, we can use the LR test statistic to check whether the McEG distribution for a given data set is statistically superior to the EG distribution. In any case, hypothesis tests of the type H 0 : ϕ = ϕ 0 versus H 0 : ϕ = ϕ 0 can be performed using a LR test. In this case, the LR test statistic for testing H 0 versus H 1 is ω = 2( (φ; x) − (φ 0 ; x)), whereφ andφ 0 are the MLEs under H 1 and H 0 , respectively. The statistic ω is asymptotically (as http://www.springerplus.com/content/4/1/2 n → ∞) distributed as χ 2 k , where k is the length of the parameter vector θ of interest. The LR test rejects H 0 if ω > χ 2 k;γ , where χ 2 k;γ denotes the upper 100γ % quantile of the χ 2 k distribution.

Application
In this section, we compare the results of fitting the McEG and EG distributions to real data sets. Sixty-three breaking strengths of glass fibres of length 1.5 cm were reported by Smith and Naylor (1987). No units for the breaking strengths were given. The The data are as follows: 0.55, 0.74, 0.77, 0.81, 0.84, 0.93, 1.04, 1.11, 1.13, 1.24, 1.25, 1.27, 1.28, 1.29, 1.30, 1.36, 1.39, 1.42, 1.48, 1.48, 1.49, 1.49, 1.50, 1.50 3;0.05 , so we reject the null hypothesis. In order to compare the two distribution models, we consider criteria like −2 , AIC (Akaike information criterion)and CAIC (corrected Akaike information criterion) for the data set. The better distribution corresponds to smaller −2 , AIC and CAIC values: where k is the number of parameters in the statistical model, n the sample size and is the maximized value of the log-likelihood function under the considered model. Also, here for calculating the values of KS we use the sample estimates of θ , α, a, b and c. Table 1 shows the MLEs under both distributions, Table 2 shows the values of −2 , AIC and CAIC values. The values in Table 2 indicate that the McEG distribution leads to a better fit than the EG distribution. A density plot compares the fitted densities of the models with the empirical histogram of the observed data ( Figure 5). The fitted density for the McEG model is closer to the empirical histogram than the fits of the EG model.
Empirical, fitted McEG and EG cdf of the data set is given in Figure 6. PP of McEG, EG and KEG distribution are given, respectively in Figures 6, 7, 8 and 9.

Simulated data
In this subsection, we provided an algorithm to generated a random sample from the McEG distribution for the given values of its parameters and sample size n. The simulation process consists the following steps: 1. Set n, and = (λ, θ , a, b, c).
2. Set initial value x 0 for the random starting.
5. Update x 0 by using the Newton's formula such as Then, x will be the desired sample from F(x). Using the above algorithm, we generated a sample of size 100 from McEG distribution for arbitrary values of λ = 0.1, θ = 0.5, a = 0.3, b = 4 and c = 5. The simulated sample is given by The maximum likelihood estimates with corresponding confidence intervals are calculated based on the simulated sample. The MLEs of (λ, θ , a, b, c) are (0.1381430, 1.4472316, 0.1033288, 3.0747396, 5.1030106) respectively. The asymptotic confidence intervals for (λ, θ , a, b, c) are obtained as

Conclusion
Here we propose a new model, the so-called the McEG distribution which extends the EG distribution in the analysis of data with real support. An obvious reason for generalizing a standard distribution is because the generalized form provides larger flexibility in modeling real data. We derive expansions for the moments and for the moment generating function. The estimation of parameters is approached by the method of maximum likelihood, also the information matrix is derived. We consider the likelihood ratio statistic to compare the model with its baseline model. An application of the McEG distribution to real data show that the new distribution can be used quite effectively to provide better fits than EG distribution. http://www.springerplus.com/content/4/1/2