Estatística
A new class of gamma distribution
A new class of gamma distribution
Acta Scientiarum. Technology, vol. 39, no. 1, pp. 79-87, 2017
Universidade Estadual de Maringá

Received: 20 November 2015
Accepted: 04 May 2016
Abstract: This paper presents a new class of probability distributions generated from the gamma distribution. For the new class proposed, we present several statistical properties, such as the risk function, expansions to density and cumulative function, moment generating function, characteristic function, the moments of order , central moments of order , the log likelihood and its partial derivatives and also Rényi entropy, kurtosis, skewness and variance. Some of these properties are indicated for a particular distribution within this new class that is used to illustrate the capability of the proposed new class through an application to a real data set. The data set presented in Choulakian and Stephens (2001) was used. Six models are compared and for the selection of these models was used the Akaike Information Criterion (AIC) and tests of Cramer-Von Mises and Anderson-Darling to assess the models fit. Lastly, the conclusions from the analysis and comparison of the results obtained are presented, as well as the directions for future researches.
Keywords: generalized distribution, statistical properties, quantile function, maximum likelihood estimation, model fit.
Resumo: Este artigo apresenta uma nova classe de distribuição de probabilidades gerada a partir da distribuição gama. Para a classe proposta apresentamos algumas propriedades estatísticas tais como função de risco, expansões para densidade e acumulada, função geratriz de momentos, função característica, momentos de ordem , momentos centrais de ordem , função de log-verossimilhança e suas respectivas derivadas parciais, entropia de Rényi e medidas de curtose, assimetria e variância. Algumas dessas propriedades são indicadas para uma distribuição-base particular dentro dessa nova classe a fim de ilustrar a potencialidade da classe proposta por meio de uma aplicação a um conjunto de dados reais. O conjunto de dados apresentado em Choulakian and Stephens (2001) foi usado. Seis modelos são comparados, e para a seleção destes foi utilizado o critério de informação de Akaike, e testes de Cramer-Von Mises e Anderson-Darling foram usados para avaliar o ajuste aos modelos. Finalmente, apresentamos as conclusões com dados da análise e comparação dos resultados obtidos e sugerimos trabalhos futuros.
Palavras-chave: distribuição generalizada, propriedades estatísticas, função quantílica, estimação por máxima verossimilhança, ajuste de modelos.
Introduction
The gamma distribution is
used in a variety of applications including queue, financial and weather
models. It can naturally be considered as the distribution of the waiting time
between events distributed according to a Poisson process. It is a
two-parameter distribution, whose density is given by: 
where
a > 0 is a shape parameter and b > 0 is the reciprocal of a scale parameter.
Due to the importance of this distribution, recently some new distributions as well as families of probability distributions based on generalizations of the gamma distribution have been proposed. Given a distribution with continuous distribution function G(x) its generalization or exponentiated formG(x) is obtained by F(x)=Ga(x), with a > 0 (power parameter). Gupta, Gupta, and Gupta (1998) proposed and studied some properties exponentiated gamma distribution.
Cordeiro, Ortega, and Silva (2011) extended the exponentiated gamma distribution defining a new distribution called Exponentiated Generalized gamma Distribution with four parameters, which is capable of modeling bathtub shaped failure rate phenomena.
Zografos and Balakrishnan (2009) defined a family of probability distributions based on the integration of a gamma distribution as follows: 
where:
G(x) is an arbitrary distribution function. When a = n+1 this distribution coincides with the distribution of the nth highest value record (Alzaatreh, Famoye, & Lee, 2014).
Alternatively, Ristic and Balakrishnan (2012) have proposed a new family of probability distributions, which is also based on the integration of the gamma distribution. They defined this new family as follows: 
where:
G(x) is an arbitrary distribution function. Similarly, when a = n+1 this distribution coincides with the distribution of the nth smallest value record (Alzaatreh et al. 2014).
Following the line of work of Zografos and Balakrishnan (2009) and Ristic and Balakrishnan (2012), our goal in this work is to propose a new family of distributions based on gamma distribution. The family of distributions proposed here is the following:

where:
G(x) is an arbitrary distribution function and HG(x) has the same support as the distribution G(x). This new family shall be called gamma-[(1-G)/G] class. The statistical properties of this new class, such as mean, variance, standard deviation, mean deviation, kurtosis, skewness, moment generating function, characteristic function and graphical analysis, are derived.
Then, to illustrate the applicability of the proposed new family, it is considered the particular case of the distribution obtained when taking into account that G(x) is the distribution function of an exponential random variable. By presenting mathematical structures for gamma-[(1-G)/G] class, it was also derived statistical properties from this new distribution, and, to illustrate its potentiality, an application to a set of real data is performed. For this, the data set presented in the work of Choulakian and Stephens (2001) was used to verify if the models are well adjusted to this data. As comparative criteria of fitness of the models, it was considered the Akaike (AIC), and the Cramer-von Mises and Anderson-Darling tests. Both hypothesis tests, Anderson-Darling and Cramér-von Mises, are discussed in detail by Chen and Balakrishnan (1995) and belong to the class of quadratic statistics based on the empirical distribution function, since they work with the squared differences between the empirical distribution and the hypothetical.
Material and methods
Obtaining a class of probability distributions
The gamma-[(1-G)/G] class is defined by the cumulative distribution function (cdf) (1) (for x > 0) which is equivalent to

where:
Q(a,z)=T(a,z)/T(a) is the regularized incomplete gamma function and is the incomplete gamma function, and T(a) is Euler gamma function. If the distribution G(x) has density g(x) the class will have a probability density function (pdf) given by

The Equations (2) and (3) can be rewritten as a sum of exponentiated distributions. These distributions have been studied by some authors in recent years, as for example, Mudholkar and Srivastava (1993) for exponentiated Weibull, Gupta and Kundu (1999) for exponentiated exponential, among others.
Using the power series exponential, we rewrite (3) as 
it follows that

Since
we can
rewrite the distribution function as 

Next, we presented an expansion to gamma-[(1-G)/G] class when is discrete. If the distribution G(x) is discrete, HG(x) is also discrete and we have that P(X=xl)=F(xl)-F(xl-1). Therefore,

In addition, we can obtain the risk function of the new gamma-[(1-G)/G] class as follows:

By inverting HG(x)=u (with, 0>u<1) it is obtained an explicit expression for quantile function as
where Q-1(a,u) is the inverse function of regularized incomplete gamma function.
Using the density and distribution function expansions, it is possible to get the statistical properties of the new class, as discussed below. Equations (4) and (5) are the main results of this subsection.
Moments and moment generating function
Several of the interesting characteristics and features of a probability model can be obtained using moments such as tendency, dispersion, skewness and kurtosis. The following equations are the development of the expansion calculations for the moments of order m for the gamma-[(1-G)/G] class. The nth moment of a random variable having cdf (2) can be easily obtained from Equation (4). Hence, we have


where:

The expression (6) is important since it generalizes the well-established probability weighted moments.
In particular, we have the following expansion of the mean for the gamma-[(1-G)/G] class

The following is the development of the expansion calculations for the moment generating function for the gamma-[(1-G)/G] class. We have from Equation (4),

Using the
fact that
we can rewrite 
Therefore, using (6), the last equation can be expressed as

Similarly, one can establish the following expansion for the characteristic function for the gamma-[(1-G)/G] class

Central moments and general coefficient
We will look at the development of the expansion calculations for central moments of order m to the gamma-[(1-G)/G] class. This measure can be calculated as

or equivalently


it follows that

In particular, by expanding the range of variance for the gamma-[(1-G)/G] class we have:


A new generalization called general coefficient, which extends the skewness and kurtosis, is given by

Substituting (7) and (8) in Equation (9), we obtain

Note that, in particular, as m=3 and m=4 in Cg(m) we obtain expansions to skewness and kurtosis measures, respectively.
Maximum likelihood estimation and Rényi entropy
After knowing a few regularity conditions, the
maximum likelihood estimates (MLEs) can be obtained by equating the derivative
of the log-likelihood function with respect to each parameter to zero. We
determine the MLEs of the parameters of the gamma-[(1-Exp)/Exp] class from
complete samples only. Let x1,...xn
be a random sample of size n from the
new class, where
is a vector of unknown parameters in the
parent distribution
. Earlier in section we wrote
and to
emphasize the parametric
vector. The log-likelihood function for the vector of parameters
can be obtained as

The log-likelihood
can be maximized, for example, either directly by using the SAS (ProcNLMixed) or by using the nonlinear likelihood
expressions obtained by differentiating . The
components of the score vector
are given by



where:
where:
is the digamma function.
Entropy is a measure of uncertainty in the sense that the higher the entropy value, the lowest the information and the greater the uncertainty, or the greater the randomness or disorder. The following is the expansion entropy calculations for the gamma-[(1-G)/G] class, using the Rényi entropy, which is given by

Substituting the expressions of density and cumulative distribution function given by Equations (3) and(2), respectively, we have

By expanding the exponential function in Taylor series as


Now, using the following binomial expansion


Thus, an explicit expression for Rényi entropy can be written

which, in turn, implies that (using Equation (6))

Results and discussion
Special model
This section, will examine a particular
distribution of the gamma-[(1-G)/G] class proposed here. It will be considered
the particular case in which
that is called the gamma-[(1-Exp)/Exp]
distribution.
The gamma-[(1-Exp)/Exp] distribution
Considering G(x) the cdf
of the exponential distribution with parameter
in Equation (2), we have the gamma-[(1-Exp)/Exp]
distribution:

Differentiating H(x), we get the density function of the gamma-[(1-Exp)/Exp] distribution:

Figure 1 show the graph of the gamma-[(1-Exp)/Exp] distribution probability density functions and cumulative distribution, for some values of the parameters.
![In right pdf and
left cdf of the gamma-[(1-Exp)/Exp] distribution for
some values of lambda .](../303249921011_gf2.png)
We can also obtain the risk function using the gamma-[(1-Exp)/Exp] distribution as follows:

Figure shows the graph of the risk function using the gamma-[(1-Exp)/Exp] distribution generated from some values assigned to parameters.

Using procedure similar to what was done in pdf and cdf expansions, the pdf and cdf of the gamma-[(1-Exp)/Exp] distribution we can rewritten as a sum of exponentiated exponentials, as follows:


Various properties of the exponentiated exponential can be obtained from Gupta and Kundu (1999). Using expansions (11) and (12), it is possible to obtain mathematical quantities of the special model such as ordinary and central moments, moment generating and characteristic functions, general coefficient, Rényi entropy and some others from quantities exponentiated of exponential distribution. For example, we consider only moments for reasons of space. The mth ordinary moment of the special model can be expressed as

In particular, we have that the mean of the gamma-[(1-Exp)/Exp] distribution is given by

Let x1,...xn
be a sample of the size n from x~gamma-[(1-Exp)/Exp](a,b,l)
The log-likelihood function for the vector of parameters
can be obtained
as
Importar tabla
The components of the score vector
are given by



Application
In this section, an application to real data for the proposed gamma distribution will be displayed. The data used in this research are from the excesses of flood peaks (in m3 s-1) Wheaton river near Carcross in the Yukon Territory, Canada. Seventy-two exceedances of the years 1958 to 1984 were recorded, rounded to one decimal place. These data were analyzed by Choulakian and Stephens (2001), and are presented in Table 1.

It is worth mentioning that this data set has also been analyzed by means of the distributions of Pareto, Weibull three parameters, the generalized Pareto and beta - Pareto (Akinsete, Famoye & Lee, 2008).
In Table 2, we can see the maximum likelihood estimates obtained by the Newton-Raphson implemented in SAS 9.1 statistical software, parameters, standard errors, Akaike information criterion and Anderson-Darling statistics (A*) and Cramér von Mises (W*) to the gamma-[(1-Exp)/Exp] distribution (M1), gamma-[(1-Exp)/Exp] distribution (proposed model, M2), exponentiated Weibull (M3), modified Weibull (M4), beta Pareto (M5) and Weibull (M6). Its densities are given by


where:
B(a,b)denotes the beta function and the parameters above are all positive real numbers.
For the six distributions shown in Table 2, the data applied to Wheaton river flooding, it was observed that beta-Pareto model (M5), which was described by Akinsete et al. (2008) as the best fitted model, in our studies had a lower performance with AIC = 524.398, A* = 2.0412 and W* = 0.3516, when compared to the proposed gamma-[(1-Exp)/Exp] model (M2) that obtained AIC = 505.030, A* = 0.4516 and W* = 0.0757. Also according to Table 2, the proposed distribution model M2 is the best tested once the lowest values of AIC, A* and W* are from such distribution.
The plots of the fitted gamma-[(1-Exp)/Exp]pdf and two better fitted pdfs are displayed in Figure 3. The graph shows that the gamma-[(1-Exp)/Exp] model has similar behavior to that of other distributions, being very competitive in the analysis of such data.


Conclusion
As concluding remarks, we note that the class of gamma-[(1-G)/G] probability distributions developed in this work is a novel way of generalizing the gamma distribution and can be applied in different areas depending on the choice of the distribution G. In a future research, we intend to carry out more detailed comparisons between the novel distribution family proposed in this paper and the family of distributions investigated in Zografos and Balakrishnan (2009), which are also based on the integration of the gamma distribution.
In this paper, we study in detail only a distribution of the gamma-[(1-G)/G] class, namely the gamma-[(1-Exp)/Exp] distribution. Some properties of this distribution were derived and applied to a set of real data, obtaining better fit than that obtained in a previous study by Akinsete et al. (2008). We intend to conduct the study of new distributions within this class as future work.
We note that, after adding several parameters to a model, this model can become better adjusted to a particular phenomenon due to its greater flexibility. On the other hand, one should not forget that there may be a problem for the estimation of the parameters, since it can occur both computational and identifiability problems in parameter estimation. Thus, the ideal is to choose a model that reflects well the phenomenon / experiment with the minimum number of parameters. In the case of the proposed class in this research, only two additional parameters are added to the set of parameters of the G distribution.
References
References
Akinsete, A., Famoye, F., & Lee, C. (2008). The beta-Pareto distribution. Statistics, 42(6), 547-563.
Alzaatreh, A., Famoye, F., & Lee, C. (2014). The gamma-normal distribution: properties and applications. Computational Statistics and Data Analysis, 69(1), 67-80.
Chen, G., & Balakrishnan, N. (1995). The general purpose approximate goodness-of-fit test. Journal of Quality Technology, 27(2), 154-161.
Choulakian, V., & Stephens, M. A. (2001). Goodness-of-fit for the generalized Pareto distribution. Technometrics, 43(4), 478-484.
Cordeiro, G. M., Ortega, E. M. M., & Silva, G. O. (2011). The exponentiated generalized gamma distribution with application to lifetime date. Journal of Statistical Computation and Simulation, 81(7), 827-842.
Gupta R. C., Gupta, P. L., & Gupta, R. D. (1998). Modeling failure time data by Lehman Alternative. Communication in Statistics - Theory and Methods, 27(4), 877-904.
Gupta, R. D., & Kundu, D. (1999). Theory & Methods: Generalized exponential distributions. Australian and New Zealand Journal of Statistics, 41(2), 173-188.
Mudholkar, G. S., & Srivastava, D. K. (1993). Exponentiated weibull family for analyzing bathtub failure-rate data. IEEE Transactions on Riliability, 42(2), 299-302.
Ristic, M. M., & Balakrishnan, N. (2012). The gamma exponentiated exponential distribution. Journal of Statistical Computation and Simulation, 82(8), 1191-1206.
Zografos, K., & Balakrishnan, N. (2009). On the families of beta-and gamma-generated generalized distribution and associated inference. Statistical Methodological, 6(4), 344-362.
Author notes
franksinatrags@gmail.com