The Distribution of the Concentration Ratio for Samples from a Uniform Population

doi:10.4236/am.2015.61007

Applied Mathematics
Vol.06 No.01(2015), Article ID:53054,13 pages
10.4236/am.2015.61007

Giovanni Girone, Antonella Nannavecchia

●How to Cite this Article

Faculty of Economics, University of Bari, Bari, Italy

Email: giovanni.girone@uniba.it, nannavecchia@lum.it

This work is licensed under the Creative Commons Attribution International License (CC BY).

http://creativecommons.org/licenses/by/4.0/

Received 24 October 2014; revised 20 November 2014; accepted 16 December 2014

ABSTRACT

In the present paper we derived, with direct method, the exact expressions for the sampling probability density function of the Gini concentration ratio for samples from a uniform population of size n = 6, 7, 8, 9 and 10. Moreover, we found some regularities of such distributions valid for any sample size.

Keywords:

Gini Concentration Ratio, Uniform Distribution, Order Statistics, Probability Density Function

1. Introduction

In 1914 Corrado Gini [1] introduced the concentration ratio R for the measure of inequality among values of a frequency distribution. The Gini index is widely used in fields as diverse as sociology, health science, engineering, and in particular, economics to measure the inequality of income distribution.

Various aspects of the Gini index have been taken into account. One of the most interesting topics regards the estimation of the concentration ratio (Hoeffding, 1948 [2] ; Glasser, 1962 [3] ; Cucconi, 1965 [4] ; Dall’Aglio, 1965 [5] ). More recently, Deltas (2003) [6] discussed the sources of bias of the Gini coefficient for small samples. This has implications for the comparison of inequality among subsamples, some of which may be small, and the use of the Gini index in measuring firm size inequality in markets with a small number of firms. Barret and Donald (2009) [7] considered statistical inference for consistent estimators of generalized Gini indices. The empirical indices are shown to be asymptotically normally distributed using functional limit theory. Moreover, asymptotic variance expressions are obtained using influence functions. Davidson (2009) [8] derived an approximation for the estimator of the Gini index by which it is expressed as a sum of IID random variables. This approximation allows developing a reliable standard error that is simple to compute. Fakoor, Ghalibaf and Azarnoosh (2011) [9] considered nonparametric estimators of the Gini index based on a sample from length-bi- ased distributions. They showed that these estimators are strongly consistent for the Gini index. Also, they obtained an asymptotic normality for the corresponding Gini index.

Girone (1968) [10] focused on the study of the sampling distribution of the Gini index and in 1971 [11] derived the exact expression for samples drawn from an exponential population. In 1971 Girone [12] obtained, with direct method, the sampling distribution function of the Gini ratio for samples of size n ≤ 5 drawn from a uniform population.

In the present note (Section 2), we calculate the joint probability density function (p.d.f.) of the random sample of size n and, then, the joint p.d.f. of the n order statistics. Hence, we transform one of the order statistics in their average and the remaining n ‒ 1 order statistics are divided by the same average. We calculate the joint p.d.f. of the new n variables and integrating with respect to the average we obtain the joint p.d.f. of the other n ‒ 1 variables. One of these variables is transformed in the concentration ratio. We calculate the joint p.d.f. of the concentration ratio and of the other n ‒ 2 variables and at last we integrate this p.d.f. with respect to the n ? 2 variables obtaining the marginal p.d.f. of the concentration ratio. The main difficulty of this procedure consists in the identification of the region of integration of the n ‒ 2 variables, for two reasons: firstly the need to decompose this region into subregions which allow identifying directly the limits of integration and secondly the growing number of such subregions that makes the derivation heavy.

In Sections 3-7, using the software Mathematica, we derive the exact distributions of the concentration ratio for samples from a uniform distribution of size n = 6, 7, 8, 9 and 10. Moreover (Section 8), we find some regularities of such distributions valid for any sample size.

2. The Procedure to Derive the Distribution of the Concentration Ratio

Let random variables from a uniform population have p.d.f.

(1)

The joint p.d.f. of the variables is

(2)

The joint p.d.f. of the order statistics is

(3)

By transforming the variables

whose Jacobian is

we obtain the joint p.d.f. of the variables S and that can be written as

(4)

We integrate expression [4] with respect to the variable S and obtain the joint p.d.f. of the variables that can be written as

(5)

By transforming the variable in the variable R i.e. the concentration ratio

from which we get

the Jacobian of the transformation is

and the joint p.d.f. of the variable R and is

(6)

for

(7)

By integrating expression [6] with respect to the variables over the regions determined by inequalities [7] , we get the marginal p.d.f. of the concentration ratio R.

3. The Distribution of the Concentration Ratio for n = 6

The procedure indicated in Section 2 is used to obtain the following p.d.f. (Figure 1) of the concentration ratio for random samples of size n = 6:

Figure 1. Probability density function of the concentration ratio R for random samples of size n = 6 from a uniform population.

Characteristic values of the distribution are:

mean

second moment

third moment

fourth moment

standard deviation

index of skewness

index of kurtosis

The distribution of the concentration ratio R for samples of size n = 6 from a uniform population shows a slight positive skewness and platykurtosis.

4. The Distribution of the Concentration Ratio for n = 7

The procedure indicated in Section 2 is used to obtain the following p.d.f. (Figure 2) of the concentration ratio R for random samples of size n = 7:

Figure 2. Probability density function of the concentration ratio R for random samples of size n = 7 from a uniform population.

Characteristic values of the distribution are:

mean

second moment

third moment

fourth moment

standard deviation

index of skewness

index of kurtosis

The distribution of the concentration ratio R for samples of size n = 7 from a uniform population shows slight positive skewness and platykurtosis, both lower than those obtained for samples of size n = 6.

5. The Distribution of the Concentration Ratio for n = 8

The procedure indicated in Section 2 is used to obtain the following p.d.f. (Figure 3) of the concentration ratio R for random samples of size n = 8:

Figure 3. Probability density function of the concentration ratio R for random samples of size n = 8 from a uniform population.

Characteristic values of the distribution are:

mean

second moment

third moment

fourth moment

standard deviation

index of skewness

index of kurtosis

The distribution of the concentration ratio R for samples of size from a uniform population shows slight positive skewness and platykurtosis, both lower than those obtained for samples of size and 7.

6. The Distribution of the Concentration Ratio for n = 9

The procedure indicated in Section 2 is used to obtain the following p.d.f. (Figure 4) of the concentration ratio R for random samples of size n = 9:

Figure 4. Probability density function of the concentration ratio R for random samples of size n = 9 from a uniform population.

Characteristic values of the distribution are:

mean

second moment

third moment

fourth moment

standard deviation

index of skewness

index of kurtosis

The distribution of the concentration ratio R for samples of size n = 9 from a uniform population shows slight positive skewness and platykurtosis, both lower than those obtained for samples of size n = 6, 7 and 8.

7. The Distribution of the Concentration Ratio for n = 10

The procedure indicated in Section 2 is used to obtain the following p.d.f. (Figure 5) of the concentration ratio R for random samples of size n = 10:

Figure 5. Probability density function of the concentration ratio R for random samples of size n = 10 from a uniform population.

Characteristic values of the distribution are:

mean

second moment

third moment

fourth moment

standard deviation

index of skewness

index of kurtosis

The distribution of the concentration ratio R for samples of size n = 10 from a uniform population shows slight positive skewness and platykurtosis, both lower than those obtained for samples of size.

8. Some Regularities of the Distributions

The analysis of the p.d.f. for shows some regularities:

● The p.d.f. of the concentration ratio R, for and for samples of size n, can be expressed by

● Furthermore, the p.d.f. of the concentration ratio R, for and for samples of size n, can be expressed by

● The density of the concentration ratio R, for and for samples of size n, is given by

● The jth term of the density of the concentration ratio R, denoted as verifies the following symmetry

The coefficients of the terms of the p.d.f. of the concentration ratio R for samples of size multiplied by become the coefficients of the terms of the same p.d.f. for sample of size n.

These results are valid for every sample size and may allow reducing the heavy calculation to determine the p.d.f. of the concentration ratio R.

9. Concluding Remarks

In the present paper we obtain the distributions of the Gini concentration ratio R for samples of size drawn from a uniform population. We use the same method used by Girone [12] to derive the same distributions for samples of size. We obtain the p.d.f. of the concentration ratio R calculating a multiple integral in dimensions for each region from to for. The limits of integration are defined by solving the inequalities of the order statistics divided by the sample mean and expressed in terms of the concentration ratio R for the values assumed in each of such regions. The calculation of the limits of integration is particularly heavy and requires a very long processing time.

The obtained results show that the p.d.f. of the concentration ratio R is given by hyperbolic splines with degree 2 and with nodes in for. Such distributions are unimodal with mean tending to, which is the value of the concentration ratio R for the population, and have decreasing standard deviation. Moreover, the distributions show a slight positive skewness and platykurtosis that tend to decrease as n increases.

Beyond the possibility to obtain similar results for samples of larger size, open problems are the derivation of the exact expression for the mean and the other features of the distribution of the concentration ratio R for random samples of size n drawn from a uniform population.

References

Gini, C. (1914) L’ammontare e la composizionedellaricchezzadellenazioni. Bocca, Torino.
Hoeffding, W. (1948) A Class of Statistics with Asymptotically Normal Distribution. Annals of Mathematical Statistics, 19, 293-325.
Glasser, G.J. (1962) Variance Formulas for the Mean Difference and the Coefficient of Concentration. Journal of the American Statistical Association, 57, 648-654. http://dx.doi.org/10.1080/01621459.1962.10500553
Cucconi, O. (1965) Sulla distribuzionecampionaria del rapporto R di concentrazione. Statistica, 25, 119.
Dall’Aglio, G. (1965) Comportamentoasintoticodellestimedelladifferenza media e del rapporto di concentrazione. Metron, 24, 379-414.
Deltas, G. (2003) The Small-Sample Bias of the Gini Coefficient: Results and Implications for Empirical Research. Review of Economics and Statistics, 85, 226-234. http://dx.doi.org/10.1162/rest.2003.85.1.226
Barrett, G.F. and Donald, S.G. (2009) Statistical Inference with Generalized Gini Indices of Inequality, Poverty, and Welfare. Journal of Business & Economic Statistics, 27, 1-17. http://dx.doi.org/10.1198/jbes.2009.0001
Davidson, R. (2009) Reliable Inference for the Gini Index. Journal of Econometrics, 150, 30-40. http://dx.doi.org/10.1016/j.jeconom.2008.11.004
Fakoor, V., Ghalibaf, M.B. and Azarnoosh, H.A. (2011) Asymptotic Behaviors of the Lorenz Curve and Gini Index in Sampling from a Length-Biased Distribution. Statistics and Probability Letters, 81, 1425-1435. http://dx.doi.org/10.1016/j.spl.2011.04.013
Girone, G. (1968) Sulcomportamentocampionariosimulato del rapporto di concentrazione. Annalidella Facoltà di Economia e Commerciodell’Universitàdegli Studi di Bari, 23, 5-11.
Girone, G. (1971) La distribuzione del rapporto di concentrazione per campionicasuali di variabiliesponenziali. Studi di Probabilità, Statistica e Ricercaoperativa in onore di Giuseppe Pompilj, Oderisi, Gubbio.
Girone, G. (1971) La distribuzione del rapporto di concentrazione per piccolicampioniestratti da unapopolazioneuniforme. Annalidell’Istituto di Statisticadell’Universitàdegli Studi di Bari, 36, 31-52.

Journal Menu >>