_{1}

^{*}

Moments and cumulants are commonly used to characterize the probability distribution or observed data set. The use of the moment method of parameter estimation is also common in the construction of an appropriate parametric distribution for a certain data set. The moment method does not always produce satisfactory results. It is difficult to determine exactly what information concerning the shape of the distribution is expressed by its moments of the third and higher order. In the case of small samples in particular, numerical values of sample moments can be very different from the corresponding values of theoretical moments of the relevant probability distribution from which the random sample comes. Parameter estimations of the probability distribution made by the moment method are often considerably less accurate than those obtained using other methods, particularly in the case of small samples. The present paper deals with an alternative approach to the construction of an appropriate parametric distribution for the considered data set using order statistics.

L-moments form the basis for a general theory which includes the summarization and description of theoretical probability distributions and obtained sample data sets, parameter estimation of theoretical probability distributions and hypothesis testing of parameter values for theoretical probability distributions. The theory of L-mo- ments includes the established methods such as the use of order statistics and the Gini mean difference. It leads to some promising innovations in the area of measuring skewness and kurtosis of the distribution and provides relatively new methods of parameter estimation for an individual distribution. L-moments can be defined for any random variable whose expected value exists. The main advantage of L-moments over conventional moments is that they can be estimated by linear functions of sample values and are more resistant to the influence of sample variability. L-moments are more robust than conventional moments to the existence of outliers in the data, facilitating better conclusions made on the basis of small samples of the basic probability distribution. L-moments sometimes bring even more efficient parameter estimations of the parametric distribution than those estimated by the maximum likelihood method for small samples in particular, see [

L-moments have certain theoretical advantages over conventional moments consisting of the ability to characterize a wider range of the distribution. They are also more resistant and less prone to estimation bias, approximation by the asymptotic normal distribution being more accurate in finite samples, see [

Let X be a random variable being distributed with the distribution function F(x) and quantile function x(F) and let

L-moments are analogous to conventional moments. They can be estimated on the basis of linear combinations of sample order statistics, i.e. L-statistics. L-moments are an alternative system describing the shape of the probability distribution.

The issue of L-moments is discussed, for example, in [

An expected value of the r-th order statistic of the random sample of the sample size n has the form

If we substitute Equation (2) into Equation (1), after adjustments we obtain

where

The letter “L” in “L-moments” indicates that the r-th L-moment λ_{r} is a linear function of the expected value of a certain linear combination of order statistics. The estimate of the r-th L-moment λ_{r}, based on the sample, is thus the linear combination of order data values, i.e. L-statistics. The first four L-moments of the probability distribution are now defined as

The probability distribution can be specified by its L-moments even if some of its conventional moments do not exist, the opposite, however, is not true. It can be proved that the first L-moment λ_{1} is a location characteristic, the second L-moment λ_{2} being a variability characteristic. It is often desirable to standardize higher L- moments λ_{r}, r ≥ 3, so that they can be independent of specific units of the random variable X. The ratio of L-moments of the r-th order of the random variable X is defined as

We can also define the function of L-moments which is analogous to the classical coefficient of variation, i.e. the so called L-coefficient of variation

The ratio of L-moments τ_{3} is a skewness characteristic, the ratio of L-moments τ_{4} being a kurtosis characteristic of the corresponding probability distribution. Main properties of the probability distribution are very well summarized by the following four characteristics: L-location λ_{1}, L-variability λ_{2}, L-skewness τ_{3} and L-kurtosis τ_{4}. L-moments λ_{1} and λ_{2}, the L-coefficient of variation τ and ratios of L-moments τ_{3 }and τ_{4} are the most useful characteristics for the summarization of the probability distribution. Their main properties are existence (if the expected value of the distribution is finite, then all its L-moments exist) and uniqueness (if the expected value of the distribution is finite, then L-moments define the only distribution, i.e. no two distinct distributions have the same L-moments).

L-moments are usually estimated by a random sample obtained from an unknown distribution. Since the r-th L-moment λ_{r} is the function of the expected values of order statistics of a random sample of the sample size r, it is natural to estimate it using the so-called U-statistic, i.e. the corresponding function of sample order statistics (averaged over all subsets of the sample size r, which may be formed from the obtained random sample of the sample size n).

Let

Hence the first four sample L-moments have the form

U-statistics are widely used especially in nonparametric statistics. Their positive properties are the absence of bias, asymptotic normality and a slight resistance due to the influence of outliers, see [

When calculating the r-th sample L-moment, it is not necessary to repeat the process over all sub-sets of the sample size r, since this statistic can be expressed directly as a linear combination of order statistics of a random sample of the sample size n.

If we assume an estimate of E(X_{r}_{:r}) obtained with the use of U-statistics, it can be written as r·b_{r−}_{1}, where

Namely

and so generally

Thus the first sample L-moments can be written as

We can therefore write generally

where

Sample L-moments are used in a similar way as sample conventional L-moments, summarizing the basic properties of the sample distribution, which are the location (level), variability, skewness and kurtosis. Thus, sample L-moments allow an estimation the corresponding properties of the probability distribution from which the sample originates and can be used in estimating the parameters of the relevant probability distribution. We often prefer L-moments to conventional moments within such applications, since sample L-moments―as the linear functions of sample values―are less sensitive to sample variability or measurement errors in extreme observations than conventional moments. L-moments therefore lead to more accurate and robust estimates of characteristics or parameters of the basic probability distribution.

Sample L-moments have been used previously in statistics, but not as part of a unified theory. The first sample L-moment l_{1} is a sample L-location (sample average), the second sample L-moment l_{2} being a sample L-variability. The natural estimation of L-moments (10) ratio is the sample ratio of L-moments

Hence t_{3} is a sample L-skewness and t_{4} is a sample L-kurtosis. Sample ratios of L-moments t_{3} and t_{4} may be used as the characteristics of skewness and kurtosis of a sample data set.

The Gini mean difference relates both to sample L-moments, having the form of

and the Gini coefficient which depends only on a single parameter σ in the case of the two-parametric lognormal distribution, depending, however, on the values of all three parameters in the case of the three-parametric lognormal distribution. For more details see, for example, [

An alternative robust version of L-moments is introduced in this subchapter. The modification is called “trimmed L-moments” and it is termed TL-moments. The expected values of order statistics of a random sample in the definition of L-moments of probability distributions are replaced with those of a larger random sample, its size growing correspondingly to the extent of the modification, as shown below.

Certain advantages of TL-moments outweigh those of conventional L-moments and central moments. TL-moment of the probability distribution may exist despite the non-existence of the corresponding L-moment or central moment of this probability distribution, as it is the case of the Cauchy distribution. Sample TL-mo- ments are more resistant to outliers in the data. The method of TL-moments is not intended to replace the existing robust methods but rather supplement them, particularly in situations when we have outliers in the data.

In this alternative robust modification of L-moments, the expected value E(X_{r-j}_{:r}) is replaced with the expected value_{1} + t_{2}, working only with the expected values of these r modified order statistics _{1} and largest t_{2} from the conceptual random sample. This modification is called the r-th trimmed L-moment (TL-moment) and marked as

It is evident from the Expressions (30) and (1) that TL-moments are reduced to L-moments, where t_{1} = t_{2} = 0. Although we can also consider applications where the adjustment values are not equal, i.e. t_{1} ≠ t_{2}, we will focus here only on the symmetric case t_{1} = t_{2} = t. Then the Expression (30) can be rewritten

Thus, for example,

For t = 1, the first four TL-moments have the form

The measurements of location, variability, skewness and kurtosis of the probability distribution analogous to conventional L-moments (6)-(9) are based on

The expected value E(X_{r}_{:n}) can be written using the Formula (2). With the use of the Equation (2), we can express the right side of the Equation (31) again as

It is necessary to point out that

Expressions (32)-(35) for the first four TL-moments (t = 1) may be written in an alternative way as

The distribution can be determined by its TL-moments, even though some of its L-moments or conventional moments do not exist. For example, _{1}.

TL-skewness

Let

is considered to be an unbiased estimate of the expected value of the (j + 1)-th order statistic X_{j}_{+1:j+l+1} in the conceptual random sample of sample size (j + l + 1). Now we will assume that in the definition of TL-moment _{r+t−j}_{:r+2t}) is replaced by its unbiased estimate

which is obtained by assigning j → r + t − j − 1 a l → t + j in (43). Now we get the r-th sample TL-moment

i.e.

which is an unbiased estimate of the r-th TL-moment _{i}_{:n} in (46) are not equal to zero only for r + t − j ≤ i ≤ n − t ?j, taking combination numbers into account. A simple adjustment of Equation (46) provides an alternative linear form

For r = 1, for example, we obtain for the first sample TL-moment

where the weights are given by

The above results can be used for the estimation of TL-skewness

We can choose t = nα, representing the size of the adjustment from each end of the sample, where α is a certain ratio, where 0 ≤ α < 0.5. More about TL-moments, see [

L-moments method used to be employed in hydrology, climatology and meteorology in the research of extreme precipitation, see, e.g. [

Year | ||||
---|---|---|---|---|

2004 | 2005 | 2006 | 2007 | |

Sample size | 4351 | 7483 | 9675 | 11,294 |

Source: Own research.

region (Bohemia and Moravia), social group, municipality size, age and the highest educational attainment. The households are divided into subsets according to their heads―mostly men. The head of household is always a man in two-parent families (a husband-and-wife or cohabitee type), regardless of the economic activity. In lone- parent families (a one-parent-with-children type) and non-family households whose members are related neither by marriage (partnership) nor parent-child relationship, a crucial criterion for determining the head of household is the economic activity, another aspect being the amount of money income of individual household members. The former criterion also applies in the case of more complex household types, for instance, in joint households of more two-parent families.

The value of α = 0.25 from the middle of the interval 0 ≤ α < 0.5 was used in this research. With only minor exceptions, the TL-moments method produced the most accurate results. L-moments was the second most effective method in more than half of the cases, the differences between this method and that of maximum likelihood not being significant enough as far as the number of cases, when the former gave better results than the latter. ^{2}, indicating that the L-moments method produced in two out of four cases―more accurate results than the maximum likelihood method, the most accurate outcomes in all four cases being produced by the TL-moments method.

For the years 2005, 2006 and 2007, an estimate of the value of the parameter θ (the beginning of the distribution, theoretical minimum) made by the maximum likelihood method is negative. This, however, may not interfere with good agreement between the model and the real distribution since the curve has initially a close contact with the horizontal axis.

A comparison of the accuracy of the three methods of point parameter estimation is also provided by

A relatively new class of moment characteristics of probability distributions has been introduced in the present paper. They are the characteristics of the location (level), variability, skewness and kurtosis of probability distributions constructed with the use of L-moments and TL-moments that represent a robust extension of L-mo- ments. The very L-moments were implemented as a more robust alternative to classical moments of probability distributions. L-moments and their estimates, however, are lacking in some robust features that are associated with TL-moments.

Year | Method of TL-moments | Method of L-moments | Maximum likelihood method | ||||||
---|---|---|---|---|---|---|---|---|---|

μ | σ^{2} | θ | μ | σ^{2} | θ | μ | σ^{2} | θ | |

2004 | 10.961 | 0.552 | 39,899 | 11.028 | 0.675 | 33,738 | 11.503 | 0.665 | 7.675 |

2005 | 11.006 | 0.521 | 40,956 | 11.040 | 0.677 | 36,606 | 11.542 | 0.446 | −8.826 |

2006 | 11.074 | 0.508 | 44,941 | 11.112 | 0.440 | 40,327 | 11.623 | 0.435 | −42.331 |

2007 | 11.156 | 0.472 | 48,529 | 11.163 | 0.654 | 45,634 | 11.703 | 0.421 | −171.292 |

Year | Criterion χ^{2} | Criterion χ^{2} | Criterion χ^{2} | ||||||

2004 | 494.441 | 866.279 | 524.478 | ||||||

2005 | 731.225 | 899.245 | 995.855 | ||||||

2006 | 831.667 | 959.902 | 1067.789 | ||||||

2007 | 1050.105 | 1220.478 | 1199.035 |

Source: Own research.

Sample TL-moments are the linear combinations of sample order statistics assigning zero weight to a predetermined number of sample outliers. They are unbiased estimates of the corresponding TL-moments of probability distributions. Some theoretical and practical aspects of TL-moments are still the subject of both current and future research. The efficiency of TL-statistics depends on the choice of α, for example,

The above methods as well as other approaches, e.g. [

This paper was subsidized by the funds of institutional support of a long-term conceptual advancement of science and research number IP400040 at the Faculty of Informatics and Statistics, University of Economics, Prague, Czech Republic.