_{1}

^{*}

The Laplace distribution is one of the oldest defined and studied distributions. In the one-parameter model (location parameter only), the sample median is the maximum likelihood estimator and is asymptotically efficient. Approximations for the variance of the sample median for small to moderate sample sizes have been studied, but no exact formula has been published. In this article, we provide an exact formula for the probability density function of the median and an exact formula for the variance of the median.

Laplace is credited as first discovering the Laplace distribution in 1774 [

Let denote the probability density function of a one-parameter Laplace distribution with variance 1. It is well known that from a sample of odd size 2n + 1 where n is any positive integer, the sample median, , is the maximum likelihood estimate of μ. Moreover, although the derivative does not exist at μ, the Cramer-Rao lower bound for the mean squared error of estimation of μ exists and is.

Also, we have which implies the median is an efficient estimator in the asymptotic sense. However, is not a sufficient statistic and there are more efficient estimators for small samples, for example the average of the middle three observations can be more efficient. Finally, the sample mean is also asymptotically normal and asymptotically the ratio of the variance of the mean to the median is 2.

Previously, [

where. For example, if

and the sample size were 999, the upper bound would be approximately 1.513. This suffices for their purpose, which was mainly to show that this ratio was less than 2, which in turn implies the median is a more efficient estimator than the sample mean. However, it seems like an inadequate upper bound since we know the ratio converges to 1. [

This paper makes two contributions. First, we find an explicit formula for the distribution of the median of a random sample from the Laplace distribution. Next, we find an exact formula for the variance of the sample median. These contributions are important because the median is the essential estimate of location for the Laplace distribution in the same way that the mean is the essential estimate of location for the normal distribution.

Without loss of generality, we will consider the case where μ = 0. The probability the median is less than x is the probability that at least n + 1 values in the sample are less than x. The distribution of the median is symmetric about 0, so it suffices to consider values of x < 0. Then, we have

In the Appendix, we show by mathematical induction that the density for x < 0 is

where the constants are

.

Hence, has a Beta distribution truncated on the interval.

The characteristic function is

and the moment generating function is

.

In both cases, the formulas involve the incomplete beta function

.

The variance can be calculated exactly using the Binomial expansion as

This variance also evaluates to

where is the generalized Hypergeometric function.

For, and 8, the ratios of the variance of the median to the variance of the mean (rounded to 6 digits) are 0.958333, 0.877951, 0.824767, 0.787808, and 0.708761. In comparison to the approximations listed in the bottom row of Siddiqui, these are close in all cases. In the case of n = 3, (corresponding to a sample size of N = 7 in his notation) he gives the approximate value of 0.81, but it should be 0.82 rounded to 2 digits. Also, in the case of n = 4, he gives the approximation 0.78, while it should be 0.79.

As a further example, consider the case n = 499, corresponding to a sample size of 999. This would be a rather large dataset by most standards and one might think the asymptotic variance would be accurate. However, the ratio of the actual variance to the asymptotic value is 1.05122, or about 5% larger than the CramerRao lower bound. Although not very close to 1, it is substantially closer than the upper bound of 1.513 provided by the Chu and Hotelling approximation (see Section 1).

For very large values of n, it will still not be possible to calculate the variance exactly using the formula at the beginning of this Section. One way to estimate the ratio of the true variance compared to the asymptotic variance is

where Z is standard normal with distribution function and density function. The numerator can be estimated by Monte Carlo simulation or the integral in the second to the last line can be estimated by numerical integration such as the Riemann sum

where denotes the integrand and N is a large integer.

It is also possible to sample values having the density by using acceptance-rejection sampling. To do this, one needs a candidate distribution with density

such that is bounded by a fixed upper bound M. In addition, it is best if M is close to 1 since the proportion of candidate values that will be accepted is. The Laplace distribution with variance 2 is a very good candidate distribution to sample from. Note that the ratio

attains its maximum at

and that maximum is

.

Start with values of n that were the closest integers to.

For each of those values of n, calculate the ratio given in the previous Section viathe Riemann sum described in Section 4 with million. In

.

Hence, at least in this range of n, the ratio is approximately. Note, the largest n here is approximately 500 million, which corresponds to a sample size of about 1 billion.

This leads to a conjecture about the rate of convergence of the variance. In addition, from the formula near the beginning of Section 3, this is also a conjecture about the asymptotic behavior of the Hypergeometric function for large n. Specifically,

using Stirling’s approximation.

We found an exact formula for the density and the variance of the one-parameter Laplace distribution. The moment generating function and characteristic function are also derived. We also describe several methods for approximating moments or other features of the distribution by an algorithm to sample directly from the distribution. This article only deals with the case of odd sample sizes. The median is the (n + 1)^{th} order statistic and the same

approach could be used to find the distribution of any other order statistic. For even sample sizes, the median is the average of the two middle observations, which makes it slightly more complicated to analyze because the joint distribution of two order statistics is needed. However, a similar approach used here may also handle the even sample size case. Furthermore, this approach could also be useful in analyzing the variance of other estimators such as the average of the middle three or middle five and observations. And it could be helpful in finding the optimal estimator for small sample sizes. This exact variance may be useful in constructing approximate confidence intervals or hypothesis tests. But, caution should still be used in using the normal approximation. Exact tests and confidence intervals can be constructed from the exact distribution. Lastly, in the two-parameter Laplace model, not considered here, a further adjustment may be needed in such procedures due to the estimation of the scale parameter.

First, note that for any positive integers m and i with i < m,

when n = 1,

and

We suppose that the formula for the density is true for some positive integer n.

Consider the random sample of size and recalculate the distribution function for the median of this random sample by conditioning on the number of values among the first 2n + 1 elements in the sample that are less than x. If the median of the entire sample is less than x, then there are three possible cases for such numbers: exactly n, exactly n + 1, or at least n + 2. We have

The derivative is

and by the induction hypothesis, this equals

Finally, notice that