Development of New LSD Formula when Numbers of Observations Are Unequal

doi:10.4236/ojs.2018.82016

Open Journal of Statistics
Vol.08 No.02(2018), Article ID:83570,6 pages
10.4236/ojs.2018.82016

Ali A. Al-Fahham

●How to Cite this Article

Basic Science Department, Faculty of Nursing, University of Kufa, Najaf, Iraq

This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).

http://creativecommons.org/licenses/by/4.0/

Received: January 6, 2018; Accepted: April 1, 2018; Published: April 4, 2018

ABSTRACT

In the case of comparison between samples with unequal replications [n₁ # n₂ # n₃ # ∙∙∙ # n_k], the statistician should use LSD (Least Significant Difference) many times to achieve these comparisons, this method consumes more time and effort. The purpose of this research is to find out a new method to do easy and reliable comparison between population means when the experiment or data involving different numbers of replications. In order to do this, a new formula was designed; the aim of this new formula is to develop the multiple comparisons into one step instead of doing it with many steps that consume more effort and time. In this work, the researcher used the mean of the replication number in the LDS formula instead of doing the calculation many times for each two means. The results of this study found that the new formula give similar results for multiple comparisons with minimal effort. It was concluded that the new formula (named Al-fahham’s Formula) achieves the same results with less time and efforts. It was recommended that Al-fahham’s Formula can be approved in the statistical textbooks and procedures.

Keywords:

ANOVA, Statistics, LSD Formula, Unequal Observations, F Test

1. Introduction

The analysis of variance ANOVA is the parametric method for identify whether differences occur significantly in an experiment including more than conditions. It is an example of simultaneous inference in which multiple comparisons are achieved and investigate only two-tailed hypotheses. The null hypothesis indicates that differences are not found between the populations represented by these conditions [1] [2] .

In general, for any ANOVA with k levels, the null hypothesis is: H₀: = µ₁ = µ₂ = ∙∙∙ = µ_k, so that the alternative hypothesis would be H₁: ≠ µ₁ ≠ µ₂ ≠ ∙∙∙ ≠ µ_k.

When F test is significant, it reveals only that somewhere among the means of the samples at least two of them differ significantly. However, F test does not specify which specific means shows the significant difference. Thus, if F test for the multiple treatment study is significant, we will indicate that there are significant differences among the means of samples or populations, but we can’t identify where they are or how much the power of the difference [2] .

Therefore, when F test is significant we make a second procedure, called post hoc comparisons. Post hoc comparisons are similar to t-tests in which we do comparisons between all possible pairs of means for the studied factor, one pair comparison at a time, to indicate which means differ significantly. For analyzing the differences between means, the statistician often do specific comparisons, the most common of which is comparing two means, i.e. what we called “pairwise comparisons” [3] . Seven post hoc comparisons methods are used: 1) least significance difference (LSD) 2) preliminary one-way ANOVA F-test is done as pairwise comparisons by the (LSD), 3) Bonferroni inequality method, 4) Tukey Studentized range, 5) Duncan multiple range, 6) Newman-Keuls multiple range, and 7) Scheffè F-projection method [4] .

The first pairwise comparison method was invented by Ronald Aylmer Fisher (1890-1962), who tried to indicate which treatments had a significant different effect in the ANOVA test, he establish his work on the least squares of the standard deviations of the treatment means with equal replications [5] .

LSD is well-known methods of simultaneous inference. The main idea of this method is to be used in the comparisons between the populations as pairs. It is then used to continue in a two-way analysis of variance, indicating that the null hypothesis has already been rejected [6] .

The main advantage of the LSD is to calculate the smallest significant difference between two means; it helps to determine the populations that are statistically different in means. If the overall F test is significant, LSD, a procedure analogous to ordinary Student’s t test, is used to test any pair of means. If the difference between two population means is more than the LSD, then those means differ significantly at the a specific level of confidence, only one LSD value is required to test all the possible comparisons between the population means when replications are equal [4] .

In the case of different numbers of observations for treatment or samples [ $n_{1} # n_{2} # n_{3} # \dots # n_{k}$ ], a different LSD must be calculated for each comparison involving different numbers of replications and resulting in different values of LSD for each two means, the statistician should use LSD many times to achieve the pairwise comparisons, as follows [7] :

$LSD = t α, d f \sqrt{\frac{M S E}{n_{1}} + \frac{M S E}{n_{2}}}$ (1)

MSE: Mean Square of Error.

2. The Aim of the Study

This study aims to reduce the time and effort for the statisticians and researchers when doing multiple comparisons for treatment with different replications or different number of observations. This is performed by using one LSD value for all comparisons, instead of calculating LSD many times.

3. The Suggested Development

If the difference between numbers of observations are not very large, statistician can use the mean number of observations (i.e. the mean of replications of treatment) to calculate LSD value, so that we can use the formula one time only, as follows:

$H_{0} : μ_{1} = μ_{2} = \dots = μ_{k}$

$H_{1} : μ_{1} \neq μ_{2} \neq \dots \neq μ_{k}$

${LSD}_{d} = t α, d f \sqrt{\frac{2 M S E}{\bar{n}}}$ (2)

LSD_d = LSD for different number of replication

$\bar{n} = \frac{n_{1} + n_{2} + \dots + n_{k}}{k}$ (3)

$\bar{n}$ = mean of samples obsevations, Where: $n_{1} # n_{2} # n_{3} # \dots # n_{k}$

tα = critical value of t for a definite level of confidence

df = degree of freedom

$M S E (Mean Square of error) = \sum_{j = i}^{k} \sum_{i = 1}^{n_{i}} {(x_{i j} - \bar{x})}^{2}$ (4)

where:

k = number of groups, n = objects in group, i = row, j = column, x = group mean, $\bar{x}$ = grand mean

So that:

If $μ_{1} - μ_{2} \geq {LSD}_{d}$ so that reject null hypothesis and accept alternative, the conclusion is that the means of the two populations are different.

3.1. The Proposed Name

Alfahham’s Formula

3.2. The Proposed Abbreviation

LSD_d: stands for: Least Significant Difference for different replications

4. Applications of the New LSD_d Formula

In the following examples, we will explain that using of the new LSD_d formula noticeably reduce the steps of the pairwise comparisons of samples with unequal replications, with achieving similar results.

4.1. Example (1)

Table 1 shows the Hb values of three different samples, it confirms that the suggested formula achieves the comparisons with minimal steps, as the classical method performs the comparison in three steps, while the new method does that in one step. According to Table 2, there is a high significant difference between groups.

The Classical Method

1) Male vs Children: test for difference at α = 0.01

$LSD = t α, d f \sqrt{\frac{M S E}{n_{1}} + \frac{M S E}{n_{2}}} = 2.291 \sqrt{\frac{1.754}{5} + \frac{1.754}{8}}$

$LSD = 1.72$

*mean difference = 16.5 − 13.4 = 3.1 > LSD

2) Male vs Female

$LSD = t α, d f \sqrt{\frac{M S E}{n_{1}} + \frac{M S E}{n_{2}}} = 2.291 \sqrt{\frac{1.754}{5} + \frac{1.754}{6}}$

$LSD = 1.83$

*mean difference = 13.4 − 9.75 = 3.65 > LSD

3) Female vs Children

$LSD = t α, d f \sqrt{\frac{M S E}{n_{1}} + \frac{M S E}{n_{2}}} = 2.291 \sqrt{\frac{1.754}{6} + \frac{1.754}{8}}$

$LSD = 2.08$

Table 1. Hb values of three different samples (children, female, and male).

Table 2. ANOVA table for Hb values of the three different samples.

*mean difference = 16.5 − 9.75 = 6.75 > LSD

*mean difference [6.75 or 3.65 or 3.1] > LSD.

$\bar{n} = \frac{5 + 6 + 8}{3} = 6.33$

${LSD}_{d} = t α, d f \sqrt{\frac{2 M S E}{\bar{n}}} = 2.291 \sqrt{\frac{2 \times 1.754}{6.33}}$

${LSD}_{d} = 1.7$

*mean difference [6.75 or 3.65 or 3.1] > LSD.

4.2. Example (2)

Examples2: Table 3 shows the Hb values of three different samples, this example confirms that the new suggested formula achieves the similar results as the other methods (i.e. Tukey method):

According to Table 4 and Tukey method, there is a high significant different between the three samples that have different replications.

The New method

Al-fahham’s Formula: α = 0.01

$\bar{n} = \frac{3 + 5 + 8}{3} = 5.33$

${LSD}_{d} = t α, d f \sqrt{\frac{2 M S E}{\bar{n}}} = 2.291 \sqrt{\frac{2 \times 1.193}{5.33}} = 1.44$

Table 3. Hb values of three different samples (children, female and male).

Table 4. ANOVA table for Hb values of the three different samples.

*comparisons:

male vs female, male vs children, female vs children:

[2.15, 4.17, 6.32] > LSD_d (1.44)

According to our suggested method, there is significant different between the three samples that have different replications.

5. Conclusion

This research confirmed that the new LSD formula named Al-fahham’s Formula (LSDd) achieves the same results in the pairwise comparisons as do the other methods with less time and efforts.

Recommendations

Al-fahham’s Formula (LSD_d) can be approved in the textbooks of statistics and used in the statistical procedures.

Cite this paper

Al-Fahham, A.A. (2018) Development of New LSD Formula when Numbers of Observations Are Unequal. Open Journal of Statistics, 8, 258-263. https://doi.org/10.4236/ojs.2018.82016

References

1. Hothorn, T., Bretz, F. and Westfall, P. (2008) A Problem in Statistical Analysis: Simultaneous Inference. Biometrical Journal, 50, 346-363. https://doi.org/10.1002/bimj.200810425

2. Heiman, G.W. (2011) Basic Statistics for the Behavioral Sciences. 6th Edition, Wadsworth/Cengage, Belmont, 293.

3. Seaman, M.A., Levin, J.R. and Serlin, R.C. (1991) New Developments in Pairwise Multiple Comparisons Some Powerful and Practicable Procedures. Psychological Bulletin, 110, 577-586. https://doi.org/10.1037/0033-2909.110.3.577

4. Dodge, Y. and Thomas, D.R. (1980) On the Performance of Non-Parametric and Normal Theory Multiple Comparison Procedures. Sankhya, B42, 11-27.

5. Yates, F. and Mather, K. (1963) Ronald Aylmer Fisher: 1890-1962. Biographical Memoirs of Fellows of the Royal Society, 9, 91-129. https://doi.org/10.1098/rsbm.1963.0006

6. Miller Jr., R.G. (1981) Simultaneous Statistical Inference. 2nd Edition, Springer, Berlin, Heidelberg, New York. https://doi.org/10.1007/978-1-4613-8122-8

7. Dodge, Y. (2008) Least Significant Difference Test. In: Dodge, Y., Ed., The Concise Encyclopedia of Statistics, Springer, New York, 302-304.

Journal Menu>>