Open Journal of Statistics
Vol.4 No.5(2014), Article ID:48772,10 pages DOI:10.4236/ojs.2014.45036

Mixture Regression-Cum-Ratio Estimator Using Multi-Auxiliary Variables and Attributes in Single-Phase Sampling

Teresio Mutembei, John Kung’u, Christopher Ouma

Department of Statistics and Actuarial Science, Kenyatta University, Nairobi, Kenya

Email: mutembeiteresio@gmail.com, johnkungu08@yahoo.com

Received 25 May 2014; revised 30 June 2014; accepted 13 July 2014

ABSTRACT

In this paper, we have proposed a class of mixture regression-cum-ratio estimator for estimating population mean by using information on multiple auxiliary variables and attributes simultaneously in single-phase sampling and analyzed the properties of the estimator. An empirical was carried out to compare the performance of the proposed estimator with the existing estimators of finite population mean using simulated population. It was found that the mixture regression-cumratio estimator was more efficient than ratio and regression estimators using one auxiliary variable and attribute, ratio and regression estimators using multiple auxiliary variables and attributes and regression-cum-ratio estimators using multiple auxiliary variables and attributes in singlephase sampling for finite population.

Keywords: Regression-Cum-Ratio Estimator, Multiple Auxiliary Variables and Attributes, Single-Phase Sampling

1. Introduction

The work of Neyman [1] may be referred to as the initial works where auxiliary information has been used. Watson [2] used the regression estimator of leaf area on leaf weight to estimate the average area of the leaves on a plant. Cochran [3] used auxiliary information in single-phase sampling to develop the ratio estimator for estimation of population mean. In the ratio estimator, the study variable and the auxiliary variable had a high positive correlation and the regression line was passing through the origin. Hansen and Hurwitz [4] also suggested the use of auxiliary information in selecting the sample with varying probabilities.

Olkin [5] was the first person to use information on more than one supplementary character, which is positively correlated with the variable under study, using a linear combination of ratio estimator based on each auxiliary variable. Shukla [6] proposed that regression estimator using multiple auxiliary was more efficient than regression estimator using single auxiliary variable. Raj [7] suggested a method of using multi-auxiliary information in sample survey. Singh [8] proposed a ratio-cum-product estimator and its multi-variable expression which were more efficient than ratio, product and mean per unit estimators.

Jhajj, Sharma and Grover [9] proposed a family of estimators using information on auxiliary attribute. They used known information of population proportion possessing an attribute that is highly correlated with study variable Y. The attribute is normally used when the auxiliary variable is not available e.g. an amount of milk produced and a particular breed of cow or an amount of yield of wheat and a particular variety of wheat. Rajesh, Pankaj, Nirmala and Florentins [10] used the information on auxiliary attribute in ratio estimator in estimating population mean of the variable of interest using known attributes such as coefficient of variation, coefficient kurtosis and point bi-serial correlation coefficient. The estimator performed better than the usual sample mean and Naik and Gupta [11] estimator. Rajesh, Pankaj, Nirmala and Florentins [10] also used the auxiliary attribute in regression, product and ratio type exponential estimator following the work of Bahl and Tuteja [12] .

Hanif, Haq and Shahbaz [13] [14] proposed a general family of estimators using multiple auxiliary attribute in single and double phase sampling. The estimator had a smaller MSE compared to that of Jhajj, Sharma and Grover [9] . They also extended their work to ratio estimator which was generalization of Naik and Gupta [11] estimator in single and double phase sampling with full information, partial information and no information.

The concept of double sampling was first proposed by Neyman [1] in sampling human populations when the mean of auxiliary variable was unknown. It was later extended to multiphase by Robson [15] . In most surveys the auxiliary information is always available and every form of auxiliary information should be used in developing sampling strategies. Samiuddin and Hanif [16] introduced the following approach using auxiliary variable.

1) Full information case: information for all auxiliary variables is available.

2) No information case: information for all auxiliary variables is not available.

3) Partial information case: information for some auxiliary variable is available for all population units.

Ahmad [17] generalized multivariate ratio and regression estimators for multi-phase sampling. Zahoor, Muhhamad and Munir [18] suggested a generalized regression-cum-ratio estimator for two-phase sampling using multiple auxiliary variables in full, partial and no information case. Kung’u and Odongo [19] and [20] proposed ratio-cum-product estimators using multiple auxiliary attributes in single phase sampling and two-phase sampling using multiple auxiliary attributes in full, partial and no information case. Moeen, Shahbaz and HanIf [21] proposed a class of mixture ratio and regression estimators for single-phase sampling for estimating population mean by using information on auxiliary variables and attributes simultaneously.

In this paper, we will incorporate both multiple auxiliary variables and attributes in regression-cum-ratio estimator to form mixture regression-cum-ratio estimator in single-phase sampling and also incorporate Arora and Bansi [22] approach in writing down the mean squared error.

2. Preliminaries

2.1. Notation and Assumption

The following notation will be used in this project. Consider a population of units. Let be the study variable for which we want to estimate the population mean and are auxiliary variables and are auxiliary attributes. For single-phase sampling design let be sample sizes for first phase while and denote the auxiliary variables and auxiliary attribute, and denote the variable of interest from first phase. Let

and and (1.0)

where, and are sampling error and are very small. We assume that

(1.1)

In defining the attributes we assume complete dichotomy so that;

(1.2)

Let and be the total number of units in the population and sample respectively possessing attribute. Let and be the corresponding proportion of units possessing a specific attributes and is the mean of the main variable at second phase.

The coefficient of variations are given by while is the correlation coefficient between study variable and auxiliary variables and is the bi-serial correlation coefficient between study variable and auxiliary variables. Then for simple random sampling without replacement for both first and second phases we write by using phase wise operation of expectations as:

(1.3)

(1.4)

[22] (1.5)

The following notations will be used in deriving the mean square errors of proposed estimators

Determinant of population correlation matrix of variables

Determinant of minor of corresponding to the element of

Denotes the multiple coefficient of determination of on.

Denotes the multiple coefficient of determination of on.

Determinant of population correlation matrix of variables.

Determinant of population correlation matrix of variables

Determinant of the correlation matrix of.

Determinant of the correlation matrix of.

Determinant of the minor corresponding to of the correlation matrix of

and.

Determinant of the minor corresponding to of the correlation matrix of and.

2.2. Mean per Unit in Single-Phase Sampling

The sample mean using simple random sampling without replacement is given by,

(1.6)

While its variance is given,

(1.7)

2.3. Ratio and Regression Estimator Using Auxiliary Variable

Let and be the unbiased estimator of population means and respectively.

Then the classical ratio estimator by Cochran [3] and regression estimator by Watson [2] are defined respectively by,

(1.8)

(1.9)

where, the population mean of the auxiliary variable is known where and are optimum values of ratio and regression estimator respectively.

The minimum MSE of and up to the first order of approximation are,

(1.9)

(1.10)

2.4. Ratio and Regression Estimator Using Multiple Auxiliary Variables

In case of multiple auxiliary variables, the ratio and regression estimators Ahmad [17] are given by,

(1.11)

(1.12)

where are the optimum values. The minimum mean squared error of and up to the first order of approximation are,

(1.13)

(1.14)

2.5. Regression-Cum-Ratio Estimator Using Multiple Auxiliary Variables

The regression-cum-ratio estimator by Zahoor [18] using multiple auxiliary variables is given by,

(1.15)

The and are the optimum values. The minimum MSE of up to the first order of approximation are,

(1.16)

2.6. Ratio and Regression Estimator Using Auxiliary Attribute

In order to have an estimate of the population mean the study variable y, assuming the knowledge of the population proportion P, Naik and Gupta [11] defined ratio and regression estimators of population when the prior information of population proportion of units, possessing the same attribute is variable. Using (1.8) and (1.9) Naik and Gupta [11] proposed following estimators:

(1.17)

(1.18)

The minimum MSE of and up to the first order of approximation are

(1.19)

(1.20)

where and are optimum values of ratio and regression estimator respectively.

2.7. Ratio and Regression Estimator Using Multiple Auxiliary Attributes.

The ratio and regression estimators by Hanif, Haq and Shahbaz [14] for single-phase sampling using information on multiple auxiliary attributes are given by,

(1.21)

(1.22)

The MSE of the and up to the first order of approximation are,

(1.23)

(1.24)

2.8. Regression-Cum-Ratio Estimator Using Multiple Auxiliary Attributes

The regression-cum-ratio estimator using multiple auxiliary attributes is given by,

(1.25)

The and are the optimum values to the first order of approximation. The minimum MSE of up to the first order of approximation are,

(1.26)

2.9. Mixture Ratio and Regression Using Multiple Auxiliary Variables and Attributes

The mixture ratio estimator based on multiple auxiliary variables and attributes by Moeen, Shahbaz and HanIf [21] is given by:

(1.27)

(1.28)

The minimum MSE of and up to the first order of approximation are

(1.29)

(1.30)

In general these estimators have a bias of order. Since the standard error of the estimates is of order, the quantity bias/s.e is of order and becomes negligible as becomes large. In practice, this quantity is usually unimportant in samples of moderate and large sizes.

In this paper, we have combined mixture ratio and mixture regression estimator to form mixture regressioncum-ratio estimator under single-phase sampling and studied the properties of the proposed estimator.

3. Methodology

3.1. Mixture Regression-Cum-Ratio Estimator Using Multi-Auxiliary Variables and Attributes in Single-Phase Sampling

If we estimate a study variable when information on all auxiliary variables is available from population, it is utilized in the form of their means. By taking the advantage of mixture regression-cum-ratio estimator technique for single-phase sampling, a generalized estimator for estimating population mean of study variable Y with the use of multi auxiliary variables and attributes is suggested as:

(2.0)

Using (1.0) in (2.0), we get,

(2.1)

Ignoring the second and higher terms for each expansion of product and after simplification, we write,

(2.2)

The mean squared error of is given by,

(2.3)

We differentiate the Equation (2.3) partially with respect to , and and equate to zero. The optimum value age given by,

(2.4)

(2.5)

(2.6)

(2.7)

Using normal equation that is used to find the optimum values given (3.8) we can write,

(2.8)

Or

(2.9)

Taking expectation of (3.49), we get,

(2.10)

Substituting the optimum (2.4) to (2.7) in (2.10) and after simplification we get,

(2.11)

Or

(2.12)

Or

(2.13)

(2.14)

Using (1.8) in (2.14), we get,

(2.15)

Using (1.5) in (2.15), we get,

(2.16)

3.2. Bias and Consistency of Mixture Regression-Cum-Ratio Estimator

These mixture regression-cum-ratio estimator using multiple auxiliary variables and attributes in single-phase sampling are biased. However, these biases are negligible for moderate and large samples.

It’s easily shown that the mixture regression-cum-ratio estimator using multiple auxiliary variables and attributes is a consistent estimator since it is a linear combination of consistent estimators it follows that it is also consistent.

4. Simulation, Result and Discussion

In this section, we carried out data analysis to compare the performance of mixture regression-cum-ratio estimator using multiple auxiliary variables and attributes with already existing estimator namely mean per unit, ratio and regression estimators using one auxiliary variable and attribute, ratio and regression estimators using two auxiliary variables and attributes and regression-cum-ratio estimators using four auxiliary variables and attributes in single-phase sampling for finite population. In the simulated population, the study variable is normally distributed while auxiliary variables and attributes are also normally distributed and strongly positively correlated with the study variable.

Study variable

For ratio estimator the auxiliary variable and attributes are positively correlated with the study variable and the line passes through the origin.

For regression estimator the auxiliary variable and attributes are positively correlated with the study variable and the line passes does not pass in the neighborhood of the origin.

All the results were obtained after carrying out several random sample and taking the average.

In order to evaluate the efficiency gain we could achieve by using the proposed estimators, we have calculated the variance of mean per unit and the mean squared error of all estimators we have considered. We have then calculated percent relative efficiency of each estimator in relation to variance of mean per unit. We have then compared the percent relative efficiency of each estimator, the estimator with the highest percent relative efficiency is considered to be the most efficient than the other estimator. The percent relative efficiency is calculated using the following formulae.

(3.0)

The Table 1 shows percent relative efficiency of mean per unit, ratio and regression estimators using one auxiliary variable and attribute, ratio and regression estimators using two auxiliary variables and attributes and regression-cum-ratio estimators using four auxiliary variables and attributes and mixture regression-cum-ratio estimator using multiple auxiliary variables and attributes with respect to mean per unit estimator for singlephase sampling. It is observed that our proposed mixture regression-cum-ratio estimator using multiple auxiliary variables and attributes using multiple auxiliary variables and attributes is the most efficient of the twelve estimators since it has the highest percent relative efficiency.

5. Conclusion

According to Table 1, the proposed mixture regression-cum-ratio estimator using multiple auxiliary variables and attributes using multiple auxiliary variables and attributes has the highest percent relative efficiency compared to mean per unit, ratio and regression estimators using one auxiliary variable and attribute, ratio and regression estimators using two auxiliary variables and attributes and regression-cum-ratio estimators using four auxiliary variables and attributes in single-phase sampling for finite population. This means that the mixture

Table 1. Relative efficiency of existing and proposed estimators with respect to mean per unit estimator for single-phase sampling.

regression-cum-ratio estimator using multiple auxiliary variables and attributes using multiple auxiliary variables and attributes is the most efficient estimator compared to the estimators that utilize auxiliary variables and attributes. The proposed mixture regression-cum-ratio estimator using multiple auxiliary variables and attributes using multiple auxiliary variables and attributes in single-phase sampling is recommended to estimate the finite population mean as it outperforms all the other namely mean per unit, ratio and regression estimators using one auxiliary variable and attribute, ratio and regression estimators using two auxiliary variables and attributes and regression-cum-ratio estimators using four auxiliary variables and attributes in single-phase sampling.

References

1. Neyman, J. (1938) Contribution to the Theory of Sampling Human Populations. Journal of the American Statistical Association, 33, 101-116. http://dx.doi.org/10.1080/01621459.1938.10503378
2. Watson, D.J. (1937) The Estimation of Leaf Areas. Journal of the Agricultural Science, 27, 474-504.http://dx.doi.org/10.1017/S002185960005173X
3. Cochran, W.G. (1940) The Estimation of the Yields of the Cereal Experiments by Sampling for the Ratio of Grain to Total Produce. Journal of the Agricultural Science, 30, 262-275. http://dx.doi.org/10.1017/S0021859600048012
4. Hansen, M.H. and Hurwitz, W.N. (1943) On the Theory of Sampling from Finite Populations. Annals of Mathematical Statistics, 14, 333-362. http://dx.doi.org/10.1214/aoms/1177731356
5. Olikin, I. (1958) Multivariate Ratio Estimation for Finite Population. Biometrika, 45, 154-165.http://dx.doi.org/10.1093/biomet/45.1-2.154
6. Shukla, G.K. (1965) Multivariate Regression Estimate. Journal of the Indian Statistical Association, 3, 202-211.
7. Raj, D. (1965) On a Method of Using Multi-Auxiliary Information in Sample Surveys. Journals of the American Statistical Association, 60, 154-165. http://dx.doi.org/10.1080/01621459.1965.10480789
8. Singh, M.P. (1967) Ratio-Cum-Product Method of Estimation. Metrika, 12, 34-42. http://dx.doi.org/10.1007/BF02613481
9. Jhajj, H.S., Sharma, M.K. and Grover, L.K (2006) A Family of Estimator of Population Mean Using Information on Auxiliary Attributes. Pakistan Journal of Statistics, 22, 43-50.
10. Rajesh, S., Pankaj, C., Nirmala, S. and Florentins, S. (2007) Ratio-Product Type Exponential Estimator for Estimating Finite Population Mean Using Information on Auxiliary Attributes. Renaissance High Press, USA.
11. Naik, V.D. and Gupta, P.C. (1996) A Note on Estimation of Mean with Known Population of Auxiliary Character. Journal of the Indian Society of Agricultural Statistics, 48, 151-158.
12. Bahl, S. and Tuteja, R.K. (1991) Ratio and Product Type Estimator. Information and Optimization Science, 12, 159-163. http://dx.doi.org/10.1080/02522667.1991.10699058
13. Hanif, M., Haq, I.U. and Shahbaz, M.Q. (2009) On a New Family of Estimator Using Multiple Auxiliary Attributes. World Applied Science Journal, 11, 1419-1422.
14. Hanif, M., Haq, I. and Shahbaz, M.Q. (2010) Ratio Estimators Using Multiple Auxiliary Attributes. World Applied Sciences Journal, 8, 133-136.
15. Robson, D.S. (1952) Multiple Sampling of Attributes. Journal of the American Statistical Association, 47, 203-215.http://dx.doi.org/10.1080/01621459.1952.10501164
16. Samiuddin, M. and Hanif, M. (2007) Estimation of Population Mean in Single and Two-Phase Sampling with or without Additional Information. Pakistan Journal of Statistics, 23, 99-118.
17. Ahmad, Z. (2008) Generalized Multivariate Ratio and Regression Estimators for Multi-Phase Sampling. Ph.D. Thesis, National College of Business Administration and Economics, Lahore.
18. Zahoor, A., Muhhamad, H. and Munir, A. (2009) Generalized Regression-Cum-Ratio Estimators for Two-Phase Sampling Using Multiple Auxiliary Variables. Pakistan Journal of Statistics, 25, 93.
19. Kung’u, J. and Odongo, L. (2014) Ratio-Cum-Product Estimator Using Multiple Auxiliary Attributes in Single Phase Sampling. Open Journal of Statistics, 4, 239-245. http://dx.doi.org/10.4236/ojs.2014.44023
20. Kung’u, J. and Odongo, L. (2014) Ratio-Cum-Product Estimator Using Multiple Auxiliary Attributes in Two-Phase Sampling. Open Journal of Statistics, 4, 246-257. http://dx.doi.org/10.4236/ojs.2014.44024
21. Moeen, M., Shahbaz, Q. and HanIf, M. (2012) Mixture Ratio and Regression Estimators Using Multi-Auxiliary Variable and Attributes in Single Phase Sampling. World Applied Sciences Journal, 18, 1518-1526.
22. Arora, S. and Bansi, L. (1989) New Mathematical Statistics. Satya Prakashan, New Delhi.