^{1}

^{*}

^{2}

^{*}

^{1}

This paper is an extension of Hanif, Hamad and Shahbaz estimator [1] for two-phase sampling. The aim of this paper is to develop a regression type estimator with two auxiliary variables for two-phase sampling when we don’t have any type of information about auxiliary variables at population level. To avoid multi-collinearity, it is assumed that both auxiliary variables have minimum correlation. Mean square error and bias of proposed estimator in two-phase sampling is derived. Mean square error of proposed estimator shows an improvement over other well known estimators under the same case.

It is fact that precision of estimators of the mean of study variable “y” is increased by proper attachment of highly correlated auxiliary variables. In some situations where auxiliary information is available at population level and cost per unit of collecting study variable “y” is affordable then single-phase sampling is more appropriate. But in a situation where prior information of auxiliary variable is lacking then it is neither practical nor economical to conduct a census for this purpose. The appropriate technique used to get estimates of those auxiliary variables on the basis of samples is two-phase sampling. In such cases we take large preliminary sample and from that auxiliary variables are computed. The main sample is independently sub-sampled from that large sample.

Two-phase sampling is a powerful technique which was firstly introduced by Neyman [

In two-phase sampling, regression and ratio estimation techniques are used to estimate the finite population mean. Ratio estimator incorporates the prior information closely related to study variable and regression technique is used when relation between study variable and auxiliary variable(s) is linear. Regression estimator is considered to be more useful than ratio estimator except when regression line does not pass through origin otherwise these two estimators have almost same significance and analyst has to decide intuitively.

Let the population consist of units, and denote the values of the i-th unit of the character and respectively. Here is our variable of interest, is main auxiliary variable and is second auxiliary variable. The two auxiliary variables are highly correlated with variable of interest. Let be first phase sample of size from the population of size N according to a simple random sampling without replacement and, the sample means of two auxiliary variables are observed. Let be second phase sample of size n_{2} from first phase sample and are observed. The notations used in this paper are:

Cochran [

Sukhatme [

Raj [

where and “w” is a suitably chosen constant.

Mohanty [

Srivastava [

Mukerjee et al. [

Samiuddin and Hanif [

where

and is the partial correlation coefficient of given and.

Singh and Espejo [

where and.

Hanif et al. [

We propose following estimator using two auxiliary variables for two-phase sampling when we don’t have any information of auxiliary variables i.e. both and are unknown.

Putting the notations of (1.1) in (2.1), squaring and taking expectation, we can obtain mean square as:

(2.2)

In order to get optimum value of K_{1} and K_{2} we differentiate (2.2) with respect to K_{1} and equating to zero we get:

Putting the value of (2.3) in (2.2) and differentiating with respect to K_{1}, we get:

where is the partial regression coefficient of y on keeping constant.

Putting the value of (2.4) in (2.3) we get:

Putting the values of (2.4) and (2.5) in (2.2) and on simplification we have:

Expressing the proposed estimator in terms of (1.1) and taking the assumption that is very small and expanding and up to second degree, we obtain bias of above estimator as follows

(2.7)

Putting (2.4)_{ }and (2.5) in (2.7) and after simplification, the optimized bias is

(2.8)

In this section, an improvement of our proposed estimator is shown over well-known estimators of two-phase sampling. In each case no information about population characteristics of auxiliary variables is available. It is proved through mathematical comparison that our proposed estimator outperforms the other estimators. We have compared our estimator with [3,6-11,13,14] estimators. The mathematical efficiency of our proposed estimator is given as:

a) Comparison with Robson [

b) Comparison with Sukhatme [

c) Comparison with Raj [

d) Comparison with Mohanty [

(3.4)

e) Comparison with Srivastava [

f) Comparison with Mukerjee et al. [

g) Comparison with Sammiuddin and Hanif [

But our estimator is more preferable than [

1) becomes classical ratio estimator for and;

2) converts into Robson [

3) emerges into Mohanty [

4) reduces to estimator given by Singh and Espejo [

5) turns into Hanif et al. [

h) Comparison with Singh and Espejo [

i) Comparison with Hanif et al. [

In this paper we have proposed a regression type estimator for two-phase sampling when we don’t have any advance knowledge of auxiliary variables. [6,8,13,14] are the special cases of our estimator. From Equations (3.1) to (3.9) one can readily see that our proposed estimator is more precise than all other competing estimators discussed in Section 1, so we can say that our estimator provides more accurate estimate about the population parameters.