Open Journal of Statistics
Vol.07 No.05(2017), Article ID:79925,15 pages
10.4236/ojs.2017.75059

Estimating a Finite Population Mean under Random Non-Response in Two Stage Cluster Sampling with Replacement

Nelson Kiprono Bii1, Christopher Ouma Onyango2, John Odhiambo1

1Institute of Mathematical Sciences, Strathmore University, Nairobi, Kenya

2Department of Statistics, Kenyatta University, Nairobi, Kenya

Copyright © 2017 by authors and Scientific Research Publishing Inc.

This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).

http://creativecommons.org/licenses/by/4.0/

Received: September 1, 2017; Accepted: October 24, 2017; Published: October 27, 2017

ABSTRACT

Non-response is a regular occurrence in Sample Surveys. Developing estimators when non-response exists may result in large biases when estimating population parameters. In this paper, a finite population mean is estimated when non-response exists randomly under two stage cluster sampling with replacement. It is assumed that non-response arises in the survey variable in the second stage of cluster sampling. Weighting method of compensating for non-response is applied. Asymptotic properties of the proposed estimator of the population mean are derived. Under mild assumptions, the estimator is shown to be asymptotically consistent.

Keywords:

Non-Response, Nadaraya-Watson Estimation, Two Stage Cluster Sampling

1. Introduction

In survey sampling, non-response is one source of errors in data analysis. Nonresponse introduces bias into the estimation of population characteristics. It also causes samples to fail to follow the distributions determined by the original sampling design. This paper seeks to reduce the non-response bias in the estimation of a finite population mean in two stage cluster sampling.

Use of regression models is recognized as one of the procedures for reducing bias due to non-response using auxiliary information. In practice, information on the variables of interest is not available for non-respondents but information on auxiliary variables may be available for non-respondents. It is therefore desirable to model the response behavior and incorporate the auxiliary data into the estimation so that the bias arising from non-response can be reduced. If the auxiliary variables are correlated with the response behavior, then the regression estimators would be more precise in estimation of population parameters, given the auxiliary information is known.

Many authors have developed estimators of population mean where non-response exists in the study and auxiliary variables. But there exist cases that do not exhibit non-response in the auxiliary variables, such as: number of people in a family, duration one takes to go through education. Imputation techniques have been used to account for non-response in the study variable. For instance, [1] applied compromised method of imputation to estimate a finite population mean under two stage cluster sampling, this method however produced a large bias. In this study, the Nadaraya-Watson regression technique is applied in deriving the estimator for the finite population mean. Kernel weights are used to compensate for non-response.

Reweighting Method

Non-response causes loss of observations and therefore reweighting means that the weights are increased for all or almost all of the elements that fail to respond in a survey. The population mean, Y ¯ , is estimated by selecting a sample of size n at random with replacement. If responding units to item y are independent so that the probability of unit j responding in cluster i is p i j ( i = 1 , 2 , , n ; j = 1 , 2 , , m ) then an imputed estimator, y ¯ I , for Y ¯ , is given by

y ¯ I = 1 i , j s w i j [ i , j s r w i j y i j + i , j s m w i j y i j ] (1.0)

where w i j = 1 π i j gives sample survey weight tied to unit j in cluster i and

π i j = p [ i , j s ] is its second order probability of inclusion, s r , is the set of r units responding to item y and s m is the set of m units that failed to respond to item y so that r + m = n and y i j is the imputed value generated so that the missing value y i j is compensated for, [2] .

2. The Proposed Estimator of Finite Population Mean

Consider a finite population of size M consisting of N clusters with N i elements in the ith cluster. A sample of n clusters is selected so that n 1 units respond and n 2 units fail to respond. Let y i j denote the value of the survey variable Y for unit j in cluster i, for i = 1 , 2 , , N , j = 1 , 2 , , N i and let population mean be given by

Y ¯ ¯ = 1 M N i i = 1 N j = 1 M i Y i j (2.1)

Let an estimator of the finite population mean be defined by Y ¯ ¯ ^ as follows:

Y ¯ ¯ ^ = 1 M { 1 n 1 i s j s Y i j π i j δ i j + 1 n 2 i s j s ( 1 1 π i j ) Y ^ i j δ i j } (2.2)

where δ i j is an indicator variable defined by

δ i j = { 1 , if j th unit in the i th cluster responds 0 , elsewhere

and n 1 and n 2 are the number of units that respond and those that fail to respond respectively.

π i j is the probability of selecting the jth unit in the ith cluster into the sample.

Let w ( x i j ) = 1 π i j to be the inverse of the second order inclusion probabilities and that x i j is the ith auxiliary random variable from the jth cluster. It follows that Equation (2.2) becomes

Y ¯ ¯ ^ = 1 M { 1 n 1 i s j s w ( x i j ) Y i j δ i j + 1 n 2 i s j s ( 1 w ( x i j ) ) Y ^ i j δ i j } (2.3)

Suppose δ i j is known to be Bernoulli random variables with probability of success δ i j , then, E ( δ i j ) = p r ( δ i j = 1 ) = δ i j and ( δ i j ) = δ i j ( 1 δ i j ) , [3] . Thus, the expected value of the estimator of population mean is given by

E ( Y ¯ ¯ ^ ) = 1 M { 1 n 1 i s j s E ( w ( x i j ) Y i j ) δ i j + 1 n 2 i s j s E ( ( 1 w ( x i j ) ) Y ^ i j ) δ i j } (2.4)

Assuming non-response in the second stage of sampling, the problem is therefore to estimate the values of Y ^ i j . To do this, a linear regression model applied by [4] and [5] given below is used;

Y ^ i j = m ( x ^ i j ) + e ^ i j (2.5)

where m ( . ) is a smooth function of the auxiliary variables and e ^ i j is the residual term with mean zero and variance which is strictly positive, Substituting Equation (2.5) in Equation (2.4) the following result is obtained:

E ( Y ¯ ¯ ^ ) = 1 M { 1 n 1 i s j s E ( ( m ( x ^ i j ) + e ^ i j ) w ( x i j ) ) δ i j + 1 n 2 i s j s E ( 1 w ( x i j ) ) ( m ( x ^ i j ) + e ^ i j ) δ i j } (2.6)

Assuming that n 1 = n 2 = n , and simplifying Equation (2.6) we obtain the following

E ( Y ¯ ¯ ^ ) = 1 M n { i s j s E ( ( m ( x ^ i j ) + e ^ i j ) w ( x i j ) ) δ i j + i s j s E ( 1 w ( x i j ) ) ( m ( x ^ i j ) + e ^ i j ) δ i j } (2.7)

A detailed work done by [5] proved that E ( e ^ i j ) = 0 . Therefore Equation (2.7) reduces to

E ( Y ¯ ¯ ^ ) = 1 M n { i s j s E ( m ( x ^ i j ) ) E ( w ( x i j ) ) δ i j + i s j s E ( 1 w ( x i j ) ) E ( m ( x ^ i j ) + e ^ i j ) δ i j } (2.8)

The second term in Equation (2.8) is simplified as follows:

1 M n { i s j s E ( 1 w ( x i j ) ) E ( m ( x ^ i j ) + e ^ i j ) δ i j * } = 1 M n { i s j s E ( 1 w ( x i j ) ) m ( x ^ i j ) δ i j } + 1 M n { i s j s E ( 1 w ( x i j ) ) e i j δ i j } (2.9)

But E ( m ( x i j ) ) = m ( x ^ i j ) = m ( x i j ) , [6] . Thus we get the following:

1 M n { i s j s E ( 1 w ( x i j ) ) E ( m ( x ^ i j ) + e ^ i j ) δ i j } = 1 M n { i = m + 1 M j = n + 1 N δ i j m ( x i j ) w ( x i j ) δ i j m ( x i j ) } + 1 M n { i = m + 1 M j = n + 1 N E ( e i j δ i j ) E ( w ( x i j ) ( e i j δ i j ) ) } (2.10)

1 M n { i s j s E ( 1 w ( x i j ) ) E ( m ( x ^ i j ) + e ^ i j ) δ i j } = 1 M n { ( M ( m + 1 ) ) ( N ( n + 1 ) ) [ ( δ i j ) m ( x i j ) w ( x i j ) δ i j m ( x i j ) ] } + 1 M n { ( M ( m + 1 ) ) ( N ( n + 1 ) ) [ δ i j E ( e i j ) E ( e i j ) δ i j w ( x i j ) ] } (2.11)

But E ( e i j ) = 0 , for details see [5] .

On simplification, Equation (2.11) reduces to

1 M n { i s j s E ( 1 w ( x i j ) ) E ( m ( x ^ i j ) + e ^ i j ) δ i j } = ( M ( m + 1 ) ) ( N ( n + 1 ) ) M n { δ i j m ( x i j ) ( 1 w ( x i j ) ) } (2.12)

Recall w ( x i j ) = 1 π i j

so that Equation (2.12) may be re-written as follows:

1 M n { i s j s E ( 1 w ( x i j ) ) E ( m ( x ^ i j ) + e ^ i j ) δ i j } = ( M ( m + 1 ) ) ( N ( n + 1 ) ) M n { δ i j m ( x i j ) ( π i j 1 π i j ) } (2.13)

Assume the sample sizes are large i.e. as n N and m M , Equation (2.13) simplifies to

1 M n { i s j s E ( 1 w ( x i j ) ) E ( m ( x ^ i j ) + e ^ i j ) δ i j } = 1 M n { δ i j m ( x i j ) ( π i j 1 π i j ) } (2.14)

Combining Equation (2.14) with the first term in Equation (2.08) becomes;

E ( Y ¯ ¯ ^ ) = 1 M n { i s j s E ( m ( x i j ) ) E ( δ i j π i j ) + i s j s δ i j ( m ( x ^ i j ) ) ( π i j 1 π i j ) } (2.15)

Since the first term represents the response units, their values are all known. The problem is to estimate the non-response units in the second term. Let the indicator variable δ i j = 1 , the problem now reduces to that of estimating the function m ( x ^ i j ) , which is a function of the auxiliary variables, x i j . Hence the expected value of the estimator of the finite population mean under non-response is given as;

E ( Y ¯ ¯ ^ ) = 1 M n { i s j s Y i j + i s j s δ i j ( m ( x ^ i j ) ) ( π i j 1 π i j ) } (2.16)

In order to derive the asymptotic properties of the expected value of the proposed estimator in 2.16, first a review of Nadaraya-Watson estimator is given below.

3. Review of Nadaraya-Watson Estimator

Given a random sample of bivariate data ( x i , y i ) , , ( x n , y n ) having a joint pdf g ( x , y ) with the regression model given by

Y i j = m ( x i j ) + e i j as in Equation (2.5), where m ( . ) is unknown. Let the error term satisfy the following conditions:

E ( e i j ) = 0 , V a r ( e i j ) = σ i j 2 , c o v ( e i , e j ) = 0 for i j (3.0)

Furthermore, let K ( . ) denote a symmetric kernel density function which is twice continuously differentiable with:

k ( w ) d w = 1 w k ( w ) d w = 0 k 2 ( w ) d w < w 2 k ( w ) d w = d k k ( w ) = k ( w ) } (3.1)

In addition, let the smoothing weights be defined by

w ( x i j ) = K ( x X i j b ) i s i s K ( x X i j b ) , i = 1 , 2 , , n ; j = 1 , 2 , , m (3.2)

where b is a smoothing parameter, normally referred to as the bandwidth such that, i j w ( x i j ) = 1 .

Using Equation (3.2), the Nadaraya-Watson estimator of m ( x i j ) is given by:

m ( x ^ i j ) = i s j s w ( x i j ) Y i j = i s j s K ( x X i j b ) Y i j i s j s K ( x X i j b ) , i = 1 , 2 , , n ; j = 1 , 2 , , m (3.3)

Given the model Y ^ i j = m ( x ^ i j ) + e ^ i j and the conditions of the error term as explained in 3.0 above, the expression for the survey variable Y i j relative to the auxiliary variable X i j can be given as a joint pdf of g ( x i j , y i j ) as follows:

m ( x i j ) = E ( Y i j / X i j = x i j ) = y g [ y / x ] d y = y g ( x , y ) d y g ( x , y ) d y (3.4)

where g ( x , y ) d y is the marginal density of X i j . The numerator and the denominator of Equation (3.4) can be estimated separately using kernel functions as follows:

g ( x , y ) is estimated by;

g ^ ( x , y ) = 1 m n i j ( 1 b K ( x X i j b ) 1 b K ( y Y i j b ) ) (3.5)

and

y g ^ ( x , y ) d y = 1 m n i j ( 1 b K ( x X i j b ) 1 b K ( y Y i j b ) ) y d y (3.6)

Using change of variables technique; let

w = y Y i j b y = w b + Y i j d y = b d w } (3.7)

So that

y g ^ ( x , y ) d y = 1 m n i j 1 b K ( x X i j b ) 1 b ( b w + Y i j ) K ( w ) b d w (3.8)

= 1 m n b i j K ( x X i j b ) [ w K ( w ) b d w + 1 b Y i j K ( w ) b d w ] (3.9)

From the conditions specified in Equation (3.1), the following (3.9) simplifies to

y g ^ ( x , y ) d y = 1 m n b i j K ( x X i j b ) [ 0 + Y i j ] (3.10)

which reduces to:

y g ^ ( x , y ) d y = 1 m n b i j K ( x X i j b ) Y i j (3.11)

Following the same procedure, the denominator can be obtained as follows:

g ^ ( x , y ) d y = 1 m n i j ( 1 b K ( x X i j b ) 1 b K ( y Y i j b ) ) d y = 1 m n b i = 1 n j = 1 m K ( x X i j b ) 1 b K ( y Y i j b ) d y (3.12)

Using change of variable technique as in Equation (3.7), Equation (3.12) can be re-written as follows:

g ^ ( x , y ) d y = 1 m n b i = 1 n j = 1 m K ( x X i j b ) 1 b K ( w ) b d w (3.13)

which yields

g ^ ( x , y ) d y = 1 m n b i = 1 n j = 1 m K ( x X i j b ) (3.14)

Since 1 b K ( w ) b d w is a pdf and therefore integrates to 1.

It follows from Equations ((3.11) and (3.14)) that the estimator m ( x ^ i j ) is as given in Equation (3.3). Thus the estimator of m ( x i j ) is a linear smoother since it is a linear function of the observations, Y i j . Given a sample and a specified kernel function, then for a given auxiliary value x i j , the corresponding y-estimate is obtained by the estimator outlined in Equation (3.3), which can be written as:

y ^ i j = m N W ( x ^ i j ) = i j W i j ( x i j ) Y i j (3.15)

where m N W ( x ^ i j ) is the Nadaraya-Watson estimator for estimating the unknown function m ( . ) , for details see [7] [8] .

This provides a way of estimating for instance the non-response values of the survey variable Y i j , given the auxiliary values x i j , for a specified kernel function.

4. Asymptotic Bias of the Mean Estimator Y ¯ ¯ ^

Equation (2.16) may be written as

E ( Y ¯ ¯ ^ ) = 1 M N { i = 1 n j = 1 m Y i j + i = n + 1 N j = m + 1 M m N W ( y ^ i j ) } (4.1)

Replacing x by x i j and re-writing Equation (3.15) using the property of symmetry associated with Nadaraya-Watson estimator, then

m N W ( x ^ i j ) = i s j s K ( X i j x i j b ) Y i j i s j s K ( X i j x i j b ) , i = 1 , 2 , , n ; j = 1 , 2 , , m (4.2)

= 1 g ^ ( x i j ) [ 1 m n b i j K ( X i j x i j b ) Y i j ] (4.3)

where g ^ ( x i j ) is the estimated marginal density of auxiliary variables X i j .

But for a finite population mean, the expected value of the estimator is given in Equation (4.1). The bias is given by

Bias ( Y ¯ ¯ ^ ) = E ( Y ¯ ¯ ^ ^ Y ¯ ¯ ) (4.4)

Bias ( Y ¯ ¯ ^ ) = E { 1 M N [ i = 1 n j = 1 m Y i j + i = n + 1 N j = m + 1 M m ( x ^ i j ) ] 1 M N [ i = 1 n j = 1 m Y i j + i = n + 1 N j = m + 1 M Y i j ] } (4.5)

Which reduces to

Bias ( Y ¯ ¯ ^ ) = 1 M N { i = n + 1 N j = m + 1 M m ( x ^ i j ) i = n + 1 N j = m + 1 M Y i j } (4.6)

= 1 M N { i = n + 1 N j = m + 1 M m ( x ^ i j ) i = n + 1 N j = m + 1 M m ( x i j ) } (4.7)

Re-writing the regression model given by Y i j = m ( X i j ) + e i j as

Y i j = m ( x i j ) + [ m ( X i j ) m ( x i j ) ] + e i j (4.8)

So that from Equation (4.3) the first term in Equation (4.7) before taking the expectation is given as:

1 M N { 1 m n b i = n + 1 N j = m + 1 M K ( X i j x i j b ) Y i j g ^ ( x i j ) } = 1 M N { 1 g ^ ( x i j ) { i = n + 1 N j = m + 1 M K ( X i j x i j b ) m ( x i j ) + 1 m n b i = n + 1 N j = m + 1 M K ( X i j x i j b ) [ m ( X i j ) m ( x i j ) ] + 1 m n b i = n + 1 N j = m + 1 M K ( X i j x i j b ) e i j } } (4.9)

Simplifying Equation (4.9) the following is thus obtained:

1 M N { 1 m n b g ^ ( x i j ) i = n + 1 N j = m + 1 M K ( X i j x i j b ) Y i j } = 1 M N { i = n + 1 N j = m + 1 M [ g ^ ( x i j ) m ( x i j ) + m ^ 1 ( x i j ) + m ^ 2 ( x i j ) ] m n b g ^ ( x i j ) } (4.10)

where

m ^ 1 ( x i j ) = i = n + 1 N j = m + 1 M K ( X i j x i j b ) [ m ( X i j ) m ( x i j ) ]

m ^ 2 ( x i j ) = i = n + 1 N j = m + 1 M K ( X i j x i j b ) e i j

Taking conditional expectation of Equation (4.10) we get

E [ i = n + 1 N j = m + 1 M M ( x ^ i j ) x i j ] = 1 M N E [ 1 m n b i = n + 1 N j = m + 1 M [ m ( x i j ) + m ^ 1 ( x i j ) g ^ ( x i j ) + m ^ 2 ( x i j ) g ^ ( x i j ) ] ] (4.11)

To obtain the relationship between the conditional mean and the selected bandwidth, the following theorem due to [6] is applied;

Theorem: (Dorfman, 1992)

Let k ( w ) be a symmetric density function with w k ( w ) d w = 0 and w 2 k ( w ) d w = k 2 . Assume n and N increase together such that n N π with

0 < π < 1 . Besides, assume the sampled and non-sampled values of x are in the interval [ c , d ] and are generated by densities d s and d p s respectively both bounded away from zero on [ c , d ] and assumed to have continuous second derivatives. If for any variable Z , E ( Z / U = u ) = A ( u ) + O ( B ) and V a r ( Z / U = u ) = O ( C ) , then Z = A ( u ) + O p ( B + C 1 2 ) .

Applying this theorem, we have

M S E ( Y ¯ ¯ ^ x i j ) = 1 ( M N ) 2 { ( M N m n ) 2 k ( w 2 ) d w m n b g ( x i j ) + ( M N m n ) 2 4 m 2 n 2 b 4 k 2 2 ( k ) [ m ( x i j ) + 2 g ( x i j ) m ( x i j ) g ( x i j ) ] 2 + O ( b 4 ) + O [ ( M N m n ) 2 m n b + 1 m n b ] } (4.12)

This theorem is stated without proof. To prove it, we partition it into the bias and variance terms and separately prove them as follows:

From Equation (3.0) it follows that E ( e i j / X i j ) = 0 . Therefore, E [ m ^ 2 ( x i j ) ] = 0 . Thus E [ m ^ 1 ( x i j ) ] can be obtained as follows:

E i = n + 1 N j = m + 1 M [ m ^ 1 ( x i j ) ] = 1 M N { 1 m n b E { i = n + 1 N j = m + 1 M K ( X i j x i j b ) [ m ( X i j ) m ( x i j ) ] } } (4.13)

Using substitution and change of variable technique below

w = V x i j b so that V = x i j + b w and d V = b d w (4.14)

Equation (4.13) can simplify to:

E i = n + 1 N j = m + 1 M [ m ^ 1 ( x i j ) ] = 1 M N { M N m n m n b k ( w ) [ m ( x i j + b w ) m ( x i j ) ] g ( x i j + b w ) b d w } (4.15)

= 1 M N { M N m n m n k ( w ) [ m ( x i j + b w ) m ( x i j ) ] g ( x i j + b w ) d w } (4.16)

Using the Taylor’s series expansion about the point x i j , the kth order kernel can be derived as follows:

g ( x i j + b w ) = g ( x i j ) + g ( x i j ) b w + 1 2 g ( x i j ) b 2 w 2 + + 1 k ! g k ( x i j ) b k w k + O ( b 2 ) (4.17)

Similarly,

m ( x i j + b w ) = m ( x i j ) + m ( x i j ) b w + 1 2 m ( x i j ) b 2 w 2 + + 1 k ! m k ( x i j ) b k w k + O ( b 2 ) (4.18)

Expanding up to the 3rd order kernels, Equation (4.18) becomes

[ m ( x i j + b w ) m ( x i j ] = m ( x i j ) b w + 1 2 m ( x i j ) b 2 w 2 + 1 3 ! m ( x i j ) b 3 w 3 (4.19)

In a similar manner, the expansion of Equation (4.16) up to order O ( b 2 ) is given by:

E i = n + 1 N j = m + 1 M [ m ^ 1 ( x i j ) ] = 1 M N { M N m n m n k ( w ) ( m ( x i j ) b w + 1 2 m ( x i j ) b 2 w 2 ) ( g ( x i j ) + g ( x i j ) b w ) d w } (4.20)

Simplifying Equation (4.20) gives;

E i = n + 1 N j = m + 1 M [ m ^ 1 ( x i j ) ] = 1 M N { ( M N m n m n ) g ( x i j ) m ( x i j ) b w k ( w ) d w + ( M N m n m n ) g ( x i j ) m ( x i j ) b 2 w 2 k ( w ) d w + ( M N m n m n ) 1 2 g ( x i j ) m ( x i j ) b 2 w 2 k ( w ) d w + O ( b 2 ) } (4.21)

Using the conditions stated in Equation (3.1), the derivation in (4.21) can further be simplified to obtain:

E i = n + 1 N j = m + 1 M [ m ^ 1 ( x i j ) ] = 1 M N { ( M N m n m n ) [ g ( x i j ) m ( x i j ) + 1 2 g ( x i j ) m ( x i j ) ] b 2 d k + O ( b 2 ) } (4.22)

Hence the expected value of the second term in Equation (4.11) then becomes:

E i = n + 1 N j = m + 1 M [ m ^ 1 ( x i j ) ] = 1 M N { ( M N m n m n ) [ 1 2 m ( x i j ) + g ( x i j ) m ( x i j ) g ( x i j ) ] b 2 d k + O ( b 2 ) } (4.23)

= 1 M N { ( M N m n m n ) [ m ( x i j ) 2 + [ g ( x i j ) ] 1 g ( x i j ) m ( x i j ) ] b 2 d k + O ( b 2 ) } (4.24)

= 1 M N { ( M N m n m n ) b 2 d k C ( x ) + O ( b 2 ) } (4.25)

where

C ( x ) = 1 2 m ( x i j ) + [ g ( x i j ) ] 1 g ( x i j ) m ( x i j ) (4.26)

and d k is as stated in Equation (3.1)

Using equation of the bias given in (4.4) and the conditional expectation in Equation (4.11), we obtain the following equation for the bias of the estimator:

Bias ( Y ¯ ¯ ^ ) = 1 M N { ( M N m n m n ) b 2 d k C ( x ) + O ( b 2 ) } = 1 M N { ( M N m n m n ) b 2 d k C ( x ) + O ( b 2 ) } (4.27)

5. Asymptotic Variance of the Estimator, Y ¯ ¯ ^

From Equations ((4.9) and (4.11)),

m 2 ( x i j ) = 1 m n b i = 1 n j = 1 m K ( X i j x i j b ) e i j (5.0)

Hence

Var i = n + 1 N j = m + 1 M [ m ^ 2 ( x i j ) ] = 1 ( M N ) 2 ( M N m n m n b ) 2 i = 1 n j = 1 m Var ( D x ) (5.1)

where

D x = K ( X i j x i j b ) e i j

Expressing Equation (5.1) in terms of expectation we obtain:

Var i = n + 1 N j = m + 1 M [ m ^ 2 ( x i j ) ] = 1 ( M N ) 2 [ ( M N m n ) 2 m n b 2 ] { E [ D x ] 2 [ E ( D x ) ] 2 } (5.2)

Using the fact that the conditional expectation

E ( e i j / X i j ) = 0 , the second term in Equation (4.13) reduces to zero. Therefore,

Var i = n + 1 N j = m + 1 M [ m ^ 2 ( x i j ) ] = 1 ( M N ) 2 [ ( M N m n ) 2 m n b 2 ] σ ( x i j ) 2 (5.3)

where

E ( e i j / X i j ) 2 = σ ( x i j ) 2

Let X = X i j , and x = x i j , and making the following substitutions

w = X x b X x = b w d X = b d w } (5.4)

Var i = n + 1 N j = m + 1 M [ m ^ 2 ( x i j ) ] = ( M N m n ) 2 m n b 2 ( M N ) 2 K ( X x b ) 2 σ ( x i j ) 2 g ( X ) d X (5.5)

= ( M N m n ) 2 m n b 2 ( M N ) 2 K ( w ) 2 σ ( x i j ) 2 g ( x + b w ) b d w (5.6)

which can be simplified to get:

Var i = n + 1 N j = m + 1 M [ m ^ 2 ( x i j ) ] = ( M N m n ) 2 m n b ( M N ) 2 K ( w ) 2 g ( x ) σ ( x i j ) 2 d w + O ( 1 m n b ) (5.7)

Var i = n + 1 N j = m + 1 M [ m ^ 1 ( x i j ) ] = 1 ( M N ) 2 Var i = n + 1 N j = m + 1 M [ 1 m n b i = 1 n j = 1 m K ( X i j x i j b ) ] [ M ( X i j ) m ( x i j ) ] (5.8)

Var i = n + 1 N j = m + 1 M [ m ^ 1 ( x i j ) ] = ( M N m n ) 2 m n b 2 ( M N ) 2 Var K ( X i j x i j b ) [ M ( X i j ) m ( x i j ) ] (5.9)

Hence

Var i = n + 1 N j = m + 1 M [ m ^ 1 ( x i j ) ] = ( M N m n ) 2 m n b 2 ( M N ) 2 E [ K ( X x b ) 2 [ M ( X ) m ( x ) ] 2 ] g ( X ) d X (5.10)

where X = b w + x so that d X = b d w .

Changing variables and applying Taylor’s series expansion then

Var i = n + 1 N j = m + 1 M [ m ^ 1 ( x i j ) ] = ( M N m n ) 2 m n b 2 ( M N ) 2 K ( w ) 2 [ m ( x + b w ) m ( x ) ] 2 g ( x + b w ) d w (5.11)

= ( M N m n ) 2 m n b 2 ( M N ) 2 K ( w ) 2 [ m ( x ) + m ( x ) b w + m ( x ) ] 2 ( g ( x ) + g ( x ) b w ) d w (5.12)

which simplifies to

Var i = n + 1 N j = m + 1 M [ m ^ 1 ( x i j ) ] = O [ ( M N m n ) 2 b 2 m n b ] (5.13)

For large samples, as n N , m M and for b 0 , then m n b . Hence the variance in Equation (5.12) asymptotically tends to zero, that is,

Var i = n + 1 N j = m + 1 M [ m ^ 1 ( x i j ) ] 0

Var ( Y ¯ ¯ ^ ) = ( M N m n ) 2 m n b ( M N ) 2 i = n + 1 N j = m + 1 M Var [ m ( x i j ) + m 1 ( x i j ) + m 2 ( x i j ) g ^ ( x i j ) ] (5.14)

On simplification,

Var ( Y ¯ ¯ ^ ) = ( M N m n ) 2 m n b ( M N ) 2 [ g ^ ( x i j ) ] 2 Var { i = n + 1 N j = m + 1 M [ m ^ 2 ( x i j ) ] } (5.15)

Substituting Equations ((5.7) into (5.15)) yields the following:

Var ( Y ¯ ¯ ^ ) = 1 ( M N ) 2 { ( M N m n ) 2 K ( w ) 2 σ ( x i j ) 2 d w m n b ( g ( x i j ) ) + O [ ( M N m n ) 2 m n b + 1 m n b ] } (5.16)

= 1 ( M N ) 2 { ( M N m n ) 2 H ( w ) σ ( x i j ) 2 m n b ( g ( x i j ) ) + O [ ( M N m n ) 2 m n b + 1 m n b ] } (5.17)

where, H ( w ) = K ( w ) 2 d w

It is notable that the variance term still depends on the marginal density function, g ( x i j ) of the auxiliary variables X i j . It can also be observed that the variance is inversely related to the smoothing parameter b. This implies that an increase in b results in a smaller variance. However, increasing the bandwidth would give a larger bias. Therefore there is a trade-off between the bias and variance of the estimated population mean. A bandwidth that provides a compromise between the two measures would therefore be desirable.

6. Mean Squared Error (MSE) of the Finite Population Mean Estimator Y ¯ ¯ ^

The MSE of Y ¯ ¯ ^ combines the bias and the variance terms of this estimator that is,

M S E ( Y ¯ ¯ ^ ) = E ( Y ¯ ¯ ^ Y ¯ ¯ ) 2 (6.0)

M S E ( Y ¯ ¯ ^ ) = E ( Y ¯ ¯ ^ E [ Y ¯ ¯ ^ ] + E [ Y ¯ ¯ ^ ] Y ¯ ¯ ) 2 (6.1)

Expanding Equation (6.1) gives:

M S E ( Y ¯ ¯ ^ ) = E ( Y ¯ ¯ ^ E [ Y ¯ ¯ ] ) 2 + E ( E [ Y ¯ ¯ ^ ] [ Y ¯ ¯ ] ) 2 + 2 E ( Y ¯ ¯ ^ E [ Y ¯ ¯ ^ ] ) ( E [ Y ¯ ¯ ^ ] Y ¯ ¯ ) (6.2)

= Var ( Y ¯ ¯ ^ ) + B i a s 2 + 0 (6.3)

Combining the bias in Equation (4.27) and the variance in Equation (5.17) and conditioning on the auxiliary values x i j of the auxiliary variables X i j then

M S E ( Y ¯ ¯ ^ / X i j = x i j ) = 1 ( M N ) 2 { ( M N m n ) 2 H ( w ) σ ( x i j ) 2 m n b ( g ( x i j ) ) + O ( 1 M N { ( M N m n ) 2 m n b + 1 m n b } ) } + 1 M N { ( M N m n m n ) b 2 d k C ( x ) + O ( b 2 ) } (6.4)

M S E ( Y ¯ ¯ ^ / X i j = x i j ) = 1 ( M N ) 2 { ( M N m n ) 2 H ( w ) σ ( x i j ) 2 m n b ( g ( x i j ) ) + [ ( M N m n ) 2 4 ( m n ) 2 ( M N ) 2 b 4 d k 2 [ m ( x i j ) + 2 g ( x i j ) m ( x i j ) g ( x i j ) ] 2 + O ( b 4 ) + 1 M N ( O { ( M N m n m n b ) + 1 m n b } ) ] } (6.5)

where H ( w ) = K ( w ) 2 d w , d k = w 2 K ( w ) d w , C ( x ) = 1 2 m ( x i j ) + [ g ( x i j ) ] 1 g ( x i j ) m ( x i j ) as used earlier in the rest of the derivations.

7. Conclusion

If the sample size is large enough, that is as n N and m M the of ( Y ¯ ¯ ^ ) in Equation (6.5) due to the kernel tends to zero for sufficiently a small bandwidth b. The estimator ( Y ¯ ¯ ^ ) is therefore asymptotically consistent since its MSE converges to zero.

Cite this paper

Bii, N.K., Onyango, C.O. and Odhiambo, J. (2017) Estimating a Finite Population Mean under Random Non-Response in Two Stage Cluster Sampling with Replacement. Open Journal of Statistics, 7, 834-848. https://doi.org/10.4236/ojs.2017.75059

References

  1. 1. Singh, S. and Horn, S. (2000) Compromised Imputation in Survey Sampling. Metrika, 51, 267-276. https://doi.org/10.1007/s001840000054

  2. 2. Lee, H., Rancourt, E. and Sarndal, C. (2002) Variance Estimation from Survey Data under Single Imputation. Survey Nonresponse, 315-328.

  3. 3. Bethlehem, J.G. (2012) Using Response Probabilities for Assessing Representativity. Statistics Netherlands, International Statistical Review, 80, 382-399.

  4. 4. Ouma, C. and Wafula, C. (2005) Bootstrap Confidence Intervals for Model-Based Surveys. East African Journal of Statistics, 1, 84-90.

  5. 5. Onyango, C.O., Otieno, R.O. and Orwa, G.O. (2010) Generalized Model Based Confidence Intervals in Two Stage Cluster Sampling. Pakistan Journal of Statistics and Operation Research, 6. https://doi.org/10.18187/pjsor.v6i2.128

  6. 6. Dorfman, A.H. (1992) Nonparametric Regression for Estimating Totals in Finite Populations. In: Proceedings of the Section on Survey Research Methods, American Statistical Association Alexandria, VA, 622-625.

  7. 7. Nadaraya, E.A. (1964) On Estimating Regression. Theory of Probability & Its Applications, 9, 141-142. https://doi.org/10.1137/1109020

  8. 8. Watson, G.S. (1964) Smooth Regression Analysis. Sankhya: The Indian Journal of Statistics, Series A, 359-372.