Paper Menu >>
Journal Menu >>
Open Journal of Marine Science, 2011, 1, 1-17 doi:10.4236/ojms.2011.11001 Published Online April 2011 (http://www.scirp.org/journal/ojms) Copyright © 2011 SciRes. OJMS Toward Estimating the Variance in Acoustic Surveys Based on Sampling Design Magnar Aksland Department of Bi olo gy, University of Bergen E-mail: magnar.aksland@bio.uib.no Received March 8, 2011; revised March 19, 2011, accepted March 22, 2011 Abstract This paper develops a sampling method to estimate the integral of a function of the area with a strategy to cover the area with parallel lines of observation. This sampling strategy is special in that lines very close to each other are selected much more seldom than under a uniformly random design for the positions of the parallel lines. It is also special in that the positions of some of the lines are deterministic. Two different vari- ance estimators are derived and investigated by sampling different man made signal functions. They show different properties in that the estimator that estimate the biggest variance gives an error interval that, in some situations, may be more than ten times the error interval computed from the other estimator. It is also obvious that the second estimator underestimates the variance. The author has not succeeded to derive an expression for the expectation of this estimator. This work is motivated towards finding the variance of acoustic abundance estimates. Keywords: Acoustic Abundance Estimation, Line Surveys, Sampling Design, Sampling in the Area, Variance Estimation 1. Introduction This paper is about to find the variance in acoustic sur- veys to estimate the abundance of fish stocks. Here, usu- ally a ship with a downward looking echo sounder that pings sound pulses into the sea and receive sound reflec- tions from fish, covers an area where the fish stock under estimation is supposed to be located. Estimating the variance of acoustic abundance esti- mates of marine resources is an old problem, but up to now not much ha s been done based on sampling design. The author of this paper has worked with acoustic es- timation of fish populations. Then it is natural that the methods will be related to the observations of the acous- tic signal generated by fish echoes, although such meth- ods will always have the potential to be used in other applications. The echo signal received from a modern echo sounder as a function of the area position of the echo sounder is an example of a function of which the area integral is necessary to estimate. This is explained in the next sec- tion. The method presented here represents the use of un- equal probability sampling design, and can be used in many sampling problems where a resource to be esti- mated are distributed over an area. Sampling design in the area or volume is a neglected field in the sampling literature, but several applications are demanding appropriated methods. This includes es- timating resources that are distributed over the area or volume. Such resources are sometimes observed con- tinuous along lines, and sometimes at distinct positions. Reference to such problems is found in Stevens and Ol- sen (2004). An important assumption for the present method is that a quantity associated with each point in an area can be measured continuously with a movable device. Then, the quantity may be observed on a system of lines cov- ering the area of interest. Here, the unequal probability design has another pur- pose than is usual. Instead of seeking inclusion prob- abilities that are positively correlated with the quantity to be observed, as for the Horvitz-Thompson estimator in sampling design, we use a sampling design that has a reduced probability to select object very close to each other relative to a uniform probability design. The pre- sent design thus produces samples that are better spatial balanced than that produced by a uniformly random de- 2 M. AKSLAND sign. The present paper is an attempt to give a contribution to the field of line sampling. It is a hope that more specialists in sampling theory will work in this field. 2. Review of Echo Integration in Terms of Energy This is a short review of a theory of echo-integration to estimate the abundance of fish. The traditional method of Echo Integration for esti- mating the abundance of sound scatters (MacLennan, 1990) is based on th e integration of the echo-signa l. This quantity represents sound energy, but the conversion of integrator values to abundance, or density, of scatters is based on “back scattering cross-section”, which repre- sents sound intensity, or power. In 1982 R. E. Craig proposed to rewrite sonar theory that is based on power, to a theory based on energy (Craig, 1983). Before this proposal, however, a method of echo integration that bases the conversion to scatter abundance on the energy of single target echoes were in use (Aksland, 1983), although the theory of the method was published in 1986 (Aksland, 1986), and published when the split beam and dual beam system were in regu- lar use (Aksland, 2005, Aksland, 2006). This alternative method of echo integration does not use the concept “back scattering cross-section” or “Target Strength”. The method is in fact based on much fewer concepts than the traditional echo integration method. The basis of the method is given below. Let 21 2 ;, I zt z tz t be the depth integrated 20 Log r TVG (Time Varied Gain) echo signal (echo inten- sity) between times 1 and 2 after sound transmission. The parameter r is the distance from the transducer. In general, the times t 2,1, ii trci ,y2 depend on the transducer position . Here c is the sou nd speed . The downward looking transducer is free to be moved within a horizontal plane with Cartesian coordinates zx , x y, here called the transducer plan. Let Q be a given region of the transducer plan. 2()d Q EAI z A (1) where d A is the area differential, is defined as the Echo Abundance within Q subject to the depth chan- nel. In applications, the dept channel may also consist of several disjoint time intervals. This may be necessary when integrating echoes from a special type of scatters while excluding others. 11 ,tz tz Let 2p I z be the integrated echo pulse intensity from a single scatter at 20 log r TVG, as it is when the echo is resolved. 2d p Q EVCI A (2) where the integral is over an area where the integrand is significant is called the Echo Value Constant of the cor- responding scatter. As for the Echo Abundance, this in- tegral is also over the different transducer positions in the transducer plane. By transforming the Echo Value Constant (2) to polar coordinates , in the object reference system (a polar reference system with origin in the scatter and 0 pointing vertically upward), and using that 2 dtanddAr , it may be expressed as 2 00 ,d tand c tr EVC b (3) where tr b is the transmit-receive beam function (here circular symmetric), c is an angle where the in- tegrand outside c is negligible, and 2 ,(,) ptr Ib is called the back-scat- tering energy of the scatter. The back-scattering energy is the same as the beam compensated integrated 40 log r TVG echo pulse received from the scatter when the transducer is in direction , in the object reference system. A necessary condition for this is that the ratio between the 40 log r TVG and 20 log r TVG functions satisfies exactly the relation 2 22 40log TVG 20log TVG4 rc rt r . Note that the Echo Value Constant of a scatter de- pends on both the level of the echo signal and the pulse length used, as well as on the beam function. For a fixed acoustic system it is a quantity that varies with the time since movement and other kinds of behaviour of the scatters will affect their Echo Value Constants. If c in (3) is replaced with a variable angle , the corresponding function is called the Echo Value, that is, 2 00 ,d tand tr EV b (4) If increases from zero, it turns out that the Echo Value increases at first, but flattens out when ap- proaches the outer part of the main lobe. When is equal to or bigger than the angle 20 where the trans- mit-receive beam has fallen 20 dB, the Echo Value is approximately constant due to the strong beam damping. It is the value of this flat region that is called the Echo Value Constant, and this property justifies its name. It can be proved that both the Echo Abundance and the Echo Value Constants of single scatters are independent of the distance between the transducer and scatters when Copyright © 2011 SciRes. OJMS M. AKSLAND 3 2 I does not include noise. These properties are due to the 20 log r TVG. Let E A EVC be the average, or mean Echo Value Con- stant over the scatters that contribute to the Echo Abun- dance (EA). It can be proved (Aksland, 1986) that 1 N E A i i EAEVC NEVC (5) where N is the number of scatters and i is the Echo Value Constant of the i-th scatter that contributes to the Echo Abundance. Relation (5) is true if the depth integral of overlapping echoes is the same as the sum of the indi- vidual integrated resolved echo pulses over all scatters. A necessary condition for this is that the phases of the ech- oes from each single scatter are uncorrelated, and that shadow effects are negligible (Zhao and Ona, 2003). EVC By writing (5) as E A EA NEVC we see that the mean Echo Value Constant correspond- ing to an Echo Abundance is a constant that converts the Echo Abundance to the nu mber of scatters. The mean Echo Value Constant may be estimated from integrated single target echoes at 40 log r TVG and corresponding detection angles provided that the echoes are representative for all echoes that contributes to the Echo Abundance. The detection angle of an echo is the angle between the beam axis and the direction between the transducer and the scatter, and can be detected within the main lobe with split beam and dual beam echo sounder systems. See Aksland, (2005), (2006) and (20 10) for details. 3. Estimating the Echo Abundance The Echo Abundance is the area integral of the echo in- tensity as a function of the transducer position over an area. The alternative echo integrator-method defines and uses the Echo Abundance, so the sampling methods to be developed here refer to this method. A review of the the- ory behind the alternative echo integrator-method is given in Section 2. Short reviews of other methods to estimate the Echo Abundance are given below. 3.1. Review of Estimating the Echo Abundance A non statistical way of estimating the Echo Abundance within a given sea area Q is to observe the echo intensi- ties 21 2 ;, I zt z tz , (see (1)), on a system of lines covering Q. Next fit some parametric class of surfaces over Q to the observed values of 2 I . Then, the volume under the fitted surface within Q will be an estimator of the Echo Abundance. An example of th is way to estimate the Echo Abundance is given in Aksland (1983). This method does not estimate the precision of the es- timated Echo Abundance. The method can, nevertheless be recommended if the precision of the abundance estimate is not very important, or if it is believed that the precision of the Echo Abun- dance is small compared to other factors of uncertainty affecting the abundance estimate. To estimate the Echo Abundance with precision is a challenge. Since acoustic data are observed along lines, application of probabilistic survey design methods are not appropriate except for special cases where each strata are covered with parallel lines of observations. This case is equivalent with sampling points on a line, where the integrated echo intensities are projected onto a line or- thogonal to the parallel lines. The parallel lines are then projected to points of observation onto the orthogonal line. This restriction of probability sampling methods may be reduced through a generalization of the foundation of probability sampling theory. Then randomization of other types of lines covering an area may be used. In acoustic abundance estimation, coverings with zigzag lines are common. Estimation of the Echo Abundance belongs to the more general problem of estimating an integral ()d Zfz z (6) where Z is a region of the Euclidean space, and f z is observed on a subset of Z with dimension commonly less than the dimension of Z. In acoustic abundance es- timation Z is of dimension 2, while the set observed has dimension 1 (a system of lines). Foote and Stefánson (1993) have described and dis- cussed different methods for estimating fish abundance over an area from line-transects. They recommend kriging methods, but have only one reference to prob- ability sampling (Cochran ). There are mainly two classes of methods that are available for estimating (6) together with an estimate of the precision of the estimate. This is methods within random field models (geostatisics, kriging) and probabil- ity survey sampling methods, respectively. 3.2. Random Field Models Here f z is considered as a realization of a random process (field) , F zzZ . Methods based on parametric models with stationary increments are elaborated and known as “Geostatistics”, (Matheron 1963). Estimators are based on predictors for Copyright © 2011 SciRes. OJMS 4 M. AKSLAND the values attained by the process outside the sample (kriging), (Krige 1951). The sample may be selected subjectively. The predictors have certain minimum vari- ance properties for given model parameters (usually trend and auto covariance functions), and these have to be estimated in applications. Objectives against the application of random field models are that it is difficult to judge whether the models are well related to the spatial distribution they are sup- posed to describe. Also the bias caused by spatial distri- butions that cannot adequately be taken to be a normal realization of the used random field model is difficult to judge. However, the fact that these methods allow subjective selected samples has lead many scientists to choose these methods. See the book by Rivoirard et al. (2000) and references therein. 3.3. Probability Survey Methods The methods within probability survey sampling are mainly concerned with the estimation of the mean 1 1N j j y N y or the total Nyof an unknown finite vector 12 ,,, N Yyy y The foundation given below follows Cassel, Särndal and Wretman (1977). The estimation is based on a sample from the set of labels 1, 2,, J N, where J must be known. A sampling design, , is a probability measure on the set of all subsets of J. Selection of a sample is done in accordance with the selected sampling design. Ps The data associated with a sample s is denoted by , s DsY, where s Y is the restriction of Y to s. In general, an estimator is a real function , tD s . The combination ,tDPs is called a strategy. The art of probability sampling is to make use of all known information about Y to construct a strategy that will give the most precise and accurate estimates for a given sampling budget. An estimator that depends only on the values of Y on the selected sample and is otherwise independent on the labels of the sample is denoted label-independent. Most of the traditional and well-known estimators in probab il- ity sampling are label-independen t. A sampling design is non-informative if Ps is in- dependent of values of Y. Otherwise the design is infor- mative. Many methods within probabilit y sampling may easily be generalized to sampling from infinite populations. Sampling from Euclidean space, and from an area in particular, are good examples, where the infinite set of points are the sampling objects. If the population from which samples are selected are a subset of a Euclidean space, each object has a position, and there is a unique distance between each pair of points. Unfortunately, such quantities associated with sample objects are very sel- dom and poorly treated in the probability sampling the- ory. In particular, when sampling a subset of the Euclid- ean space successively to estimate an integral, estimators that are label-dependent and sampling designs that are informative are likely to be better than the traditional estimators in probability sampling. Fortunately Thompson (1990) and Thompson and Se- ber (1996) has worked with adaptive sampling methods that has informative sampling designs that give better precision than conventional estimates of populations having aggregation tendencies in their area distribution. These estimation methods have been used in biological sampling as well as in acoustic surveys (Harbitz, Ona and Pennington, 2009), (Conners and Schwager, 2002), (McQuinn et al., 2005). When sampling values of some function defined on a given subset of a Euclidean space, the observed values give information about the spatial structure of the func- tion values. Information about this structure should be used when selecting the rest of the sample, as well as in constructing a good estimator. When sampling in the area, estimators that are the volume under some fitted surface to the observed data are likely to have good properties relative to other, more traditional, estimators in the sampling literature. The adapted sampling design methods mentioned above have not been developed far enough to be appropriate for all situations when estimat- ing an integral over Euclidean space. The case of spatial label sets impose some general demands on the sampling design. a) The design should reduce the selection of “spatial unbalanced” samples (7) b) For successively selected samples, the design should be informative This indicates the need for developing new methods within the field of probability sampling. Bertil Matern (1969) has pointed out this and other problems related to the application of probability sampling methods, but not much has happened with the foundation of sampling de- sign since then. Case a) in (7) above means that samples that do not cover the space properly, or are too patchy, are “spatial unbalanced”. A way to avoid th is is to stratify the sample population into n strata, where 2n is the sample size, and chose a probability sample of size 2 in each stratum. When estimating the abundance of animals with aggre- gation tendencies, wh ich are rath er usual, there is another advantage of stratifying the sampling population as fine as possible. The population variance of small strata are likely to be smaller than the population variance of big- Copyright © 2011 SciRes. OJMS M. AKSLAND Copyright © 2011 SciRes. OJMS 5 ger strata. Then, under the assumption of equal total sample sizes, the use of many small strata is likely to give more precise estimates than an estimator based on fewer and bigger strata. See Des Raj (1968), ch. 4. lines be l x and 1h x , respectively, where 1 lh xx , and the values observed are l and h. The strata width is set equal to 1 since this will not represent any loss of generality. The function y y y x is then observed at the values 0, l x , h 1 x and 1 for x . In the follow- ing, y x will be called the signal function. Case b) in (7) holds because when starting to collect a successive sample, information on the spatial structure of the population variable is gained. To increase the preci- sion, this knowledge should be used when selecting the sampling design for the positio n of the next observation. We choose the following sampling strategy for the stratum: The estimator is given by 01 1111 2lhhll h Txyxyxyxxyx 4. A Strategy for Parallel Line Survey (8) Assume that we know approximately the area where a pelagic resource is located. This resource can be ob- served with a downward looking transducer that is moved within the actual area. Cover first the area along parallel lines that runs completely through the locations occupied by the resource so that sailing between the dif- ferent lines are outside th is region. The first covering has the role of a pilot survey that is selected subjectively. This is the same as the area under the step function shown in Figure 2, and defines a label-dependent esti- mator. The sampling design is given by the following prob- ability density: ,120 1 for0, 0,1 lhlhl h lhlh f xxxxx x xxxx (9) Next return and go two parallel legs with random lo- cations between each neighbour lines of the pilot survey. The randomized legs should also go completely through the resource. Fi gu re 1 illustrates this survey. A plot of this density in terms of x for l x and y for 1h x is shown in Figure 3. It is seen that this sampling design reduces the likely- hood considerable that the lines will be selected close together, or close to some strata boundary relative to a uniform sampling design, where the probability density is constant for all x and . y In this survey, the pilot survey, wh ich is deterministic, defines strata boundaries. Two random lines are selected within each stratum, stochastically independent of the selection within other stratums. To reduce the likelihood of selecting lines very close to each other, or to close to one or both lines of the pilot survey, the position s for the lines are selected with unequal probabilities. This sampling strategy fulfils to a certain degree de- mand (7a) for a strategy for estimating an integral over a subset of the Euclidean space. The estimator (8) repre- sents the area under a simpl e int erpol ati on t o t he obser ved In a strata let the positions of the two selected rando m Figure 1. The pilot survey: thin lines. The probability survey: thick lines. The isolines illustrate the distribution of the echo intensity that is not known during observation. The whole survey is projected onto the line perpendicular to the parallel lines. M. AKSLAND Copyright © 2011 SciRes. OJMS 6 Figure 2. Graphical illustration of the estimator (area un- der step function). This is the same as the area under the straight-line curve through the observed values. Figure 3. The joint probability density for the positions of the left ( x ) and right () random lines in a stratum. y values while the sampling design reduces the likelihood for selecting lines close together. However, the present sampling design is not informative, as it is independent of any observed values from the survey. Unfortunately, this strategy does not make use of th e observed variation along the parallel lines. A generalization of this method where the sampling design depends on the observations on the strata bounda- ries, may appear in the future. The following are some statistical properties of the present strategy, which are derived in the Appendix I: The random variables l x and h x have identical marginal distributions with probability density 3 20 1, 01xx x , and the conditional density of one of the variables, , given the other, z x , is 3 1 6,0 1 zxz z x 1 x ,. These distributions are used when generating random values for l x and h x . Further 2 2 11 E,E ,E 37 1 1 E,E 28 2 ll lh l lh hl xx xx x xx xx 2 , 21 (10) 1 01 0 1234 01 0 1 E5111 6 or 1 E()()522( ) 6 Tyyxxxxyx Tyy xxxxyxdx dx (11) In general T is biased, but T is unbiased if y x is linear. Moreover, when the graph of y x 1 is a straight line over the strata, T is equal to for every 0dyx x selected positions of l x and h x . This follows from Figure 2, and implies further that Va for the class of linear functions r 0T y xabx for all real con- stants and . ab It is possible to specify a bigger class of functions yx for which T is unbiased. However, T is in general biased, and although most estimators given in the sam- pling literature are unbiased, the lack of unbiasedness for T is caused by the fact that T was selected without this in mind. It is seen from Figure 2 that T has positive bias if y x is convex and negative bias if y x is concave on 0,1 , because then y x will be below (above) the straight-line function in Figure 2. However, T is sup- posed to be summed over several strata, and it is the total bias over all strata that are important here. This is likely to be small because the individual strata biases will be both positive and negative. An in depth theoretical analysis of the bias of the sum of T over the strata seems to be difficult. Some considerations are given in the Discussion. To be able to find a formula for , the ex- pected value of is given. Var T 2 T 1 222 234562 0101 0 11 23456 23456 01 00 1223 00 11 E34 351020186d 84 2 510522d 251015102d 601d d z Tyyyyxxxxxxyxx y xxxxxyxxyx xxxxxyxx zzyzxxzxx yxxz (12) M. AKSLAND 7 Formula (12) may also be expressed with the polynomials on factorial form. 1 222 2342 0101 0 11 23 23 01 00 0, 0 1 11 E3 41328126d 84 2 151 2d1521d 601111d d xz xz Tyyyyxxxxxxyx yxxxxyxxy xxxxyxx xzxzxzyxyzxz x mean expression given here is obtained by squaring the last (11) and subtracting it from the first ex- 2). With (11) and (12), expressions for the variance and square error of T may be derived. The variance expression of pression of (1 1 22234 5 6 0 0101 0 1 1 23456 234562 1 0 0 2 1 Var( )525402066d 126 31 52040306d 351020186d 32 y Ty yyy xxxxxxyxx y 1 1 22 3 234 00 0 60 (1)()()d d2522d z x xxxxxyxxxxxxxxyxx z (13) by computing it for te signal function zz yzxx zxxyxxxxxxyxx An estimator for Var T may be obtained from (13) h y x ugh th y x devel , a program that computes this estimator has been oped. This estimator cannot take negative values since it is the variance of T fo r a gi ven sign a l function. Another variance estimator is: given by the piecewise stnction throe observed values (see Figure 2). However, it is not easy to evaluate the expectation of this estimator. Since the integrals in (13) are tractable for piecewise linear functions for raight-line fu 2 1 ˆ E111 l h xxyxxy x 22 222 2 01 01 11 Var1311 1 126 4 1 hl lh lh Tyyyy xyxxyx y 01 11 11 1 2323 1 11 2 lhhl hll yy xx xyxxyx xyxx 4 h hl (14) where 2 ˆ Eis an estimator of 2 E11 1 hll h x yxx yx not inrm that can be directly applied . Formula (14) is because an es- timator for the last line has to be inserted. The formula is derived in Appendix I. Another reason why the proposed variance estimators cannot be directly used is that it is derived for strata of with 1. This was done to simplify mathematical deriva- ut eneralized to the case with rent strata widths. a fo tions, bit may easily be g diffe Assume that we have n strata with widths ,1,L, i x in . Let (8) be denoted by i T in strata number i. Then, i T is an estimator of 1 dyx x 0, where i x xx , and x runs from 0 to i x over strata number i. Since the Echo Abundance in strata number i is equal to 0d i xyxx , we have that ii x T is an estimator of the Echo Abundance in strata number i, while 1i n ii x T corresponding estimator r the total Echo Abundance. 2 Var Var ii ii is the fo Since x Tx T tiplied with 2 i , (7) or an alternative mul x , is an estimator for Var ii x T l strata to obtain an esti , and these mama- tor for the var iance of y be summed over al 1 n ii i x T . Note that in strata number i, lli x xx , and hhi x xx , where x and ih x x zed legs in the are the absolute positions of the two randomistrata. Estimating thence mever, t t Many acoustic surveys are carried out in a way where the Results from sampling a set of artificial functions are given in Appendix II. 5. Discussion Echo Abundaay also be useful in the classic echo integrating method. Howhe pre- sent methods are independent of what quantity that is estimated. 5.1. Estimatinghe Echo Abundance Copyright © 2011 SciRes. OJMS M. AKSLAND 8 urce is done subjectively using ns. Amng such decisions there ca sa part of the survey to decide upon ring. When covering a resource sub- never selected very close to each other, ut when using sampling design, this may happen. Many the sample. Moreover, to be able to estimate e variance of the used estimator, every pair of objects ected. This is estimator o reduce the prob- ab covering of the reso common sense decisioo n be found quite wise ideas to generalize the theory of mpling design. Although most coverings are done in some systematic ways using the available knowledge of how the actual resource distributes over the area, the cruise leaders will always also want to use observation data from the finished the remaining cove jectively, legs are b cruise leaders will consider close legs as a waist of money. But within sampling design from a finite popula- tion, every object must have a positive probability to be selected in th should have a positive probability to be sel strict requirement for the Horvitz-Thompsona based on unequal probabilities given in RAJ (1968). This is the reason why close legs should have a positive probability to be selected when using sampling design to select parallel lines. There are certain problems with un- equal probability designs. See Tillé (1996), but these do not affect the present methods. There is another way to avoid the possible selection of close legs. This is when the echo intensity as a function of the area position is modelled stochastically. In Cassel, Särndal and Wretman (1977) it is shown how super population models reduce the importance of a sampling design and may even allow estimation of variances from subjectively selected samples. However, when variances are estimated from subjective samples based on some stochastic model for the area echo intensity, all prob- abilities comes from this model, and the variance esti- mates cannot be expected to be more reliable than the underlying stochastic model. An advantage with probability sampling methods to estimate populations with difficult area distribution, is that the estimators do not depend on some underlying super population model, but on man made probabilities expressed by the chosen sampling design. As long as samples are selected in accordance with the sampling design, the estimates are objective. Subjective knowl- edge is used with probability sampling methods in vari- ous ways, for instance when forming strata, and in gen- eral when deciding upon the sampling design. Whenever it is felt that an estimate based on probability sampling design is biased, or are subject to other kinds of errors, this feeling comes from knowledge that is not used when deciding about the sampling strategy. All supplementary knowledge should be used when deciding the sampling strategy. 5.2. Miscellaneous This paper shows that it is possible t ility to select legs close to each other, but the probabil- ity should not be zero. An estimator that is biased within strata in general may bother someone, but since it is the bias of the estimator summed over all strata that is of importance, the bias within strata is not serious. The bias for every signal function y x for which (8) can be integrated can be computed analytically. If 2 yx x , which is convex, the bias is + 10.7%. When 2 1yx x , the bias has the same absolute value, but is negative. If the likelihood to observe signal functions with biases that cancel each other is similar, then the bias of the estimated sum over many strata will tend to zero when the number of strata increases. However, this is not likely to occur exactly. An example that may occur is a function that alue over tima is zero every except fo a short distance by a fish y clstrata tor within the actual stmay ha where caused ose to a rata nt fu r a very high school. If the boundary, the ve a consider- v school happens to sta es able negative bias. The opposite situation to this, where the function has a high value everywhere except for a short distance where it is zero, is very unlikely to occur. However, to find out more about the bias of the estimator (8) summed over strata, the best study would be to com- pute by programming the true value as well as the ex- pectation for a lot of differenctions y x defined over many strata. The study in Appendix II th light on this problem, but more signal functions and rs are needed. However, the figures in Ap- pendix II show that the bias usually shrinks when the number of strata is increased from 5 to 10. Otherwise, the figures in Appendix II indicate that the difference between (13) and (14) is less when the sampling density is small and when sampling signal functions with sharp and big variations. Some may have noted that the estimator i T and 1i T rows some strata numbe corresponding to strata no i and i + 1 both contain the observed value on the common strata line. But this does not violate the stochastic independence of i T and 1i T because observed values from the pilot survey (on strata boundaries) are not stochastic. 5.3. The Estimators of Variance The two estimators developed in this papy not be the best. The estimator based on (13) is obviously too small. This is indicated in the Appendix II. Estimator (14) is almost unbiased in theory, but Appendix II indicates that it is too big. The interval, Estimate SD T, based on (14), con- tains the true value in almost all cases in Appendix II, as well as in other cases. However, the square root of an unbiased estimator of the variance of an estimator has an er ma expectation that is less than the expectation of the Stan- Copyright © 2011 SciRes. OJMS M. AKSLAND 9 the estimator. Therefore, if dard Deviation of Var T is unbiased, Estimate Var T, should nt be gra than the Estimate SD T in average. If the estimator based on (13) has much less expecta- tion than Var T, it is dangerous to increase it with a constant factor, because the right factor may vary with the shape of the signal function o , and as well on the sam- why this estimator in all strata where dance generated by a fish population is th This me stimator uce estima ber of s. It is not sure th ignal fu ter pling density. One possible reason estimates a too small variance, is that the observed numbers 0 y, l y, h y and 1 y all happen to have values close to some straight line, the estimated variance in these strata are close to zero. A difference between the used man made signal func- tions in the Appendix II and real signal functions based on the Echo Abun at the latter is not static. ans that observed val- ues are hardly equal if they are observed twice at differ- ent times. As the man made signal functions used are continuous with continuous derivative, a function like this will converge to linear within strata when the strata widths goes to zero. Therefore, the variance e based on (13) may prodtes that tend too fast to zero when the numtrata increases at the estimator has this property if it used on real non-static snctions. Another way to find an estimator for Var T is to express it as a multiple polynomial 01 Var( )ijmn ijmnl h Tcyyyy where ijmn c are constants, or polynomials in l x and h x . The problem is to find coefficients ijmn c so that EVar VarTT. This is a generalization of the method given in Des Raj (1968) to find an unbiased variance estimator when sampling with unequal prob- abilities. The present author has not yet succeeded to find an estimator by this method. 5.4. Sampling with Unequal Probabilities In the literature about sampling design, interesting results about sampling with unequal probabilities have been derived (see Des Raj (1968)). It may be tempting to try the Horvitz-Thompson estimator and variance estimator here, but in the present case this estimator is not appro- prt make ndaries. Also, there are different reasons to apply unequal probabilities in the present case and iHorvitz- son’s n spread the sample bet random sampling stra The general reason for selecting unequal probabilities ties can e chosen that are positive correlated with the variable to pefully, the re is important as a special riance of an estimated integral based iate. This is because the estimator does no use of the observed values on the strata bou n the Thomp case. The reason for choosing a not uiform random sampling strategy when sampling an area is to ter than is obtained with a uniform tegy. Living resources’ have social behaviour, and then it is believed that the echo intensities at positions close to each other are seldom very different. This is the reason to spread the sample in the present case. in the Horvitz-Thompson’s case is that probabili b be observed. 5.5. The Future The author of this paper does not look upon the present results as a finite solution. It is more a start of using sampling design to estimate the variance. Ho sults of this paper build on some principles that are new in sampling design. The combination of a subjective and a randomized covering where the estimator is label dependent and depends on both deterministic and ran- domized observations is not common. It is a hope that generalized methods building on this principle can be developed. A real challenge is to combine the common subjective coverings with additional randomized obser- vations for estimating the variance. Use of adaptive sampling strategies in sampling design is difficult. But if sampling designs can be based on the non-random observations from a deterministic part of the survey, the same variance reductions may be obtained with less statistical difficulties. 6. Conclusions A special sampling strategy is proposed for covering an area with parallel lines of observation. The strategy con- sists of a deterministic covering followed by a random- ized covering between the deterministic covering. A label dependent estimator is proposed that depends on both deterministic and randomized observations. The sampling design is with unequal probabilities with the purpose to produce better spatial balanced samples. The theoretical Expectation and Variance of this esti- mator are derived, and two estimators of the variance have been found. Further properties of the estimators of variance are studied by sampling man made functions. This study showed that one of the estimators is likely to underestimate the variance. The two variance estimators may not be the best for the proposed sampling strategy. However, if the pro- posed strategy is generalized and based on similar prin- ciples, the results in this paper case. stimating the vaE on a line sample requires a generalization of the founda- tion of sampling design. This is a big job. Copyright © 2011 SciRes. OJMS M. AKSLAND Copyright © 2011 SciRes. OJMS 10 d [3] na E. Estimation and compensation models for the shadowing effect in dense fish aggregations. ICES ournal theron G. Principles of geostatistics. Economic . Osney Mead, Oxford, 2000, 206 pp. dn. John Wiley and r, G. A. F. Adaptive Sampling. ., Simard Y., Stroud W. F., Beaulieu J. L., ic cod (Gadus morhua) with es- Y. McGraw-Hill Publish- nomie et de 7. References [1] Stevens D. L. and Olsen A. R. Spatially BalanceSam-[13] Rivoirard J. Simmonds K. G., Foote P., Fernandes and N. Bez. Geostatistics for estimating fish stock abundance Blackwell Science, pling of Natural Resources. Journal of the Americal Sta- tistical Association 99, No. 465, 2004, 262-278. [2] D. N. MacLennan. Acoustical measurement of fish abundance. J. Acoust. Soc. Am.; vol. 87(1): 1990, 1-15. Craig R. E. Re-definition of sonar theory in terms of en- ergy. In: Selected papers of the ICES/FAO Symposium on fisheries acoustics. FAO Fish. Rep; (300), 1983, 331 p. [4] Aksland M. Acoustic abundance estimation of the spawning component of the local herring stock in Lin- daaspollene, western Norway. FiskDir. Skr.Ser. HavUn- ders. 1983, 297-334. [5] Aksland M. Estimating numbers of pelagic fish by echo integration. J. Cons. int. Explor. Mer. 43, 1986, 7-25. [6] Aksland M. An alternative echo-integrating method. ICES Journal of Marine Science 62, 2005, 226-235. [7] Aksland M. Applying an alternative method of echo-integration. ICES Journal of Marine Science 63, 2006, 1438-1452. [8] Zhao X. and O tima Journal of Marine Science 60, 2003, 155-163. [9] Aksland M. Analysing Estimators in the Alternative Echo Integrator Method. The Open Ocean Engineering J [20] 3, 2010, 116-128. [10] Foote K. G., and Stefánson G. Definition of the problem ing C of estimating fish abundance over an area from acoustic line-transect measurements of density. ICES Journal of Marine Science 50, 1993, 369-381. [11] Ma Geol- Statistique No 44, 1996, 177-189. ogy 58, 1963, 1246-1266. [12] Krige D. G. A statistical approach to some mine valua- tions and allied problems at the Witwatersrand. Master's thesis of the University of Witwatersrand. 1951 [14] Cassel C. M., Särndal C. E. and Wretman J. H. Founda- tions of Inference in Survey Sampling. John Wiley & Sons, 1977. [15] Thompson, S. K. Sampling, 2nd e Sons, New York, 2002. [16] Thompson, S. K., and Sebe John Wiley and Sons, New York, 1996. [17] Harbitz A., Ona E. and Pennington M. The use of an adaptive acoustic-survey design to estimate the abun- dance of highly skewed fish populations. ICES Journal of Marine Science 66, 2009, 1-6. [18] Conners M. E. and Schwager S. J. The use of adaptive cluster sampling for hydroacoustic surveys. ICES Journal of Marine Science 59, 2002, 1314–1325. [19] McQuinn I. H and Walsh S. J. An adaptive, integrated “acoustic-trawl” survey design for Atlant tion of the acoustic and trawl dead zones. ICES Journal of Marine Science 62, 2005, 93-106. Matern B. Sample Survey Problems. Bulletin of the In- ternational Statistical Institute Vol. XLII, Book 1, 1969, 143-154. [21] Des Raj. SAMPLING THEOR ompany LTD, 1968. [22] Tillé Y. Some remarks on unequal probability designs without replacement. Annales D’Éco M. AKSLAND 11 Appendix I Derivation of Results Let us write the probability distribution (9) as ,120 1,0,0,1fxzxzx z xzx z (15) Where x and are variables for l z x and h x , re- spectively. Note that this distribution is symmetric in the variables x and . This means that the marginal di- stributions of z l x and h x are identical, and EE,0, and EE nm mn lhl h lh hl xxxxmn xx xx 0 The marginal density of l x is obtained by integrating (15) with respect to . z 11 00 12 0 3 ,d 1201d 120 1d 20 1 xx x f xz zxzxz z x xz zz xx Hence, the marginal density is given by 3 201, 01 g xxx x (16) Now 13 2 0 12345 0 E201d 203 3d 1E 3 l h xxxx x xxx x x (17) and 11 01,01 11 11 00 11 112 00 23 11 0 13 1 0 E120 1dd 1dd 120 1d 11 1 120d (18) 23 120 E1d 23 nmn m lh xz z z mn z mnn nn m n nm m lh xxxzx zxz zxxzxz zzxxx zz z zz nn xxzz z nn By using special cases of (18), the expectation formu- las except the conditional in (10) are derived. The conditional distribution of h x given l x is given by 3 3 ,1201 20 1 1 6,0 11 f xzxzx z hzx gx xx zxz z x x (19) The conditional exp ectation follows as: 12 03 61 d 1 2 1 x hl zxzz x Ex xxx (20) Below, the relation 11 00 d1d f xxf xx is used some times. To show the relation, change 1 x x in the integral. By using (8), 01 01 1 EEEE1 2 E1 1 (21) 6 1E1E1 1 2 lh hl lh hll h Tyxyx xyx xy x yy xyxxy x Next evaluate the two expectations in the brackets by using (9 ). 01 01 11 00 E1 12011d d 12011dd hl x zx x xyx zxzx zyxxz x yxzzx zzx By integrating with respect to x , the following expres- sion is obtained. 1245 0 E1102 2d hl x yxxxxx yxx (22) Likewise, 01 01 11 00 E1 1 120111d d 120111d d lh xz z z xy x x xzxzyzx y zyzxxxzx z By integrating with respect to x , and then substitute 1z with x , dd x z , we get. 1345 0 E111023d lh x yx xxxyx x (23) By combining (21), (22) and (23), an expression for ETis obtained. Copyright © 2011 SciRes. OJMS 12 M. AKSLAND 1 01 0 1234 01 0 1 E( )5111d 6(24) 1522 d 6 Tyy xxxxyxx yyxx xxyxx Formula (24) has been checked with the functions for 2 2 ,and1abxx x y x. These results are identical with the results obtained by deriving ETdi- rectly from (8) and using (10). Deriving formula (12) is a bit tedious. It follows from (8) that 2 201 222 22 01 01 01 4 (1)()(1)(1) ()(1)()(1)(1) 2(1)(1)()(25) 2(1)(1)(1) 2(1)(1)()(1 lh hllh lhh llh lhhh l ll hlh hll h Tyxyxxyxxyx y xyxxyxxy x yxxyxx yx yxxyxx yx xxyxyx ) Then, the expectation of each term is derived by using (9) and (10) and trying to split double integrals. 222 0101 01 1 E3 21 lh yx yxyyyy 4 (26) 22 22 0, 0 1 11 22 22 00 E1 12011d d 120111d d hl xz xz x xyx zxzxzyxxz x yxxz z zz zx By evaluating the integral with respect to and con- tracting, the following result are obtained: z 22 12562 0 E1 23553d hl xyx x xxxyx x (27) Next 22 22 0, 0 1 11 22 22 00 E11 120111d d 1201111d d lh xz xz z xy x xxzxzyzxz zyzzxxxxx z Integrating with respect to x , contracting and chang- ing the integrator variable with z1 x , gives 22 134562 0 E11 21020 13 3d lh xy x x xxxyx Continuing 0, 0 1 11 22 00 E1 12011d d 120111d d lhl xz xz x xxyx xzxz xzyxxz x yxxzz zzzx Evaluating and contracting gives 12356 0 E1 1022d lhl xxyx x xxxyx x (29) The next term follows. 0, 1 11 23 00 E1 12011dd 1201(11dd hhl xzo xz x xxyx zzxzxzyxxz x yxx zzzzzx Evaluating and compressing gives 12456 0 E1 22 510103d hhl xxyx x xxxxyx x (30) Next term 2 0, 1 11 23 00 E1 120111d d 12011(11d d llh xzo xz x xxyx xxzxzyzxz zyzzxxxxx z Evaluating and compressing 1456 0 E11 2583 d ll h xxyx x xxyx x (31) The next last term 0, 0 1 11 22 00 E1 120111d d 1201111d d hlh xz xz x xxyx zxxzxzyzxz zyzzxx xxzx x (28) Integrating and changing with 1z x in the inte- gral with respect to gives z Copyright © 2011 SciRes. OJMS M. AKSLAND Copyright © 2011 SciRes. OJMS 13 13456 0 E1 1 10 254d hl h xxyx x xxxyx this were obtained by the control. The derivation of (11) and (12) do not involve complicated mathematics, but rather an extensive algebraic task with risk of making errors. x (32) And finally the last term 0, 0 1 E1 1133 1201111d d lhl h xz xz xxyxyx x zxzxzy x yzxz This section is concluded with deriving (14). There may be several ways to obtain an estimator for Var T. The version derived here is rather strai g h t f orward. Since 2 2 Var EETT T 2 T, we try to estimate both terms. As is an unbiased estimator of 2 ET, we can use and subtract an estimator for 2 T 2 EETT 2 . 2 T This formula could be used for the last term in (12). However, it is possible to derive alternative versions. One is given here. 11 00 1 0 12 0 1223 00 E111 1201111d d 120 11 11 1dd 1201d d lhl h z z z xxyxyx zzyz xx xzyxx zzyz zxxxxyx xz zzyzzxxxx yxxz is given by (25), but the expectation of the first term is computable from the data. Therefore is replaced by its expected 2 01 lh yx yx 0 l yx y 2 1 h x value 01 01 34 21 yy yy 22 1 in the expression for . 2 T z This will reduce the variability of the estimator. This is confirmed by trying both versions during sampling arti- ficial signal functions, and the version with the first terms equal to 22 01 01 34yy yy21 is smallest on all occasions. An expression for 2 ET follows from (21). 2 201 01 2 1 E36 1E11 1 6 1E11 1 4 hll h hll h Tyy yyxyxxy x xyxxy x Hence 1 0 223 0 E111 120 1 dd lhl h z xxyxyx zzyz x xzxxyxxz (34) Using the arguments in the expectation operators as estimators for the expectations, and performing the sub- traction 22 ˆ ET Inserting (26) – (33) or (34) i nt o ( 25 ) gi ves (12). Formula (12) has been controlled with a linear func- tion for y x. for linear functions, and Var 0TT the following estimator is ob- tained. ٛ 22 222 2 01 01 01 2 11 Var( )1311 126 4 11 111 2323 11 111E111 24 hl lh lhhllh hll hhll h Tyyyy xyxxyx yy xx xyxxyx xyxxyx xyxxyx ٛ This expression is identical with (14). The last line cannot be estimated unbiased by using , as this would yield an over estimate of 2 111 hll h xyxxy x 2 E1hl 1 1 l h x yx 22 Var EE x yx , This follows from the general result, X XX, that holds for any random vari- able X . However, and it is believed that the variance term is small relative to the expectation terms in general. Therefore, (14) with 2 1111 4hl l h xyxxy x as the last line was used in Appendix II. 2 2 E11 1 E11 1 Var 111 hll h hll h hll h xyxxy x xyxxy x x yxx yx Estimator (14) depends on and even if the signal function is linear. This means that it takes values that in general are different from zero even when the variance is zero, as in the linear case. To try estimator (14), a program that calculates it has been made. Results from using (14) in the main paper are given in Appendix II. xlxh 14 M. AKSLAND Estimator (14) depends on l x and h x even if the sig- nal function is linear. This means that it takes values that in general are different from zero even when the variance is zero, as in the linear case. To try estimator (14), a pro- gram that calculates it has been made. Results from using (14) in the main paper are given in Appendix II. In some difficult cases the variance estimator takes negative values. This happens on some sampling occa- sions with 5 strata, but also more seldom with 10 strata on the signal functions a and b in Appendix II. The stan- dard deviation is set to zero in such cases. Appendix II Sampling man made functions The two variance formulas based on (13) and given by (14) have been tried by sampling several artificial signal functions. An immediate impression from these investi- gations is that the two variance estimates show different properties with the variance estimator based on (13) be- ing the smallest. The advantage with sampling known artificial signal functions is that the parameter to be estimated can be calculated as well as the expected value (11) and varia- nce (13) of the estimator (8) used. From this, the bias and deviation between estimated and true value follows. In the present case, however, the variance (13) was not computed. From repeated independent samples the vari- ability of the estimate is shown, and th is indicates some- thing about the th eoretical variability of the estimator. Results from 30 repeated independent samples of four different signal functions are given. Each signal function is sampled with 5 and 10 strata for comparison. The signal functions were constructed by means of the 10 first terms in a Fourier series and added some higher frequencies. To ensure that the signal functions are non-negative, they were put equal to zero if they were negative. There are many parameters in such functions, and it is possible to make any function shape. However, since real signal functions are integrated echo intensities over depth and along a fixed direction in the area, they will seldom show very sharp variations. The strata widths are constant and equal to one for 5 strata, and half width for 10 strata. Then the integral over the signal function is the same for 5 and 10 strata. However, when fish concentrate to schools, real signal functions may be difficult to sample, in particular when the area of distribution is small. The signal functions a and b shown in Figure 4 and 6 are of this kind. The results of 30 repeated independent samplings are shown in Figure 5. Note that the y-axis is similar in these plots. It is seen from Figure 5 that the error interval based on (14) seems not to be wider than the error interval Figure 4. The first signal function (a) sampled with 5 and 10 strata, respectively. On the first sampling occasion, the function was observed at the locations where there are ver- tical lines, including the grid. Figure 5. Estimate, true value, expected value and estimates of two error bounds for 30 independent samples of signal function a based on sampling with 5 (upper) and 10 strata (low er graph). Copyright © 2011 SciRes. OJMS M. AKSLAND 15 based on (13). It is also seen that the error interval based on (14) has zero width several places, especially for 5 strata. For each signal function that was sampled 30 times, it was counted how many times the error intervals con- tained the true value, and how many times they con- tained the expected value of the estimator. For signal function a, these numbers are given in Table 1. The next signal function (b) is shown in Figure 6. This is also a difficult function to estimate. The result of 30 independent samples of signal func- tion b is shown in Figure 7. Here is seen that the two error interv als seems to be of the same order for 5 strata, but for 10 strata the error in- terval based on (14), main paper, is the biggest. The numbers of times the error intervals contain the true value and the expected value of the estimator are given in Table 2. See Table 1 for more information. The next signal function is shown in Figure 8. This Table 1: Numbers of times the error interval contained the true value and expected value, respectively out of 30 inde- pendent samples of signal function a. 5 strata 10 strata Estimate ± 2SD based on (13) 13 13 24 21 Estimate ± SD based on (14) 11 7 17 18 Figure 6. Signal function b sampled with 5 and 10 strata, respectively. The first sampling locations within each stra- tum are shown as red vertical lines. Figure 7. Estimate, true value, expected value and two es- timates of error bound based on 30 independent samples of signal function b based on sampling with 5 (upper) and 10 strata (lower graph). Table 2: Numbers of times the error interval contained the true value and expected value, respectively out of 30 inde- pendent samples of signal function b. 5 strata 10 strata Estimate ± 2SD based on (13) 13 13 24 25 Estimate ± SD based on (14) 15 17 30 30 Figure 8. Signal function c sampled with 5 and 10 strata, respectively. The sampling positions in each stratum are shown as vertical lines. Copyright © 2011 SciRes. OJMS M. AKSLAND 16 values reaches a maximum before the area under signal func- tio sed on (14) is the biggest bo one of gnal function (d) is shown in Figure 10. Th he area under signal func- tio te that the error intervals st e error intervals for signal fu function start with low it declines and goes to zero. The results of estimating n c is shown in Figure 9. Here the error interval ba th for 5 and 10 strata. This case is seldom in that the bias for 10 strata is bigger that the bias for 5 strata. The numbers of times the error intervals contain two parameters are given in Table 3. See Table 1 for explanation. The last si is is a function with small sharp variations, and is an easy function to estimate. The results of estimating t n d are shown in Figure 11. Figure 11 and others indica abilize, gets narrower as well as less variable when the functions are sampled with 10 strata compared with 5 strata. But it also seems that the error interval based on (13) decreases faster than that based on (14) when the sampling density increases. The numbers of times th nction d contain one of two parameters are given in Table 4. See Table 1 for more explanation. Figure 9. Estimate, true value, Expected value and two es able 3. Numbers of times the error interval contained the 5 strata 10 strata - timates of the error bounds for 30 independent samples of signal function c based on sampling with 5 (upper) and 10 strata (lower graph). T true value and expected value, respectively out of 30 inde- pendent samples of signal function c. Esti Figure 10. Signal function d sampled with 5 and 10 strata, respectively. The function is observed at the x-positions of the vertical lines, including the grid at the first sampling occasion. Figure 11. Estimate, true value, Expected value and tw es the error interval contained the trata 10 strata o estimates of the error bounds for 30 indepe ndent samples of signal function d based on sampling with 5 (upper) and 10 strata (lower graph). able 4. Numbers of timT true value and expected value, respectively out of 30 inde- pendent samples of signal function d. 5 s mate ± 2SD based on (13) 11 14 11 18 Estimate ± SD based on (14) 30 30 30 30 Estimate ± 2SD based on (13) Estimate ± SD based on (14) 29 30 30 30 22 25 8 26 Copyright © 2011 SciRes. OJMS M. AKSLAND Copyright © 2011 SciRes. OJMS 17 signn d sat t3) shrsider sa y biased. It is not ea by (14) has not good properties, as icult to es- tim The results from samplingal functiohows th he estimator based on (1inks conable from mpling with 5 strata to sampling with 10 strata. That the unknown value is within the error interval in only 8 cases for 10 strata is not good, but the expected value is contained in the error interval 26 of 30 times. Since the expected value is rather close to the true value here, it may be concluded that not many estimates are far from the true value. But this may, nevertheless, indicate an unfortunate property with th is estimator in that it may be considerably to small if the sampling density is high while the signal function is smoo th. The Figures in Appendix II indicate that the variance estimator based on (13) is negativel sy, although not impossible, to derive the expectation of this estimator. Therefore, the estimator is tried on arti- ficial signal functions. The estimator given the estimates within strata may be negative. When all within strata estimates are summed over several strata, the resulting variance has better properties, but based on the present figures it is hard to be lieve that this estimator is nearly unbiased. Note that the error interval based on (14) is calculated as Estimate ± estimated standard de- viation, while that based on (13) is calculated as Estimate ± two times the estimated standard deviation. Some of the chosen signal functions are diff ate, in particular signal function a and b as shown in the Figures 4 and 6. This may occur in fisheries acous- tics when fish concentrate to schools. To improve preci- sion in such cases, the sampling effort has to be in- creased. |