Open Journal of Statistics
Vol.1 No.2(2011), Article ID:6550,6 pages DOI:10.4236/ojs.2011.12010
Sequential Test of Fuzzy Hypotheses
Department of Statistics, Faculty of Sciences, University of Birjand, Southern Khorasan, Iran
E-mail: g_z_akbari@yahoo.com
Received May 22, 2011; revised June 10, 2011; accepted June 17, 2011
Keywords: Canonical Fuzzy Number, Fuzzy Hypotheses, Type I and II Error Sizes, Sequential Probability Ratio Test
Abstract
In testing statistical hypotheses, as in other statistical problems, we may be confronted with fuzzy concepts. This paper deals with the problem of testing hypotheses, when the hypotheses are fuzzy and the data are crisp. We first give new definitions for notion of mass (density) probability function with fuzzy parameter, probability of type I and type II errors and then state and prove the sequential probability ratio test, on the basis of these new errors, for testing fuzzy hypotheses. Numerical examples are also provided to illustrate the approach.
1. Introduction
Statistical analysis, in traditional form, is based on crispness of data, random variable, point estimation, hypotheses, parameter and so on. As there are many different situations in which the above mentioned concepts are imprecise. On the other hand, the theory of fuzzy sets is a well known tool for formulation and analysis of imprecise and subjective concepts. Therefore the sequential probability ratio test with fuzzy hypotheses can be important. The problem of statistical inference in fuzzy environments are developed in different approaches.
Delgado et al. [1] consider the problem of fuzzy hypotheses testing with crisp data. Arnold [2,3] presents an approach to test fuzzily formulated hypotheses, in which he considered fuzzy constraints on the type I and II errors. Holena [4] considers a fuzzy generalization of a sophisticated approach to exploratory data analysis, the general unary hypotheses automaton. Holena [5] presents a principally different approach and motivates by the observational logic and its success in automated knowledge discovery. Neyman-pearson lemma for fuzzy hypotheses testing and Neyman-pearson lemma for fuzzy hypotheses testing with vague data is given by Taheri et al. and Torabi et al. [6,7]. Filzmoser and Viertl [8] present an approach for statistical testing at the basis of fuzzy values by introducing the fuzzy p-value. Some methods of statistical inference with fuzzy data, are reviewed by Viertl [9]. Buckley [10,11] studies the problems of statistical inference in fuzzy environment. Thompson and Geyer [12] proposed the Fuzzy p-values in latent variable problems. Taheri and Arefi [13] exhibit an approach for testing fuzzy hypotheses based on fuzzy test statistics. Parchami et al. [14] consider the problem of testing hypotheses, when the hypotheses are fuzzy and the data are crisp. they first introduce the notion of fuzzy p-value, by applying the extension principle and then present an approach for testing fuzzy hypotheses by comparing a fuzzy p-value and a fuzzy significance level, based on a comparison of two fuzzy sets.
In present work, we first define a new approach for obtaining the probability (density) function, when the random variable is crisp and the parameter of interest is imprecise (fuzzy). Also, the type I and type II errors are introduced based on fuzzy hypotheses. Then, the sequential probability ratio test (SPRT) is defined and extended based on such hypotheses.
We organize the matter in the following way:
In section 2 we describe some basic concepts of fuzzy hypotheses, density (Mass) probability function with fuzzy parameter and necessary definitions. In section 3 we come up sequential probability ratio test based on fuzzy hypotheses. In section 4 the previous definitions and the sequential probability ratio test will be illustrated by examples.
2. Preliminaries
In this section we describe fuzzy hypotheses, density (Mass) probability function with fuzzy parameter and necessary definitions.
Let be a probability space, a random variable (RV)
is a measurable function from
to
, where
is the probability measure induced by
and is called the distribution of the RV
, i.e.,
If is dominated by a
finite measure
, i.e.
then by the Radon-Nikodym theorem (Billingsley, [15]), we have
where is the Radon-Nikodym derivative of
with respect to
and is called the probability density function of
with respect to
. In a statistical context, the measure
is usually a “counting measure” or a “Lebesgue measure”, hence
is
or
, respectively.
2.1. Canonical Fuzzy Numbers
Let be the “support” or “sample space” of
, then a fuzzy subset
of
is defined by its membership function
. We denote by
the
cut set of
and
is the closure of the set
, and
1) is called a normal fuzzy set if there exists
such that
;
2) is called a convex fuzzy set if
for all
;
3) is called a fuzzy number if
is a normal convex fuzzy set and its
cut sets, are bounded
;
4) is called a closed fuzzy number if
is a fuzzy number and its membership function
is upper semicontinuous;
5) is called a bounded fuzzy number if
is a fuzzy number and the support of its membership function
is compact.
If is a closed and bounded fuzzy number with
and
and its membership function be strictly increasing on the interval
and strictly decreasing on the interval
, then
is called a canonical fuzzy number (Klir and Yuan, [16]).
The fuzzy canonical numbers (such as triangular or trapezoidal fuzzy numbers) are very realistic in fuzzy set theory, so we use this numbers for our goal.
2.2. Fuzzy Hypotheses
We define some models, as fuzzy sets of real numbers, for modeling the extended versions of the simple, the one-sided, and the two-sided ordinary (crisp) hypotheses to the fuzzy ones.
Testing statistical hypothesis is a main branch of statistical inference. Typically, a statistical hypothesis is an assertion about the probability distribution of one or more random variable(s). Traditionally, all statisticians assume the hypothesis for which we wish provide a test are well-defined. This limitation, sometimes, force the statistician to make decision procedure in an unrealistic manner. This is because in realistic problems, we may come across non-precise (fuzzy) hypothesis. For example, suppose that is the proportion of a population which have a disease. We take a random sample of elements and study the sample for having some idea about
. In crisp hypothesis testing, one uses the hypotheses of the form:
versus
or
versus
, and so on. However, we would sometimes like to test more realistic hypotheses. In this example, more realistic expressions about
would be considered as: “small”, “very small”, “large”, “approximately 0.2”, “essentially larger” and so on. Therefore, more realistic formulation of the hypotheses might be
is small, versus
is not small. We call such expressions as fuzzy hypotheses.
We define some models, as fuzzy sets of real numbers, for modeling the extended versions of the simple, the one-sided, and the two-sided crisp hypotheses to the fuzzy ones (Akbari and Rezaei, [17]).
Definition 2.1 Let be a real number and known.
1) Any hypothesis of the form is called to be a fuzzy simple hypothesis.
2) Any hypothesis of the form is called to be a fuzzy two-sided hypothesis.
3) Any hypothesis of the form
is called to be a fuzzy right one-sided hypothesis.
4) Any hypothesis of the form
is called to be a fuzzy left one-sided hypothesis.
We denote the above definitions by
2.3. Density (Mass) Probability Function
Let is a RV and let
be the “support” or “sample” space of
and
where is the membership function of canonical fuzzy hypothesis and
is its
-cuts.
We call the new density as the fuzzy probability density (mass) function (FPDF) of
(Akbari and Rezaei [18]). We note that,
and
(substitute the summation by integral in discrete cases).
Let be arbitrary function in
. Then we define
Let be a random sample, with observed value
, where
has the FPDF
with unknown
. For testing
we state the following definitions:
Definition 2.2 Let be a test function. The probability of type I error of
is
and the probability of type II error of
is
.
Definition 2.3 A teat is said to be a test of level
if
, where
.
we call the size of
.
3. Sequential Probability Ratio Test
Consider testing a null fuzzy hypothesis against a alternative fuzzy hypothesis. In other words, suppose a sample can be drawn from one of two FPDFs and it is desired to test that the sample came from one distribution against the possibility that is came from the other. If denotes the random variables, we want to test
versus
. The simple likelihood-ratio test was of the following form:
The sequential test that we propose to consider employs the likelihood-ratios sequentially. Define
for and compute sequentially
for fixed
and
satisfying
, adopt the following procedure: take observation
and compute
; if
, reject
; if
, accept
; and if
, take observation
, and compute
. If
, reject
; if
, accept
; and if
, take observation
, and etc. The idea is to continue sampling as long as
and stop as soon as
or
, rejecting
if
and accepting
if
. The critical region of the described sequential test can be define as
, where
Similarly, the acceptance region can be defined as, where
When we considered the simple likelihood-ratio test for fixed sample size, we determined
so that the test would have preassigned size
. We know want to determine
and
so that the sequential probability ratio test will have preassigned
and
for its respective sizes of type I and type II errors. Note that
and
where, as before, is a shortened notation for
.
For fixed and
, the above equations are two equations in the two unknown
and
. A solution of these two equations would give the sequential probability ratio test having the desired preassigned error sizes
and
. As might be anticipated, the actual determination of
and
from above equations can be a major computational project.
We note that the sample size of a sequential probability ratio test is a random variable. The procedure says to continue sampling until first falls outside the interval
. The actual sample size then depend on which
s observed; it is a function of the random variables
and consequently is itself a RV. Denote it by
. Ideally, we would like to know the distribution of
or at least the expectation of
. One way of assessing the performance of the sequential probability ratio test would be to evaluate the expected sample size that is required under each hypothesis. The following lemma, given without proof (Lehmann, [20]), state that the sequential probability ratio test with crisp hypotheses is an optimal test if performance is measured using expected sample size. We can similarly prove this lemma with fuzzy hypotheses based on introduced FDPF.
Lemma 3.1 The sequential probability ratio test with error sizes and
minimizes both
and
among all tests which satisfy the following:
,
, and the expected sample size is finite.
We noted above that the determination of and
that defines that particular sequential probability ratio test which has error sizes
and
is in general computationally quite difficult. The following lemma (with simple proof) gives an approximation to
and
.
Lemma 3.2 Let and
be defined so that the sequential probability ratio test corresponding to
and
has error sizes
and
; then
and
can be approximated by, say
and
, where
Lemma 3.3 Let and
be the error sizes of the sequential probability ratio test defined by
and
given in before lemma. Then
.
Naturally, one would prefer to use that sequential probability ratio test having the desired preassigned error sizes and
; however, since it is difficult to to find the
and
corresponding to such a sequential probability ratio test, instead one can use that sequential probability ratio test defined by
and
of before equation and be assured that the the sum of the error sizes
and
is less than or equal to the sum of the desired error sizes
and
.
The procedure used in performing a sequential probability ratio test is to continue sampling as long as and stop sampling as soon as
or
. If
, an equivalent test is given by the following: continue sampling as long as
, and stop sampling as soon as
or
. As before, let
be a RV denoting the sample size of the sequential probability ratio test, and let
.
If the sequential probability ratio test leads to rejection of, then the RV
, but
is close to
since
first became less than or equal to
at the
th observation; hence
. Similarly
; hence
, where
. Using Wald
s equation (Casella and Berger, [19])
we obtain
and
4. Numerical Examples
In this section, we illustrate the proposed approach for some distributions and use the ability of package “Maple 6” [21] for this examples.
Example 4.1 (Taheri and Behboodian, [6]) Let be a continues r.v. with PDF
we want to test
where the membership functions and
are defined in the following way:
We can interpret and
as the value of
“” and “
”.
Let and
. We obtain
,
. Hence,
, and we must take
whereas
, thus we take
.
Example 4.2 Let be a random sample where
population, i.e.,
and s are our fuzzy hypotheses with membership functions given by:
We can interpret and
as the value of “
” and “
”.
Let. Hence,
, and we must take
, whereas
, thus we take
.
Example 4.3 Let be a random sample where
population, i.e.,
and s are our triangular fuzzy parameters with membership functions
for.
We can interpret the canonical parameters as having values that are “near to”.
Let,
,
and
. Hence,
, and we must take
whereas
, thus we take
.
Example 4.4 Let be a RV from the
population, i.e.,
and s are our trapezoidal fuzzy parameters with membership functions given by:
for.
Let,
and
. If
, then
, and we must take
, whereas
, thus we take
.
5. Conclusions
In this paper, an new approach for sequential test of fuzzy hypotheses based on fuzzy hypotheses for onesample and two-sample when the available data are crisp, is presented. As for this paper, it sound the introduced method is very simple and applicable in the statistics and other sciences.
Extension of the proposed method to test the variance, correlation and parameters of linear models (regression models), design of experiment is a potential area for the future work. Furthermore, we can construct sequential test of fuzzy hypotheses based on intuitionistic fuzzy hypotheses or fuzzy data for the parameters of interest.
6. References
[1] M. Delgado, J. L. Verdegay and M. A. Vila, “Testing Fuzzy-Hypotheses: A Bayesian Approach,” In: M. M. Gupta, Ed., Approximate Reasoning in Expert Systems, Elsevier Science Ltd, New York, 1995, pp. 307-316.
[2] B. F. Arnold, “An Approach to Fuzzy Hypothesis Testing,” Metrika, Vol. 44, No. 1, 1996, pp. 119-126. doi:10.1007/BF02614060
[3] B. F. Arnold, “Testing Fuzzy Hypothesis with Crisp Data,” Fuzzy Sets and Systems, Vol. 94, No. 3, 1998, pp. 323-333. doi:10.1016/S0165-0114(96)00258-8
[4] M. Holena, “Fuzzy Hypotheses for GUHA Implications,” Fuzzy Sets and Systems, Vol. 98, No. 1, 1998, pp. 101-125. doi:10.1016/S0165-0114(96)00369-7
[5] M. Holena, “Fuzzy Hypotheses Testing in the Framework of Fuzzy Logic,” Fuzzy Sets and Systems, Vol. 145, No. 2, 2004, pp. 229-252. doi:10.1016/S0165-0114(03)00208-2
[6] S. M. Taheri and J. Behboodian, “Neyman-Pearson Lemma for Fuzzy Hypotheses Testing,” Metrika, Vol. 49, No. 1, 1999, pp. 3-17. doi:10.1007/s001840050021
[7] H. Torabi, J. Behboodian and S. M. Taheri, “NeymanPearson Lemma for Fuzzy Hypotheses Testing withVague Data,” Metrika, Vol. 64, No. 3, 2006, pp. 289-304. doi:10.1007/s00184-006-0049-8
[8] P. Filzmoser and R. Viertl, “Testing Hypotheses with Fuzzy Data: The Fuzzy p-value,” Metrika, Vol. 59, No. 1, 2004, pp. 21-29. doi:10.1007/s001840300269
[9] R. Viertl, “Univariate Statistical Analysis with Fuzzy Data,” Computational Statistics and Data Analysis, Vol. 51, No. 1, 2006, pp. 133-147. doi:10.1016/j.csda.2006.04.002
[10] J. J. Buckley, “Fuzzy Probabilities: New Approach and Applications,” Springer-Verlag, Berlin, 2005.
[11] J. J. Buckley, “Fuzzy Probability and Statistics,” Springer -Verlag, Berlin, 2006.
[12] E. A. Thompson and C. J. Geyer, “Fuzzy p-values in Latent Variable Problems,” Biometrika, Vol. 94, No. 1, 2007, pp. 49-60. doi:10.1093/biomet/asm001
[13] S. M. Taheri and M. Arefi, “Testing Fuzzy Hypotheses Based on Fuzzy Statistics,” Soft Computing, Vol. 13, No. 6, 2009, pp. 617-625. doi:10.1007/s00500-008-0339-3
[14] A. Parchami, S. M. Taheri and M. Mashinchi, “Fuzzy p-value in Testing Fuzzy Hypotheses with Crisp Data,” Statistical Papers, Vol. 51, No. 1, 2010, pp. 209-226. doi:10.1007/s00362-008-0133-4
[15] P. Billingsley, “Probability and Measure,” 2nd Edition, John and Wiley, New York, 1995.
[16] G. Klir and B. Yuan, “Fuzzy Sets and Fuzzy Logic-Theory and Applications,” Prentice-Hall, Upper Saddle River, 1995.
[17] M. G. Akbari and A. Rezaei, “Bootstrap Testing Fuzzy Hypotheses and Observations on Fuzzy Statistic,” Expert Systems with Applications, Vol. 37, No. 8, 2010, pp. 5782- 5787. doi:10.1016/j.eswa.2010.02.030
[18] M. G. Akbari and A. Rezaei, “Discrete and Continuous Random Variables with Fuzzy Parameter,” Far East Journal of Theoretical Statistics, Online, 2009.
[19] G. Casella and R. L. Berger, “Statistical Inference,” 2nd Edition, Duxbury Press, Belmont, 2002.
[20] E. L. Lehmann, “Testing Statitical Hypotheses,” Chapman and Hall, London, 1994.
[21] Maple 6, Waterloo Maple Inc. Waterloo, Canada.