Some existed fuzzy regression methods have some special requirements for the object of study, such as assuming the observed values as symmetric triangular fuzzy numbers or imposing a non-negative constraint of regression parameters. In this paper, we propose a left-right fuzzy regression method, which is applicable to various forms of observed values. We present a fuzzy distance and partial order between two left-right (LR) fuzzy numbers and we let the mean fuzzy distance between the observed and estimated values as the mean fuzzy error, then make the mean fuzzy error minimum to get the regression parameter. We adopt two criteria involving mean fuzzy error (comparative mean fuzzy error based on partial order) and SSE to compare the performance of our proposed method with other methods. Finally four different types of numerical examples are given to illustrate that our proposed method has feasibility and wide applicability.
As a widely used statistical method, regression analysis is playing a more and more important role in model establishment and prediction evaluation. The traditional regression analysis requires accurate data, but in practice, the information obtained in many cases is not accurate. In this regard, Zadeh [
For the first time, Tanaka [
In 1988, Diamond [
In recent years, scholars have improved and promoted the fuzzy regression method, but some methods are not rigorous when it comes to error estimation and sometimes there are some special requirements for the object of study, such as the observations should be symmetric fuzzy triangular fuzzy number, constraints that the regression parameters should be non-negative. Moreover, in multivariate fuzzy linear regression, their model results are not so satisfactory due to the huge difference between the two inputs. Therefore, Roldan [
The main purpose of this paper is to introduce a fuzzy regression method using the fuzzy distance between left right fuzzy numbers. In order to do this, in Section 2 some preliminary theories of left-right fuzzy distance and partial order are given. In Section 3 some properties of LR fuzzy distance are studied and the concrete formula for calculating the fuzzy distance is determined. In Section 4, the fuzzy distance is used as the mean fuzzy error, then the minimized mean fuzzy error is used as the objective function and the stepwise regression is used to solve it. In Section 5, we apply the proposed model and other previous models in four different types of examples, and compare the models by SSE and mean fuzzy error based on the partial order.
We will use the following notion about fuzzy number. Let I = [ 0 , 1 ] , ℝ 0 − = ( − ∞ , 0 ] and ℝ 0 + = [ 0, ∞ ) .
Definition 2.1. (Aguilar [
Let F be the set of all FNs (with compact support). Thus, for each α ∈ I the α-level set A α of A is a compact subinterval of ℝ that can be expressed as A α = [ a _ α , a ¯ α ] , where a _ α is the inferior extreme and a ¯ α is the superior extreme of the interval A α . Following this notation, we will also denote the support of A by A 0 = [ a _ 0 , a ¯ 0 ] . The number D c A = ( a _ 1 + a ¯ 1 ) / 2 is the center of the FN A , and its radius is spr A = ( a ¯ 1 − a _ 1 ) / 2 .
Proposition 2.1. (Wu [
( A ⊕ B ) α = [ a _ α + b _ α , a ¯ α + b ¯ α ] (1)
( A ⊗ B ) α = [ min ( a _ α ⋅ b _ α , a _ α ⋅ b ¯ α , a ¯ α ⋅ b _ α , a ¯ α ⋅ b ¯ α ) , max ( a _ α ⋅ b _ α , a _ α ⋅ b ¯ α , a ¯ α ⋅ b _ α , a ¯ α ⋅ b ¯ α ) ] (2)
Definition 2.2. (Dubois [
A ( x ) = { L ( x − a 1 a 2 − a 1 ) a 1 < x < a 2 1 a 2 ≤ x ≤ a 3 R ( a 4 − x a 4 − a 3 ) a 3 < x < a 4 0 others (3)
where L , R : I → I are strictly increasing, continuous mappings such that L ( 0 ) = R ( 0 ) = 0 and L ( 1 ) = R ( 1 ) = 1 . Clearly, the kernel of A is [ a 2 , a 3 ] and its support is [ a 1 , a 4 ] . Let F L R be the family of all LRFN.
Triangular fuzzy number are special cases of LRFN (denote them by A = ( a 1 / a 2 / a 3 / a 4 ) L R ) with L ( x ) = R ( x ) = x for all x ∈ I and a 1 ≤ a 2 = a 3 ≤ a 4 . To be short, we will denote triangular fuzzy number by A = ( a 1 / a 2 / a 4 ) T . Let F T be the family of all TFN.
Proposition 2.2. Given α ∈ I and L , R : I → I are strictly increasing, continuous mapping. a 1 ≤ a 2 ≤ a 3 ≤ a 4 ∈ ℝ , then there exists a unique LRFN A such that
a _ α = a _ ( α ) = a 1 + ( a 2 − a 1 ) L − 1 ( α ) (4)
a ¯ α = a ¯ ( α ) = a 4 − ( a 4 − a 3 ) R − 1 ( α ) (5)
Definition 2.3. (Alfonso [
A partition of the interval I is a set P = { δ 0 , δ 1 , ⋯ , δ n } such that 0 = δ 0 < δ 1 < ⋯ < δ n = 1 . The simplest partition of I is P 0 = { 0 < δ 0 < δ 1 = 1 } . If P = { δ i } i = 0 n is a partition of I and f : S → ℝ is a mapping defined on S ⊇ I , we will denote, for all i ∈ { 1,2, ⋯ , n } ,
Δ f i = Δ f [ δ i − 1 , δ i ] = [ f ( δ i ) − f ( δ i − 1 ) ] / ( δ i − δ i − 1 ) (6)
Therefore, for each α ∈ I the α-level set A α of A = ( a 1 / a 2 / a 3 / a 4 ) L R , we have
Δ a _ i = Δ a _ [ δ i − 1 , δ i ] = a _ ( δ i ) − a _ ( δ i − 1 ) δ i − δ i − 1 = ( a 2 − a 1 ) [ L − 1 ( δ i ) − L − 1 ( δ i − 1 ) ] δ i − δ i − 1 (7)
Δ a ¯ i = Δ a ¯ [ δ i − 1 , δ i ] = a ¯ ( δ i ) − a ¯ ( δ i − 1 ) δ i − δ i − 1 = ( a 4 − a 3 ) [ R − 1 ( δ i − 1 ) − R − 1 ( δ i ) ] δ i − δ i − 1 (8)
Definition 2.4. (Hierro [
1) ρ ( x , x ) = 0 S ;
2) if ρ ( x , y ) = 0 , then x = y ;
3) ρ ( x , y ) = ρ ( y , x ) ;
4) ρ ( x , z ) ⊑ s ( ρ ( x , y ) , ρ ( y , z ) ) .
we also say that ( S ,0 S , s ) is a metric space w.r.t. the partial order ⊑ . The function ρ is :
1) a pseudometric if it satisfies 1), 3) and 4);
2) a semimetric (on ( S ,0 S ) ) if it satisfies 1) - 3);
3) a pseudosemimetric (on ( S ,0 S ) ) if it satisfies 1) and 3).
Definition 2.5. (Hierro [
Theorem 2.1. (Hierro [
Theorem 2.2. (Hierro [
Definition 2.6. (Hierro [
D ( A , B ) _ α = q 0 ϕ 0 ( D c A , D c B ) − q ψ ( s p r A , s p r B ) − q 1 ( α ) ∑ i = 1 n ϕ i ( Δ a _ i , Δ b _ i ) (9)
D ( A , B ) ¯ α = q 0 ϕ 0 ( D c A , D c B ) + q ψ ( s p r A , s p r B ) + q 2 ( α ) ∑ i = 1 n ϕ i ( Δ a ¯ i , b ¯ i ) (10)
D ( A , B ) be the only LRFN determined by its α-cuts.
Theorem 3.1. If q 1 , q 2 are the standard negation, A , B ∈ F L R , then D ( A , B ) ∈ F L R . In addition, if q 0 , q > 0 , then D : F L R × F L R → F L R is a semimetric on ( F L R , 0 ˜ ) .
Proposition 3.1. (Hierro [
1) D ( A , A ) = 0 ˜ ;
2) if D ( A , B ) = 0 ˜ , then A = B ;
3) D ( A , B ) = D ( B , A ) ;
4) D ( A , C ) ≼ D ( A , B ) + D ( B , C ) .
whatever the metrics ϕ 0 , ψ , ϕ i , φ i and the partition P .
Therefore, D is a metric on ( F L R , 0 ˜ ) .
Proposition 3.2. Let D be a metric on ( F L R , 0 ˜ ) , A , B , C ∈ F L R , then we have:
1) D ( 1 ˜ , 0 ˜ ) = q 0 ϕ 0 ( 1,0 ) ;
2) If A ≼ B ≼ C , ρ ( x , y ) ≤ ρ ( x , z ) when | y − x | ≤ | z − x | where ρ = ϕ 0 , ψ , ϕ i , φ i , then D ( A , B ) ≼ D ( A , C ) , D ( B , C ) ≼ D ( A , C ) ;
3) If ρ ( k x , k y ) = k ρ ( x , y ) where ρ = ϕ 0 , ψ , ϕ i , φ i , k ∈ ℝ + , then D ( k A , k B ) = k D ( A , B ) .
Proposition 3.3. If A = ( a 1 / a 2 / a 3 / a 4 ) L R , B = ( b 1 / b 2 / b 3 / b 4 ) L R satisfy A ≼ B w.r.t. ( D c , P ) , only if a 2 + a 3 ≤ b 2 + b 3 , a 3 − a 2 ≤ b 3 − b 2 , a 2 − a 1 ≤ b 2 − b 1 , a 4 − a 3 ≤ b 4 − b 3 .
Theorem 3.2. Assume that, Definition 2.6, we choose q = q 0 = 1 , q 1 ( α ) = q 2 ( α ) are the standard negation, ϕ 0 ( x , y ) = ψ ( x , y ) = ( x − y ) 2 , and ϕ i ( x , y ) = φ i ( x , y ) = ( δ i − δ i − 1 ) 2 ( x − y ) 2 , for all i ∈ { 2,3, ⋯ , n } in their respective domains. A = ( a 1 / a 2 / a 3 / a 4 ) L R , B = ( b 1 / b 2 / b 3 / b 4 ) L R , then we define the distance measure D ( A , B ) as:
D ( A , B ) = ( d 1 / d 2 / d 3 / d 4 ) L R (11)
where
d 1 = ( a 2 − b 2 ) ( a 3 − b 3 ) − [ ( a 2 − a 1 ) − ( b 2 − b 1 ) ] 2 d 2 = ( a 2 − b 2 ) ( a 3 − b 3 ) d 3 = [ ( a 3 − b 3 ) 2 + ( a 2 − b 2 ) 2 ] / 2 d 4 = [ ( a 3 − b 3 ) 2 + ( a 2 − b 2 ) 2 ] / 2 + [ ( a 4 − a 3 ) − ( b 4 − b 3 ) ] 2 (12)
Proof. If A = ( a 1 / a 2 / a 3 / a 4 ) L R and B = ( b 1 / b 2 / b 3 / b 4 ) L R , we deduce from (9)-(10) that, first of all, notice that
q 0 ϕ 0 ( D c A , D c B ) = ( D c A − D c B ) 2 = [ ( a 2 + a 3 ) − ( b 2 + b 3 ) ] 2 / 4 (13)
q ψ ( s p r A , s p r B ) = ( s p r A − s p r B ) 2 = [ ( a 3 − a 2 ) − ( b 3 − b 2 ) ] 2 / 4 (14)
On the other hand, by (7)
ϕ i ( Δ a _ i , Δ b _ i ) = ( δ i − δ i − 1 ) 2 ( Δ a _ i − Δ b _ i ) 2 = [ ( a 2 − a 1 ) − ( b 2 − b 1 ) ] 2 [ L − 1 ( δ i ) − L − 1 ( δ i − 1 ) ] 2 (15)
Similarly by (8)
φ i ( Δ a ¯ i , Δ b ¯ i ) = [ ( a 4 − a 3 ) − ( b 4 − b 3 ) ] 2 [ R − 1 ( δ i − 1 ) − R − 1 ( δ i ) ] 2 (16)
Then we have
D ( A , B ) _ α = [ ( a 2 + a 3 ) − ( b 2 + b 3 ) ] 2 / 4 − [ ( a 3 − a 2 ) − ( b 3 − b 2 ) ] 2 / 4 − q 1 ( α ) [ ( a 2 − a 1 ) − ( b 2 − b 1 ) ] 2 ∑ i = 1 n [ L − 1 ( δ i ) − L − 1 ( δ i − 1 ) ] 2 (17)
D ( A , B ) ¯ α = [ ( a 2 + a 3 ) − ( b 2 + b 3 ) ] 2 / 4 + [ ( a 3 − a 2 ) − ( b 3 − b 2 ) ] 2 / 4 + q 2 ( α ) [ ( a 4 − a 3 ) − ( b 4 − b 3 ) ] 2 ∑ i = 1 n [ R − 1 ( δ i − 1 ) − R − 1 ( δ i ) ] 2 (18)
Finally, by (17)-(18) we have
d 1 = D ( A , B ) _ 0 = ( a 2 − b 2 ) ( a 3 − b 3 ) − [ ( a 2 − a 1 ) − ( b 2 − b 1 ) ] 2 d 2 = D ( A , B ) _ 1 = ( a 2 − b 2 ) ( a 3 − b 3 ) d 3 = D ( A , B ) ¯ 1 = [ ( a 3 − b 3 ) 2 + ( a 2 − b 2 ) 2 ] / 2 d 4 = D ( A , B ) ¯ 0 = [ ( a 3 − b 3 ) 2 + ( a 2 − b 2 ) 2 ] / 2 + [ ( a 4 − a 3 ) − ( b 4 − b 3 ) ] 2 (19)
In this section, the regression methodology of minimizing the mean fuzzy error as the objective function is introduction. The regression parameters is obtained by the least square method based on Theorem 3.2, and Proposition 3.3 to contrast which is the model with least fuzzy error.
Let X ˜ = ( x ˜ 1 , ⋯ , x ˜ N ) ′ is a random vector which have N LR fuzzy input, then we get a crisp random vector X ′ = ( x 1,1 , x 1,2 , x 1,3 , x 1,4 , ⋯ , x N ,1 , x N ,2 , x N ,3 , x N ,4 ) when we put the four corners of each fuzzy input as explanatory variables, so we have a new crisp random vector X = ( x 1 , x 2 , x 3 , ⋯ , x 4 N ) . The LR fuzzy response variable is Y ˜ = ( y 1 / y 2 / y 3 / y 4 ) L R (suppose there are n samples). We will analyze the relationship between Y ˜ and X. The regression model we consider can be formalized as:
Y ˜ = ( y ^ 1 + ε 1 / y ^ 2 + ε 2 / y ^ 3, j + ε 3, j / y ^ 4 + ε 4 ) L R , (20)
where ε 1 , ε 2 , ε 3 , ε 4 are the residuals (i.e., real-valued random variables such that E [ ε 1 | X ] = E [ ε 2 | X ] = E [ ε 3 | X ] = E [ ε 4 | X ] = 0 ) and the estimated variables Y ˜ ^ = ( y ^ 1 / y ^ 2 / y ^ 3 / y ^ 4 ) L R be formalized as:
y ^ 1 = b 0 , 1 + ∑ i = 1 4 N b i , 1 x i , y ^ 2 = b 0 , 2 + ∑ i = 1 4 N b i , 2 x i , y ^ 3 = b 0 , 3 + ∑ i = 1 4 N b i , 3 x i , y ^ 4 = b 0 , 4 + ∑ i = 1 4 N b i , 4 x i , (21)
there b 0 , k , b i , k ( k = 1 , 2 , 3 , 4 represents the kth corner) are regression parameters for y ^ 1 , y ^ 2 , y ^ 3 and y ^ 4 respectively.
Consider the distance measure D defined in (11), we will minimize the distance between the observed and the estimated values as the objective function. In other words, we use the mean fuzzy error ε as the objective function and it be formalized as:
min : ε = 1 n ∑ j = 1 n D ( Y ˜ j , Y ˜ ^ j ) (22)
that is, we are looking for a function Y ˜ ^ = ( y ^ 1 / y ^ 2 / y ^ 3 / y ^ 4 ) L R such that the mean fuzzy error ε (22) is as small as possible w.r.t. the partial order ≼ w.r.t. on F L R . If the objective regression function Y ˜ ^ is given by (21) then the mean fuzzy error (23) is:
min : ε = 1 n ∑ j = 1 n ( ε j , 1 / ε j , 2 / ε j , 3 / ε j , 4 ) L R = ( 1 n ∑ j = 1 n ε j , 1 / 1 n ∑ j = 1 n ε j , 2 / 1 n ∑ j = 1 n ε j , 3 / 1 n ∑ j = 1 n ε j , 4 ) L R (23)
where
ε j , 1 = D ( Y ˜ , Y ˜ ^ ) _ j , 0 = ( y j , 2 − y ^ j , 2 ) ( y j , 3 − y ^ j , 3 ) − [ ( y j , 2 − y j , 1 ) − ( y ^ j , 2 − y ^ j , 1 ) ] 2
ε j , 2 = D ( Y ˜ , Y ˜ ^ ) _ j , 1 = ( y j , 2 − y ^ j , 2 ) ( y j , 3 − y ^ j , 3 )
ε j , 3 = D ( Y ˜ , Y ˜ ^ ) ¯ j , 1 = [ ( y j , 3 − y ^ j , 3 ) 2 + ( y j , 2 − y ^ j , 2 ) 2 ] / 2
ε j , 4 = D ( Y ˜ , Y ˜ ^ ) ¯ j , 0 = [ ( y j , 3 − y ^ j , 3 ) 2 + ( y j , 2 − y ^ j , 2 ) 2 ] / 2 + [ ( y j , 4 − y j , 3 ) − ( y ^ j , 4 − y ^ j , 3 ) ] 2
As a consequence, we must minimize each component to obtain the optimal solution. To sum up, y 1 , y 2 , y 3 , y 4 , four corners of Y ˜ can be relates to x i ( i = 1 , 2 , ⋯ , 4 N ) .
For each of the possible combinations of y 1 , y 2 , y 3 , y 4 we calculate a mean fuzzy error using the previous equation. Finally, we sort the fuzzy models using the partial order ≼ and, for the optimal solution of the fuzzy regression problem, we choose the fuzzy model with the lowest fuzzy error.
The main purpose of this paper is to develop a LR fuzzy methodology that can be considered easy to understand and powerful. Based on this, we will consider using the stepwise linear regression method to solve the problem. Stepwise regression is a method of fitting regression models in which the choice of predictive variables is carried out by an automatic procedure. Eliminate unimportant explanatory variables to ensure that only significant variables are included in the regression equation before each new variable is introduced. The final set of explanatory variables is optimal. We propose to consider models obtained by this approach in the solution of the fuzzy problem.
Applying the previous methodology, we will obtain the fuzzy regression models Y ˜ ^ . To evaluate the goodness-of-fit of the different models, we considered two numerical estimations of the following statistics:
1) Mean fuzzy error ε are given by (23);
2) S S E = 1 4 ∑ k = 1 4 ∑ j = 1 n ( y j , k − y ^ j , k ) 2 .
Example 1. Triangular fuzzy observations
In this example, the fuzzy input-output data from Sakawa [
We use these data to regress the fuzzy response variable Y ˜ = ( y 1 / y 2 / y 4 ) T about the fuzzy exploratory variable X ˜ = ( x 1 / x 2 / x 4 ) T . This problem can be reduced to one that can be solves with the methodology described in the previous sections and considering the crisp random vector X = ( x 1 , x 2 , x 4 ) as the exploratory random variable.
In order to search for a suitable fuzzy regression model capable to express the statistical relationship between Y ˜ and X, we consider the methodology explained previously. We used the stepwise method to select the explanatory variables which are needed to build the model.
Therefore, proposed method (PM) with the lowest error is given by
No. | X ˜ | Y ˜ | Model | ε | SSE |
---|---|---|---|---|---|
1 | (1.5/2/2.5)T | (3.5/4/4.5)T | SY | (0.702/0.732/1.825)T | 7.466 |
2 | (3/3.5/4)T | (5/5.5/6)T | YL | (0.633/0.644/0.655)T | 6.738 |
3 | (4.5/5.5/6.5)T | (6.5/7.5/8.5)T | KC | (0.207/0.643/1.077)T | 7.463 |
4 | (6.5/7/7.5)T | (6/6.5/7)T | CH | (0.632/0.643/0.654)T | 5.201 |
5 | (8/8.5/9)T | (8/8.5/9)T | Wu | (0.542/0.643/0.744)T | 5.682 |
6 | (9.5/40.5/11.5)T | (7/8/9)T | ME | (0.683/0.694/0.704)T | 5.619 |
7 | (10.5/11/11.5)T | (10/10.5/11)T | PM | (0.643/0.643/0.625)T | 5.142 |
8 | (12/12.5/13)T | (9/9.5/10)T |
Y ˜ ^ P M = ( 3.572 + x 1 − 0.481 x 2 / 3.572 + 0.519 x 2 / 3.572 − 0.481 x 2 + x 4 ) T ,
following Sakawa [
Y ˜ ^ S Y = ( 3.031 / 3.20 / 3.371 ) T + ( 0.498 / 0.579 / 0.659 ) T X ˜ ,
Yang [
Y ˜ ^ Y L = ( 3.203 / 3.497 / 3.788 ) T + ( 0.525 / 0.529 / 0.534 ) T X ˜ ,
Kao [
Y ˜ ^ K C = 3.565 + 0.522 X ˜ + ( − 0.962 / − 0.011 / 0.938 ) T ,
Chen [
Y ˜ ^ C H = ( 3.272 / 3.572 / 3.872 ) T + 0.519 X ˜ ,
Wu [
Y ˜ ^ W u = 3.57 + 0.5196 X ˜ ,
Modarres [
Y ˜ ^ M E = ( 3.278 / 3.511 / 3.744 ) T + ( 0.544 / 0.553 / 0.562 ) T X ˜ .
kernel value( y ^ 2 ), and the left support value( y ^ 1 ), respectively. Therefore, when the graph of the model is closer to the output, the fitting effect of the model is better. Right part of
Example 2. One crisp explanatory variable and non-triangular fuzzy response variable
From Example 1, it was found that if data were single variable symmetric triangular fuzzy numbers, the performance of these models were similar, except for the SYM. So there were a few changes in the data [
Since SYM perform badly in Example 1, it would no longer be added to the comparison.
Following Yang [
Y ˜ ^ Y L = ( 3.185 / 3.850 / 4.516 ) L R + ( 0.919 / 0.924 / 0.928 ) L R X ,
in addition, Wu [
Y ˜ ^ W u = 3.85 + 0.924 X , ( α = 0.6 ) ,
Y ˜ ^ C H = 0.924 X + ( 3.15 / 3.85 / 4.55 ) L R ,
Kao [
Y ˜ ^ K C = 3.806 + 0.927 X + ( − 0.681 / 0.019 / 0.719 ) L R ,
Proposed Method’s model as
Y ˜ ^ P M = ( 3.185 + 0.919 x / 3.85 + 0.924 x / 4.516 + 0.928 x ) L R .
It can be seem from
Example 3. Two explanatory and response variables both are asymmetrical
Since the input of the first two examples was single variable, two independent variables were taken to this example. The input and output were triangular fuzzy number, and the numerical difference between the two independent variables was larger. Consisted of 15 fuzzy observations with two fuzzy explanatory variables X ˜ 1 = ( x 1,1 / x 1,2 / x 1,4 ) T , X ˜ 2 = ( x 2 , 1 / x 2 , 2 / x 2 , 4 ) T and one fuzzy response variable Y ˜ , which is listed in the left part of
No. | X ˜ | Y ˜ | Membership of Y ˜ | Model | ε | SSE |
---|---|---|---|---|---|---|
1 | 2 | (4.5/5/5.5)LR | 1 − 4 ( y − 5 ) 2 | YL | (0.2269/0.2866/0.3463)LR | 1.6322 |
2 | 5 | (8/9/10)LR | 1 − ( y − 9 ) 2 | Wu | (−0.2634/0.2866/0.8366)LR | 3.2665 |
3 | 7 | (10.5/11/11.5)LR | 1 − 4 ( y − 11 ) 2 | CH | (0.2276/0.2866/0.3466)LR | 1.6331 |
4 | 10 | (12/13/14)LR | 1 − ( y − 13 ) 2 | KC | (0.2268/0.2868/0.3468)LR | 1.6339 |
5 | 12 | (14/14.5/15)LR | 1 − 4 ( y − 14.5 ) 2 | PM | (0.2269/0.2866/0.3463)LR | 1.6322 |
No. | X ˜ 1 | X ˜ 2 | Y ˜ | Model | ε | SSE |
---|---|---|---|---|---|---|
1 | (151/274/322)T | (1432/2450/3461)T | (111/162/194)T | YL | (−389.5/9.1/409.1)T | 4384.7 |
2 | (101/180/291)T | (2448/3154/4463)T | (88/120/161)T | Wu | (−500.3/3.8/455.3)T | 4960.2 |
⋮ | ⋮ | ⋮ | ⋮ | CH | (−1513.9/6.9/788)T | 11873 |
15 | (216/370/516)T | (1785/2605/4042)T | (167/212/267)T | PM | (−45.8/3.8/71.7)T | 725.4 |
In Example 2, YLM, CHM and PM model had the same fuzzy error, and KCM had larger fuzzy error. Therefore, KCM was not as good as PM. Thus, KCM was removed from the comparison. Following Yang [
Y ˜ ^ Y L = 12.726 + 0.49 X ˜ 1 + 0.007 X ˜ 2 ,
in addition, Wu [
Y ˜ ^ W u = 3.453 + 0.496 X ˜ 1 + 0.009 X ˜ 2 , ( α = 1 ) ,
Y ˜ ^ C H = 0.507 X ˜ 1 + 0.009 X ˜ 2 + ( − 18.167 / 0.06 / 10.592 ) T .
We use these data to regress the fuzzy response variable Y ˜ about the fuzzy exploratory variable X ˜ 1 = ( x 1,1 / x 1,2 / x 1,4 ) T , X ˜ 2 = ( x 2,1 / x 2,2 / x 2,4 ) T . This problem can be reduced to one that can be solved with the methodology described in the previous sections and considering the crisp random vector X = ( x 1 , x 2 , x 3 , x 4 , x 5 , x 6 ) as the exploratory random variable. Then we used the stepwise method to select the explanatory variables.
Proposed Method:
Y ˜ ^ P M = ( y ^ 1 / y ^ 2 / y ^ 4 ) T
where
y ^ 1 = − 1.128 + 0.345 x 2 , 1 + 0.009 x 2 , 2 ,
y ^ 2 = 3.453 + 0.496 x 2 , 1 + 0.009 x 2 , 2 ,
y ^ 4 = 6.596 + 0.594 x 1 , 1 − 0.0009 x 2 , 1 + 0.009 x 2 , 2 + 0.241 x 4 , 1 .
From the perspective of fuzzy error, PM was significantly better than other models. It can be seen from
Example 4. Real-life data
When the input was a single crisp variable, PM performed quite equal to other models; when input was fuzzy multi-variable, PM is significantly better than
other models. Thus, it would make sense to see whether our model would perform well when the input was crisp multi-variable. A set of real-life data are adopted in this example to demonstrate the proposed solution approaches for the fuzzy regression problem, which is listed in
Following Yang [
Y ˜ ^ Y L = ( 49.414 / 52.577 / 57.921 ) T + ( 1.243 / 1.468 / 1.616 ) T X 1 + ( 0.653 / 0.662 / 0.701 ) T X 2
Chen [
Y ˜ ^ C H = 1.854 X 1 + 0.795 X 2 + ( 38.016 / 43.324 / 51.632 ) T
In addition, Kao [
Y ˜ ^ K C = 51.394 + 1.481 X 1 + 0.674 X 2 + ( − 4.807 / 0.501 / 8.808 ) T
Proposed Method:
Y ˜ ^ P M = ( y ^ 1 / y ^ 2 / y ^ 4 ) T
where
y ^ 1 = 48.998 + 1.237 X 1 + 0.663 X 2
y ^ 2 = 52.577 + 1.468 X 1 + 0.663 X 2
y ^ 4 = 57.921 + 1.616 X 1 + 0.701 X 2
No. | X 1 | X 2 | Y ˜ | No. | X 1 | X 2 | Y ˜ |
---|---|---|---|---|---|---|---|
1 | 7 | 26 | (72.5/78.5/87.5)T | 8 | 1 | 31 | (70.5/72.5/78.5)T |
2 | 1 | 29 | (70.3/74.3/80.3)T | 9 | 2 | 54 | (90.1/93.1/101.1)T |
3 | 11 | 56 | (100.3/104.3/113.3)T | 10 | 21 | 47 | (109.9/115.9/125.9)T |
4 | 11 | 31 | (80.6/87.6/95.6)T | 11 | 1 | 40 | (79.8/83.8/90.8)T |
5 | 7 | 52 | (89.9/95.9/103.9)T | 12 | 11 | 66 | (107.3/113.3/123.3)T |
6 | 11 | 55 | (100.2/109.2/118.2)T | 13 | 10 | 68 | (100.4/109.4/118.4)T |
7 | 3 | 71 | (99.7/102.7/111.7)T |
Model | ε | SSE | Model | ε | SSE |
---|---|---|---|---|---|
YL | (1.665/4.454/4.797)T | 69.519 | KC | (−0.022/4.499/6.096)T | 82.725 |
CH | (10.570/15.091/16.689)T | 221.422 | PM | (1.646/4.454/4.798)T | 69.608 |
The previous four different types of examples suggested that when the data structure was simple, for example, in Example 1 the data were single variable symmetric triangular fuzzy number, in Example 2 the input was a single variable crisp and the output was LR fuzzy number, in Example 4 the input was multivariate crisp and the output was LR fuzzy number, PM was not significantly better than any other model, but also not inferior to other models. When the data structure was complex, for example, in Example 3 the input was multivariate LR fuzzy number and the output also was LR fuzzy number, PM was significantly better than other models. This could be explained by those previous models which did not investigate the influence of the left and right values of input on the center value of output or the center value of input to the left and right values of output. In practice, there is more than one factor that affects the dependent variable, which indicates that our model could have a wider range of application.
The authors declare no conflicts of interest regarding the publication of this paper.
Deng, J. and Lu, Q.J. (2018) Fuzzy Regression Model Based on Fuzzy Distance Measure. Journal of Data Analysis and Information Processing, 6, 126-140. https://doi.org/10.4236/jdaip.2018.63008