the response signal in range of 700 to 900 s is in relative steady state. In order to explain the range of 700 to 900 s being relative steady state, the PCA based on the response signals at 600 s is carried out, as seen in Figure 5. Comparing with Figure 4, there is obvious difference. This shows the response at 600 s is different from that of 700, 800 and 900 s, so the range of 700 to 900 s is relative steady state section.
In addition, using the response values at 600 s as features the discrimination between Jing wines and counterfeits was also carried out well, and the discrimination tasks were performed well using the response values at 400 and 500 s, respectively, too. This indicates that the tasks are simple and easy, and the eNose is competent for the discrimination tasks. The reason for this case is that there are essential differences between Jing wines and counterfeits, and the response value at different time can directly reflect the characteristics of their quality and difference between the two classes. In opposition to these features, a portion of difference information is likely to be removed in the process of extracting features by MDCV and WE, therefore the methods of MDCV and WE do not excel the response values as features in aspect of discrimination effect. This will be explained in Section 3.5.
Figure 4. The PCA results between Jing wines and counterfeits corresponding 700, 800 and 900 s. (a) The discrimination corresponding 700 s; (b) The discrimination corresponding 800 s; (c) The discrimination corresponding 900 s.
Figure 5. The PCA results between Jing wines and counterfeits corresponding 600 s.
The relative steady-state response can reflect the samples characteristics very well under a certain examination condition, and guarantee the analysis results to be stable and reliable comparing with other response values, the similarity of the discrimination results based on the response signals at 700 s, 800 s and 900 s is a better illustration. Therefore, we would rather choose these response values in range of 700 to 900 s as features for the discrimination tasks.
Because the response of each sensor in range of 700 to 900 s is relative stable, and the average response value in the range can more represent its overall characteristic, the average response value of each sensor in the range is more supposed to be as feature for the discrimination tasks. Figure 6 shows the result of PCA based on the average response values. From Figure 6, the discrimination is successful and similar to Figure 4 in intuition, this shows again the response values of each sensor in range of 700 to 900 s are relative stable. When the RSV and average response values are taken as features, higher reliability and simplicity are obtained for the discrimination tasks.
3.5. Evaluation of Discrimination Capability to Every Kind of Feature
Except MDCV the other features meet well the demand of discrimination between Jing wines and counterfeits, and after PC selected by L-statistic the MDCV feature was also used to carry out the discrimination tasks. Inspired by the criterion of separability between classes introduced by , an idea is proposed about how to evaluate the discrimination capabilities of these features. The reason the Jing wines can be discriminated from counterfeits is that they locate two different areas in the feature space. At the same time the distance between the areas is the larger the discrimination is the easier, so the distance may be selected as an evaluation parameter. In the feature space how to calculate the distance between the two areas is a key step. Because Mahalanobis dis-
Figure 6. The PCA results between Jing wines and counterfeits based on average value.
tance is a popular method in aspect of distance analysis, and can eliminate the disturbance of correlation between variables and effect of dimensions, the distance was selected as an evaluation parameter. There were a lot of samples in an area of the feature space, and every distance between two samples which located the two areas respectively was not equal, so the mean of Mahalanobis distance was selected as a representing measure of the distance between the two areas. By the expression of Mahalanobis distance , we can give the concrete calculation for the mean distance, that is
where d is mean distance between the two classes in feature space, Xk the k-th sample feature vector, the feature mean vector corresponding total samples, n the sample number of the Jing wines, m the sample number of the counterfeits, n + m the number of total samples, S the feature covariance matrix corresponding total samples.
According to the expression (4), the means of Mahalanobis distances in different feature space were calculated and are shown in Table 1. From Table 1, there are some results as follows:
1) The discrimination capability of the feature MDCV, WE and RSV is lower, medium and better, respectively, this result is basically in accord with the result of PCA.
2) For the feature MDCV, due to the Mahalanobis distance of PC1 and PC2 being the least, the discrimination result is the poorest, Jing Wines and counterfeits cannot be discriminated, but the distance corresponding PC1 and PC3 is bigger than that of PC1 and PC2, so the discrimination result is better than that of the PC1 and PC2, this is in accord with the result of Figure 2.
3) About the feature WE, the corresponding distance is bigger than that of MDCV, thus the feature WE excels MDCV in discrimination effect, and this result may elicit by comparing Figure 3 with Figure 2.
4) The distances corresponding to all RSV are all bigger than that of other features, and there is a little and
Table 1. The distances of different features.
uniform change in distances among these RSV, so the distance corresponding to average value is almost same as the distance corresponding to response value at 800 s. This is a second reason that RSV or average values were selected as features for the discrimination tasks.
The above results illuminate clearly that the bigger the mean distance the higher the discrimination capability of feature, and the mean distance is employed to evaluate discrimination capability of feature is effective.
The discrimination tasks between Chinese Jing wines and counterfeits were all accomplished well using the three kinds of feature extraction methods, the RSV is the best and the MDCV the poorest in respect of discrimination effect. For the MDCV, the discrimination can only be carried out with the help of Wilks L-statistic, and the PC1 and PC3 were selected for the discrimination tasks. For the WE, the discrimination can well carry out, but it is more complicated method and its discrimination capability is lower than that of RSV. For the RSV, the discrimination capability and the simplicity are best than those of others, and absolutely competent for the discrimination between Chinese Jing wines and counterfeits. In addition, the evaluation method of discrimination capability for these features based on Mahalanobis distance is proved to be appropriate, may be employed to evaluate the discrimination capability of different features.
Because these different features were all used to execute the discrimination tasks, this demonstrates that the eNose could be indeed used to discriminate Chinese Jing wines from counterfeits. A new attempt concerning the discrimination between Chinese Jing wine and counterfeits was given in this investigation.
This work is supported by the National Natural Science Foundation of China (NSFC) under Grant No. 31171685. At the same time, the authors would like to thank the Chinese Jing Brand Ltd. for the providing samples.