Open Journal of Statistics
Vol.05 No.01(2015), Article ID:53437,5 pages
10.4236/ojs.2015.51003

Correct Classification Rates in Multi-Category Discriminant Analysis of Spatial Gaussian Data

Lina Dreižienė1,2, Kęstutis Dučinskas1, Laura Paulionienė1

1Department of Mathematics and Statistics, Klaipėda University, Klaipėda, Lithuania

2Institute of Mathematics and Informatics, Vilnius University, Vilnius, Lithuania

Email: l.dreiziene@gmail.com, kestutis.ducinskas@ku.lt, saltyte.laura@gmail.com

Copyright © 2015 by authors and Scientific Research Publishing Inc.

This work is licensed under the Creative Commons Attribution International License (CC BY).

http://creativecommons.org/licenses/by/4.0/

Received 28 December 2014; accepted 17 January 2015; published 22 January 2015

ABSTRACT

This paper discusses the problem of classifying a multivariate Gaussian random field observation into one of the several categories specified by different parametric mean models. Investigation is conducted on the classifier based on plug-in Bayes classification rule (PBCR) formed by replacing unknown parameters in Bayes classification rule (BCR) with category parameters estimators. This is the extension of the previous one from the two category cases to the multi-category case. The novel closed-form expressions for the Bayes classification probability and actual correct classification rate associated with PBCR are derived. These correct classification rates are suggested as performance measures for the classifications procedure. An empirical study has been carried out to analyze the dependence of derived classification rates on category parameters.

Keywords:

Gaussian Random Field, Bayes Classification Rule, Pairwise Discriminant Function, Actual Correct Classification Rate

1. Introduction

Much work has been done concerning the error rates in two-category discrimination of uncorrelated observations (see e.g. [1] ). Several methods for estimations of the error rates in discriminant analysis of spatial data have been recently proposed (see e.g. [2] [3] ).

The multi-category problem, however, has very rarely been addressed because most of the methods proposed for two categories do not generalize. Schervish [4] considered the problem of classification into one of three known normal populations by single linear discriminant function. Techniques for multi-category probability estimation by combining all pairwise comparisons are investigated by several authors (see e.g. [5] ). Empirical comparison of different methods of error rate estimation in multi-category linear discriminant analysis for multivariate homoscedastic Gaussian data was performed by Hirst [6] . Bayesian multiclass classification problem for correlated Gaussian observation was empirically studied by Williams [7] . The novel model-free estimation method for multiclass conditional probability based on conditional quintile regression functions is theoretically and numerically studied by Xu [8] . Correct classification rates in multi-category classification of independent multivariate Gaussian observations were provided by Schervish [9] . We generalize results of above to the problem of classification of multivariate spatially correlated Gaussian observations.

We propose the method of multi-category discriminant analysis essentially exploiting the Bayes classification rule that is optimal in the sense of minimum misclassification probability in case of complete statistical certainty (see [10] , chapter 6). In practice, however, the complete statistical description of populations is usually not possible. Then having training sample, parametric plug-in Bayes classification rule formed by replacing unknown parameters with their estimators in BCR is being used.

Šaltytė and Dučinskas [11] derived the asymptotic approximation of the expected error rate when classifying the observation of a scalar Gaussian random field into one of two classes with different regression mean models and common variance. This result was generalized to multivariate spatial-temporal regression model in [12] . However, the observations to be classified are assumed to be independent from training samples in all publication listed above. The assumption of independence for the classification of scalar GRF observations was removed by Dučinskas [2] . Multivariate two-category case has been considered in Dučinskas [13] and Dučinskas and Dreižienė [14] . Formulas for the error rates for multiclass classification of scalar GRF observation are derived in [15] . The authors of the above papers have been focused on the maximum likelihood (ML) estimators because of tractability of the covariance matrix of these estimators. In the present paper, we extend the investigation of the performance of the PBCR in multi-category case. The novel closed form expressions for the actual correct classification rate (ACCR) are derived.

By using the derived formulas, the performance of the PBR is numerically analyzed in the case of stationary Gaussian random field on the square lattice with the exponential covariance function. The dependence of the correct classification rate and ACCR values on the range parameter is investigated.

The rest of the paper is organized as follows. Section 2 presents concepts and notions concerning BCR applied to multi-category classification of multivariate Gaussian random field (MGRF) observation. Bayes probability of correct classification is derived. In Section 3, the actual correct classification rate incurred by PBCR is considered and its closed-form expression is derived. Numerical examples, based on simulated data, are presented in Section 4, in order to illustrate theoretical results. The effect of the values of range parameter on the values of ACCR is examined.

2. The Main Concepts and Definitions

The main objective of this paper is to classify a single observation of MGRF into one of categories, say.

The model of observation in category is

.

Here represents a mean component and is a matrix of parameters. The error term is generated by p-dimensional zero-mean stationary GRF with covariance function defined by model for all

,

where is the spatial correlation function and is the variance-covariance matrix with elements. So we have deal with so called intrinsic covariance model (see [16] ).

Consider the problem of classification of the vector of observation of at location denoted by into one of populations specified above with given joint training sample. Joint training sample is stratified training sample, specified by matrix, where is the matrix of observations of from, ,.

Then the model of is

,

where is the matrix of category means parameters and is the matrix of random errors that has matrix-variate normal distribution i.e.

.

Here denotes the spatial correlation matrix among components (rows) of. In the rest of the paper the realization (observed value) of training sample will be denoted by.

Denote by the vector of spatial correlations between and observations in and set, , ,.

Notice that in category, the conditional distribution of given is Gaussian, i.e.

,

where conditional means are

(1)

and conditional covariance matrix is

. (2)

The marginal and conditional squared Mahalanobis distances between categories and for observation taken at location are specified respectively by

,

and

.

It is easy to notice that does not depend on realizations of and depends only on their locations.

Under the assumption of completely parametric certainty of populations and for known prior probabilities of populations, , Bayes rule minimizing the probability of misclassification is based on the logarithm of the conditional densities ratio.

There is no loss of generality in focusing attention on category, since the numbering of the categories is arbitrary. Let the set of population parameters is denoted by. Set.

Denote the log ratio of conditional densities in categories and by

, (3)

where,.

These functions will be called pairwise discriminant functions (PDF).

Then Bayes rule (BR) (see [10] , chapter 6) is given by:

classify to population if for,. (4)

3. Probabilities and Rates of Correct Classification

Set and set as -dimensional vector with the l-th components specified as, and with.

Lemma 1. The conditional probability of correct classification for category due to BCR specified in (4) is

.

Here is the probability density function of r-variate normal distribution with mean vector and variance-covariance matrix.

Proof. Recall, that under the definition (see e.g. [4] [9] ) a probability of correct classification due to aforementioned BCR is

. (5)

It is the probability of correct classification of when it comes from. Probability measure is based on conditional distribution of given, with means and variance-covariance matrix specified in (1), (2). may be expressed in form

,

where, and denotes the dimensional identity matrix.

After making the substitution of variables in (5) we obtain that

and,.

Set, then probability of correct classification can be rewritten in the following way.

After straightforward calculations we show that. That completes the proof of lemma.

In practical applications not all statistical parameters of populations are known. Then the estimators of unknown parameters can be found from training sample. When estimators of unknown parameters are plugged into Bayes discriminant function (BDF), the plug-in BDF is obtained (PBDF). In this paper we assume that true values of parameters and are unknown.

Let and be the estimators of and based on. Set.

Then replacing by in (3) we get the plug-in BDF (PBDF)

.

Then the classification rule based on PBCR is associated with plug-in PDF (PPDF) in the following way: classify to population if for .

Definition 1. The actual correct classification rate incurred by PBCR associated with PPDF is

.

Set and.

Lemma 2. The actual correct classification rate due to PBDR is

,

where is r-dimensional vector with components, and.

Proof. It is obvious that in population the conditional distribution of BPDF given is Gaussian, i.e.,

.

Set, then probability of correct classification can be rewritten in the following way:

.

After straightforward calculations we show that. That completes the proof of lemma.

4. Example and Discussions

Simulation study in order to compare proposed Bayes probability of correct classification rate and the actual correct classification rate incurred by PBCR was carried out for three class case. Also the effect of the range parameter on these values is examined.

In this example, observations are assumed to arise from bivariate stationary Gaussian random field with constant mean and isotropic exponential correlation function given by, where is a parameter of spatial correlation (range).

Set, and.

Estimators of and have the following form:

,

where denotes design matrix of training sample and is specified by and

.

Considered set of training locations with indicated class labels is shown in Figure 1.

So we have small training sample sizes (i.e.) and three different locations to be classified, furthermore we assume equal prior probabilities,.

Simulations were performed by geoR: a free and open-source package for geostatistical analysis included in statistical computing software R (http://www.r-project.org/). Each case was simulated 100 times (runs) and values are calculated by averaging ACCR over the runs. and CCR values are presented in Table 1. As might be expected values are lower than CCR. All values are increasing while range parameter is increasing. That means the stronger correlation gives better accuracy of proposed classification procedures.

Figure 1. Locations of training sample: “1” are samples from population Ω1, “2” from Ω2, “3” from Ω3, A, B and C denotes the locations of observation to be classified.

Table 1. CCR and values.

It’s seen in Figure 1, that the closest location to be classified is location A and the farthest is location C. CCR and are largest for location A and smallest for location C. It can be concluded that better accuracy gives closer locations.

References

  1. McLachlan, G.J. (2004) Discriminant Analysis and Statistical Pattern Recognition. Wiley, New York.
  2. Dučinskas, K. (2009) Approximation of the Expected Error Rate in Classification of the Gaussian Random Field Observations. Statistics and Probability Letters, 79, 138-144. http://dx.doi.org/10.1016/j.spl.2008.07.042
  3. Batsidis, A. and Zografos, K. (2011) Errors of Misclassification in Discrimination of Dimensional Coherent Elliptic Random Field Observations. Statistica Neerlandica, 65, 446-461. http://dx.doi.org/10.1111/j.1467-9574.2011.00494.x
  4. Schervish, M.J. (1984) Linear Discrimination for Three Known Normal Populations. Journal of Statistical Planning and Inference, 10, 167-175. http://dx.doi.org/10.1016/0378-3758(84)90068-5
  5. Wu, T.F., Lin, C.J. and Weng, R.C. (2004) Probability Estimates for Multi-Class Classification by Pairwise Coupling. Journal of Machine Learning Research, 5, 975-1005.
  6. Hirst, D. (1996) Error-Rate Estimation in Multiply-Group Linear Discriminant Analysis. Technometrics, 38, 389-399. http://dx.doi.org/10.1080/00401706.1996.10484551
  7. Williams, C.K.I. and Barber, D. (1998) Bayesian Classification with Gaussian Processes. IEEE Translations on Pattern Analysis and Machine Intelligence, 20, 1342-1351. http://dx.doi.org/10.1109/34.735807
  8. Xu, T. and Wang, J. (2013) An Efficient Model-Free Estimation of Multiclass Conditional Probability. Journal of Statistical Planning and Inference, 143, 2079-2088. http://dx.doi.org/10.1016/j.jspi.2013.08.008
  9. Schervish, M.J. (1981) Asymptotic Expansions for the Means and Variances of Error Rates. Biometrica, 68, 295-299. http://dx.doi.org/10.1093/biomet/68.1.295
  10. Anderson, T.W. (2003) An Introduction to Multivariate Statistical Analysis. Wiley, New York.
  11. Šaltytė, J. and Dučinskas, K. (2002) Comparison of ML and OLS Estimators in Discriminant Analysis of Spatially Correlated Observations. Informatica, 13, 297-238.
  12. Šaltytė-Benth, J. and Dučinskas, K. (2005) Linear Discriminant Analysis of Multivariate Spatial-Temporal Regressions. Scandinavian Journal of Statistics, 32, 281-294. http://dx.doi.org/10.1111/j.1467-9469.2005.00421.x
  13. Dučinskas, K. (2011) Error Rates in Classification of Multivariate Gaussian Random Field Observation. Lithuanian Mathematical Journal, 51, 477-485. http://dx.doi.org/10.1007/s10986-011-9142-4
  14. Dučinskas, K. and Dreižienė, L. (2011) Supervised Classification of the Scalar Gaussian Random Field Observations under a Deterministic Spatial Sampling Design. Austrian Journal of Statistics, 40, 25-36.
  15. Dučinskas, K., Dreižienė, L. and Zikarienė, E. (2015) Multiclass Classification of the Scalar Gaussian Random Field Observation with Known Spatial Correlation Function. Statistics and Probability Letters, 98, 107-114. http://dx.doi.org/10.1016/j.spl.2014.12.008
  16. Wackernagel, H. (2003) Multivariate Geostatistics: An Introduction with Applications. Springer-Verlag, Berlin. http://dx.doi.org/10.1007/978-3-662-05294-5