Image classification is one of the most basic operations of digital image processing. The present review focuses on the strengths and weaknesses of traditional pixel-based classification (PBC) and the advances of object-oriented classification (OOC) algorithms employed for the extraction of information from remotely sensed satellite imageries. The state-of-the-art classifiers are reviewed for their potential usage in urban remote sensing (RS), with a special focus on cryospheric applications. Generally, classifiers for information extraction can be divided into three catalogues: 1) based on the type of learning (supervised and unsupervised), 2) based on assumptions on data distribution (parametric and non-parametric) and, 3) based on the number of outputs for each spatial unit (hard and soft). The classification methods are broadly based on the PBC or the OOC approaches. Both methods have their own advantages and disadvantages depending upon their area of application and most importantly the RS datasets that are used for information extraction. Classification algorithms are variedly explored in the cryosphere for extracting geospatial information for various logistic and scientific applications, such as to understand temporal changes in geographical phenomena. Information extraction in cryospheric regions is challenging, accounting to the very similar and conflicting spectral responses of the features present in the region. The spectral responses of snow and ice, water, and blue ice, rock and shadow are a big challenge for the pixel-based classifiers. Thus, in such cases, OOC approach is superior for extracting information from the cryospheric regions. Also, ensemble classifiers and customized spectral index ratios (CSIR) proved extremely good approaches for information extraction from cryospheric regions. The present review would be beneficial for developing new classifiers in the cryospheric environment for better understanding of spatial-temporal changes over long time scales.
Image classification is one of the most basic operations of digital image processing (DIP). In most simple terms, image classification can be expressed as the process of distributing image into classes or categories of the analogous type. Digital image classification is the process of assigning pixels to meaningful classes [
In the field of RS, numerous attempts have been made for developing an effective approach for the information extraction processes. The availability of a range of high resolution (HR) images offers an advantage for more precise extraction of information by developing advanced classification schemes. RS classification is a complex process and requires consideration of many factors. According to Lu and Weng [
Comprehensive review of information extraction techniques and algorithms has not been done much, though there are lots of research attempts that are aimed at image classification [
classification [
Arbiol et al. [
Broadly, classifiers for information extraction can be divided into three catalogues: 1) based on the type of learning (supervised and unsupervised), 2) based on assumptions on data distribution (parametric and non-pa- rametric) and, 3) based on the number of outputs for each spatial unit (hard and soft).
The identification of natural groups, or structures, within multi-spectral (MS) data, is termed as unsupervised classification. It does not require training data as the basis for classification. These classifiers try to combine reflectance pixel values in the feature space, into substantially separated clusters which are considered as classes. After spectral grouping, the analyst identifies the obtained classes to some form of reference data. Thus, in unsupervised classification procedure: 1) training datasets are not required and 2) only a specification of number of classes is required by the user. Several clustering algorithms exist that can be used to verify the natural spectral clusters present in the image. The most popular classifiers which use this algorithm are K-means and iterative self-organizing data analysis (ISODATA). Change detection is one of the main applications for such methods, where the method recognizes the changes in real time. Unlike unsupervised classification, the supervised classification method uses training samples of known identity (ground reference sites) to classify pixels of unknown identity (i.e., unclassified pixels are assigned to one of the several informational classes). The supervised classification algorithm is the demonstration of sample sites of known cover type, called training sites, which are used to establish a numerical interpretation key that describes the spectral attributes for each feature type of interest. Then the reflectance value of each pixel is compared numerically to each class with the interpretation key marked with the name of the class it matches most similar to in the image. In this, the user defines the decision rules for each class directly or provides training data for each class to assist the computer classification. Here, labeled information is used to train a model capable of recognizing the pre-defined classes. Thus, in supervised classification procedure; 1) the decision rules for each class are defined directly and 2) training data (class prototypes) for each class are provided to assist the classification. Major steps involved in supervised and unsupervised classification are depicted in
There are three major steps involved in the classic supervised classification, 1) training: the user identifies representative training areas or samples and develops a numerical description of the spectral attributes (spectral signature) of each land cover class of interest in the image, 2) classification: each pixel in the image is classified into the land cover class based on its resemblance to the input training pixel and if the pixel is not matching to any predefined class signature then it is classified as unknown or unclassified, and 3) accuracy assessment: the classified thematic image is compared with reference image or ground reference data to check the accuracy of the
classification. There are several classifiers that fall under the category of supervised classification. The most popular classifiers are maximum likelihood classifier (MXL), Mahalanobis distance (MhD), support vector machines (SVMs), spectral angle mapper (SAM), parallelepiped (PP), and minimum distance (MD).
PBC is considered as the typical method for classification of RS images, which is based on conventional statistical techniques such as supervised and unsupervised classification. The traditional PBC process automatically categorizes all the pixels of an image into the thematic classes utilizing only the spectral information (relative reflectance) of each pixel in the image. The spectral information used by each pixel is also termed as a spectral signature, which is estimated by the relative reflectance in different wavelength bands. Pixel-based approach for classification of RS imagery is at present limited to HR data, such as QB images, as they produce some undesirable classification results in extracting the targeted class. To overcome the limitation of the PBC approach in HR RS, OOC approach was suggested in which the processing units are no longer single pixel, but it is an image object.
Geographic object-based image analysis (GEOBIA) has been gaining importance e in all the fields of RS over the past decade, especially to take an advantage of spatial-spectral characteristics of the high resolution satellite data for information extraction. Since the emergence of fine spatial resolution satellite sensor imagery, OOC has been applied comprehensively to overcome the within-object variation that can lead to pixel-based misclassification. Literature review of the OOC approaches in RS has suggested that the rule-based classifier and the standard nearest neighbor (NN) classifier are among the most commonly employed object classifiers, popularized by the availability of commercial software such as e Cognition and ENVI. OOC approach utilizes image segmentation and fuzzy classification on the results of segmentation. Broadly, this process can be divided into two main workflow steps: 1) multi-resolution segmentation and, 2) knowledge-based classification of the segments. In general, OOC algorithm initially performs segmentation of the whole image into consequential pixel groups that are referred to as segments. Then the user defines a set of knowledge-based classification rules (spectral, spatial, contextual and textual information) to describe each class. Thereafter, the classifier is chosen to assign each segment to the proper class according to a defined set of rules.
Hybrid classification approach fuses the elements of supervised and unsupervised algorithms. Since early 1990s, several hybrid methods have also been tried and refined in many cases to improve classification accuracy. Hybrid methods have demonstrated significant results in analyses where there is complex variability in the spectral data within information classes. The algorithm of most hybrid methods involves: 1) initial arrangement of the imagery by spectral clustering, 2) assigning clusters to user-defined classes, and 3) classification of the entire image using supervised learning. The Iterative Guided Spectral Class Rejection (IGSCR) is one such hybrid classifier which uses specific rejection criteria and large numbers of training pixels to cluster analogous pixels into two or three user-defined classes through a sequence of iterations. This method accepts and labels a spectral class if it meets user defined inclusion threshold and rejects it if it does not. Until a convergence threshold is met, rejected pixels continue to be classified in the next iteration and so on. At the end, the supervised decision rule is used to classify the image into pre-defined information classes utilizing these pure classes.
DT is a classifier network in which the compilation of simple classifiers is made to solve a complex problem in classification. It is a hierarchical configuration, where at each level, a test is employed to one or more attribute
Attributes | Classification Approach | |
---|---|---|
Pixel Based | Object Based | |
Spectral/Color | Used | Used |
Form/Shape | Not-Used | Used |
Area/Size | Not-Used | Used |
Texture | Not-Used | Used |
Content | Not-Used | Used |
values that may have one of the two outcomes. The outcome may be a leaf of DT, which defines a class, or a decision node, which signifies a further test on the attribute values and forms a branch or sub-tree of the tree. The algorithm for constructing a DT is summarized below:
If there are k classes represented by {Class1, Class2,..., Classk}, and a training set, T, then
If T contains one or more objects which all belong to a single class Classj, then the DT is a leaf identifying class Classj.
If T contains no objects, the DT is a leaf determined from information other than T.
If T contains objects that belong to a mixture of classes, then a test is chosen, based on a single attribute that has one or more mutually exclusive outcomes {Object1, Object2..., Objectn}. T is partitioned into subsets T1, T2..., Tn, where Ti contains all the objects in T that have outcome Objecti of the chosen test. The similar method is applied recursively to each subset of training objects to build the DT.
DT classifiers vary in ways they partition the training sample into subsets and thus form sub-trees, i.e. DTs differ in their criteria for evaluating splits into subsets. Information theory is used by the induction algorithm to evaluate splits. Many studies have been done comparing DT algorithm with other classifiers and found that based on the information theory, it is more accurate and gives reliable results. The other advantage of the algorithm is that it can convert DT into corresponding classification rules. Rules are more comprehensive, easy to understand and easy to implement.
The computer programs or rule sets that are targeted to simulate human learning processes through the establishment and reinforcement of linkages between input data and output data are called ANNs. The basic element of ANNs is the processing node that corresponds to the neuron of the human brain. Each processing node receives and sums a set of input values, and passes this sum through an activation function providing the output value of the node, which in turn forms one of the inputs to a processing node in the next layer of ANNs. When considering the use of the NNC approach to classification, it is necessary to make several key decisions beforehand. First, the number of layers to use must be chosen. Commonly, a three-layer network is sufficient, with the purpose of the first layer being simply to allocate the components of the input pixel vector to each of the processing elements in the second layer. The next choice relates to the number of elements in each layer. The input layer will usually be given as many nodes as there are features in the pixel vectors. The number to use in the output node will depend on how the outputs are used to represent the classes. The simplest method is to let each separate output signify a different class, in which case the number of output processing elements will be the same as the number of training classes. ANNs are composed of three elements (
1) An input layer consists of the source data, which in the context of RS are the MS observations, perhaps in several bands and from several dates. ANNs are designed to work with large volumes of data, including many bands and dates of MS observations, together with related ancillary data.
2) The output layer consists of the classes required by the user. There are few restrictions on the nature of the output layer, although the process will be more consistent when the number of output labels is small or modest with respect to the number of input channels. Included are training data in which the association between output labels and input data is clearly established.
3) During the training phase, an ANN establishes a connection between input and output data by establishing weights within one or more hidden layers. In the context of RS, repeated associations between classes and digital values, as expressed in the training data, strengthen weights within hidden layers that permit the ANN to assign correct labels when given spectral values in the absence of training data.
Ensemble learning refers to a collection of methods that learn a target function by training a number of individual classifiers and combining their predictions. The key idea of ensemble methodology is to combine a set of models, each of which solves the same original task, in order to obtain an improved composite global model, with more accurate and reliable estimates or decisions than those, which can be obtained from using a single model. Ensemble methods can be also used for improving the robustness of clustering algorithms. Formation of an ensemble involves: 1) modifying the data 2) modifying the learning task 3) exploiting the algorithm characteristics, and 4) exploiting problem characteristics. Methods for independently constructing ensembles are: 1) majority vote, 2) bagging and random forest (RF), 3) randomness injection, 4) feature-selection, and 5) error-cor- recting output coding. Methods for coordinated construction of ensembles are: 1) boosting, and 2) stacking. The way of combining the classifiers may be divided into two main groups: 1) simple multiple classifier combinations and 2) meta-combiners. The simple fusing methods are best suited for problems where the individual classifiers perform the same task and have comparable success. However, such combiners are more vulnerable to outliers and unevenly performing classifiers. Simple combining methods are uniform voting, distribution summation, Bayesian combination, Dempster?Shafer, Naïve Bayes, Bayesian Augmented Naïve Bayes (BAN) and Naïve Bayes Classifiers (NBC), entropy weighting, density-based weighting, and logarithmic opinion pool. On the other hand, the meta-combiners are theoretically more powerful but are susceptible to all the problems associated with the added learning (such as over-fitting, long training time). Meta-learning means learning from the classifiers produced by the inducers and from the classifications of these classifiers on training data. Meta-com- bining methods are stacking, arbiter trees, combiner trees, and grading. Based on our literature review, we have compiled case studies which involve the application of different classifiers in RS studies and depicted them in
RS images captured by low, medium, and coarse spatial resolutions, are contaminated with mixed pixels that represent more than one class on the ground. Hard classification process may result in an erroneous classification of images crowded with mixed pixels. This may be because of the fact that the spectral signature of a mixed pixel may no longer match any of the component classes or may be similar to the spectral signature of a different class. Therefore, allocating a mixed pixel to only one class may not be desirable because of the loss of pertinent class information associated within the mixed pixels. Therefore, to overcome the presence of mixed pixels in images, methods have been developed which quantitatively decompose or unmix the mixed pixels into its class components. This process is called sub-pixel classification, which has also been referred as spectral unmixing, spectral decomposition, fuzzy classification, and soft classification. The sub-pixel classification process decomposes a collection of class component spectra or endmembers. Thus, sub-pixel classification methods tend to resolve a pixel into various class components, generating many outputs in the form of fraction images. There are two main types of sub-pixel classification procedures: 1) based on linear models and 2) based on non-linear models.
OOC has shown many advantages over PBC especially in HR RS [
Satellite Image | Study Area (Application) | Classifiers | Classes | Accuracy/Results |
---|---|---|---|---|
Geo Eye-1 (GE1) [ | Beijing, China. (Land use/cover) | BAN and NBC | 5 [Houses, Roads, Grass, Hills, and Rivers] | BAN-86.2% NBC-82.0% |
GE-1 and QB [ | Fredericton, Canada. (Land use Land cover) | Fuzzy method (FM) and Crisp method (CM) | 5 [Shadow, vegetation, road, building and bare land] | FM (GE-1)-82% FM(QB)-90% CM (GE-1)-68% CM (QB)-42% |
GE-1 and WV-2 [ | Cuevas del Almanzora, southern Spain (Land cover classification) | OOC-NNC and SVM | 6 [Greenhouses, Nets, Vegetation, Orchards, Buildings, Bare soil] | NNC (GE-1)-87.91% SVM (GE-1)-85.71% NNC (WV-2_4)-89.01% SVM (WV-2_4)-84.07% NNC (WV-2_8)-87.91% SVM (WV-2_8)-87.36% |
IKONOS [ | Ghent in Belgium (Urban land cover mapping) | Ensemble Classifiers-DT, ANN, and RF. | 9 [water, grass, trees, buildings (with dark roof, red roof, bright roof), roads, other man-made objects, shadow] | Results indicate that ensemble classifiers generate significantly higher accuracies than a single classifier. |
IKONOS [ | Pico da Vara Natural Reserve. (Vegetation mapping) | SVM, ANN, MhD and MXL (parametric methods) | 8 [Forestry production species, aggressive alien invasive species, bare soil areas, clouds, natural pasture areas and shadows of clouds] | Despite the poor separability of some vegetation categories, MXL, SVM and ANN classifications have achieved good overall accuracies (overall accuracy > 75% and Kappa Index Agreement > 0.6). |
QB [ | Brandenburg, Germany. (Forest types) | Knowledge-based methods | 5 [Pine, Larch, Beech, Robinia and Oak] | Results show a good separability with approximately 80% to 90% overall accuracy for the tree species beech, oak, robinia, larch, and pine. |
QB [ | Lang Tengah Island. (Coral distribution mapping) | Ensembles classifier-PP, MD, MXL, Fisher and K-Nearest Neighbor (NN) | 4 [Dense coral, Sparse Coral, Dead Coral and Sand] | Using an ensemble classification approach, highest overall accuracy (73.02%) was seen in comparison to PP (52.38%), MD (50.79%), MXL (60.37%), Fisher (31.75%) and K-NN (50.79%) |
WV-2 [ | São Luís, Brazil. (Classification of Mangrove Areas) | OOC | 8 [Streets, Tidal flat, Tidal channel, ceramic roof, asbestos roof, metal roof, mangroves, no-mangroves] | Kappa index value of 0.93 was found for the generated maps |
WV-2 [ | Zhengzhou city, China. (Urban land cover) | OOC | 5 [Vegetation, water, road, building, space land] | 84.3144% Kappa = 0.7807 |
SPOT [ | Central-north Poland. (Land cover mapping) | Rule-based classification-OOC approach | 13 [Continuous built-up land, discontinuous built-up land, industrial units, construction sites, green urban areas, arable land, grasslands, gardens, coniferous forests, deciduous forests, mixed forests, deforestations and water] | Overall accuracy-89.1% Kappa coefficient-0.87 |
Satellite Image | Study Area | Classifiers | Results |
---|---|---|---|
Landsat ETM + and Terra ASTER [ | Istanbul, Turkey | MXL and ANN | MXL algorithm dominated road class over the image whilst ANN classifier was slightly sensitive to inland water class. |
Landsat 7 ETM + [ | Ayvalık district, Turkey | PP, MD and MXL | MXL gave better results than MD and PP. |
Landsat TM [ | Saudi Arabia | ISODATA, MXL, MhD and MD | MXL method gave the best results while both MD and MhD methods overestimated agriculture land and suburban areas |
WV-2 [ | Larsemann Hills, Antarctica | SVM, MXL, NNC, SAM and Winner Takes All (WTA) | Results indicate that the WTA integration and the SVM classification methods were more accurate than the MXL, NNC, and SAM classification methods. |
---|---|---|---|
QB and WV-2 [ | São Paulo, Brazil | DT, RF, SVM, Regression tree (RT) | RF achieved the highest accuracy (κ = 0.95), followed either by the RT (κ = 0.85) or the DT (κ = 0.77). The SVM, maybe due to the high dimensionality and over-fitting issues, was the algorithm that performed the worst. |
GE-1 and WV-2 [ | Cuevas del Almanzora, southern Spain | OOC-NN and SVM | The overall accuracy attained by applying NN and SVM to the four MS bands of GE-1 were very similar to those computed from WV-2, for either four or eight MS bands. The best overall accuracy values were close to 90%, and they were not improved by using multi-angle ortho-images. |
GE-1 and QB [ | Fredericton, Canada | FM and CM | The overall accuracies using FM stand higher than those of CM. The overall accuracy and kappa coefficient for QB image classification was better than that of the GE-1 image. |
IKONOS [ | Tanzania | OOC using mathematical and morphology analysis | OOC based on multi-resolution segmentation and mathematical morphology analysis procedures performs best with a spatial accuracy above 85% and a statistical accuracy above 97%. |
SPOT [ | Nile river, Egypt | Contextual classifier, MXL and MD | The MXL classifier yielded the best classification accuracy (up to 97%) compared to the other two classifiers. |
SPOT and Landsat TM [ | Northern Territory Tropical Savanna, Australia | Supervised image classification with and without ancillary data―NDVI, DEM, slope model & hydrology. | The producer accuracy on average (40%) was higher for the image classification (without ancillary data) and a marginal difference in user accuracy (5%). For the integrated approach (image plus ancillary data) producer and user accuracies were 34% and 35% respectively. |
IRS LISS III [ | East Sikkim, India | BAN and hybrid classification. | Overall accuracy was found to be 90.53% using the BAN classifier and 91.57% using the Hybrid classifier. |
GE-1 [ | Beijing, China. | BAN and NBC | The best mean overall classification accuracy is 86.2% (BAN). As expected, BAN gives better classification results than NBC. |
WV-2 [ | University Putra, Malaysia | OOC including fuzzy rule-based and SVM | Classification result of supervised SVM contained mixed objects and misclassifications of impervious surfaces and other urban features. Rule-based classifier (overall accuracy = 93.07%) performed better than supervised SVM (overall accuracy = 85.02%) resulting in finer discrimination of spatially and spectrally similar objects. |
shape and relations to adjacent regions. Many studies have been performed comparing the PBC and OOC approach [
On comparing PBC and OOC methods, Xiaoxia et al. [
Classification of RS data using PBC in mountainous terrain is challenging because of variations in the sun illumination angle, on the other hand, OOC can utilize GIS tools for improvement of classification results. Gholoobi et al. [
Traditional PBC uses a combined spectral response from all training set pixels for a target class. Therefore, the resulting signature comprises spectral responses from a group of different land covers in the training samples, and the classification system merely ignores the impact of mixed pixels [
1) The traditional PBC method can’t make the best use of the relationship between pixel and pixels around it, which makes the classification results become incoherent, caused “salt & pepper phenomenon” [
2) While proven highly successful with low to moderate spatial resolution data, these pixel-based classifiers have produced unsatisfactory classification accuracies with hyperspatial data [
3) Major among these is that a pixel’s spatial extent may not match the extent of the land cover feature of interest [
4) Another common problem, though, and one that is less often considered, is where the object of interest is considerably larger than the pixel size [
5) The failure of PBC techniques is because of the fact that these methods are based on the assumption that individual classes exhibit uniform visual properties. As we increase the spatial resolution of data, the intra- class variation increases and this property of class uniformity is hampered leading to very poor performance [
1) Considering the drawbacks of PBC and visual interpretation classification methods, the OBC technology is produced. OOC not only uses spectral information of land types, but also supplements image’s spatial position, shape characteristic, texture parameter and the relationship between contexts, which effectively avoid the “salt & pepper phenomenon” and greatly improve the accuracy of classification [
2) Importantly, this object-based information can be integrated with other spatial data in vector-based GIS environments, and used widely in spatial analysis [
3) The change of the classification units from pixels to image objects reduces within-class spectral variation and generally removes the so-called salt-and-pepper effects that are typical arise in PBC [
4) A large set of features characterizing object’s spatial, textural, and contextual properties can be derived as complementary information to the direct spectral observations to potentially improve classification accuracy [
Two types of errors often exist in image segmentation including over-segmentation and under-segmentation. These segmentation errors could affect the subsequent classification process in two ways [
1) Under-segmentation results in image objects that cover more than one class and thus introduce classification errors because all pixels in each mixed image object have to be assigned to the same class.
2) Features extracted from mis-segmented image objects with over-segmentation or under-segmentation errors do not represent the properties of real objects on the Earth’s surface (e.g. shape and area), so they may not be useful and could even reduce the classification accuracy if not chosen appropriately.
Classification algorithms are variedly explored in cryosphere for extracting geospatial information for various scientific and logistic applications. One of the most common applications of classification of satellite imageries in cryosphere is to understand temporal changes in geographical phenomena. Some of the case studies are depicted in brief on the basis of satellite data, classifiers, and significant results in
Jawak and Luis [
Satellite Data | Classification | Results |
---|---|---|
WV-2 [ | Automated “spectral-shape” procedure | Accuracy for detecting actively flowing supraglacial streams, particularly in slushy areas where classification performance dramatically improves (85.2% success) versus simple threshold methods (52.9% and 59.4% success for low and moderate thresholds, respectively). |
Landsat TM [ | ISODATA, MXL | ISODATA clustering depicts that ice and snow in cast shadow are partly unmapped. For MXL, regions in the cast shadow without glacier ice and also the mixed pixels with ice/snow and terrain along the glacier outline are mapped as a glacier. |
IRS P6-AWiFS and TERRA-ASTER [ | Supervised MXL | Overall accuracy of classifications obtained with the use of various band combinations has been found to range from 74.72% to 89.35%. The highest overall accuracy (i.e., 89.35%) resulted from a glacier terrain map derived from a band combination having two optical and two thermal bands―IB1, IB3, IB6 and IB8. |
ERS-1 SAR [ | Supervised MXL | Kappa coefficients for two images are 0.49 ± 0.02 & 0.48 ± 0.02, respectively, at the 95% confidence level. This represents a classification accuracy of about 50% for each of the SAR images. |
Landsat and ASTER [ | Morphometric glacial mapping (MGM) method | A glacier mapping using a TM4/TM5-ratio image in combination with an MSI analysis to eliminate misclassified pixels was successfully applied to clean-ice glaciers. MGM was found to be capable to identify supraglacial debris. |
Landsat and MODIS [ | SIRs | The six-band ETM+ sensor discriminated surface features more sensitive than those of the two-band AVHRR and MODIS data, which discriminated Blue Ice Areas (BIAs) from exposed rock and snow. The higher spatial resolution and better spectral signatures of the ETM+ data improved BIA recognition. |
Landsat and ASTER [ | ANN | The overall accuracy was 0.64 with a kappa coefficient of 0.26, which is not satisfactory. In comparison with the independent vector debris layer. The overall accuracy is 0.75 and the kappa coefficient is 0.22. The performance of ANN classifier was not convincing. |
Landsat and IRS LISS III Study Area: Alam Chal glacier, Iran [ | SIRs and K-means | The image of IRS_LISS could not be used for snow mapping due to the fact that its spectral bands were not appropriate for this application. |
TERRA SAR-X [ | RF | Classification of the glacier surface is carried out with an overall accuracy of 93.72%. |
WV-2 [ | CSIRs | The land-cover map generated from using four CSIR combinations had a K value (0.98) significantly higher than for the land-cover map generated using one CSIR combination (0.92). |
Landsat [ | Supervised MXL | The overall accuracy of classification performed for the snow- and ice-covered parts of the glaciers was 86.29% with a Kappa coefficient of 0.84. |
WV-2 [ | SVM, MXL, NNC, SAM and WTA | The overall accuracy of the WTA method was 97.23% (96.47% for SVM classifier) with a 0.96 kappa coefficient (0.95 with the SVM classifier). The accuracy of the other classifiers was 93.73% to 95.55% with kappa coefficients of 0.91 to 0.93. |
Several factors affect the accurate information extraction from the RS data. Factors affecting the information extraction accuracy may range from inappropriate satellite data selection, insufficient resolution of the data for extracting a particular information, noise in the satellite data, atmospheric errors, cloud cover in the image and much more. After studying the several cases of information extraction methods, we gathered the information which shows the factors that have affected information extraction approach in a particular study.
Satellite Data | Classifiers | Factors Affecting Information Extraction |
---|---|---|
GE-1 and QB [ | FM and CM | The presented method performed quite well in vegetation detection, however, the CM missed to classify a huge number of segments related to vegetation. Since this method was very good at detecting vegetation, the overall accuracy and kappa coefficient for QB image was higher than that of the GE-1 image. |
GE-1 and WV-2 [ | OOC analysis-NN and SVM | Classification algorithms frequently used in OOC approaches, as is the case of NN, do not perform well on a high-dimensional feature space, due to problems related to feature correlation. On the whole, NN performed quite well with bare soil, greenhouses and nets, whereas SVM only outperformed NN in the case of the building class. However, the performance of SVM when classifying vegetation and orchards was quite poor. The shape and geometric features (B + Sh), ratios to the scene (B + Rs) and texture features based on GLCM (B + T) did not contribute to improving the classification. |
IKONOS [ | Ensemble Classifiers-DT, ANN, Adaboost with DT as base-classifier and RF, which uses a CART like DT | Ensemble classifier brought an increase of 6% - 12%. ANN shows better results where it outperforms a DT by more than 8%, but only RF has a higher accuracy (+4%) than ANN. For all ensemble classifications using DT as the base learner, the computing time is still very fast even with hundreds of classifiers more. Hence, it would be beneficial to implement binary strategies with ensemble classifiers. (This, however, does not apply to ANN for which training costs will be substantially higher and easily become prohibitive). |
IKONOS [ | SVM, ANN, MhD, and MXL | Non-parametric methods SVM and ANN performed better than parametric methods (MXL and MhD), mostly for the less separable and heterogeneous classes. When classifying complex data sets, SVM and NN appear to be better options because they don’t assume any data distribution while MXL should be used with good results when the data distribution is Gaussian. Classifications using SVM, ANN or MXL techniques could be improved by increasing the quantity and quality of training sites. |
QB [ | Knowledge-based methods | Suitable training pixels can improve the overall accuracy to κ = 0.95. Mistakes in the assignment mainly occur with pixels close to the border to the next age class. The use of knowledge bases and additional data increases the number of separable classes and leads to better results than simple supervised or unsupervised classifications. A major problem is that classes which are not considered in the knowledge base will introduce mistakes into the classification. Usually, most pixels that are assigned as training pixels are classified correctly and only a negligible minority is misclassified. |
QB [ | Ensembles classifier-PP, MD, MXL, Fisher and K-NN | The output maps from different supervised techniques had an overall moderate accuracy with a range from 41% to 56%. Although such results are only moderate by statistical standards, these are not unrealistic values, especially in RS of marine environments as this could be caused by a number of factors such as the delay in the time of ground truthing and the time when the satellite images were taken and the water column correction algorithm may not remove the effects of water attenuation completely. |
WV-2 [ | OOC | The result of the scene classified was improved because the attributes related to both channels Yellow and Red-Edge are better defined, eliminating confusions which occurred in the past, for instance, with classes Ceramic roof and Bare soil. This is due to the fact that this channel is positioned spectrally at the absorption end from the red and beginning of the infrared wavelength for part of the vegetation. So it is partially sensitive to the spectral behavior of different vegetation types from this region, which was expected for the spectral bands of this satellite system. |
WV-2 [ | OOC | The classification method and corresponding classification parameter are low in flexibility and portability; it is difficult to solve the influence of the shadow of buildings and crown in classification. |
SPOT [ | Rule-based classification-OOC | The overall accuracy of classification was 89.1%. Main misclassifications are due to confusion with two classes: mixed forests and arable land. The lowest accuracy was reached for class “gardens”; although producer’s accuracy is 100%, user’s accuracy reaches only 42.5%. It means that too many objects were attributed to this class. In case of this class, the applied classification approach proved to be not sufficient. It is caused by spectral heterogeneity of class “gardens” and spatial resolution of SPOT image, which is not adequate for recognizing texture features formed by relatively small objects. |
position of object borders in satellite images and complex resemblance of the segments to different classes”. Thus, we could conclude that there are several factors which affect the extraction method depending upon satellite data, resolution of the image, reference data, number of classes, spectral bands of the image, shadow effects in the image, object size, pan-sharpening algorithm, re-sampling methods [
The classifications are broadly based on PBC approaches or the OOC approaches. Both the methods have their own advantages and disadvantages depending upon their area of application and most importantly the RS datasets that are used for information extraction. Traditional PBC makes use of combined spectral responses from all training set pixels for a given class. Hence, the resulting signature comprises responses from a group of different land covers in the training samples. Thus, this classification system ignores the effect of mixed pixels. PBC approach has many disadvantages when compared to OOC, especially in HR satellite data processing. Though proved to be highly successful with low to moderate spatial resolution data, PBC produced quite a lot unsatisfactory classification accuracy results with hyperspectral data. As PBC approach lacks visual interpretation, OOC approach covers the drawbacks of PBC approach and results in outstanding classification accuracies. OOC can also be integrated with other spatial data in vector-based GIS environments and can be used widely in spatial analysis. After undertaking a complete literature survey, it can be concluded that for HR satellite image classification, OOC approach is considered the most suitable approach by most of the researchers as compared to PBC approach. The traditional PBC approaches cannot make the best use of the relationship between pixel and pixels around it, which makes the classification results, become incoherent, causing “salt & pepper phenomenon”. Though PBC approach is considered very effective for low to moderate spatial resolution data, it is not ideal for HR and VHR satellite data. PBC cannot distinguish surface features which have different objects with the same spectral characteristics. In almost all the case studies, OOC approach resulted in greater accuracy ranging from 84% to 89% (approximately).
OOC can, not only use spectral information of land types, but also use images’ spatial position, shape characteristic, texture parameter and the relationship between contexts, which effectively avoid the “salt & pepper phenomenon” and greatly improve the accuracy of classification. Information extraction in cryospheric regions accounts for several difficulties due to very similar spectral responses of the snow and ice, water, and blue ice, rock and shadow, etc. In the case studies performed in cryospheric regions, we can conclude that HR RS data is the best fit for the information extraction. CSIR [
Information extraction from HR and VHR RS data can be best done by the OOC methods, CSIR and by using ensemble classifiers. These classification approaches have given significant accuracy results in the case of HR RS data. Also, mapping of cryospheric regions has not been done much so far using these approaches. Thus, extraction of supraglacial lakes, supraglacial ponds, BIAs, supraglacial streams and dust and debris mapping can be done by utilizing these less explored approaches in HR RS studies. Information extraction using HR RS data lacks the application of the OOC approach in cryospheric regions. There is a need to explore OOC approach in much more detail as it has a lot of advantages over PBC approach.
We acknowledge Dr. S. Rajan, Director, NCAOR for his encouragement and motivation of this research. We acknowledge Dr. T. P. Singh (Director, SIT), Dr. Kanchan Khare (HOD, Department of Civil Engineering, SIT), Prof. Sagar Kolekar (SIT), and Prof. Rushikesh Kulkarni (SIT), for their cooperation. We also thank Ms. Prachi Vaidya, India for her constructive comments on the draft version of the manuscript. This is NCAOR contribution No. 15/2015.
Shridhar D.Jawak,PraptiDevliyal,Alvarinho J.Luis, (2015) A Comprehensive Review on Pixel Oriented and Object Oriented Methods for Information Extraction from Remotely Sensed Satellite Images with a Special Emphasis on Cryospheric Applications. Advances in Remote Sensing,04,177-195. doi: 10.4236/ars.2015.43015