A simple and fast approach based on eigenvalue similarity metric for Polarimetric SAR image segmentation of Land Cover is proposed in this paper. The approach uses eigenvalues of the coherency matrix as to construct similarity metric of clustering algorithm to segment SAR image. The Mahalanobis distance is used to metric pairwise similarity between pixels to avoid the manual scale parameter tuning in previous spectral clustering method. Furthermore, the spatial coherence constraints and spectral clustering ensemble are employed to stabilize and improve the segmentation performance. All experiments are carried out on three sets of Polarimetric SAR data. The experimental results show that the proposed method is superior to other comparison methods.
The fully-polarimetric synthetic aperture radar (SAR) [
The segmentation process is based on the choice of features and classifier. Selecting good features can obtain better segmentation result than only improving classifier. In the existing segmentation of polarimetric SAR, what is generally used as features is polarimetric information, and the texture or gray information of image [
The existing classifier can be divided into the supervised and the unsupervised [
Clustering usually means to group in accordance with the similarity between objects. The distance is the most common similarity metric, which reflects the similarity between objects by measuring the difference of the objects. In practical applications, the choice of distance depends on the characteristics of the object, and it is generally applied to cluster as similarity metric. In the used POL-SAR segmentation, according to the coherency matrix which is obeying the Wishart distribution, Anfinsen [
In our method, firstly, we choose the eigenvalues of coherency matrix as the input features, which include the essential information of coherency matrix and represent the intensity of scattering. Secondly, we apply Mahalanobis distance as the similarity metric by studying the statistical properties of eigenvalues. And taking into account the neighborhood information of image, consistency constraints will be applied to the similarity metric. Thirdly, the above similarity metric is applied to spectral clustering algorithm to complete the segmentation. At last, in order to improve and stabilize the segmentation results, the strategy of cluster ensemble is used.
The full-polarimetric SAR data can be expressed by complex scattering matrix S :
S = [ S h h S h v S v h S v v ] (1)
where, h and v represent horizontal and vertical polarization modes, respectively. It is commonly assumed that natural targets exhibit reciprocity, S h v = S v h . The above scattering matrix also can then be expressed as the scattering vector k .
k = 1 2 [ S h h + S v v , S h h − S v v , 2 S h v ] (2)
In order to better explain the physical meaning of the scattering process, the coherency matrix T is used:
T = k ⋅ k * (3)
The coherency matrix is a Hermit matrix, T = T ∗ , whose size is 3 × 3.
In order to make better use of the polarization scattering matrix to reveal the physical mechanism, polarization data usually is broken down into different components by decomposition [
Cloude decomposition:
[ T ] = [ U 3 ] [ λ 1 0 0 0 λ 2 0 0 0 λ 3 ] [ U 3 ] * T (4)
where λ i is eigenvalue, [ U i ] is eigenvector corresponding to eigenvalue λ i , i = 1 , 2 , 3 . Each eigenvector represents a scattering mechanism, and the corresponding eigenvalue represents the intensity of the scattering mechanisms.
The eigenvalues and eigenvectors resulting from Cloude decomposition have been concerned and studied [
So we conclude that the eigenvalues of coherency matrix include rich polarization information. And Gaussian distribution of eigenvalues makes it more convenient to measure than Wishart distribution of coherency matrix.
As described in the reference [
Euclidean distance is the most widely used similarity metric, whose characteristics are as follows: 1) the ranges of different features (
λ1 | λ2 | λ3 | |
---|---|---|---|
The Maximum | 0.0090 | 0.0525 | 0.5580 |
The Minimum | −0.0047 | 0.00005 | 0.0005 |
result, the Euclidean distance between two points depend on the feature λ 3 largely. 2) Without considering the correlation between features, Euclidean distance treats features equally and only integrates the difference of each feature between two points. 3) Euclidean distance is applicable to the data obeying strictly Gaussian distribution. However, our data is not strictly Gaussian distribution, just like the area A shown in
In order to solve the above problems of Euclidean distance, Indian statistician P. C. Mahalanobis has proposed Mahalanobis distance based Multivariate Statistics. It is an effective method of similarity metric between unknown sample sets, and is also called covariance distance. With respect to the Euclidean distance, Mahalanobis distance has the following advantages: 1) Mahalanobis distance is normalized distance of non-uniform distribution in the Euclidean space, balancing the ranges of different features. 2) Mahalanobis distance is based on the distribution of features in the entire space, therefore, to better describe the similarity between two points. 3) Mahalanobis distance is applicable to the data obeying Gaussian distribution approximately [
S Euclidean = ( x − y ) T ⋅ ( x − y ) , (5)
S Ma = ( x − y ) T ⋅ C − 1 ⋅ ( x − y ) , (6)
where, x , y are features of two points. C is covariance matrix, varying with input data. Thus, the similarity metric with Mahalanobis distance is adaptive.
In the process of segmentation and classification, the probability of pixels in the image and its neighborhood having the same class attributes is large, called spatial coherence constraints. So, we choose the similarity metric with spatial coherence constraints [
In summary, the final used affinity matrix is:
S + α S ¯ (7)
where,
S i j = ( x i − x j ) T ⋅ C − 1 ⋅ ( x i − x j ) , (8)
S ¯ i j = 1 N R ∑ x r ∈ N k ( x i − x r ) T ⋅ C − 1 ⋅ ( x i − x r ) (9)
where, C is covariance matrix, x i , x j are the features of the ith, jth image pixels. N k are the pixels whose center is x j and neighborhood window is k × k . N R is the number of pixels included in window.
As described in [
With above similarity metric, grouping data is to complete clustering. However, some of the commonly used statistical-based clustering algorithm such as EM, demands obedience distribution, and is sensitive to the initialization. At the same time, when Gaussian fitting, the statistical properties of mixed-pixel is unstable. So EM algorithm is not suitable for the eigenvalues, just like the results displayed in
Spectral clustering is a typical clustering algorithm based similarity metric. Spectral clustering algorithm is no longer required a convex structure of the data to ensure a good result, and which is also a discriminant method. Instead of making assumptions of the global structure of data, spectral clustering algorithm firstly collects local information to indicate the possibilities of whether two points belonging to the same class, then makes the global decision based on a clustering criterion to divide all the data points into irrelevant sets.
Spectral clustering has good clustering results, but for larger POLSAR data, the application of classic spectral clustering algorithm has been limited [
the integral characteristic function. The method first randomly select a small part of sample from all of the samples as representative points to solve the characteristic problem, and then extend eigenvector to the similarity matrix for the entire sample set.
The main steps of Nyström algorithm are as follows:
Step 1. Randomly selected m sample points as sample subset;
Step 2. Form the affinity matrix of the subset W ∈ R n × n , where W i j = exp ( − ‖ x i − x j ‖ 2 / 2 σ 2 ) , i ≠ j , W i i = 0 ;
Step 3. Eigendecompose W, obtain the eigenvalues and the corresponding eigenvectors of W, then extrapolate eigenvectors of the entire similarity matrix;
Step 4. Clustering first n-dimensional eigenvectors into n clustering via k-means, as the final segmentation results.
Where, W means similarity matrix between sample points to be cluster, and it contains all the information required to cluster. In our method, W is conducted as S + α S ¯ .
The clustering ensemble is a final division of multiple clustering results of given task, and the division has better robustness, novelty and stability. The key issue is how to obtain better clustering results based on combinations of different cluster results membership, also means the construction and choice of consensus function.
Consensus function gives multiple clustering results a final division. In [
Nyström algorithm can effectively reduce the computational complexity. However, clustering results are instable as a result of the random sampling. So we make use of cluster ensembles to keep stable segmentation results. Apply Nyström algorithm for k times, and every time random sampling the same amount of sample to obtain clustering labels { l a b e l 1 , l a b e l 2 , ⋯ , l a b e l k } . And map the labels into the final result with MCLA.
Our algorithm process is divided into three steps: pretreatment, similarity metric, spectral clustering ensemble. Pretreatment: refined Lee filter with window 7 × 7, and Cloude decompose to obtain eigenvalues as input features. Similarity metric: construct similarity matrix with Mahalanobis distance. Spectral clustering ensemble: spectral cluster for several times and ensemble with MCLA. The flow chart of eigenvalue similarity metric based spectral clustering is as shown in
1) Flevoland Data Set:It is NASA/JPL AIRSAR L-Band POLSAR dataset of Flevoland, the Netherlands, which has the size of 1024 × 750 pixels. The pixel size is 6.6 m in the slant range direction and 12.10 m in the azimuth direction. In
2) San Francisco Data Set:It is the fully polarimetric L-band airborne SAR data acquired with the AIRSAR sensor of the NASA/JPL at the test site of San Francisco bay, which has a mixed scene of urban, vegetation and ocean. The original data has the size of 1024 × 900 pixels, and experimental data is the size of 800 × 500. In
For clustering, the divisibility between two categories depends on the distance between them. So we make a comparison of Euclidean distance and Mahalanobis distance of each of the two categories in
And we also can see the results of
the distance between two categories, but still not enough to separate every point.
In order to better demonstrate the effectiveness of our method, we choose the contrast algorithms: 1) H/a/A_Wishart [
a) OVER ALL Flevoland Data Set
For image Flevoland, in our method,the number of random sample is 70, the size of neighborhood window k = 3 because this image doesn’t has fine texture but has small blocks, α = 3 , the number of classes is 7. The number of classes of the algorithm (spectral clustering_Wishart) is also 7, and the algorithm (H/a/A_Wishart) is fixed 16 then mergered to be 7 manually.
The ground truth does not provide a label for each pixel of the entire image so the accuracy calculation is limited to only those pixels where the ground-truth provides a label. Partial ground-truth map is shown in
P ¯ = 1 N ∑ i = 1 K ∑ j = 1 , i ≠ j K C o r r e c t ( i , j ) (10)
where Correct is the number of pixels emergimg both in the ground truth and classifying result for each category terrain object, N is the total number of pixels, and K is the category number of terrain, i, j is the pixel from a kind of category.
From
Areas Accuracy | P ¯ | |||||||
---|---|---|---|---|---|---|---|---|
H/a/A_Wishart | ― | 0.90 | 0.82 | ― | 0.72 | 0.32 | 0.85 | 0.72 |
Spectral_Wishart | 0.3 | 0.68 | 0.97 | ― | 0.95 | ― | 0.886 | 0.76 |
Our Method__Eu | ― | 0.88 | ― | ― | 0.93 | 0.50 | ― | 0.33 |
Our Method__Ma | 0.76 | 0.878 | 0.95 | 0.92 | 0.94 | 0.93 | 0.93 | 0.90 |
application of spatial information and Cluster ensembles. In addition, H/a/ A_Wishart and spectral clustering_Wishart have the similar results, but the latter can converge faster.
b) PARTIAL Flevoland Data Set
In order to improve the credibility of the contrast algorithm, and contrast with reference [
c) San Francisco Data Set
In order to demonstrate the robustness of our method, the San Francisco bay data is tested by four algorithms. The number of random sample is 70, k = 3 , α = 1 , the number of classes is 3. The contrast algorithms results are mergered to be 3 manually. As can be seen from the two marked details, our method is superior to other methods in the shape. At the same time, ocean, city and forest are classified clearly, and the forest at the upper left corner has good regional consistency. The overall segmentation accuracy of four methods is shown in
Accuracy Area/GroudTruth | H/a/A Wishart | Spectral_ Wishart | Our Method __Eu | Our Method __Ma |
---|---|---|---|---|
/ | 0.91 | 0.63 | 0.19 | 0.98 |
/ | 0.78 | 0.31 | 0.66 | 0.87 |
In sum up, our method have good segmentation performance. The main advantages of this method are simple, fast and effective. In term of running time, for the image of Flevoland, the Nyström algorithm needs 40 s when we choose the sample number as 70. We choose the ensemble time N as 3, so the running time of the entire program is about 2 minutes. The affect on the images of San Francisco bay and Xi’an city is small by random sampling, so the process of cluster ensemble can be bypassed. And the contrast algorithms need the process of Wishart iteration, which consumes time. In the term of validity, our method can guarantee the accuracy of the overall segmentation, at the same time keep the details on good results.
At the same time, we can see the method has good robustness from the above three images. This is because the eigenvalue expresses the main information of T matrix.
This paper introduces an approach for segmentation of the POLSAR data based on eigenvalue similarity metric. From the scientific and application point of view, it is a new approach of data processing. With analysis of the characteristic eigenvalue, we propose a new construction method of similarity metric. As a result, our method reduces the complexity of the spectral clustering for POL-SAR image segmentation, avoiding the choice of Gaussian kernel parameter and completing clustering effectively. From the experimental results, we can see that the proposed method has low time cost. Therefore, our method satisfied processing level for the land cover observation with use of SAR image. At the same time, our method not only keeps the overall classification accuracy, but also has more details of land cover. So our method can be applied to polarimetric SAR image recognition. However, there are still some problems like the edge blur. Our future work is to enhance the distinguish ability of the feature by adding other type features, such as texture and deep abstract features.
This work was supported in part by the National Natural Science Foundation of China (Nos. 61472306, 91438201, 61572383), and the fund for Foreign Scholars in University Research and Teaching Programs (the 111 Project, No. B07048).
Gou, S.P., Li, D.B., Hai, D., Chen, W.S., Du, F.F. and Jiao, L.C. (2018) Spectral Clustering with Eigenvalue Similarity Metric Method for POL-SAR Image Segmentation of Land Cover. Journal of Geographic Information System, 10, 150-164. https://doi.org/10.4236/jgis.2018.101007