_{1}

^{*}

With the advancement in geospatial data acquisition technology, large sizes of digital data are being collected for our world. These include air- and space-borne imagery, LiDAR data, sonar data, terrestrial laser-scanning data, etc. LiDAR sensors generate huge datasets of point of multiple returns. Because of its large size, LiDAR data has costly storage and computational requirements. In this article, a LiDAR compression method based on spatial clustering and optimal filtering is presented. The method consists of classification and spatial clustering of the study area image and creation of the optimal planes in the LiDAR dataset through first-order plane-fitting. First-order plane-fitting is equivalent to the Eigen value problem of the covariance matrix. The Eigen value of the covariance matrix represents the spatial variation along the direction of the corresponding eigenvector. The eigenvector of the minimum Eigen value is the estimated normal vector of the surface formed by the LiDAR point and its neighbors. The ratio of the minimum Eigen value and the sum of the Eigen values approximates the change of local curvature, which determines the deviation of the surface formed by a LiDAR point and its neighbors from the tangential plane formed at that neighborhood. If the minimum Eigen value is close to zero for example, then the surface consisting of the point and its neighbors is a plane. The objective of this ongoing research work is basically to develop a LiDAR compression method that can be used in the future at the data acquisition phase to help remove fake returns and redundant points.

The large volumes of spatial data and their products made it necessary to research new data compression techniques. Much research has been focused on developing compression methods for aerial and satellite imagery [1,2]. As a result, a number of image compression techniques have been developed for these types of imagery. These compression techniques can be generally categorized in two classes: 1) one that reduces the number of bits and creates a numerically-identical replica of the original image; and 2) one that creates a much compressed replica of the image, but with much degraded quality [

LiDAR sensors generate huge datasets of unstructured point clouds of multiple returns, which may be false signals or correspond to natural or manmade features. Because of its large size, LiDAR data has costly storage and computational requirements [

The method uses the Voronoi diagram to evaluate the local density of the LiDAR points and identify clusters within the data. Then, points in the same proximity with elevations within a threshold are selected. The Voronoi tree concept is then used to delete the selected points and update the Voronoi diagram. The final TIN is then built using a randomized incremental algorithm. The methods described above can help produce a compressed LiDAR dataset. However, none of these methods can be used to remove unwanted and redundant points in the LiDAR data set. The method presented herein can be used to remove unwanted and redundant LiDAR point; producing a much compressed LiDAR dataset through spatial clustering and optimal plane fitting.

The method adopted in this research consists of (1) classification and spatial clustering and (2) optimal firstorder plane fitting of the LiDAR dataset. The schematic diagram of

First-order plane-fitting helps in identifying the type of surface defined by every LiDAR point and its neighbors, and therefore the removal of unwanted, redundant LiDAR points, and fake LiDAR returns becomes possible. The technical approach adopted in this study is explained below:

The clustering process in this study was performed in two steps: (a) classification of the digital orthoimage of the study area, which was performed using the Bayesian maximum likelihood classification (BMLC), and (b) spatial clustering of the LiDAR dataset. The goal of spatial clustering is to subdivide the data into separate regions

that are characterized with a unique property in every region’s local neighborhood. These regions are defined by the points located inside them.

In the BMLC method, a Bayesian Probability Function is calculated based on statistics computed from the inputs for classes established from the training sites. This classification begins with computing statistics for user selected training sites of land cover classes and uses the results of the statistical summary to classify the image. Each pixel is judged as to the class to which it most probably belongs. Histogram analysis was performed to locate image clusters using intensity and distance metric [

The sweeping spatial clustering algorithm was used to determine arbitrary shaped, possibly-nested clusters in the LiDAR dataset. This hierarchical spatial clustering algorithm generates spatial clusters in one pass as it is based on the sweep-line concept which is widely known in computational geometry and computer graphics.

This algorithm works in three phases: initializing, sweeping and finalizing. During the initializing phase, the LiDAR points are sorted according to the direction of the sweep-line movement. In the sweeping phase, a sweep-line moves through the plane and stops to update the data structure when it hits a LiDAR point and it continues until the whole LiDAR point set is clustered. In the finalizing phase, the indices of the resulted clusters are ordered in a simple data structure of arrays.

The basic features found in LiDAR point cloud are planes. Having planes, points and edges can be obtained by calculating planes intersections. Two methods are commonly used to identify optimal planes, which are the least square fitting and principal component analysis. First order plane fitting is basically equivalent to the Eigen value problem of the covariance matrix [_{i} in

The covariance matrix of the point p_{i} and its k neighboring points; is expressed as:

where is the centroid of p_{i} and its k neighbors, and is the eigenvector of the smallest Eigen value.

Since is a real, positive, semi-definite matrix, its Eigen values are always greater than or equal to zero. The eigenvector of the minimum Eigen value is the estimated normal vector of the surface formed by pi and its k neighboring points. The other eigenvectors are the tangential vectors of the surface. If the minimum Eigen value is close to zero, then the surface consisting of a LiDAR point and its neighbors is a plane. Note that each Eigen value of the covariance matrix represents the spatial variation along the direction of the corresponding eigenvector. The ratio of the minimum Eigen value and the sum of the Eigen values approximates the change of local curvature, which determines the deviation of the surface formed by a LiDAR point pi and its neighbors from the tangential plane formed at that neighborhood. The optimal planes have been created in this study for the clustered LiDAR points set. And a criterion for keeping or removing unwanted, redundant, and fake LiDAR points has been established based on the optimal plane of the LiDAR dataset obtained using the First Order Plane Fitting method. The success of this compression technique was judged by the compression ratio.

The LiDAR data used in the work was acquired for a study area in the north east region of the City of Venice, which is located in Sarasota County, Florida, United

States (

Classification of the study area ortho-imagery was performed using the Bayesian maximum likelihood classification (BMLC) method. The process started by computing statistics for selected training sites of land cover classes and used the results of the statistical summary to

classify the image. Histogram analysis was performed to locate ortho-image clusters using intensity and distance metrics (

The resulted vector layer was then used to initiate a sweeping spatial clustering algorithm in order to identify clusters in the LiDAR dataset following [

As it can be seen in

LiDAR data in this way helped facilitate the execution of the optimal plane fitting.

Although the optimal planes shown in

A LiDAR data compression method was presented in this ongoing research work based on spatial clustering and optimal plane fitting. The method has produced a compression ratio of 17.8% for the LiDAR dataset of the study area, which is promising. The issue this ongoing study is trying to address however is not only the development of a LiDAR compression method with low computational demands. The objective is to develop a compression method that can be applied at the LiDAR acquisition stage that only records the LiDAR points that are on these optimal planes. If this goal is achieved, it will help to design a LiDAR sensor in the future that will only record points that are located on these planes.