Holoscopic 3D imaging is a true 3D imaging system mimics fly’s eye technique to acquire a true 3D optical model of a real scene. To reconstruct the 3D image computationally, an efficient implementation of an Auto-Feature-Edge (AFE) descriptor algorithm is required that provides an individual feature detector for integration of 3D information to locate objects in the scene. The AFE descriptor plays a key role in simplifying the detection of both edge-based and region-based objects. The detector is based on a Multi-Quantize Adaptive Local Histogram Analysis (MQALHA) algorithm. This is distinctive for each Feature-Edge (FE) block i.e. the large contrast changes (gradients) in FE are easier to localise. The novelty of this work lies in generating a free-noise 3D-Map (3DM) according to a correlation analysis of region contours. This automatically combines the exploitation of the available depth estimation technique with edge-based feature shape recognition technique. The application area consists of two varied domains, which prove the efficiency and robustness of the approach: a) extracting a set of setting feature-edges, for both tracking and mapping process for 3D depthmap estimation, and b) separation and recognition of focus objects in the scene. Experimental results show that the proposed 3DM technique is performed efficiently compared to the state-of-the-art algorithms.
The development of 3D technologies for the acquisition and visualisation of still and moving 3D images remains a great challenge. Holoscopic 3D Imaging also known as Integral Imaging is a type of autostereoscopic 3D technology that offers completely natural 3D colour effects almost like the real world [
Today’s 3D display technology supplied to the viewers is based on stereovision techniques, which require the viewers to wear a spatial pair of glasses to perceive the left and right eye image via spatial or temporal multiplexing [
Many research groups have proposed multi-view autostereoscopic 3D displays where the viewers will not need to wear glasses; however, it requires the viewers to stay within a defined distance range from the display and it also fails to simulate and produce natural viewing experience. As a result, such techniques are liable to cause eye strain, fatigue, and headaches during continuous viewing while the viewer’s focus on the image [
Currently, advances in micro-lens manufacturing technologies as well as increases in processing power and storage capabilities mean that holoscopic 3D imaging has become viable for future 3D imaging and display technology. Thus, it has attracted a great attention of scientists and researchers from very various disciplines.
The fundamental principal of holoscopic 3D imaging is to capture the spatio-angular distribution of the light rays from the object using a lens array [
This paper presents a 3D depth estimation algorithm with feature-based technique to generate robust geometric 3D structures from a holoscopic 3D image. It facilitates correspondence matching for a small set of regularly spaced pixels in the EI array. The AFE descriptor is implemented using MQLHA in the extracted stage. The extracted features are the combination of the edge and point features that have the ability to handle relatively featureless surface contours.
The quality of the segmentation for object recognition results assumes vital significance for evaluating which targets the objectives (i.e. identify precise object contours and coarser) of object presence detection algorithm have met. This paper addresses the 3D depth estimation challenge in holoscopic 3D imaging system, which improves the recognition by accurately identifying the object’s contours.
The 3D imaging technologies overcomes the 2D imaging limitations in terms of object size processing. They offer more low-level features and cues compared to the 2D imaging technology and are thus attracting researchers’ interest in the 3D area.
Manolache et al. [
A practical feature-based matching approach for obtaining unidirectional 3D depth measurement from the holoscopic 3D image through disparity analysis and viewpoint images extraction was explored and presented by Wu, et al. [
rather than a macro block of pixels corresponding to a micro-lens unit. Each VPI presented a two-dimensional parallel recording of the 3D scene from a particular direction. The calculation of the object depth was from the disparity estimation between these VPIs’ displacements using a depth equation. The resulting 3D images contain enormous amounts of non-useful information and homogeneous regions.
Zarpalas, et al. [
Recently, the authors’ in [
The authors created automatic descriptors, which concentrated on extracting reliable sets of features to exploit the rich information contained in the central VPI. Furthermore, the approach was built on a new corresponding and matching technique based on an adaptive aggregation window size. A novel approach to setting feature- point blocks and matching of the corresponding features was achieved automatically. The advantages of this framework was that it solves the disadvantages identified in [
Recently, an innovative approach has been proposed for 3D depth measurement [
This framework is an extension of the author’s previous work [
The proposed AFE detection algorithm can extract object boundaries and discard non-boundary edges caused by textures on object surfaces that affect the accuracy of 3D measurement.
In this approach, the model-building task of the 3D depthmap estimation depends on either the explicit recovery or indirect inference of the 3D features embodied in the depth data. The 3D objects can be adequately modelled in terms of feature extraction from the plane patches and spatial connection among the plane patches to identify the structure of the 3D world and provide the essential constraints of matching. The efficacy of the approach crucially depends upon the robust estimation of these features and their connection. The proposed approach is an unsupervised edge detection algorithm that automatically determines thresholds in image edge extraction. The non-boundary feature-edges are not desired in the depthmap estimation stage. The initial extracted feature-edges are optimized by other phases of the proposed AFE detector to extract the reliable feature-edge that is used for the depthmap estimation stage.
The three-stages of modeling the proposed approach for generating an accurate 3D depth cues map that is used to estimate and extract 3D object are shown in
The key idea behind this stage is the resampling of the collected data into the form of a VPI for the subsequent process. The holoscopic 3D imaging records a set of 2D images from a 3D scene on a photographic film placed at the focal plane of the micro-lens sheet (
To this end, the “Viewpoint Image Extraction” is an important and initial stage, which transforms the EIs into VPIs by exploiting the strong interval correlation between pixels displaced by one micro-lens. To obtain single orthographic (VPI) view from Unidirectional or Omni-directional H3DI data, will require sampling of all pixels in the same location under different micro-lens, resulting in a single view direction of the recorded scene. This image is called the viewpoint image (VPI) containing information of the scene from one particular view direction. The author’s previous work [
A Modeling Auto-Feature-Edge (AFE) detector algorithm for holoscopic 3D imaging system is necessary to produce an accurate true volume. The main aim of using an AFE is to have a single detector that is able to integrate 3D cues and locate objects in a scene. The feature detection stage detects both edge-based and region- based features using the automatic threshold selection method for image analysis to provide an efficacious feature. For each FE block this will be an individual process of looking for large contrast changes. The FE points are pixels at or around which the image values are subject to sharp variations.
There are three steps to setting features through edge detection. First, suppress noise, then detect and enhance edges, and finally localise and extract accurately features points. These extracted features are used for the integration of 3D cues to locate objects. Thresholding also plays a key role for feature detection due to its simplicity, high speed of operation, and ease of implementation [
achievable when using a single technique across a wide range of image content. To extract distinctive Feature-Edge (FE) blocks, i.e., large contrast changes (gradients), the integration of more than one technique is a suitable approach to enhance the performance of edge-based features shape recognition.
1) Feature Edge Enhancement Phase
First or second order derivative masks in the spatial domain are used in the edge enhancement phase to smooth the image and calculate the potential edge features. The following explains the main steps of the phase:
a) Smooth the VPI to reduce the noise by convolving the VPI with a discrete Gaussian window = 3 from the 2D Gaussian function: (1)
where, the parameter σ indicates the width of the Gaussian distribution that defines the effective spread of the function.
b) The boundary between two regions is detected using the gradient magnitude increase as the contrast between the FE on a smoothed VPI (
Therefore, for the smoothed VPI the gradient can be written as:
c) Thin the contour to a single pixel width, thus, the non-maxima suppression process is performed to maintain only Fes, where their gradient magnitude is the local maximum. Each pixel in the gradient magnitude
2) Feature Edge Extraction Phase
The reliable feature edges are selected in this phase by processing the computed gradient of VPI. The aim of this step is to be able to increase the continuity of the FEs for the multi-resolution aspect to enhance the computational speed and to enhance the technique’s efficiency so it can be used with a range of noise levels. The following set out in more detail this edge extraction process.
The goal of this step is to increase the FEs’ continuity, improve the computational speed of the multi-resolution aspect, and optimize the method for use with various noise levels. The MQALHA algorithm was used to design and implement a new first derivative operator based on the edge detection method to provide the best trade-off between detection and edge localization and resolution [
The following two steps demonstrate the MQALHA algorithm:
First Step: Image Quantization
Using the scheme presented in [
To robustly represent the local histogram without losing information the quantization step is required, and it is needed to enable the processing to be conducted on a small block.
Second Step: Optimal Threshold
To extract the two edge maps at different resolutions, local histogram smoothing and thresholding are undertaken on the quantized VPIs, which are divided into 4 × 4 non-overlapping blocks [
The edge combination process is applied to combine the two EF maps shown in
The objective of this stage is the implementation of the resultant FE map with a practical approach for obtaining depth through extraction VPI and disparity analysis. The approach has been presented in [
The following steps detail the five phases for the generation:
1) Feature Descriptors Phase: For any object in an image, attracting points on the object can be extracted to provide a feature description of the object. These descriptions are procured by the use of training images. A feature edge is considered as a “good” interesting point if its local maxima are greater than the present threshold. Low contrast areas and feature edge points are discarded. Interesting points are extracted from the second stage using the training center VPI to provide a feature description of the object and are stored in a database. Feature descriptors have been designed so that Euclidean Distances (EDs) in feature space are directly used to grade possible matches. The simplest matching technique in corresponding and matching feature blocks is to set a threshold and eliminates all matches not within this threshold value. However, setting the matching threshold faces some difficulties as it dependent partially on the variation of imagery being processed. Accordingly, a novel simple auto threshold technique has been used to detect feature points in the matching process to avoid the need for manual parameter specification and hence time saving. Ideally, its value should be adjusted depending
on the different regions of the feature space. It should be simple to implement and time efficient. Low thresholds result in incorrect matches being considered as correct whereas high thresholds result in correct matches being missed.
2) Feature Tracking and Matching Phase: For the extracted 2D VPIs, it is often necessary to track the displacement of individual features from VPI to VPI. The selected desirable features to track are often obtained in the same way as proper features to match. In the matching process the small displacements from pairs of VPIs (i.e. reference VPI and target VPI) have been calculated using the sum of square difference (SSD) cost function. To speed the process (thus saving time) in the feature tracking and matching phase a learning algorithm has been used to build special purpose recognizers to rapidly search for features in entire VPIs [
where, C(c, d) is the cost function, d is a horizontal displacement vector between pixel p of the window (block) w in the I1 center VPI and (c + d) in the I2 target VPI. The Multi-Baseline Algorithm (MBA) matching score function is used, where there are a number of stereo pairs (P) [
where, SSSD is the sum of SSD which minimizes the depth map (D) and P is the pairs of the VPIs. To determine the initial disparity of the central pixel c (xl, yl), the score for all neighboring pixels (nb) is calculated using different values of
where, cb represents the window around pixel “c” and R is the search area. Disparity is estimated in the “winner-takes-all” method by only selecting the disparity label with the lowest cost. The disparity is verified whether it is dominant in the block cantered on pixel c (cb). If it is, then this disparity is considered as the final disparity and no further refinement is required; if it is not, further refinement is required.
3) Feature Disparity Refinement Phase: The resulting disparity map described above is not optimal because it still contains some noise and errors. Therefore, an extra step is necessary to detect and remove erroneous estimates on occlusions. The aim of this phase is to reduce artefacts and to resample the disparity maps to correct inaccurate full resolution disparity values and handle occlusion areas. For this, the following adaptive weighting factor approach by [
blocks nb, where,
where the high variance is closer to the centre block “cb” and “wc” is the weighting factor for the center point “c”. The colour difference term is calculated as the Euclidean distance between neighbor blocks in the CIELab colour space of pixel cb (Ccb = [Lcb, acb, bcb]) and pixel nb (Cnb = [Lnb, anb, bnb]), which is expressed as:
The voting scheme used was from the support region; therefore, each pixel p collects votes from reliable neighbors as:
support aggregation window of the center pixel “c” and the disparity dn contributes one vote, which is accumulated in the set Votep (d). The final disparity of the cb is decided by the maximum majority weighted vote num-
ber
By use of smoothing term Equation (6) and (8), Equation (9) can be rewritten as:
where,
block nb. With this improvement, the disparity map is more accurate as the calculation of the new disparity only involves the FE blocks belonging to the same region (object).
4) 3D Depthmap Calculation Phase: The depth equation D = (d・ψ・F)/△ is derived through geometrical analysis of the optical recording process giving the mathematical relationship between the object depth and the corresponding VPI pair displacement “d”, where, “D” is the corresponding depth to be calculated, and Ψ and F are the pitch size and the focal length of the recording micro-lens, respectively. d is the disparity of the object point within two extracted VPIs and Δ is the sampling distance between the two VPIs. The free-noise 3DM is calculated from the VPIs by establishing the corresponding disparity from Equation (10).
5) Feature 3D Map Smoothing: Finally a further disparity enhancement process is used to maintain a good trade-off between accuracy and processing time. A median filter is applied to further smooth the resulting 3D depthmap. The median filter is a robust method, often used to remove impetuous noise from an image [
This section presents the results of the above described approach on two data sets with comparisons to the state- of-the-art 3D depthmap estimation from the H3DI system to quantitatively evaluate the performance of the pro- posed algorithm. The efficiency measurement and its core differences of this technique have been proved by implementing this algorithm on synthetic database, which provides the knowledge of the ground truth. The AFE detector algorithm is tested with both on real-world and computer generated holoscopic 3D content using unidirectional and omnidirectional holoscopic 3D images.
A feature extraction module with a robust edge detection grouping technique often involves the identification of all feature edges in the disparity map analysis. The FEs map is more distinctive and more robust to camera VPI changes. These edges provide robust local feature cues that can effectively help to provide geometric constraints and establish robust feature correspondences between VPIs.
To evaluate the efficiency of the proposed descriptor, i.e. AFE descriptors, Section 4.1 provides description and properties of synthetic database [
Two types of data have been used into evaluate the performance of the proposed technique to evaluate and compare the performance with counterpart techniques.
1) Synthetic Holoscopic 3D Images
In this assessment 32 synthetic H3DIs were used to evaluate the performance of the AFE detection algorithm with known dense ground-truth disparity maps. A sample of 28 dataset images are available to the public in [
2) Real World Holoscopic 3D Images
The proposal algorithm has been tested on real data of unidirectional holoscopic 3D images, which were commonly used with currently state-of ?the art techniques [
To evaluate the performance of the proposed feature edge detection technique, two commonly used feature detection descriptors were used to select local features in VPIs. The evaluation was carried out on real and synthetic unidirectional and omnidirectional holoscopic 3D images with different photometric transformation and scene types. Speeded up Robust Features (SURF) descriptor proposed by Bay et al. in 2006 [
where,
Based on the experimental results, the SIFT descriptor MAPR is 4.48%, while SURF 4.59% and AFE detector 3.55% for the same quantity of the detected FEs (700 - 750). The evaluation results on
The proposed algorithm had a superior performance for identifying more reliable feature-edges and is more robust to noise which is very important features to make the estimated depth map more reliable. Therefore, it overcame most of the difficult situations experienced in disparity analysis techniques, i.e. untrackable (object border, object occlusion and reappearance) regions by identifying strong feature-edge before the matching process. This is mainly acting when two different displacements exist within the matching block in matching analysis between pairs of VPIs. In other words, the performance of the AFE detector technique is of great importance in achieving correct depth estimation on object borders. In terms of speed (run time), the proposed AFE technique was superior than SIFT and slightly slower than SURF, therefore, the current SIFT cannot be used for real-time image or video retrieval and matching [
This section provides a few chosen sets of results from varied captured images with multiple objects. The goal was to select a variety of practically relevant situations where long-range and different plane depths are important. The evaluation of depth map results is based on error measures using the existing ground truth depth map of the synthetic database in [
Algorithm | Number of Detected Features | MAPE % | Matching Time (s) |
---|---|---|---|
SIFT | 690 | 4.48 | 7.167 |
SURF | 720 | 4.59 | 3.756 |
Authors Previous Work [ | 2910 | 5.06 | 9.665 |
Manual Threshold [ | 2875 | 9.78 | 20.592 |
Proposed AFED (Auto Feature Edge Detection) | 4736 700 - 750 | 4.72 3.55 | 10.895 4.086 |
which are used in [
The results in
The performance of the proposed approach has been evaluated on a test-set comprising of a wide variety of real- word and synthetic dataset images to quantitatively compare the effects of various cues. In this section the reported results are based on the following contributions of the proposed AFE algorithm:
A feature extraction module with a robust edge detection grouping technique that often involves the identification of all feature edges in the disparity map analysis. The FEs map is more distinctive and more robust to camera VPI changes, and these edges provide robust local feature information that can effectively help to provide geometric constraints and establish robust feature correspondences between VPIs.
Parameters | S. D |
---|---|
Pairs no. of VPI | 6 |
Window size | 17 × 17 |
No. of neighbour blocks-NBN | 8 |
Max disparity-R | 20 |
Probability voting adaptive window shape size | 3 |
2D Gaussian filter size | 3 × 3 |
Colour threshold pixel neighbour | 2 |
Gaussian standard deviation starting 𝜎r | 0.8 |
Similarity measurement method | SSD |
Methods | MAPE % |
---|---|
[ | 38.281 |
[ | 11.609 |
[ | 27.25 |
[ | 9.278 |
Proposed approach/auto local method | 8.44 |
from the extracted VPIs technique. Furthermore, although the viewpoint is constrained, the image resolution is low so not many details are visible; and indeed, the VPIs’ quality varies (brightness, contrast, and sharpness) due to factors including illumination and focusing. Therefore, feature detection and extraction is constrained by
the VPI and the resolution.
However 3D object segmentation is out of the scope of this proposed work. The proposed AFE algorithm can be used to present a simple image segmentation technique for object recognition from holoscopic 3D imaging system. This can create a separate object segment to form separate objects in the scene from the background for non-overlapping objects. In this technique, the results of the FE detection are the guide to identify regions and then convert every connected FE into a set of connected segments (contour). Therefore, the detected edge pixels are used to automatically extract the seeds required for growing a region through pixel aggregation of the interior pixel area of each object. The advantages of this method are that the segmentation is obtained without a prior knowledge of the number, position, size, or shapes of the objects and there is no requirement for a homogeneity criterion threshold in the region growing process. Furthermore, this mask can be used as a filtering
process to improve the accuracy of the extracted depth map from different methods by removing background noise and separating foreground. For more details see the author’s previous work [
The proposed method achieved a better performance of 3D depth measurement estimation directly from disparity maps of the sampled VPIs. Consequently, setting and extracting efficient and informative features played a significant role in the performance of a precise depthmap estimation that had well-defined boundaries. This paper described a computational method to edge detection, feature-edge extraction, and object identification techniques for a holoscopic 3D imaging system where the edges were the main features. It functioned well with both real and synthetic unidirectional and omnidirectional holoscopic 3D images since we formulated the feature edge detection as a 3D object recognition problem. We devised an optimization scheme that resolved the 3D depth-contour problem. The solution to this was by combining a semi-global disparity map estimation technique with an FE detection technique to automatically generated the 3D depthmap, where the depth cues were derived from an interactive labelling of feature-edges using a novel auto feature-edge detection algorithm. Objects at different depths had been successfully detected and it had been clearly shown that the algorithm successfully overcame the depth contours problem for holoscopic 3D images and produced an accurate object scene. In addition, it improved the performance of the depth estimation related to generalisation, speed and quality. The use of the feature-edge to estimate an accurate 3D object contour map and the recognition and separation of objects in the optimization process along with the smoothness constraints were the basis of its success. Experiments identifying the precise object contours and which objects were present in a scene showed that the proposed method outperformed other state-of-the-art algorithms as illustrated in
Eman Alazawi,Mohammad Rafiq Swash,Maysam Abbod, (2016) 3D Depth Measurement for Holoscopic 3D Imaging System. Journal of Computer and Communications,04,49-67. doi: 10.4236/jcc.2016.46005