A Robust Stereo Matching Algorithm Based on...
Transcript of A Robust Stereo Matching Algorithm Based on...
A Robust Stereo Matching Algorithm Based on Disparity Growing of Non-Horizontal Edge Features
S.Shirazi1, P.Moallem2, M. Ashourian3
1Electrical Engineering, Tiran Branch, Islamic Azad University, Isfahan, Iran;
Email : [email protected] 2Electrical Engineering Department, Faculty of Engineering, University of Isfahan,
Isfahan, Iran Email : [email protected]
3Electrical Engineering Department, Majlesi Branch, Islamic Azad University Isfahan, Iran
Email : [email protected]
ABSTRACT
In this paper, a robust stereo matching technique using growing component seeds has been proposed in order to find a semi-dense disparity map. In the proposed algorithm only a fraction of the disparity space is checked to find a semi-dense disparity map. The method is a combination of both the feature based and the area based matching algorithms. The features are the non-horizontal edges extracted by the proposed feature detection algorithm. In order to compute the disparity values in non-feature points, a feature growing algorithm has been applied. The quality of correspondence seeds influences the computing time and the quality of the final disparity map. We show that the proposed algorithm achieves similar results as the ground truth disparity map. The proposed algorithm has been further compared with box filtering, belief propagation, random Growing-Component-Seeds and Canny-Growing-Component-Seeds. According to the obtained results, the proposed method increases the speed of computation ratio of the random GCS algorithm by 31% and the ratio of canny GCS algorithm by 13%. The experimental results show the proposed method achieves higher speed, more accurate disparity map, and the lowest RMS errors.
Keywords: stereo matching, feature based matching, area based matching, fast matching, Mathematics Subject Classification: Image Analysais 62H35
Computing Classification System:
1. INTRODUCTION
Stereo vision is an area of study that attempts to recreate the human vision system by using two or
more views of the same scene to derive 3D depth information about the scene. As an emerging
technology, stereo vision algorithms are constantly being revised and developed, and many
alternative approaches exist for implementing a stereo vision system (Munro, 2009), (Foggia 2007).
Stereo vision based obstacle detection is an algorithm that aims to detect and compute obstacle
depth using stereo matching and disparity map. The disparity map or motion field obtained from the
International Journal of Tomography and Simulation [ISSN 2319-3336]; Year: 2014, Volume: 25, Issue Number: 1; [Formerly known as “International Journal of Tomography & Statistics” (ISSN 0972-9976; 0973-7294)]; Copyright © 2014 by CESER PUBLICATIONS
www.ceser.in/ijts.html www.ceserp.com/cp-jour www.ceserpublications.com
matching stage may then be used to compute the 3D positions of the scene points given the imaging
geometry. Matching techniques can be divided broadly into pixel-based, area-based and feature-
based image matching, or a combination of them. Other types of stereo matching methods such as
diffusion-based (Scharstein 1998), wavelet-based (Kim 1997), phase-based (Porr 1998), and filter-
based (Jones 2000) have also been developed. Some other less common techniques for estimating
image motion or optical flow are reported in (Limongiello 2007) or for medical analysis in (Jainy
Sachdeva et. al. 2012) and Deddy Kurniadi (2010).
In our developed system, the pre-supposition of a stereo system are considered which are: (1)
Epipolar constraint; (2) Uniqueness constraint; (3) Smoothness constraint; and (4) Ordering constraint
or monotonicity constraint.
In all matching algorithms, a trade off between the accuracy and the speed exist. The smaller the field
of searching and calculation, the faster the comparison phase and the less calculation cost will be. In
this paper, the proposed algorithm is compared with box filtering, belief propagation, random Growing-
Component-Seeds and Canny- Growing-Component-Seeds.
The next presents basics of stereo matching. We used segment-based stereo matching using belief
propagation for comparison with the proposed method. The proposed method combining area based
and feature based matching algorithms using growing component seeds on the features is explained
in Section 3. Section 4 provides some experimental results on well-known databases. Finally, the
conclusion is given in the last section.
2. MATCHING ALGORITHMS
The Different comparison criterions have been suggested for the analogy of areas and disparity
computation (Haralick 1993), which most of them are the statistic equations between brightness and
light intensity of the left and right window (Sun 2000). There is some kind of function matching for
stereo matching such as: Sum of Squared Difference (SSD), Sum of Absolute Difference (SAD),
Normalized Cross Correlation (NCC) and Sum of Hamming Distance (SHD) (Moallem 2005), (Fazli
2009),(Yoon 2012). In this paper the applied technique for matching is as follows:
ggffggffgfNCC
����
�.
)).((),( (1)
Where f and g are the intensity values the left and right images within a box. The stereo problem is
solved based on an optimization algorithm. Different optimization techniques, including dynamic
programming (DP) (Mohammadi 2010), graph cuts (Jones 2000), (Limongiello 2007), (Kim 1997), and
genetic algorithm (Kim 1996) have been applied under this framework. These techniques try to find a
global optimized solution under some given conditions defined by limiting some parameters. In order
to avoid searching the entire disparity space, a number of algorithms were proposed which greedily
International Journal of Tomography and Simulation [ISSN 2319-3336]
102
grow corresponding patches from a given set of reliable seed correspondences. Such algorithms
assume that the neighboring pixels have similar disparity, not exceeding the disparity gradient limit or
other constraints (Haralick 1985). Some of the benchmarking algorithms which we compare our result
with them are: Disparity Component Growth (Radim 2006), (Radim 2002), (Sharstein 2002); Box
filtering (Xiaoyan 2012), (Sun 1997), and Segment-Based Stereo Matching Using Belief Propagation
(Sharstein, 2002) (Lankton 2009) realization of this work supposes the availability of a great number
of repetitions of samples responding to the same known theoretical model. In practice, as the
theoretical model is unknown, we use the Monte-Carlo method based on the generation of the data by
computer according to a fixed theoretical model.
3. MATCHING ALGORITHMS
In this paper, we propose a novel stereo matching technique combining the feature based and the
area based algorithms together. Fig. 1 shows an overall view of the proposed algorithm.
Figure 1. Block diagram of the proposed method
Computing final disparity map
Thresholding to smoothing the edges
Constitute S contain extracted edges
Compute image similarity for all seeds s �S
Growing Component Seeds (GCS)
Matching on the special
neighborhood around features
Left Image Right Image
Edge detecting using canny/non-horizontal edge on x orientation
Feature extraction
Phase I Phase II
International Journal of Tomography and Simulation [ISSN 2319-3336]
103
This algorithm consists of two phases; the first is feature extraction and matching and the second one
is disparity growing on a special neighborhood around the extracted features. In the following sections
each part is explained in detail.
3.1. Phase I
In this paper, the extracted features are edges. This consists of the implementation of various image
processing algorithms like edge detection using Sobels, Prewitt, Canny and Laplacian, and so on. A
different technique is reported to increase the performance of the edge detection. We first use the
ancillary conditions taken by the Canny Filter. The Canny operator is used because Canny has shown
that the ideal operator that maximizes the conventional signal-to-noise ratio in detecting a particular
edge correlates with the same edge model itself. However, this detection is not well localized and
requires an additional localization criterion (Moallem 2006), (Shriram 2010).
As mentioned before, the comparison of non-horizontal edges has difficulties in stereo vision and
therefore we choose and use these non-horizontal edges as characteristic points for comparison.
The second suggested algorithm makes it possible by the derivation of non-horizontal edges, to
reduce a huge amount of pixels and fulfill the comparison based on these pixels, which mainly play a
more important role in the comparison. In this algorithm, the volume and capacity of the calculation is
not high and the time of the progress of identifying edges’ points is relatively short. If the noise of
images is so that there is no need of filtering to eliminate them, the following actions may be taken to
identify the non-horizontal edges with the breadth of one pixel: First we use a 3×3 Sobel filter which
estimates the gradient parallel to x direction on both left and right images.
xS� �� �� � �� ��
1 0 -12 0 - 21 0 -1
(2)
To identify the real edges which have noise, the output of the filter has to be greater than a minimum
threshold. Otherwise there will be many edges which most of them are appeared because of the noise.
To solve this problem, we go on by identifying the non-horizontal edges with the breadth of one pixel
(Moallem 2005), ((Sadeghi, 2008) (Jahn 2000), ( Pollard 1985)
The edges’ points derived based on reduced light intensity on their sides are divided into two groups
of positive and negative ones. To reach a diameter of one pixel, we need to calculate the maximum
and minimum of the positive and minimum edge points. A non-horizontal thinned positive edge in a
left image is localized in relation to a pixel when the filter response exceeds a positive threshold �0+
and then a local maximum in the horizontal direction is obtained, therefore:
),1(),(),1(),(
),( 0
yxyxyxyx
yx
ll
ll
l
���
�
������
(3) Local maximum
Threshold
International Journal of Tomography and Simulation [ISSN 2319-3336]
104
Assume that �0+ is the mean positive value of the filter response. The extraction of non-horizontal
positive thinned edge points is the same as the negative ones;
),1(),(),1(),(
),( 0
yxyxyxyx
yx
ll
ll
l
�
�
������
(4)
Now we can extract feature chains (Sadeghi 2008) with lengths more than 3. Each two sequence
feature points in the same chain should be in the same sequence scan lines, and the disparity in the
horizontal direction should be less than 2. Chains with length less than 3 are ignored. In the feature
extraction stage, for both of the left and right images non-horizontal thinned edge extraction is
performed. In the matching stage only similar features are to be compared. To detect non-horizontal
edge points, the left and right images are convolved with a proper gradient mask. A non-horizontal
thinned positive edge is localized in relation to a pixel where the filter response exceeds a positive
threshold and obtains a local maximum in the direction. The positive threshold is selected dynamically
based on the mean positive value of the filter response; which is highly important in wide baseline
stereo. Then the matching of the founded seeds is examined and the matching seeds are found and
saved in Matrix S.
A
B
Figure 2. (a) Non-horizontal positive and negative edge (b) canny edge
3.2. Phase II
The second phase consists of growing extracted features around a special neighborhood. Suppose
an unsorted list of disparity seeds S is given. Each seed is a point in the disparity space, s = (x, x', y )
defined by selecting a certain neighborhood N(s)around this point formed with 16 points constructed
from four sub-sets N1(s) � N2(s) � N3(s) � N4(s), see Fig. 3 where we use the same colors for each:
Local maximum Threshold
International Journal of Tomography and Simulation [ISSN 2319-3336]
105
Figure.3. Disparity space neighborhood used in this paper.
N1(s) = {(x � 1, x' � 1, y), (x � 2, x' � 1, y),(x � 1, x' � 2, y)} ,
N2(s) = {(x + 1, x' + 1, y), (x + 2, x' + 1, y),(x + 1, x' + 2, y)} ,
N3(s) = {(x, x', y � 1), (x ± 1, x', y � 1), (x, x' ± 1, y � 1)} ,
N4(s) = {(x, x', y + 1), (x ± 1, x', y + 1), (x, x' ± 1, y + 1)} .
An empty matching table � is defined and the disparity components are grown by drawing an arbitrary
seed s from S. Then the similarity is computed from small image windows around pixels (u, v) and (u',
v) by for example the normalized cross-correlation (NCC) and adding it to �, or by individually
selecting the best-similarity neighbors qi over its four sub-neighborhoods Ni(s):
),',(maxarg),',()(),',(
yxxsimilivuuqisNiyxx
������
��
(5)
And placing each of these neighbors (qi) to the seed list if their inter-image similarity exceeds a
threshold. Hence, up to four new seeds are created. If a seed from the list Sis already a member of
the matching table, then it will be discarded. If seeds have been already selected, the matching
operation will be stopped for this seed. The development will be stopped by finishing all of the seeds
of S. The output of this phase -here on called Matching on the Special Neighborhood around Features
is a partially filled matching table which its connected regions in 3D represent disparity components
grown around the initial seeds.
The proposed method uses Growing Correspondence Seeds (GCS) for matching. A fast stereo
matching algorithm is proposed that visits only a small fraction of the disparity space in order to find a
semi-dense disparity map. It works by growing from a set of corresponding seeds. Unlike other seed-
growing algorithms, this algorithm guarantees matching accuracy and correctness, even in the
presence of repetitive patterns. In this paper the NCC function matching has been used on a 5×5
window for comparison of the image similarity in all experimental results. We prepare an empty
matching table � and start growing disparity components by drawing a seed s from S, computed by
using the proposed edge detection and adding it to �. In this way higher performance can be achieved
in compare to previous works such as (Cech, 2007). In contrast to the method proposed in (Cech,
2007) which uses random seeds, here we use an S matrix that contains the edges of left and right
images (Cech, 2007). In this way we combine area based and feature based matching. This will
improve the accuracy for real images.
x
x'
y
N1
N2
N3
N4
International Journal of Tomography and Simulation [ISSN 2319-3336]
106
where y is an 1�n vector observations of the dependent variables, X is the matrix kn� of k
explanatory variable, � the vector of n theoretical residuals and � the vector of the theoretical
regression coefficients. It is supposed that the residuals are independent random variable of the same
normal distribution of null mean and constant variance ²� . The parameters to be simulated are X , �
and � , while the vector y is calculated by the model.
4. EXPERIMENTAL RESULTS
In this section some experimental results is provided to evaluate the robustness of the proposed
method. We have used Middlebury, CMU database, the database used in (Kim 1996), and real
images captured by un-calibrated cameras. The program runs on PC with these features: AMD Athlon
64x2 Dual core processor 4800+2.5 Ghz, 896 MB of RAM. For simplicity, the following abbreviation is
used: Belief propagation algorithm as ”BP”; The box filtering as “BF”; Random Growing Component
Seeds as “GCS”; Canny Growing Component Seeds as “Canny+GCS”;non-horizontal edge Growing
Component Seeds as “NHE+GCS” .
Figures 4 to 8 show the original images and the ground truth of them. Figs. 10 to 12 show the left and
right original images. Figs. 4 and 6 are images with rich texture content, and Figs.8 and 10 are the
relative images without texture. Figs. 5, 7, 9, 11, and 13 show the results of all the explained
algorithms. Tables 1, 2, and 3 show the RMS error, time taken, and the percent of fault matching,
respectively. The RMS error is calculates based on Equation 3.
2[ ( ( , ) ( , )) ] /RMSError D i j G i j n� �� (6)
D(i,j) is the disparity value in (i,j) coordinates; and G(i,j) is the disparity value of the ground truth at (i,j)
coordinates. Tables 1, 2 and 3 compare the results.
Figures 5, 7, 9, 11, and 13 show the qualitative evaluation.
International Journal of Tomography and Simulation [ISSN 2319-3336]
107
Original image Ground truth
Figure 4. Original image and Ground truth of Bowling2
Figure 5. Results of all explained algorithms of Bowling
Figure 6. Original image and Ground truth of Aloe
Box Filtering (BF) Algorithm 1 (BP) GCS
Box Filtering (BF) Algorithm 1 (BP)
International Journal of Tomography and Simulation [ISSN 2319-3336]
108
Box Filtering (BF) Algorithm1 (BP) GCS algorithm
Canny+GCS NHE+GCS
Figure 7. Results of all explained algorithms of Aloe
Original image (corner) Ground truth
Figure 8. Original image and Ground truth of corner
International Journal of Tomography and Simulation [ISSN 2319-3336]
109
Box Filtering (BF) Algorithm1 (BP) GCS algorithm
Canny+GCS NHE+GCS
Figure 9. Results of all explained algorithms of corner
Original image (Head left) Original image (Head right)
Figure 10. Original image and Ground truth of Head
International Journal of Tomography and Simulation [ISSN 2319-3336]
110
Box Filtering Algorithm1 GCS algorithm
Canny+GCS NHE+GCS
Figure 11. Results of all explained algorithms of Head
Original real image (left) Original real image (right)
Figure 12. Original image and Ground truth of real image
International Journal of Tomography and Simulation [ISSN 2319-3336]
111
Box Filtering (BF) Algorithm1(BP) GCS algorithm
Canny+GCS NHE+GCS
Figure 13. Results of all explained algorithms of real image
Table 1. RMSError for different algorithms
Bowling2� Flower�pots Baby3 Aloe Cones� Poster
BP 300.2 412.3 201.3 150.3 212.1 100.3
GCS 233.2 351.0 193.1 125.4 147.1 91.9
Canny+GCS 228.5 214.8 200.5 124.5 142.3 75.3
NHE+GCS 225.5 203.4 193.1 109.2 129.2 66.9
International Journal of Tomography and Simulation [ISSN 2319-3336]
112
Table 2.The processing time for different algorithms (ms)
Name�
Algorithm�
Bowling2� Flower_�
pots�
Baby3 Aloe Cones poster� Rotunda
BP� 8.46� 8.25� 9.28 10.79 10.21 16.74� 25.50
BF� 9.60� 9.31� 22.90 15.82 9.80 20.65� 28.32
GCS� 4.90� 7.20� 4.06 3.39 10.43 12.40� 6.45
Canny+GCS� 4.04� 6.35� 3.92 4.56 8.56 10.64� 4.25
NHE+GCS� 3.68� 5.90� 3.60 2.80 6.3 9.40� 2.04
Table 3. Percent of fault matching for different algorithm�
Name
�
Algorithms�
Poster Cones Aloe Baby3 Flowerpots Bowling
NHE+GCS 4.47 4.71 5. 32 9.29 5.53 6.85 GCS 5.97 6.32 6.53 10.86 5.57 10.31
BF 6.10 9.18 9.11 11.30 11.43 10.71
BP 6.36 7.56 6.67 10.75 6.89 10.95
The obtained results prove that the proposed method is fairly fast and robust against shadows.As it
is shown in Figures 5 to 13, BF and BP algorithms are area based algorithm, thus they operate
soundly in highly textured images as well as low textured images. The GCS algorithm achieves an
accurate disparity map, but due to the fact that the chosen growing points are randomly selected,
longer running time is elapsed in compare to the Canny+GCS algorithm and the NHE+GCS algorithm.
The Canny+GCS uses “canny” edge detector so that the pixels of horizontal edges are also grown,
thus it produces a disparity map with lower disparity than the NHE+GCS algorithm. According to
Table 2, the NHE+GCS algorithm is almost 31% faster than basic GCS algorithm, and the
Canny+GCS is 13% faster than basic GCS algorithm.
The NHE+GCS algorithm achieves higher speed and a more accurate disparity map with the
lowest RMS errors. As shown in Figure 13, the NHE+GCS algorithm works well for un-calibrated
(uncertified) real images as well. According to experimental results the NHE+GCS algorithm improves
the speed, accuracy, and robustness, simultaneously.
International Journal of Tomography and Simulation [ISSN 2319-3336]
113
5. CONCLUSION
In this paper a novel stereo matching algorithm has been proposed, namely the NHE+GCS
algorithm. Our algorithm consists of two phases; the first phase is feature extraction and feature
matching and the second one is disparity growing on a special neighborhood around features
extracted. The method combines feature and area based matching. The proposed algorithm has
been compared with box filtering, belief propagation, random Growing-Component-Seeds and canny
Growing-Component-Seeds. The box filtering algorithm based on area increases the accuracy while
the belief propagation increases the speed. The random and canny GCS algorithms increase both the
speed and accuracy but the proposed method increases the speed even more and meanwhile
preserves the accuracy. Thus, this method achieves a high quality disparity map in lower operation
time, suitable for obstacle detection which requires high speed and accuracy in computing the
disparity map.
6. REFERENCES
Cech, Jan Radim Sara.(2007) Efficient Sampling of Disparity Space for Fast and Accurate Matching. In Proc. BenCOS Workshop CVPR, 2007. Chen Q. and G. Medioni. (1999). A volumetric stereo matching method: Application to image-based modeling. In Proc CVPR, pp. 1029–1034. Deddy Kurniadi (2010). Reconstruction of Multislice Image in Electrical Impedance Tomography. International Journal of Tomography & Simulation. 20, Number 2, 2012 Fazli, S. H. Mohammadi Dehnavi, and P. Moallem, (2009). A Robust Obstacle Detection Method in Highly Textured Environments using Stereo Vision, 2nd international conference on machine vision (IEEE), p 97-100. Foggia, P. Jean-Michel Jolion, A. Limongiello, and M. Vento, (2007) Stereo Vision for Obstacle Detection: a Graph-Based Approach, 6th IAPR – TC-15 Workshop on Graph-based Representations in Pattern Recognition (GbR 2007), Alicante June 11-13. Jainy Sachdeva, Vinod Kumar, Indra Gupta, Niranjan Khandelwal, Chirag Kamal Ahuja (2012) Hybrid model for 2-D rigid multimodal registration of brain images, International Journal of Tomography & Simulation, 15, Number 10, 2010 Jones D. G. and J. Malik, (2000) Computational framework for determining stereo correspondence from a set of linear spatial filters,” image processing, 76, no. 5, pp. 321–331. Limongiello A. et al, (2007).“Stereo Vision for Obstacle Detection: a Graph-Based Approach”, Alicante June 11-13, 2007. Kim, Y.-S. J.-J. Lee, and Y.-H. Ha,(1997) Stereo matching algorithm based on modified wavelet decomposition process, Pattern Recognition, 30, no. 6, pp. 929–952. Kim T. and Muller. J. (1996). Automated urban area building extraction from high resolution stereo imagery. IVC, 4:115–130,
International Journal of Tomography and Simulation [ISSN 2319-3336]
114
Haralick R. M., and L.G. Shapiro, (1993). Computer and Robot Vision, Addision volume I, Addison-Wesley, Reading, 1993. Haralick R. M. and L. G. Shapiro. (1985). Image segmentation techniques. CVGIP, 29:100–132, 1985. Jahn, H. (2000). Parallel Epipolar Stereo Matching, ICPR2000, 2,402-405,Spain,2000 S. Lankton (2009). Sparse Field Methods - Technical Report, Georgia Institute of Technology, 2009 Moallem P. and K, Faez, (2005) Effective Parameters in Search Space Reduction Used in a Fast Edge-Based Stereo Matching, Journal of Circuits, Systems, and Computers, 14(2): 249-266. Moallem, P. Ashourian, M. Mirzaeian, B. and Ataei, M. (2006) A Novel Fast Feature Based Stereo Matching Algorithm with Low Invalid Matching, WSEAS Transaction on Computers, Issue 3, Vol. 5, pp. 469 – 477, March. Mohammadi. H. , Shirazi S., Moallem, P. (2010) A Novel Obstacle Detection Method using Stereo Vision and two-stage Dynamic Programming”, ICEE2010. Iran.
Munro, P. Gerdelan, A. P. (2009) Stereo Vision Computer Depth Perception, Country United States City University Park Country Code US Post code 16802 Number of Publications 1,358,061 Number of Authors 1,096,162 Radim Sara.(2006) Robust Correspondence Recognition for Computer Vision. In Proc COMPSTAT, 17th Conference of IASC-ERS Roma, Italy. Radim Sara.(2002) Finding the largest unambiguous component of stereo matching. In Proc. ECCV, pp. 900-14. Sadeghi, H. Moallem, P. and Monadjemi, S. A. (2008). Feature Based Dense Stereo Matching using Dynamic Programming and Color. International Journal of Computational Intelligence, 4 , No. 3, pp. 179-186 Scharstein D.and Szeliski, R. (1998). Stereo matching with nonlinear diffusion, International Journal of Computer Vision, 28, no. 2, pp. 155–174. Shengyong Chen ; Wang, Z. (2012) Acceleration Strategies in Generalized Belief Propagation. IEEE Transactions on Industrial Informatics , 8 , Issue: 1, pp. : 41 - 48 Shriram K V et. al. (2010) Automotive Image Processing Technique using Canny’s Edge Detector International Journal of Engineering Science and Technology; 2 (7), 2632-2644, 2010. Sizintsev, M. ; Wildes, R.P. (2012) Spatiotemporal Stereo and Scene Flow via Stequel Matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34 , Issue: 6, pp.: 1206 - 1219 Sun, C. (1997) A Fast Stereo Matching Method, Digital Image Computing: Techniques and Applications, pp.95-100. Sun, C. (2000) Fast stereo matching using rectangular sub-regioning and 3D maximum-surface techniques, International Journal of Computer Vision, 47, no. 1/2/3, pp. 99–117, April-June Yoon, K.-J. (2012) .Stereo matching based on nonlinear diffusion with disparity-dependent support weights, IET Computer Vision, 6 , Issue: 4 Xiaoyan Hu ; Mordohai, P.(2012). A Quantitative Evaluation of Confidence Measures for Stereo Vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34 , Issue: 11 , pp. 2121 - 2133
International Journal of Tomography and Simulation [ISSN 2319-3336]
115