1. Home
  2. Archives
  3. Vol 15 (2021) Issue 3
  4. Articles

New Stereo Vision Algorithm Composition Using Weighted Adaptive Histogram Equalization and Gamma Correction

Abstract

This work presents the composition of a new algorithm for a stereo vision system to acquire accurate depth measurement from stereo correspondence. Stereo correspondence produced by matching is commonly affected by image noise such as illumination variation, blurry boundaries, and radiometric differences. The proposed algorithm introduces a pre-processing step based on the combination of Contrast Limited Adaptive Histogram Equalization (CLAHE) and Adaptive Gamma Correction Weighted Distribution (AGCWD) with a guided filter (GF). The cost value of the pre-processing step is determined in the matching cost step using the census transform (CT), which is followed by aggregation using the fixed-window and GF technique. A winner-takes-all (WTA) approach is employed to select the minimum disparity map value and final refinement using left-right consistency checking (LR) along with a weighted median filter (WMF) to remove outliers. The algorithm improved the accuracy 31.65% for all pixel errors and 23.35% for pixel errors in nonoccluded regions compared to several established algorithms on a Middlebury dataset.

Keywords

1 Introduction

Stereo vision is a key topic in image processing studies, including the architecture to acquire depth estimations for translating a 2D perspective into 3D scenes. The process known as stereo matching to produce disparity in the architecture relies on creating a depth map, which is being studied in order to increase the disparity accuracy. Depth maps can be applied in various methods for 3D scene reconstruction in fields such as medical imaging, virtual reality, autonomous navigation, etc. [1]. The stereo vision algorithm can be based on three types of approaches: local, global and semiglobal [2]. All techniques are structured into a basic four-stage taxonomy: matching cost computation, aggregation of cost, disparity selection, and final refinement. The simplest stereo vision computation method is a local approach using pixel-window comparison, resulting in a short execution time. Global methods use energy minimization in identifying the disparities, thus contributing to higher disparity accuracy but resulting in a longer execution time. Semiglobal methods are a trade-off between the local and global approaches.

Stereo correspondences produced by matching are commonly affected by image noise such as illumination variation, blurry boundaries, and radiometric differences. In this work, a new algorithm with local composition is proposed to focus on these issues and acquire an accurate disparity map from the stereo correspondence process. The Middlebury dataset, consisting of 15 stereo image pairs, was used as evaluation platform [3]. The pre-processing step applies two different techniques: the weighted adaptive histogram equalization of CLAHE, and AGCWD. Low texture and edge regions are enhanced and smoothed by applying a guided filter (GF). CLAHE and AGCWD were selected because they operate on small image tiles to improve contrast, especially at large surfaces and blurry edges. The cost value is determined during pre-processing using the census transform (CT) in the matching cost step, followed by aggregation using the fixed-window and GF techniques. A winner-takes-all (WTA) approach is employed to select the minimum disparity map value and final refinement is done using left-right (LR) consistency checking along with a weighted median filter (WMF) to remove outliers.

The remainder of this article is structured as follows. Section 2 describes the enhancement architecture, Section 3 explains the algorithm's methodology and the result is discussed in Section 4. Finally, Section 5 concludes the work.

2 Image Enhancement Model

A constraint in the generated color image is that it is often comprised by noise from bad illumination conditions and an undesirable environment [4]. Poor lighting, non-uniform light and sudden changes in light are factors that cause poor contrast and blurred image details, resulting in challenges in obtaining and analyzing the information from images [5]. The weighted adaptive histogram equalization technique in the CLAHE method focuses on small regions in the image, defined as tiles, instead of the entire image, other than traditional histogram equalization methods used for image enhancement [6]. The parameters used in CLAHE set a threshold for contrast, limiting so-called clipping and improving surface textures in the image without sacrificing edge textures.

The implementation of CLAHE is initialized by decomposition of the original image into several rectangular blocks [7]. The next stage is to implement

histogram block fine tuning, involving histogram creation, clipping, and distribution, followed by obtaining the cumulative distribution function (CDF) and the probability density function (PDF) values. Bilinear interpolation between blocks is employed for equalization mapping to remove artefacts within the blocks and prevent their visibility in boundary regions. A series of segmentations and Rayleigh shape distributions creates 180 bins of bell-shaped histograms. A clipper point is then applied to limit histogram peak values in each histogram block to control the level of contrast. The clipper acts as an essential controller to prevent image saturation, especially in homogeneous regions. Each gray level is redistributed by clipped pixels. The clipping point is calculated as follows [8]:

\[B = \left(\frac{M}{N} \left(1 + \frac{(\alpha(S_{max}))}{100}\right)\right) \tag{1}\]

The block of pixels is represented by M and the dynamic range of the block is indicated by N. The clip factor and maximum slope are denoted as \(\alpha\) and S. The clipper value is adjusted at 9 x \(10^{-3}\), which is the optimal value for acquiring acceptable contrast adjustment based on several experiments. The CDF and PDF produced from the clipped peak histogram are used in the mapping function to remap the gray level block as formulated in Eq. (2):

\[CD F(I) = \sum_{k=0}^{I} PDF(I)\] (2)

\[T_{map}(I) = CDF(I) x I_{max}\] (3)

The remapping role is denoted as \(T_{map}(I)\), where I is the level of gray pixels and \(I_{max}\) is the maximum number of block pixels. Therefore, it is critical to consider blocking artefacts based on the interpolated value of each pixel from the adjacent blocks' mapping function. The blocks are remapped based on bilinear interpolation in the mapping function as follows:

\[T_{map}(p(i)) = m \cdot (n \cdot T_a \cdot p(i) + (1 - n) \cdot T_b \cdot p(i)) + (1 - m) \cdot (n \cdot T_c \cdot p(i)) + (1 - n) \cdot T_d \cdot p(i)\] (4)

\[\text{[rumus tidak dapat ditampilkan dengan baik — lihat PDF asli]}\] (5)

where a, b, c, and d represent the center pixel blocks and p denotes arbitrary pixels around the blocks. \(T(\cdot)\) signifies the remapping role of the four blocks of pixels, \(T_a\), \(T_b\), \(T_c\) and \(T_d\), where p(i) is an arbitrary pixel value i along the x, y coordinates. The interpolation contributes to remove blocking artefacts with low computing cost.

The issue of manual adjustment of the gamma value in image quality enhancement can be solved with the AGCWD approach [9]. The algorithm automatically obtains the values based on the implementation of a weighted

distribution function. The gamma correction transform-based equation and the proposed AGCWD are derived in Eq. (6) and Eq. (7):

\[T_C(I) = I_{max} \left(\frac{I}{I_{max}}\right)^{\gamma} = I_{max} \left(\frac{I}{I_{max}}\right)^{1 - CDF(I)}\] (6)

The parameter for the weighted distribution is:

\[PDF_{w}(I) = PDF_{max} \left( \frac{PDF(I) - PDF_{min}}{PDF_{max} - PDF_{min}} \right)^{\alpha}\] (7)

where \(PDF_{min}\) and \(PDF_{max}\) are the minimum and maximum statistical histogram of the PDF based on histogram evaluation, while the parameter for weighted adjustment is denoted by \(\alpha\). In this work, the value of \(\alpha\), determined as 0.5, is the optimal solution to be reached using the CLAHE technique. The next stage is to determine the CDF value based on Eq. (8).

\[CDF_{w}(I) = \sum_{l=0}^{I_{max}} PDF_{w}(I) / \sum PDF_{w}\] (8)

The sum of \(PDF_w\) can be calculated as follows:

\[\sum PDF_w = \sum_{l=0}^{I_{max}} PDF_w(l)\] (9)

Finally, to increase the generality of the results, the gamma parameter based on the CDF is modified as follows:

\[\gamma = 1 - CDF_w(I) \tag{10}\]

This is a vital functionality for the HE to map the output gray level from the input gray level, which is a level transformation function based on the CDF.

3 Methodology

The proposed stereo vision algorithm consists of five stages, as depicted in Figure 1. The process starts with the pre-processing step for color image condition improvement utilizing the combination of CLAHE and AGCWD with GF. CLAHE and AGCWD work on small regions rather than the entire image, thus improving the image enhancement. After that, matching cost computation is done by employing the CT method. The third stage produces the aggregated cost from the matching cost computation utilizing the fixed-window and GF methods. In disparity optimization, a WTA is utilized to optimize the disparity map. Left-right consistency checking with WMF is used in the final disparity refinement to eliminate the remaining noise and conduct map smoothing.

The pre-processing stage with CLAHE and AGCWD to enhance original images from the Middlebury platform is shown in Figure 2. The images comprise the parameters disparity level, resolution, dimension and camera calibration [10].

1

Figure 1 Taxonomy of the proposed algorithm.

3

Figure 2 The pre-processing stage based on CLAHE + GF and AGCWD + GF.

A GF smoothing filter is used to improve the quality of surfaces and edge areas. A census approach of local transform using the CT method is operated between the local intensity of neighboring pixels with the center pixel in the census window applied in the matching cost computation [11]. CT works on the stereo pair images considering the images as a fixed census window. The window size is defined as \((2m+1) \times (2n+1)\). The window size is fixed at \(9 \times 7\) so the transform can be determined as follows:

\[T(x,y) = n \otimes i = -n m \otimes i = -m \xi(I(x,y), I(x+i,y+i))\] (11)

where I(x,y) is the intensity of pixel (x,y). The operator \(\bigotimes\) describes bit-wise catenation. The relationship function \(\xi\) is determined as follows:

\[\text{[rumus tidak dapat ditampilkan dengan baik — lihat PDF asli]}\] (12)

where \(q_1 q_2\) are the center pixel and a closely neighboring pixel. The distance and difference between the census value of both transform vectors can be estimated applying the Hamming distance function as expressed in Eq. (13):

\[C_d(x,y) = Hamm(T_l(x,y), T_r(x-d,y))\] (13)

The disparity is defined by d and \(T_1\), \(T_r\) represent both transform vectors. The cost value of all candidates in the matching cost computation between the reference and the target image is aggregated in the cost aggregation stage. A square-fixed-window aggregation employing the box filter method is implemented [12]. The square-fixed-window size applied is \((2r + 1) \times (2r + 1)\) for r = 2, which sets the windows size to 5 x 5 so that the cost aggregation value is computed as in Eq. (14) and Eq. (15):

\[T^{r}(u, v, d) = \frac{1}{2r+1} \sum_{m=-r}^{r} C(u+m, v, d)\] (14)

\[A_{square}^{r}(u, v, d) = \frac{1}{2r+1} \sum_{n=-r}^{r} T^{r}(u, v + n, d)\] (15)

T keeps the intermediate results of the cost matrix. A well-rectified stereo image pair shows the same height and width at the corresponding epipolar lines. The aggregated cost is smoothed with the GF to remove noise from the texture while preserving sharp edges at the boundaries. The usage of the WTA strategy is able to enhance the corresponding value as utilized in [13] by choosing a minimum aggregated pixel value, thus producing a more accurate disparity map. However, the WTA strategy still produces errors and noise in low-texture and occlusion areas. The WTA formula is given by Eq. (16):

\[d = arg \ arg \ min_{d \in d_r} CA \ (p, d) \tag{16}\] where d is the selected minimum aggregated cost value. The cost aggregation volume is represented by CA(p,d) from Eq. (16) and indicates the range of acceptable disparity values.

The final step of the algorithm consists of the disparity refinement stage. It involves a number of sequential processes: invalid disparity detection, occlusion handling, and filtering. Outlier locations and occlusion are perceived using the left-right consistency check. The result of the left-right check are the outlier locations producing inconsistent disparity values among the two maps [14]. A similar strategy employed in [15] and [16] is employed for the location of disparity point p. The equation is formulated as follows:

\[\text{[rumus tidak dapat ditampilkan dengan baik — lihat PDF asli]}\] (17)

where \(d_{RL}\) signifies the right reference and \(d_{LR}\) is the left reference of the disparity map. The disparity map consists of the locations of outliers. The value of \(\tau_{LR}\) is fixed to zero to aim for minimum error in the disparity map [16]. Unnecessary remaining noise can be removed using WMF [17]. WMF attains an impressive accuracy level in noise removal. Corresponding to Eq. (17), the weight B(p, q) of the bilateral filter is adjusted to the total histogram h(p, dr), contributing to Eq. (18):

\[h(p,d_r) = \sum_{q \in W_p \mid d(q) = d_r} B(p,q)\] (18)

where dr indicates the disparity range, and wp signifies the window size with an r x r radius at the center p pixel. The median h(p, dr) value is the refined final disparity value of d′ expressed in Eq. (19):

\[d' = med\{h(p, d_r)\}\tag{19}\]

4 Result and Discussion

The result in this work was obtained from executing the proposed algorithm. Experiments were implemented on a Windows 10 platform desktop PC with a 3.2 GHz processor with 8 GB RAM. Stereo images from the standard Middlebury dataset were used to evaluate the disparity accuracy level based on 15 sets of training images. The image enhancement performance was first compared through quality parameters such as entropy, MSE, PSNR and histogram distribution [18]. The disparity accuracy level was determined according to the number of bad pixels in non-occluded regions (nonocc) and all pixels (all). The performance result of several images in terms of entropy, MSE and PSNR is shown in Table 1.

Table 1 Entropy, MSE and PSNR performance of several Middlebury images.

EntropyMSEPSNR
Img
Type
CLAGCL
+
GF
AG +
GF
CLAGCL
+
GF
AG +
GF
CLAGCL
+
GF
AG
+
GF
Adi7.27.77.27.70.251.970.251.9624.015.124.015.2
ArtL7.67.87.67.90.361.500.361.4822.516.322.516.4
Jade7.57.57.47.50.590.590.580.5920.420.420.420.4
Moto7.77.97.77.90.411.300.421.2922.016.921.817.0
Ted7.77.87.67.80.411.150.421.1621.917.521.817.4
Vint7.56.57.46.00.950.460.960.4718.321.518.321.3
Avg7.57.67.57.60.501.210.501.2021.417.621.417.6

CL denotes CLAHE while AG denotes AGCWD. The result revealed that the weight distribution and equalization offer good performance by providing an entropy value that is almost equal to that of the original images, i.e. AGCWD (101.3%) and AGCWD + GF (101.3%), and CLAHE (100%) and CLAHE + GF (100%), which is excellent for image enhancement. These findings demonstrate the importance of weight distribution adjustment to preserve the information in the image.

A related key finding is that CLAHE produced a significant improvement in PSNR performance compared with AGCWD, at 21.9% and 21.6% when combined with GF. CLAHE also contributed to a lower MSE value, at 41.3% and 41.8% with GF implementation.

2

Figure 3 Left image histogram distribution of Jadeplant, Motorcycle and Teddy.

The initial histogram quantitative analysis for three images (Jadeplant, Motorcycle and Teddy) shows the specific histogram distribution for each original image, as shown in Figure 3. The implementation of CLAHE and AGCWD contributed to a slight amount of plateau distribution compared to the original image histogram distribution. The guided filter applied in this work contributed to a much more uniform histogram distribution, and changed the intensity mean so that a better contrast was generated. The produced histogram distribution of CLAHE had lower intensity than the original histogram distribution due to HE equalizing the input image histogram, resulting in a loss of some intensity; however, in AGCWD, the technique preserved the intensity.

Figure 4 shows a visual comparison of the image contrast enhancement based on the Jadeplant, Motorcycles and Teddy images. CLAHE improved both image contrast and pixel values but produced halo artefacts. Halo artefacts come from over-stretching of the histogram, which is very apparent along firm edges. Additionally, tone distortion is not tolerable in contrast enhancement. AGCWD had quite good performance. It is resilient due to weighting distribution usage on the gamma correction, preserving the image brightness close to the original image as compared with CLAHE. The experiments showed that the guided filter was effective and efficient in detail enhancement to eliminate or reduce noise effects without blurring the image boundaries. The parameters utilized in the stereo vision algorithm were used to investigate the effectiveness of the pre-processing step with matching cost combination. A census cost computation strategy, i.e. census transform, was performed with the same parameter conditions as for the block diagram. Figure 5 displays the Jadeplant, Motorcycle and Teddy left images, ground truth and the disparity assessment produced by Middlebury evaluation. Object scenes placed at increasing depth are specified step by step to disparity values based on the disparity maps. The final value is indicated from closer to farther according to the color assignment. The result shows that the proposed method performed well around low texture regions and at the boundaries indicated by the red circles.

2

Figure 4 Enhanced left images of Jadeplant, Motorcycle and Teddy.

4

Figure 5 Middlebury evaluation of the disparity maps of Jadeplant, Motorcycle and Teddy.

Figure 6 shows a disparity map comparison for Motorcycle, Playtable and Vintage between two established algorithms, Simultaneous Edge Drawing Disparity (SED) [19] and Max-Tree Stereo (MTS) [20], and the proposed method. The disparity map indicates only minimum streak artefacts, which commonly occur in the local stereo method. The algorithm worked well at discontinuity and occlusion regions, as shown around the red box marking. It is trickier and more challenging to execute the algorithm on larger low-texture areas because of similar intensity levels among pixels.

Table 2 presents the bad pixel error percentage performance from the Middlebury evaluation. It indicates that the algorithm had better accuracy than SED and MTS (i.e. with a weight average difference for nonocc = 1.5 ~ 2.9, all = 2.0 ~ 5.2), especially for the images containing edge structures. The outcomes of the algorithm were among the lowest average error numbers, which indicates the competitiveness of the proposed algorithm's performance.

2

Figure 6 Disparity maps for Motorcycle, Playtable and Vintage images: (a) left image, (b) ground-truth, (c) SED, (d) MTS, (e) CLAHE + CT, (f) AGCWD + CT.

Table 2 Bad pixel errors in nonocclusion regions and all errors from the Middlebury evaluation.

SEDMTSCLAHE + CTAGCWD + CT
MethodNonocc
%
All %Nonocc %All %Nonocc %All %Nonocc %All %
Adiron23.725.117.219.019.123.419.123.4
ArtL15.617.117.322.59.7827.410.827.7
Jadepl10612310712342.664.548.369.6
Motor18.320.614.817.53.0916.814.518.1
MotorE17.719.718.320.73.2717.010.718.3
Piano17.718.112.413.016.821.318.122.5
PianoL29.729.532.032.030.033.431.034.4
Pipes28.534.122.429.412.526.314.928.3
Playrm21.322.821.926.924.239.524.339.6
Playt18.218.825.927.431.637.333.238.7
PlaytP15.916.510.312.013.720.315.121.5
Recyc16.216.816.417.515.319.218.922.4
Shelvs14.415.111.4712.125.327.225.627.4
Teddy6.657.266.968.117.6816.59.8918.6
Vintge31.633.825.227.270.373.872.275.4
Avg25.427.924.027.222.530.924.232.4

5 Conclusion

This paper presented several pre-processing methods employed in a stereo vision algorithm by integrating CLAHE and AGCWD with GF for image enhancement. These methods exhibited an impressive improvement in image enhancement and disparity map accuracy. The employment of matching costs computation was robust against illumination variation by identifying regions with intensity differences. The smooth GF included in the algorithm increased the efficiency and preserved object edges. The disparity maps result proved that the algorithm's accuracy was improved significantly compared with several other established stereo vision algorithms.

Acknowledgement

This work was supported by Universiti Teknikal Malaysia Melaka (UTeM) and the Centre for Research and Innovation Management (CRIM) with research grant no. PJP/2020/FTKEE/TD/S01726.

Research Intelligence

Data from OpenAlex ↗

Metrics

1
Citations
0.10
FWCIfield-weighted
43th
Percentilevs same year + field
Article
Work type
Open Access

Citation Trend

Citation Timeline

YearCitations
20241

Institution Network

References

  1. Cao, Y.S., Liu, J.G., Wen, T.X. & Bi, X., Improvement of Stereo Matching Algorithm Based On Guided Filtering and Kernel Regression, J. Phys. Conf. Ser., 1213(3), pp. 1-6, 2019. DOI: 10.1088/1742-6596/1213/3/032014
  2. Bebeselea-Sterp, E., Brad, R.R. & Brad, R.R., A Comparative Study of Stereovision Algorithms, Int. J. Adv. Comput. Sci. Appl., 8(11), pp. 359-375, 2017. DOI: 10.14569/ijacsa.2017.081144
  3. Scharstein, D. & Szeliski, R., A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms, Int. J. Comput. Vis., 47(1-3), pp. 7-42, 2002. DOI: 10.1109/smbv.2001.988771
  4. Jo, H.W. & Moon, B., A Modified Census Transform Using the Representative Intensity Values, ISOCC 2015 – Int. SoC Des. Conf. SoC Internet Everything, pp. 309-310, 2016.
  5. Hamzah, R.A., Hamid, M.S., Kadmin, A.F. & Abd Gani, S.F., Improvement of Stereo Corresponding Algorithm Based on Sum of Absolute Differences and Edge Preserving Filter, IEEE International Conference on Signal and Image Processing Applications (ICSIPA), pp. 222-225, 2017. DOI: 10.1109/icsipa.2017.8120610
  6. Tang, J.R. & Mat Isa, N.A., Bi-Histogram Equalization Using Modified Histogram Bins, Applied Soft Computing, 55(June), pp. 31-43, 2017. DOI: 10.1016/j.asoc.2017.01.053
  7. Veluchamy, M. & Subramani, B., Image Contrast and Color Enhancement Using Adaptive Gamma Correction and Histogram Equalization, Optik (Stuttg)., 183(February), pp. 329-337, 2019. DOI: 10.1016/j.ijleo.2019.02.054
  8. Chang, Y., Jung, C., Ke, P., Song, H. & Hwang, J., Automatic Contrast-limited Adaptive Histogram Equalization with Dual Gamma Correction, IEEE Access, 6, pp. 11782-11792, 2018. DOI: 10.1109/access.2018.2797872
  9. Huang, S.C., Cheng, F.C. & Chiu, Y.S., Efficient Contrast Enhancement Using Adaptive Gamma Correction with Weighting Distribution, IEEE Trans. Image Process., 22(3), pp. 1032-1041, 2013.
  10. Scharstein, D., Hirschmuller, H., Kitajima, Y., Krathwohl, G., Nesic, N., Wang, X. & Westling, P., High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth, German Conference on Pattern Recognition, pp. 31-42, 2014. DOI: 10.1007/978-3-319-11752-2_3
  11. Lim, J., Kim, Y. & Lee, S., A Census Transform-based Robust Stereo Matching Under Radiometric Changes, Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), pp. 1- 4, 2017.
  12. Hamzah, R.A., Wei, M.G.Y. & Anwar, N.S.N. Stereo Matching Based on Absolute Differences for Multiple Objects Detection, Telkomnika (Telecommunication Comput. Electron. Control.), 17(1), pp. 261-267, 2019. DOI: 10.12928/telkomnika.v17i1.9185
  13. Cigla, C. & Alatan, A.A., Information Permeability for Stereo Matching, Signal Process. Image Commun., 28(9), pp. 1072-1088, 2013.
  14. Zhange, J.Y. & Piao, Y., Research on Stereo Matching Algorithm Based on Improved Steady-State Matching Probability, J. Phys.: Conf. Ser., 1004, 012009, 2018. DOI: 10.1088/1742-6596/1004/1/012009
  15. Mattoccia, S., Tombari, F. & Di Stefano, L., Stereo Vision Enabling Precise Border Localization within a Scanline Optimization Framework, Asian Conference on Computer Vision, pp. 517-527, 2007. DOI: 10.1007/978-3-540-76390-1_51
  16. Kordelas, G.A., Alexiadis, D.S., Daras, P. & Izquierdo, E., Enhanced Disparity Estimation in Stereo Images, Image Vis. Comput., 35, pp. 31-49, 2015. DOI: 10.1016/j.imavis.2014.12.001
  17. Wu, W., Li, L. & Jin, W., Disparity Refinement Based on Segment-Tree and Fast Weighted Median Filter, IEEE International Conference on Image Processing (ICIP), pp. 3449-3453, 2016. DOI: 10.1109/icip.2016.7533000
  18. Kaur, R., Chawla, M., Khiva, N.K. & Ansari M.D., Comparative Analysis of Contrast Enhancement Techniques for Medical Images, Pertanika J. Sci. Technol., 26(3), pp. 965-978, 2018.