Correction of geometrically distorted underwater images ...csusap.csu.edu.au/~rpaul/paper/JOSA A...

Research Article Journal of the Optical Society of America A 1

Correction of geometrically distorted underwaterimages using shift map analysisKALYAN KUMAR HALDER1,*, MANORANJAN PAUL2, MURAT TAHTALI1, SREENATHA G. ANAVATTI1, AND

MANZUR MURSHED3

1School of Engineering and Information Technology, The University of New South Wales, Canberra, ACT 2600, Australia2School of Computing and Mathematics, Charles Sturt University, Bathurst, NSW 2795, Australia3School of Information Technology, Federation University, VIC 3842, Australia*Corresponding author: [email protected]

Compiled December 11, 2016

In underwater imaging, the water waves cause severe geometric distortions and blurring of the acquiredshort-exposure images. Corrections for these distortions have been tackled reasonably well by previousefforts, but still need improvement in the estimation of pixel shift maps to increase the restoration accu-racy. This paper presents a new algorithm that efficiently estimates the shift maps from the geometricallydistorted video sequences and uses those maps to restore the sequences. A non-rigid image registrationmethod is employed to estimate the shift maps of the distorted frames against a reference frame. Thesharpest frame of the sequence, determined using a sharpness metric, is chosen as the reference frame.A k-means clustering technique is employed to discard too blurry frames that could result in inaccuracyin the shift maps’ estimation. The estimated pixel shift maps are processed to generate the accurate shiftmap that is used to dewarp the input frames into their non-distorted forms. The proposed method isapplied on several synthetic and real-world video sequences and the obtained results exhibit significantimprovements over the state-of-the-art methods. © 2016 Optical Society of America

OCIS codes: (010.7295) Visibility and imaging; (100.3020) Image reconstruction-restoration; (110.3000) Image quality assessment.

http://dx.doi.org/10.1364/ao.XX.XXXXXX

1. INTRODUCTION

When imaging through an air-water surface, i.e., the camera isin the air above the water surface and the object of interest isimmersed, the captured images suffer from random geometricdistortions and non-uniform blurring [1–6]. These effects comefrom attenuation due to scattering and absorption throughoutthe imaging path [7–11]. Scattering is due to dissolved organicmatter or small suspended biological organisms, which causesa change in the direction of the light beam. In addition, theabsorption process causes a reduction in the beam energy. Evenif the water is clear, i.e., free of the small observable suspendedparticles, it is still inhomogeneous due to random temperatureand salinity fluctuations [12, 13]. Furthermore, light rays bendwhen they travel from air to water or vice versa; the amount ofbending is determined by the refractive indices of these media.This significantly contributes to the geometric distortions of theacquired images [2]. As a consequence, underwater imagingposes significant challenges at extended ranges when comparedto similar problems in the air.

Researchers from different fields including computer vision,computer graphics, and optical engineering have attempted

to address the general problems of recovering distorted im-ages due to water waves. In [14], an algorithm to solve theimage recovery problem was proposed by modeling a three-dimensional (3D) structure of the water waves. The 3D shape ofthe water surface is reconstructed using the optical flow fieldsof the distorted frames. The algorithm assumes that the dis-tance between the object and the water surface is known, andthe surface waviness is mild enough so that the short-exposureimages pose similar features. An adaptive optics (AO) technol-ogy was proposed in [13] for a possible solution of the underwa-ter imaging problem. Although AO systems, which are compar-atively popular in astronomical imaging, provide real-time cor-rections of images, the systems are very complex and the costof full correction in real-time is very high. Taking these factorsinto account, a simplified model was proposed and a low-costlow-order AO system was developed in [13] that could be con-trolled by a standard personal computer. The complete systemis developed based on the assumption that the degradations ofthe underwater images are predominantly low-frequency low-order aberrations and image motion.

A simple algorithm was proposed in [15], where it is as-

http://dx.doi.org/10.1364/ao.XX.XXXXXX


sumed that each point on an object viewed through the wavywater surface oscillates around its true position, i.e., when thewater surface is in steady-state condition. Also, the averageof the displacements of each point adds up to zero when timetends to infinity. Based on this assumption, the algorithm re-stores the image of a submerged object by taking the time-average of a large number of individual frames captured overa certain period of time. This approach can provide a reason-able result when the waviness of the water is low, but withhigher energy waves it fails to restore all features of the im-age, providing a blurry output. Based on the similar assump-tion, a graph-embedding method was proposed in [16] to recon-struct through-the-water images from a sequence of submergedobjects. A shortest path algorithm, assuming the distributionof surface normals of the wavy water to be Gaussian, is em-ployed to select the image having the shortest overall path toall the remaining images. A similar approach was presented in[17, 18] where the entire image is considered as a group of fixedsmall image patches. Unlike [16], the method compensates forthe leakage problem, caused by false shortest-distances amongsimilar but motion-blurred patches, by analyzing the motion-blur and the refraction effects separately. The authors in [19]emphasized first the estimation of the shape of the distortedwater surface and then reconstructing the scenes. Their algo-rithm assumes that the water surface follows the linear wavedynamics. The radial surface slope, which is used to estimatethe water surface, is deduced along extinction boundaries con-sidering the transmission region and internal reflection regionseparately, and it is recursively improved by minimizing thequadratic measurement error. The slopes of the estimated sur-face are then employed to reconstruct the target image by aninverse ray tracing.

A lucky region technique was proposed in [20] for recon-struction of underwater images. The algorithm divides thewhole image frame into an ensemble of smaller image patchessimilar to [16–18], but in this case, instead of the whole im-age only the image patches of good quality are selected. Thegood image patches are then processed further to produce ahigh-quality image. The algorithm was tested for both through-the-water imaging and water-surface-reflection imaging. In [6],an underwater image reconstruction approach was presentedbased on motion compensation, lucky patch selection and de-noising. The motion compensation is performed by an imageregistration technique, whereas the lucky patches are selectedby calculating the nonlinear gain coefficient for each pixel ofthe prototype-current frame pair by means of an image qualitymetric. The final reconstructed frame is obtained by fusing thelucky patches. Tian and Narasimhan [3, 4] introduced an un-derwater image reconstruction approach using a model-basedtracking. In the approach, a simulator of fluctuations of the wa-ter surface was first built using the forward Euler method andthen the original images were recovered using the estimatedwater surface. In [5], Oreifej et al. proposed a two-stage im-age reconstruction approach where in the first stage, an itera-tive non-rigid image registration is employed to register all theinput frames, after being pre-processed, against their temporalmean, which in turn, is used to produce a better-structured se-quence. The second stage involves extracting the unstructuredsparse errors from the sequence through rank minimization. In[21], Oreifej et al. applied the image reconstruction approachfor background estimation in the cases where the videos con-tain both turbulence effects and moving objects. Although thismethod works well for turbulence mitigation, it is very compu-

tationally expensive.In this paper, a new image restoration approach is proposed

to recover a non-distorted image from a distorted video se-quence of it based on a motion compensation technique. Thisapproach introduces a new way of choosing the reference framefor motion estimation by employing a blind image quality (BIQ)metric and discarding the severely degraded frames from fur-ther processing by means of a k-means algorithm to avoid inac-curacies in the output. An efficient image registration techniqueis employed to accurately estimate the motion vector maps, inother words, the pixel shift maps of the distorted frames againstthe reference frame. The centroid of the estimated shift maps isthen calculated and processed further to generate correct shiftmaps to obtain a high-quality frame. The results presented areof both theoretical and practical interest and offer a new effi-cient tool for substantial improvement of infrastructure of ma-chine vision-based systems in general and of intelligent surveil-lance systems in particular.

The rest of this paper is organized as follows. Section 2presents the details of the proposed image restoration method.Section 3 provides the experimental results on several syntheticand real-world datasets and a comparison of them against thosefrom the state-of-the-art methods [4, 21]. Finally, Section 4 con-tains comments and conclusions on the results.

2. PROPOSED IMAGE RESTORATION METHOD

The proposed geometric distortion correction method is illus-trated as a block diagram in Fig. 1. In the first step of themethod, an image sharpness assessment metric is used to cal-culate the sharpness values of the input frames, which are usedto select the reference frame as well as a subset of fairly sharpedframes from the input frames using k-means clustering. Next,a pixel registration technique is employed to calculate the pixelshift maps of the selected frames against the reference frame.After that, the shift maps are used to calculate the correct shiftmaps for the input frames through two-dimensional (2D) inter-polations. Finally, those shift maps are utilized to restore the dis-torted frames back to their truth ones. The complete approachis discussed in details below.

A. Assumptions

The proposed method is based on the Cox-Munk law [22] whichstated that, in the case when the water surface is sufficientlylarge and the waves are stationary, the normals of the water sur-face, therefore, the displacements of the points on the object areapproximately Gaussian distributed centered around their truelocations. Considering this law, it can be assumed that the wan-der of each pixel of an image over a certain period of time oscil-lates around its true position, i.e., the mean wander of the pixelwould be zero. In a different viewpoint, if the pixel’s wanderis measured with respect to any fixed position, the mean of itswander will represent its shift from the true position. This shiftcan be then used to restore the pixel to its true position. The goalof the proposed method is to accurately estimate the center ofeach pixel’s wander, hence called a centroid-based method.

B. Sharpness assessment

Based on the above-mentioned assumptions, instead of a sin-gle pixel, if a whole frame is considered, there needs to havea reference frame to calculate the pixel shift maps of all theinput frames against it. The simplest way could be calculat-ing the mean of the warped frames and use it as the reference


Input frames

Reference frame

Pixel shift maps

Correctedshift maps

Dewarped frames

Iterative

Input video

Output video

Pixelregistration

k-means clustering

Frame dewarping

Shift maps processingSelected

frames

Sharpness assessment

Fig. 1. Simple block diagram of the proposed image restoration method.

frame. Although a mean frame is somewhat geometrically sta-ble, it contains motion-blur, thus, cannot provide a high-qualityoutput [23, 24]. In this method, one of the warped framesfrom the whole video sequence is considered as the referenceframe. There are three options to choose a reference frame inthis approach: (1) first frame, (2) any random frame, and (c) thesharpest frame. Although the first two options do not involvetoo much processing time for the reference frame selection, theoutput video may not be of high-quality if the frame suffersfrom blur or noise [2]. Therefore in this study, a BIQ metric isemployed that calculates the sharpness of all input frames anduses these values to choose the reference frame.

Assuming P as the one dimensional Pseudo-Wigner distri-bution, the Rényi entropy R can be calculated as [25]

R(n) = −12

log2

(N

∑κ=1

P̆3n(κ)

), (1)

where n and κ are the time and frequency discrete variables,respectively, N is the number of pixels, and P̆(κ) = P(κ) · P∗(κ).Here P∗(κ) indicates the complex conjugate of P(κ).

Using Eq. 1, the expected entropy for an image can be com-puted as [25]

R̄(t, θs) = ∑n

R(n, θs)/M, (2)

where R̄(t, θs) is the expected value of entropy for image t ∈[1, 2, · · ·, M], measured in directions θs ∈ [θ1, θ2, · · ·, θS]. HereM and S represent the image size and the number of directions,respectively. The metric employed to select the sharpest frameis [25]

Q =

√√√√ S

∑s=1

(μt − R̄(t, θs))2 /S, (3)

where μt is the mean of the values R̄(t, θs). The frame that poseshigher value of Q is deemed to be the sharpest frame.

C. K-means clustering

Since the distortion condition changes non-uniformly andrapidly, there may be some frames in the video sequence thathave poor sharpness, and use of those frames to calculate pixelshift maps may add additional errors in the restoration process.

K-means clustering is an efficient tool to select a set of sub-frames due to its high convergence rate and invariant propertyto data ordering [27–29]. Therefore, a k-means algorithm is ap-plied on the sharpness values of the input frames, calculated us-ing Eq. (3), to create two clusters of frames - poor sharpness andfair sharpness. The frames belonging to the fair sharpness clus-ter are passed to the next pixel registration process, whereas theframes belonging to the poor sharpness cluster are discarded.

D. Pixel registration

A correlation coefficient-based backward mapping image reg-istration technique [26] is employed to estimate the pixel shiftmaps (sx and sy for x- and y- directions, respectively) of the se-lected input frames with respect to the reference frame. For thepurpose of restoration of geometric deformations, it is advanta-geous to have the shift maps in the form of backward mappingrather than forward mapping. The latter will tell where a pixelwent in the next frame, and can lead to gaps if two or more pix-els from the source frame collapse into one pixel in the targetframe, whereas the former will tell where a pixel came from theprevious frame. Backward mapping is easily implementable us-ing a 2D interpolation without any noticeable artifacts [30].

E. Shift maps processing

After estimating the shift maps for all selected frames throughthe backward mapping, their centroid is calculated as

px(x, y) =∑K

k=1 sx(x, y, k)K

,

py(x, y) =∑K

k=1 sy(x, y, k)K

,

(4)

where px and py are the centroid shift maps, k is the frame index,and K is the total number of selected frames.

Due to the use of backward mapping, the inversion of thecentroid shift maps is required before they are warped withthe individual shift maps in order to generate corrected shiftmaps, which transform each individual frames into their de-warped selves. However, simply calculating the matrix-inverseor changing the sign of the centroid shift maps may not pro-vide a desired inversion. A possible solution of this inversion


is described in [31], where the inverse of a shift map is approx-imated by warping of that shift map with its negative matrix.Therefore, the inverses of px and py are approximated as

p−1x (x, y) ≈ −px

(x − px(x, y), y − py(x, y)

),

p−1y (x, y) ≈ −py

(x − px(x, y), y − py(x, y)

),

(5)

where p−1x and p−1

y represent the inverse of px and py, respec-tively. According to the initial assumption, this shift map rep-resents the actual shift of the reference frame from its ground-truth, therefore, it can be used to warp the frame back into itsunwarped position through a 2D interpolation technique. Forall other frames, the actual shift maps are calculated using theinverse centroid shift map and individual shift maps as

s∗x(x, y, k) = p−1x (x, y) + sx

(x + p−1

x (x, y), y + p−1y (x, y), k

),

s∗y(x, y, k) = p−1y (x, y) + sy

(x + p−1

x (x, y), y + p−1y (x, y), k

).(6)

where s∗x and s∗y are the corrected shift maps for all the selectedframes.

F. Frame dewarping

Using the corrected shift maps in Eq. (6), the input frames canbe dewarped as

I∗(x, y, k) = I(x + s∗x(x, y, k), y + s∗y(x, y, k), k

), (7)

where I∗ is the dewarped version of the warped frame I.The dewarped sequence, after the first iteration, is more sta-

bilized than the original one. However, a few more iterationsmay provide higher accuracy and better stabilization of the out-put video. In that case, the dewarped version of the referenceframe after an iteration is used as the reference frame in the nextiteration.

3. SIMULATION EXPERIMENTS

The proposed method was implemented in MATLAB andtested on an Intel Core i7-2600 CPU 3.40 GHz machine with8 GB RAM. Experimental analyses were carried out to verifyits performance against two state-of-the-art methods, namelythe Tian method [4] and the Oreifej method [21]. As dis-cussed earlier, the Tian method is quite different from the Or-eifej method and the proposed method, which recovered thedistorted frames by estimating the water surface fluctuationsbased on a distortion model. However, the proposed methodand the Oreifej method are somewhat similar, where an im-age registration is employed to calculate the pixel shift mapsthat are later used to dewarp the frames. The main differencebetween these two methods is in the way of calculating thepixel shift maps. The Oreifej method uses the mean of the in-put frames as the reference frame, blurs the input frames withthe same blur presented in the reference frame, and registersthem against the reference frame to get the pixel shift maps. Onthe other hand, the proposed method uses the sharpest warpedframe as the reference frame, registers the input frames directlyagainst it, and uses the mean of the individual pixel shift mapsto calculate the accurate ones. In short, both methods rely on av-eraging - the Oreifej method does the averaging of the frames,whereas the proposed method does the averaging of the pixelshift maps avoiding the blurring operation, thus provides moreaccurate estimation of the pixel shift maps.

All three methods were first applied on four syntheticallydistorted underwater video sequences, namely the Brick-wall,Small-fonts, Midium-fonts, and Large-fonts sequence. Each se-quence consists of 60 frames and the underwater distortion ofdifferent strengths was added to them using a spatial distor-tion model that is developed based on Snell’s law of refractionand the wave equation [3]. Later on, the methods were appliedon a real degraded video sequence, namely the Water-colgatesequence [20]. This sequence contains 60 frames that were cap-tured in the laboratory by a video camera fixed above the watersurface. A still object was laid on the planner ground underthe clean water. Waves were generated by wind produced bya fan, while additional waves were created by dropping an ob-ject into the water. Any other attenuation of a turbid mediummay have not present during the video recording, the distor-tions were only due to the wavy water surface. This processgenerated distortions with a maximum amplitude of 15 pixels,with an average amplitude of 3.2 pixels, and a standard devi-ation of 2.8 pixels. The test sequences contain both texts andtextures in order to provide a good means of demonstrating un-derwater distortion effects clearly. Along with visual compar-ison, various image quality metrics are used to quantitativelyevaluate the performance of the methods.

A. Quality metrics

Several relevant quality metrics are used to analyze the per-formance of restoration by the proposed method. These met-rics require the existence of a distortion-free copy of an image,called the ground-truth image, which can be used for compari-son with the image whose quality is to be measured.

A.1. Mean square error (MSE)

The MSE is a frequently used measure of how close the restoredframe is to the ground-truth frame. It can be defined as

MSE =1

m × n

m

∑x=1

n

∑y=1

(R(x, y)− G(x, y)

)2, (8)

where R and G are any restored frame and the ground-truthframe, respectively. m × n is the size of each frame.

A.2. Peak point spread function (PPSF)

The PSF describes, as the name implies, the response of an imag-ing system to a point source or point object. It is basically theimage of a bright spot. The sharper and narrower the PSF ofan imaging system the better it is at resolving finer detail. Inorder to use PSF as a quality measure for the restored frames, itis calculated as

PSF(x, y) = F−1(

F (R(x, y))F (G(x, y))

), (9)

where F denotes the Fourier transform and F−1 its inverse.The PPSF, therefore, can be calculated as

PPSF = MAX (PSF(x, y)) , (10)

where MAX(·) represents the maximum possible value.

A.3. Peak signal-to-noise ratio (PSNR)

The term PSNR is an expression for the ratio between the max-imum possible value (power) of a signal and the power of dis-torting noise that affects the quality of its representation. Themathematical expression for the PSNR is as follows:

PSNR = 10 log10

((MAX(G))2

MSE

). (11)


(a) Ground truth frames

(b) Sample distorted frames

(c) Restored frames using Tian method

(d) Restored frames using Oreifej method

(e) Rrestored frames using proposed method

Fig. 2. Image restoration results. Simulation tests were carried out on the synthetically generated Brick-wall, Small-fonts, Medium-fonts and Large-fonts sequences [3].

A.4. Structural similarity (SSIM)

The SSIM is an image quality metric that estimates the visualimpact of changes in image luminance, contrast and structure.It can be expressed as [32]

SSIM =(2μRμG + C1)(2σRG + C2)

(μ2R + μ2

G + C1)(σ2R + σ2

G + C2), (12)

where μR, μG, σR, σG, and σRG are the local means, standarddeviations, and cross-covariance for the restored frame R andthe ground-truth frame G. C1 and C2 are constants and theycan be defined as

C1 = (0.01 × L)2,

C2 = (0.03 × L)2,

where L is the specified dynamic range value.


Table 1. Comparison of quality metrics between the Tian, Oreifej and proposed methods for the frames in Fig. 2.

Quality metrics MethodsVideo sequences

Brick-wall Small-fonts Midium-fonts Large-fonts

MSE

(Low values – Good)

Tian method 0.0108 0.0197 0.0150 0.0153

Oreifej method 0.0051 0.0029 0.0038 0.0078

Proposed method 0.0031 0.0015 0.0022 0.0072

PSNR

(High values – Good)

Tian method 19.68 17.05 18.24 18.14

Oreifej method 22.92 25.38 24.20 21.08

Proposed method 25.09 28.24 26.58 21.43

PPSF


Tian method 0.0403 0.0429 0.0651 0.0372

Oreifej method 0.0526 0.0435 0.1082 0.0788

Proposed method 0.1017 0.0856 0.1778 0.0856

SSIM


Tian method 0.6017 0.5465 0.5778 0.6199

Oreifej method 0.6331 0.6190 0.7107 0.6317

Proposed method 0.7726 0.7326 0.8100 0.7412

B. Performance analysis and comparison

The computational times for the Tian method and the Oreifejmethod required to process a single frame of size 268×292 areobtained as 48s and 37s, respectively. For the proposed method,the processing time for the same frame is 23s, which is 52%shorter than the Tian method and 38% shorter than the Oreifejmethod. The simulation results of the Tian, Oreifej and pro-posed methods on the four synthetic sequences are shown inFig. 2. The ground-truth frames, which were used to syntheti-cally generate the test sequences, are shown in Fig. 2(a). Sam-ples of the distorted frames from each sequence are shown inFig. 2(b). The means of the restored frames using the Tian,Oreifej and proposed methods are shown in Fig. 2(c) to 2(e),respectively. A mean frame is used here to show the imagerestoration performance as it is a good measure of how wellthe frames are close to each other. The more the mean frame issharp, the more the frames are geometrically aligned. It is ev-ident that all the three methods compensate for the geometricdistortions to some extent, especially when compared with thewarped frames separately. However, the Tian method is muchless effective in doing so, which is notable from the uncompen-sated regions and intensity distortions of the restored framesin Fig. 2(c). The Oreifej method performs comparatively wellin both geometric and intensity corrections as can be seen fromFig. 2(d). For the proposed method, the restored frames aremore geometrically aligned, therefore, provides a mean framecloser to the ground-truth. It can be argued that there are stillsome distortions in the restored frames in Fig. 2(e), but thesecame from the pixel registration and interpolation techniquethat introduce some processing errors.

The quantitative analysis is conducted for all the syntheticsequences by calculating the MSE, PSNR, PPSF, SSIM of themean of restored frames with respect to their ground-truths. Acomparison of the calculated metrics is shown in Table 1. Itcan be noted from the numerical comparison that the proposedmethod outperforms the other two methods in all the qualitymetrics for each of the test sequences.

Furthermore, Fig. 3 shows the PSF of the sample frame incomparison to the mean of restored frames by the three meth-ods for the randomly selected Brick-wall sequence. As ex-pected, the PSF of the warped frame in Fig. 3(a) is indistinct

020

0.02

0.04

15 20

0.06

PSF peak = 0.032354

15

0.08

10

0.1

105 5

0 0

(a)

020

0.02

0.04

15 20

0.06

PSF peak = 0.040328

15

0.08

10

0.1

105 5

0 0

(b)

020

0.02

0.04

15 20

0.06

PSF peak = 0.052632

15

0.08

10

0.1

105 5

0 0

(c)

020

0.02

0.04

15 20

0.06

PSF peak = 0.10171

15

0.08

10

0.1

105 5

0 0

(d)

Fig. 3. PSF plot for the randomly selected Brick-wall sequence:(a) sample frame, (b) using Tian method, (c) using Oreifejmethod, and (d) using proposed method.

without any notable peak at its center. In the case of the Tianmethod, although its PSF could provide a reasonable restora-tion of the warped frames, it is not very distinguishable, asit only sports a weak raise at its center as shown in Fig. 3(b).The Oreifej method’s PSF is slightly more pronounced as shownin Fig. 3(c), whilst also exhibiting increased PSF activity awayfrom the center. On the other hand, the proposed method’s PSFshown in Fig. 3(d) has a distinct peak at the center, althoughnot as sharp as to indicate perfect correlation, the peak is signif-icantly higher with less activity away from the center.

The performances of the Tian, Oreifej, and proposed meth-ods are also compared by their geometric correction capabilitiesunder noisy conditions. Figure 4 shows the SSIM index versusSNR plots for the Brick-wall sequence for the three methods.As expected, the restoration performance of all the methods de-grades with the increase of noise level. In the case of the pro-


10 15 20 25 30 35 40 45

SNR (dB)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1SS

IM in

dex

Tian methodOreifej methodProposed method

Fig. 4. SSIM index of the restored mean frame versus SNR forthe Brick-wall sequence.

posed method the SSIM index is consistently higher than theother two methods for all the values of SNR.

After verifying the performance of the three methods onthe synthetic sequences, they were applied on the real-worldWater-colgate sequence [20]. A sample warped frame is shownin Fig. 5(a) for visual comparison. The means of the restoredframes using the Tian, Oreifej and proposed methods are shownin Fig. 5(b) to 5(d), respectively. The Tian method improvesthe geometric alignments at the cost of considerable blurringin the restored frame. The Oreifej method provides better vi-sual quality of the frame than the Tian method, though the re-stored frame still has some distortions. However, the proposedmethod outperforms both state-of-the art methods by provid-ing more details of the frame, which is visually evident from thebetter legibility of the word ‘stationery’ compared to the otherframes in Fig. 5(a) to 5(c).

Since the ground-truth frame is not available in this case, thequality metrics presented in Table 1 are not applicable for thequantitative assessment. Still, an object comparison is made forthese results using the BIQ metric, defined in Eq. (3), and thecalculated values are presented in Table 2. For the proposedmethod, the value of the metric is higher than that of the othertwo methods, confirming its better geometric correction capa-bility.

Table 2. Comparison of quality metric between the Tian, Or-eifej and proposed methods for the real-world Water-colgatesequence.

Quality metric MethodsVideo sequence

Water-colgate

BIQ

(High values–

Good)

Tian method 0.0056

Oreifej method 0.0063

Proposed method 0.0115

4. DISCUSSION AND CONCLUSION

An efficient algorithm to recover the non-distorted image of anunderwater scene from a distorted video sequence is success-fully implemented in this paper. The proposed approach is

(a) (b)

(c) (d)

Fig. 5. Restoration results on real-world Water-colgate se-quence [20]: (a) sample distorted frame, (b) restored frameusing Tian method, (c) restored frame using Oreifej method,and (d) restored frame using proposed method.

based on estimating the centroid of the random shifting of thepixels of short-exposure frames using a motion estimation tech-nique and using it to obtain a single high-quality image. Theperformance of the proposed method is highly dependent onthe accuracy of the pixel shift maps estimation, which is verychallenging due to large random fluctuations of refractive in-dex. A highly accurate pixel registration technique is employedfor this purpose. Also, a sharpness assessment strategy pro-vides the sharpest frame to be used as the reference frame,while a k-means clustering algorithm allows only reasonablygood quality frames for shift maps computation. The proposedscheme therefore demonstrates a convincing way of calculatingthe nearly accurate pixel shift maps of the distorted frames thatis further processed and used to warp the reference frame backinto its truth position.

The proposed method outperforms the state-of-the-art meth-ods on both synthetic and real-world video sequences, espe-cially, in terms of visual quality and accuracy. Although aniterative scheme is considered in the proposed method, its per-formance with a single iteration also provides substantial im-provement. The computational time of the proposed methodcan be easily controlled by various means, such as by introduc-ing some stopping criteria in the pixel registration algorithmand using the fastest interpolation technique. The implemen-tation of the proposed method on graphical processing units(GPUs) or field programmable gate arrays (FPGAs) could allowits real-time application to optical communications, mine detec-tion, and surveillance. In its current form, only geometrical cor-rection of the frames is of concern, whilst the consideration ofother imaging problems, such as localized blur and CCD noiseis a future aim. The proposed method can also be extended toimaging through the water-surface-reflection and imaging fromunderwater to observe objects above the water surface.

REFERENCES

1. J. Gilles and S. Osher, “Wavelet burst accumulation for turbulence mit-igation,” J. Electron. Imaging 25(3), 033003 (2016).

2. K. K. Halder, M. Tahtali, and S. G. Anavatti, “Simple algorithm for cor-


rection of geometrically warped underwater images,” Electron. Lett.50(23), 1687–1689 (2014).

3. Y. Tian and S. G. Narasimhan, “Seeing through water: Image restora-tion using model-based tracking,” in Proceedings of IEEE InternationalConference on Computer Vision (Institute of Electrical and ElectronicsEngineers, 2009), pp. 2303–2310.

4. Y. Tian and S. G. Narasimhan, “Globally optimal estimation of nonrigidimage distortion,” Int. J. Comput. Vis. 98(3), 279–302 (2012).

5. O. Oreifej, S. Guang, T. Pace, and M. Shah, “A two-stage reconstruc-tion approach for seeing through water,” in Proceedings of IEEE Con-ference on Computer Vision and Pattern Recognition (Institute of Elec-trical and Electronics Engineers, 2011), pp. 1153–1160.

6. A. V. Kanaev, W. Hou, S. Woods, and L. N. Smith, “Restoration ofturbulence degraded underwater images,” Opt. Eng. 51(5), 057007(2012).

7. A. Galdran, D. Pardo, A. Picón, and A. Alvarez-Gila, “Automatic Red-Channel underwater image restoration," J. Vis. Commun. Image Rep-resent. 26, 132–145 (2015).

8. K. Seemakurthy and A. N. Rajagopalan, “Deskewing of UnderwaterImages,” IEEE Trans. Image Process. 24(3), 1046–1059 (2015).

9. W. Hou, “A simple underwater imaging model,” Opt. Lett. 34(17),2688–2690 (2009).

10. M. Boffety, F. Galland, and A.-G. Allais, “Influence of polarization fil-tering on image registration precision in underwater conditions,” Opt.Lett. 37(15), 3273–3275 (2012).

11. G. Wang, B. Zheng, and F. F. Sun, “Estimation-based approach forunderwater image restoration,” Opt. Lett. 36(13), 2384–2386 (2011).

12. R. Schettini and S. Corchs, “Underwater image processing: State ofthe art of restoration and image enhancement methods,” EURASIP J.Adv. Signal Process. 2010, 746052 (2010).

13. M. L. Holohan and J. C. Dainty, “Low-order adaptive optics: A possibleuse in underwater imaging?,” Opt. Laser Technol. 29(1), 51–55 (1997).

14. H. Murase, “Surface shape reconstruction of a nonrigid transport ob-ject using refraction and motion,” IEEE Trans. Pattern Anal. Mach. In-tell. 14(10), 1045–1052 (1992).

15. R. Shefer, M. Malhi, and A. Shenhar, “Waves distortion correctionusing crosscorrelation,” http://visl.technion.ac.il/projects/2000maor/(2001).

16. A. Efros, V. Isler, J. Shi, and M. Visontai, “Seeing through water,” inProceedings of Neural Information Processing Systems (Neural Infor-mation Processing Systems Foundation, 2004).

17. A. Donate, G. Dahme, and E. Ribeiro, “Classification of textures dis-torted by water waves,” in Proceedings of International Conference onPattern Recognition (Institute of Electrical and Electronics Engineers,2006), pp. 421–424.

18. A. Donate and E. Ribeiro, “Improved reconstruction of images dis-

torted by water waves,” in Advances in Computer Graphics andComputer Vision (J. Braz, A. Ranchordas, H. Araújo, and J. Jorge,eds.), vol. 4 of Communications in Computer and Information Science(Springer Berlin Heidelberg, 2007), pp. 264–277.

19. D. M. Milder, P. W. Carter, N. L. Flacco, B. E. Hubbard, N. M. Jones,K. R. Panici, B. D. Platt, R. E. Potter, K. W. Tong, and D. J. Twissel-mann, “Reconstruction of through-surface underwater imagery,” WaveRandom Complex 16(4), 521–530 (2006).

20. Z. Wen, A. Lambert, D. Fraser, and H. Li, “Bispectral analysis andrecovery of images distorted by a moving water surface,” Appl. Opt.49(33), 6376–6384 (2010).

21. O. Oreifej, X. Li, and M. Shah, “Simultaneous video stabilizationand moving object detection in turbulence,” IEEE Trans. Pattern Anal.Mach. Intell. 35(2), 450–462 (2013).

22. C. Cox and W. Munk, “Slopes of the sea surface deduced from pho-tographs of sun glitter,” Bull. Scripps Inst. Oceanogr. 6(9), 401–479(1956).

23. M. Tahtali, D. Fraser, and A. J. Lambert, “Restoration of non-uniformlywarped images using a typical frame as prototype,” in Proceedingsof IEEE Region 10 Conference (Institute of Electrical and ElectronicsEngineers, 2005), pp. 1–6.

24. K. K. Halder, M. Tahtali, and S. G. Anavatti, “Geometric correction ofatmospheric turbulence-degraded video containing moving objects,”Opt. Express 23(4), 5091–5101 (2015).

25. S. Gabarda and G. Cristóbal, “Blind image quality assessmentthrough anisotropy,” J. Opt. Soc. Am. A 24(12), B42–B51 (2007).

26. A. Myronenko and X. Song, “Intensity-based image registration byminimizing residual complexity,” IEEE Trans. Med. Imaging 29(11),1882–1891 (2010).

27. A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: A review,”ACM Comput. Surv. 31(3), 264–323 (1999).

28. S. Yu, L.-C. Tranchevent, X. Liu, W. Glänzel, J. A. K. Suykens, B.De Moor, and Y. Moreau, “Optimized data fusion for kernel k-meansclustering,” IEEE Trans. Pattern Anal. Mach. Intell. 34(5), 1031–1039(2012).

29. M. H. Furhad, M. Tahtali, and A. Lambert, “Restoring atmospheric-turbulence-degraded images,” Appl. Opt. 55(19), 5082–5090 (2016)

30. M. Tahtali, A. J. Lambert, and D. Fraser, “Graphics processing unitrestoration of non-uniformly warped images using a typical frame asprototype,” Proc. SPIE 7800, 78000H (2010).

31. M. Tahtali, “Imaging techniques through the atmosphere,” Ph.D. dis-sertation (University of New South Wales, 2007).

32. Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli, “Image qualityassessment: From error visibility to structural similarity,” IEEE Trans.Image Process. 13(4), 600–612 (2004).

Correction of geometrically distorted underwater images ...csusap.csu.edu.au/~rpaul/paper/JOSA A...

Documents

Transcript of Correction of geometrically distorted underwater images ...csusap.csu.edu.au/~rpaul/paper/JOSA A...