IEEE Final Year Projects 2011-2012 :: Elysium Technologies Pvt Ltd::Imageprocessing

Elysium Technologies Private Limited ISO 9001:2008 A leading Research and Development Division Madurai | Chennai | Trichy | Coimbatore | Kollam| Singapore Website: elysiumtechnologies.com, elysiumtechnologies.info Email: [email protected]

IEEE Project List 2011 - 2012

Madurai

Elysium Technologies Private Limited

230, Church Road, Annanagar,

Madurai , Tamilnadu – 625 020.

Contact : 91452 4390702, 4392702, 4394702.

eMail: [email protected]

Trichy


3rd

Floor,SI Towers,

15 ,Melapudur , Trichy,

Tamilnadu – 620 001.

Contact : 91431 - 4002234.


Kollam


Surya Complex,Vendor junction,

kollam,Kerala – 691 010.

Contact : 91474 2723622.


A b s t r a c t IMAGE PROCESSING 2011 - 2012

2

01 1-D Transforms for the Motion Compensation Residual

Transforms used in image coding are also commonly used to compress prediction residuals in video coding. Prediction

residuals have different spatial characteristics from images, and it is useful to develop transforms that are adapted to

prediction residuals. In this paper, we explore the differences between the characteristics of images and motion

compensated prediction residuals by analyzing their local anisotropic characteristics and develop transforms adapted to

the local anisotropic characteristics of these residuals. The analysis indicates that many regions of motion compensated

prediction residuals have 1-D anisotropic characteristics and we propose to use 1-D directional transforms for these

regions. We present experimental results with one example set of such transforms within the H.264/AVC codec and the

results indicate that the proposed transforms can improve the compression efficiency of motion compensated prediction

residuals over conventional transforms

.02 A Filtering Approach to Edge Preserving MAP Estimation of Images

The authors present a computationally efficient technique for maximum a posteriori (MAP) estimation of images in the presence of both

blur and noise. The image is divided into statistically independent regions. Each region is modelled with a WSS Gaussian prior. Classical

Wiener filter theory is used to generate a set of convex sets in the solution space, with the solution to the MAP estimation problem lying at

the intersection of these sets. The proposed algorithm uses an underlying segmentation of the image, and a means of determining the

segmentation and refining it are described. The algorithm is suitable for a range of image restoration problems, as it provides a

computationally efficient means to deal with the shortcomings of Wiener filtering without sacrificing the computational simplicity of the

filtering approach. The algorithm is also of interest from a theoretical viewpoint as it provides a continuum of solutions between Wiener

filtering and Inverse filtering depending upon the segmentation used. We do not attempt to show here that the proposed method is the best

general approach to the image reconstruction problem. However, related work referenced herein shows excellent performance in the

specific problem of demosaicing.

03 A Generalized Unsharp Masking Algorithm

Enhancement of contrast and sharpness of an image is required in many applications. Unsharp masking is a classical tool

for sharpness enhancement. We propose a generalized unsharp masking algorithm using the exploratory data model as a

unified framework. The proposed algorithm is designed to address three issues: 1) simultaneously enhancing contrast and

sharpness by means of individual treatment of the model component and the residual, 2) reducing the halo effect by means

of an edge-preserving filter, and 3) solving the out-of-range problem by means of log-ratio and tangent operations. We also

present a study of the properties of the log-ratio operations and reveal a new connection between the Bregman divergence

and the generalized linear systems. This connection not only provides a novel insight into the geometrical property of such

systems, but also opens a new pathway for system development. We present a new system called the tangent system

which is based upon a specific Bregman divergence. Experimental results, which are comparable to recently published

results, show that the proposed algorithm is able to significantly improve the contrast and sharpness of an image. In the

proposed algorithm, the user can adjust the two parameters controlling the contrast and sharpness to produce the desired

1



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


results. This makes the proposed algorithm practically useful.

04 A Geometric Method for Optimal Design of Color Filter Arrays

A color filter array (CFA) used in a digital camera is a mosaic of spectrally selective filters, which allows only one color

component to be sensed at each pixel. The missing two components of each pixel have to be estimated by methods known

as demosaicking. The demosaicking algorithm and the CFA design are crucial for the quality of the output images. In this

paper, we present a CFA design methodology in the frequency domain. The frequency structure, which is shown to be just

the symbolic DFT of the CFA pattern (one period of the CFA), is introduced to represent images sampled with any

rectangular CFAs in the frequency domain. Based on the frequency structure, the CFA design involves the solution of a

constrained optimization problem that aims at minimizing the demosaicking error. To decrease the number of parameters

and speed up the parameter searching, the optimization problem is reformulated as the selection of geometric points on the

boundary of a convex polygon or the surface of a convex polyhedron. Using our methodology, several new CFA patterns

are found, which outperform the currently commercialized and published ones. Experiments demonstrate the effectiveness

of our CFA design methodology and the superiority of our new CFA patterns.

05 A Hybrid Approach to Detect and Localize Texts in Natural Scene Images

Text detection and localization in natural scene images is important for content-based image analysis. This problem is

challenging due to the complex background, the non-uniform illumination, the variations of text font, size and line

orientation. In this paper, we present a hybrid approach to robustly detect and localize texts in natural scene images. A text

region detector is designed to estimate the text existing confidence and scale information in image pyramid, which help

segment candidate text components by local binarization. To efficiently filter out the non-text components, a conditional

random field (CRF) model considering unary component properties and binary contextual component relationships with

supervised parameter learning is proposed. Finally, text components are grouped into text lines/words with a learning-

based energy minimization method. Since all the three stages are learning-based, there are very few parameters requiring

manual tuning. Experimental results evaluated on the ICDAR2005 competition dataset showthat our approach yields higher

precision and recall performance compared with state-of-the-art methods. We also evaluated our approach on a multilingual

image dataset with promising results.

06 A Linear Programming Approach for Optimal Contrast-Tone Mapping

This paper proposes a novel algorithmic approach of image enhancement via optimal contrast-tone mapping. In a

fundamental departure from the current practice of histogram equalization for contrast enhancement, the proposed

approach maximizes expected contrast gain subject to an upper limit on tone distortion and optionally to other constraints

that suppress artifacts. The underlying contrast-tone optimization problem can be solved efficiently by linear programming.

This new constrained optimization approach for image enhancement is general, and the user can add and fine tune the

constraints to achieve desired visual effects. Experimental results demonstrate clearly superior performance of the new

approach over histogram equalization and its variants.

2



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


07 A Majorize–Minimize Strategy for Subspace Optimization Applied to Image Restoration

This paper proposes accelerated subspace optimization methods in the context of image restoration. Subspace

optimization methods belong to the class of iterative descent algorithms for unconstrained optimization. At each iteration

of such methods, a stepsize vector allowing the best combination of several search directions is computed through a

multidimensional search. It is usually obtained by an inner iterative second-order method ruled by a stopping criterion that

guarantees the convergence of the outer algorithm. As an alternative, we propose an original multidimensional search

strategy based on the majorize–minimize principle. It leads to a closed-form stepsize formula that ensures the convergence

of the subspace algorithm whatever the number of inner iterations. The practical efficiency of the proposed scheme is

illustrated in the context of edge-preserving image restoration.

08 A Maximum Likelihood Approach to Joint Image Registration and Fusion

Both image registration and fusion can be formulated as estimation problems. Instead of estimating the registration

parameters and the true scene separately as in the conventional way, we propose a maximum likelihood approach for joint

image registration and fusion in this paper. More precisely, the fusion performance is used as the criteria to evaluate the

registration accuracy. Hence, the registration parameters can be automatically tuned so that both fusion and registration

can be optimized simultaneously. The expectation maximization algorithm is employed to solve this joint optimization

problem. The Cramer-Rao bound (CRB) is then derived. Our experiments use several types of sensory images for

performance evaluation, such as visual images, IR thermal images, and hyperspectral images. It is shown that the mean

square error of estimating the registration parameters using the proposed method is close to the CRBs. At the mean time,

an improved fusion performance can be achieved in terms of the edge preservation measure Q^AB/F , compared to the

Laplacian pyramid fusion approach.

09 A New Hybrid Method for Image Approximation Using the Easy Path Wavelet Transform

The easy path wavelet transform (EPWT) has recently been proposed by one of the authors as a tool for sparse

representations of bivariate functions from discrete data, in particular from image data. The EPWT is a locally adaptive

wavelet transform. It works along pathways through the array of function values and exploits the local correlations of the

given data in a simple appropriate manner. However, the EPWT suffers from its adaptivity costs that arise from the storage

of path vectors. In this paper, we propose a new hybrid method for image approximation that exploits the advantages of the

usual tensor product wavelet transform for the representation of smooth images and uses the EPWT for an efficient

representation of edges and texture. Numerical results show the efficiency of this procedure.

10 A Novel 3-D Color Histogram Equalization Method With Uniform 1-D Gray Scale Histogram

3



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


The majority of color histogram equalization methods do not yield uniform histogram in gray scale. After converting a color

histogram equalized image into gray scale, the contrast of the converted image is worse than that of an 1-D gray scale

histogram equalized image. We propose a novel 3-D color histogram equalization method that produces uniform

distribution in gray scale histogram by defining a new cumulative probability density function in 3-D color space. Test

results with natural and synthetic images are presented to compare and analyze various color histogram equalization

algorithms based upon 3-D color histograms.We also present theoretical analysis for nonideal performance of existing

methods.

11 A Stratified Approach for Camera Calibration Using Spheres

This paper proposes a stratified approach for camera calibration using spheres. Previous works have exploited epipolar

tangents to locate frontier points on spheres for estimating the epipolar geometry. It is shown in this paper that other than

the frontier points, two additional point features can be obtained by considering the bitangent envelopes of a pair of

spheres. A simple method for locating the images of such point features and the sphere centers is presented. An algorithm

for recovering the fundamental matrix in a plane plus parallax representation using these recovered image points and the

epipolar tangents from three spheres is developed. A new formulation of the absolute dual quadric as a cone tangent to a

dual sphere with the plane at infinity being its vertex is derived. This allows the recovery of the absolute dual quadric,

which is used to upgrade the weak calibration to a full calibration. Experimental results on both synthetic and real data are

presented, which demonstrate the feasibility and the high precision achieved by our proposed algorithm.

12 A Uniform Framework for Estimating Illumination Chromaticity, Correspondence, and Specular Reflection

Based upon a new correspondence matching invariant called illumination chromaticity constancy, we present a new

solution for illumination chromaticity estimation, correspondence searching, and specularity removal. Using as few as two

images, the core of our method is the computation of a vote distribution for a number of illumination chromaticity

hypotheses via correspondence matching. The hypothesis with the highest vote is accepted as correct. The estimated

illumination chromaticity is then used together with the new matching invariant to match highlights, which inherently

provides solutions for correspondence searching and specularity removal. Our method differs from the previous

approaches: those treat these vision problems separately and generally require that specular highlights be detected in a

preprocessing step. Also, our method uses more images than previous illumination chromaticity estimation methods,

which increases its robustness because more inputs/constraints are used. Experimental results on both synthetic and real

images demonstrate the effectiveness of the proposed method.

13 A Variational Model for Histogram Transfer of Color Images

In this paper, we propose a variational formulation for histogram transfer of two or more color images. We study an energy

functional composed by three terms: one tends to approach the cumulative histograms of the transformed images, the

other two tend to maintain the colors and geometry of the original images. By minimizing this energy, we obtain an

algorithm that balances equalization and the conservation of features of the original images. As a result, they evolve while

approaching an intermediate histogram between them. This intermediate histogram does not need to be specified in

advance, but it is a natural result of the model. Finally, we provide experiments showing that the proposed method

4



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


compares well with the state of the art.

14 A Variational Model for Segmentation of Overlapping Objects With Additive Intensity Value

We propose a variant of the Mumford–Shah model for the segmentation of a pair of overlapping objects with additive

intensity value. Unlike standard segmentation models, it does not only determine distinct objects in the image, but also

recover the possibly multiple membership of the pixels. To accomplish this, some a priori knowledge about the

smoothness of the object boundary is integrated into the model. Additivity is imposed through a soft constraint which

allows the user to control the degree of additivity and is more robust than the hard constraint. We also show analytically

that the additivity parameter can be chosen to achieve some stability conditions. To solve the optimization problem

involving geometric quantities efficiently, we apply a multiphase level set method. Segmentation results on synthetic and

real images validate the good performance of our model, and demonstrate the model’s applicability to images with multiple

channels and multiple objects.

15 Accelerating X-Ray Data Collection Using Pyramid Beam Ray Casting Geometries

Image reconstruction from its projections is a necessity in many applications such as medical (CT), security, inspection,

and others. This paper extends the 2-D Fan-beam method in [2] to 3-D. The algorithm, called Pyramid Beam (PB), is based

upon the parallel reconstruction algorithm in [1]. It allows fast capturing of the scanned data, and in 3-D, the

reconstructions are based upon the discrete X-ray transform [1]. The PB geometries are reordered to fit parallel projection

geometry. The underlying idea is to use the algorithm in [1] by porting the proposed PB geometries to fit the algorithm in

[1]. The complexity of the algorithm is comparable with the 3-D FFT. The results show excellent reconstruction qualities

while being simple for practical use.

16 Adaptive Multiwavelet-Based Watermarking Through JPW Masking

In this paper, a multibit, multiplicative, spread spectrum watermarking using the discrete multiwavelet (including

unbalanced and balanced multiwavelet) transform is presented. Performance improvement with respect to existing

algorithm is obtained by means of a new just perceptual weighting (JPW) model. The new model incorporates various

masking effects of human visual perception by taking into account the eye’s sensitivity to noise changes depending on

spatial frequency, luminance and texture of all the image subbands. In contrast to conventional JND threshold model, JPW

describing minimum perceptual sensitivity weighting to noise changes, is fitter for nonadditive watermarking. Specifically,

watermarking strength is adaptively adjusted to obtain minimum perceptual distortion by employing the JPW model.

Correspondingly, an adaptive optimum decoding is derived using a statistic model based on generalized-Gaussian

distribution (GGD) for multiwavelet coefficients of the cover-image. Furthermore, the impact of multiwavelet characteristics

on proposed watermarking scheme is also analyzed. Finally, the experimental results show that proposed JPW model can

improve the quality of the watermarked image and give more robustness of the watermark as compared with a variety of

state-of-the-art algorithms.

5



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


17 Adaptive Sequential Prediction of Multidimensional Signals With Applications to Lossless Image Coding

We investigate the problem of designing adaptive sequential linear predictors for the class of piecewise autoregressive

multidimensional signals, and adopt an approach of minimum description length (MDL) to determine the order of the

predictor and the support on which the predictor operates. The design objective is to strike a balance between the bias and

variance of the prediction errors in the MDL criterion. The predictor design problem is particularly interesting and

challenging for multidimensional signals (e.g., images and videos) because of the increased degree of freedom in choosing

the predictor support. Our main result is a new technique of sequentializing a multidimensional signal into a sequence of

nested contexts of increasing order to facilitate the MDL search for the order and the support shape of the predictor, and

the sequentialization is made adaptive on a sample by sample basis. The proposed MDL-based adaptive predictor is applied

to lossless image coding, and its performance is empirically established to be the best among all the results that have been

published till present.

18 An Augmented Lagrangian Approach to the Constrained Optimization Formulation of Imaging Inverse Problems

We propose a new fast algorithm for solving one of the standard approaches to ill-posed linear inverse problems (IPLIP),

where a (possibly nonsmooth) regularizer is minimized under the constraint that the solution explains the observations

sufficiently well. Although the regularizer and constraint are usually convex, several particular features of these problems

(huge dimensionality, nonsmoothness) preclude the use of off-the-shelf optimization tools and have stimulated a

considerable amount of research. In this paper, we propose a new efficient algorithm to handle one class of constrained

problems (often known as basis pursuit denoising) tailored to image recovery applications. The proposed algorithm, which

belongs to the family of augmented Lagrangian methods, can be used to deal with a variety of imaging IPLIP, including

deconvolution and reconstruction from compressive observations (such as MRI), using either total-variation or wavelet-

based (or, more generally, frame-based) regularization. The proposed algorithm is an instance of the so-called alternating

direction method of multipliers, for which convergence sufficient conditions are known; we show that these conditions are

satisfied by the proposed algorithm. Experiments on a set of image restoration and reconstruction benchmark problems

show that the proposed algorithm is a strong contender for the state-of-the-art.

19 An Improved Image Compression Algorithm Using Binary Space Partition Scheme and Geometric Wavelets

Geometric wavelet is a recent development in the field of multivariate nonlinear piecewise polynomials approximation. The

present study improves the geometric wavelet (GW) image coding method by using the slope intercept representation of

the straight line in the binary space partition scheme. The performance of the proposed algorithm is compared with the

wavelet transform-based compression methods such as the embedded zerotree wavelet (EZW), the set partitioning in

hierarchical trees (SPIHT) and the embedded block coding with optimized truncation (EBCOT), and other recently

6



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


developed “sparse geometric representation” based compression algorithms. The proposed image compression algorithm

outperforms the EZW, the Bandelets and the GW algorithm. The presented algorithm reports a gain of 0.22 dB over the GW

method at the compression ratio of 64 for the Cameraman test image.

20 An Iterative Shrinkage Approach to Total-Variation Image Restoration

The problem of restoration of digital images from their degraded measurements plays a central role in a multitude of

practically important applications. A particularly challenging instance of this problem occurs in the case when the

degradation phenomenon is modeled by an ill-conditioned operator. In such a situation, the presence of noise makes it

impossible to recover a valuable approximation of the image of interest without using some a priori information about its

properties. Such a priori information—commonly referred to as simply priors—is essential for image restoration, rendering

it stable and robust to noise. Moreover, using the priors makes the recovered images exhibit some plausible features of

their original counterpart. Particularly, if the original image is known to be a piecewise smooth function, one of the standard

priors used in this case is defined by the Rudin-Osher-Fatemi model, which results in total variation (TV) based image

restoration. The current arsenal of algorithms for TV-based image restoration is vast. In this present paper, a different

approach to the solution of the problem is proposed based upon the method of iterative shrinkage (aka iterated

thresholding). In the proposed method, the TV-based image restoration is performed through a recursive application of two

simple procedures, viz. linear filtering and soft thresholding. Therefore, the method can be identified as belonging to the

group of first-order algorithms which are efficient in dealing with images of relatively large sizes. Another valuable feature

of the proposed method consists in its working directly with the TV functional, rather then with its smoothed versions.

Moreover, the method provides a single solution for both isotropic and anisotropic definitions of the TV functional, thereby

establishing a useful connection between the two formulae. Finally, a number of standard examples of image deblurring are

demonstrated, in which the proposed method can provide restoration results of superior quality as compared to the case of

sparse-wavelet deconvolution.

21 An Optimal Data Hiding Scheme With Tree-Based Parity Check

Reducing distortion between the cover object and the stego object is an important issue for steganography. The tree-based

parity check method is very efficient for hiding a message on image data due to its simplicity. Based on this approach, we

propose a majority vote strategy that results in least distortion for finding a stego object. The lower embedding efficiency of

our method is better than that of previous works when the hidden message length is relatively large.

22 An Orientation Inference Framework for Surface Reconstruction From Unorganized Point Clouds

In this paper, we present an orientation inference framework for reconstructing implicit surfaces from unoriented point

clouds. The proposed method starts from building a surface approximation hierarchy comprising of a set of unoriented

local surfaces, which are represented as a weighted combination of radial basis functions. We formulate the determination

of the globally consistent orientation as a graph optimization problem by treating the local implicit patches as nodes. An

7



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


energy function is defined to penalize inconsistent orientation changes by checking the sign consistency between

neighboring local surfaces. An optimal labeling of the graph nodes indicating the orientation of each local surface can,

thus, be obtained by minimizing the total energy defined on the graph. The local inference results are propagated over the

model in a front-propagation fashion to obtain the global solution. The reconstructed surfaces are consolidated by a simple

and effective inspection procedure to locate the erroneously fitted local surfaces. A progressive reconstruction algorithm

that iteratively includes more oriented points to improve the fitting accuracy and efficiently updates the RBF coefficients is

proposed. We demonstrate the performance of the proposed method by showing the surface reconstruction results on

some real-world 3-D data sets with comparison to those by using the previous methods.

23 Anisotropic Morphological Filters With Spatially-Variant Structuring Elements Based on Image-Dependent Gradient Fields

This paper deals with the theory and applications of spatially-variant discrete mathematical morphology. We review and

formalize the definition of spatially variant dilation/erosion and opening/closing for binary and gray-level images using

exclusively the structuring function, without resorting to complement. This theoretical framework allows to build

morphological operators whose structuring elements can locally adapt their shape and orientation across the dominant

direction of the structures in the image. The shape and orientation of the structuring element at each pixel are extracted

from the image under study: the orientation is given by means of a diffusion process of the average square gradient field,

which regularizes and extends the orientation information from the edges of the objects to the homogeneous areas of the

image; and the shape of the orientated structuring elements can be linear or it can be given by the distance to relevant

edges of the objects. The proposed filters are used on binary and gray-level images for enhancement of anisotropic

features such as coherent, flow-like structures. Results of spatially-variant erosions/dilations and openings/closings-based

filters prove the validity of this theoretical sound and novel approach.

24 Autofluorescence Removal by Non-Negative Matrix Factorization

This paper describes a new, physically interpretable, fully automatic algorithm for removal of tissue autofluorescence (AF)

from fluorescence microscopy images, by non-negative matrix factorization. Measurement of signal intensities from the

concentration of certain fluorescent reporter molecules at each location within a sample of biological tissue is confounded

by fluorescence produced by the tissue itself (autofluorescence). Spectral mixing models use mixing coefficients to specify

how much fluorescence from each source is present and unmixing algorithms separate the two fluorescent sources.

Current spectral unmixing methods for AF removal often require a priori knowledge of mixing coefficients. Those which do

not, such as principal component analysis, generate negative mixing coefficients that are not physically meaningful. Non-

negative matrix factorization constrains mixing coefficients to be non-negative, and has been used for spectral unmixing,

but not AF removal. This paper describes a novel non-negative matrix factorization algorithm which separates fluorescent

images into true signal and AF components utilizing an estimate of the dark current. We also present a test-bed, based on

fluorescent beads, to compare the performance of different AF removal algorithms. Our algorithm out-performed previous

state of the art on validation images.

25 Automatic Exact Histogram Specification for Contrast Enhancement and Visual System Based Quantitative Evaluation

8



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


Histogram equalization, which aims at information maximization, is widely used in different ways to perform contrast

enhancement in images. In this paper, an automatic exact histogram specification technique is proposed and used for

global and local contrast enhancement of images. The desired histogram is obtained by first subjecting the image

histogram to a modification process and then by maximizing a measure that represents increase in information and

decrease in ambiguity. A new method of measuring image contrast based upon local band-limited approach and center-

surround retinal receptive field model is also devised in this paper. This method works at multiple scales (frequency bands)

and combines the contrast measures obtained at different scales using QQ-norm. In comparison to a few existing methods,

the effectiveness of the proposed automatic exact histogram specification technique in enhancing contrasts of images is

demonstrated through qualitative analysis and the proposed image contrast measure based quantitative analysis.

26 Balanced Multifilter Banks for Multiple Description Coding

The parametrization for one kind of multifilter banks generating balanced multiwavelets is presented in this paper, in which

two lowpass filters are flipping filters, and two highpass filters have linear phase. Based on these parametric expressions,

some balanced multiwavelets and analysis-ready multiwavelets are constructed, which are symmetric, or antisymmetric.

Moreover, on the basis of balanced multiwavelet transform, a new method of multiple description coding is given, and

experiments show that this method works well. Compared with the traditional multiple description coding method, this

method has low redundancy.

27 Balanced Multiwavelets With Interpolatory Property

Balanced multiwavelets with interpolatory property are discussed in this paper. This kind of multiwavelets can have a

sampling property like Shannon’s sampling theorem. It has been shown that the corresponding matrix-valued refinable

mask has special structure, and an orthogonal multifilter bank{H(z),G(Z)} can be reduced to a scalar valued conjugate

quadrature filter (CQF) a(z). But it does not mean that any scalar CQF can form a “good” multifilter bank which can generate

a vector-valued refinable function with some degree of smoothness. In the context of balanced multiwavelets, we give the

definition of transferring balance order, which a scalar CQF a(z) satisfies, to guarantee that the multiwavelet generated is

balanced. On the basis of the parametrization of a scalar CQF with any length and conditions of transferring balance order,

parametrization of multifilter banks which can generate interpolatory multiwavelet and interpolatory scaling function, is

gotten. Moreover, some balanced interpolatory multiwavelets have been constructed. Interpolatory analysis-ready

multiwavelets (armlets) are also discussed in this paper. It is known that conditions of armlets are easy to validate,

compared with balanced multiwavelets. But it will be present that if the corresponding scaling function is interpolatory, the

multiwavelet is balanced of order if and only if it is an armlet of order . Finally, the application of balanced multiwavelets

with interpolatory property in image processing is also discussed.

28 Blind Deconvolution Using Generalized Cross-Validation Approach to Regularization Parameter Estimation

In this paper, we propose and present an algorithm for total variation (TV)-based blind deconvolution. Both the unknown

image and blur can be estimated within an alternating minimization framework. With the generalized cross-validation (GCV)

method, the regularization parameters associated with the unknown image and blur can be updated in alternating

9



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


minimization steps. Experimental results confirm that the performance of the proposed algorithm is better than variational

Bayesian blind deconvolution algorithms with Student’s-t priors or a total variation prior.

29 Blind Spectral Unmixing Based on Sparse Nonnegative Matrix Factorization

Nonnegative matrix factorization (NMF) is a widely used method for blind spectral unmixing (SU), which aims at obtaining

the endmembers and corresponding fractional abundances, knowing only the collected mixing spectral data. It is noted that

the abundance may be sparse (i.e., the endmembers may be with sparse distributions) and sparse NMF tends to lead to a

unique result, so it is intuitive and meaningful to constrain NMF with sparseness for solving SU. However, due to the

abundance sum-to-one constraint in SU, the traditional sparseness measured by L0/L1-norm is not an effective constraint

any more. A novel measure (termed as S-measure) of sparseness using higher order norms of the signal vector is proposed

in this paper. It features the physical significance. By using the S-measure constraint (SMC), a gradient-based sparse NMF

algorithm (termed as NMF-SMC) is proposed for solving the SU problem, where the learning rate is adaptively selected, and

the endmembers and abundances are simultaneously estimated. In the proposed NMF-SMC, there is no pure index

assumption and no need to know the exact sparseness degree of the abundance in prior. Yet, it does not require the

preprocessing of dimension reduction in which some useful information may be lost. Experiments based on synthetic

mixtures and real-world images collected by AVIRIS and HYDICE sensors are performed to evaluate the validity of the

proposed method.

30 Boosting Color Feature Selection for Color Face Recognition

This paper introduces the new color face recognition (FR) method that makes effective use of boosting learning as color-

component feature selection framework. The proposed boosting color-component feature selection framework is designed

for finding the best set of color-component features from various color spaces (or models), aiming to achieve the best FR

performance for a given FR task. In addition, to facilitate the complementary effect of the selected color-component

features for the purpose of color FR, they are combined using the proposed weighted feature fusion scheme. The

effectiveness of our color FR method has been successfully evaluated on the following five public face databases (DBs):

CMU-PIE, Color FERET, XM2VTSDB, SCface, and FRGC 2.0. Experimental results show that the results of the proposed

method are impressively better than the results of other state-of-the-art color FR methods over different FR challenges

including highly uncontrolled illumination, moderate pose variation, and small resolution face images.

31 Characterization of Electrophotographic Print Artifacts: Banding, Jitter, and Ghosting

Electrophotographic (EP) print banding, jitter, and ghosting artifacts are common sources of print quality degradation.

Traditionally, the characterization of banding and jitter artifacts relies mainly on the assumption that the defect has either a

horizontal or vertical orientation which permits the simple 1-D analysis of the defect profile. However, this assumption can

easily be violated if a small amount of printer or scanner skew is introduced to the analyzed images. In some cases, the

defect can inherently be neither vertical nor horizontal. In this case, unless the defect orientation has been accurately

detected before analysis, the 1-D-based approaches could bias the estimation of the defect severity. In this paper, we

present an approach to characterize the jitter and banding artifacts of unrestricted orientation using wavelet filtering and 2-

D spectral analysis. We also present a new system for detecting and quantifying ghosting defects. It includes a design for a

printed test pattern to emphasize the ghosting defect and facilitate further processing and analysis.Wavelet filtering and a

10



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


template matching technique are used to detect the ghost location along and across the scanned test pattern. A new metric

is developed to quantify ghosting based upon its contrast, shape, and location consistency. Our experimental results show

that the proposed approaches provide objective measures that quantify EP defects with a rank ordering correlation

coefficient of 0.8 to 0.98, as compared to the subjective assessment of print quality experts.

32 Classification-Based Adaptive Filtering for Multiframe Blind Image Restoration

In this paper, the blind restoration of a scene is investigated, when multiple degraded (blurred and noisy) acquisitions are

available. An adaptive filtering technique is proposed, where the distorted images are filtered, classified and then fused

based upon the classification decisions. Finite normal-density mixture (FNM) models are used to model the filtered outputs

at each iteration. For simplicity, fixed number of Gaussian components (classes) is, initially, considered for each degraded

frame and the selection of the optimal number of classes is performed according to the global relative entropy criterion.

However, there exist cases where dynamically varyingFNMmodels should be considered, where the optimal number of

classes is selected according to the Akaike information criterion. The iterative application of classification and fusion,

followed by optimal adaptive filtering, converges to a global enhanced representation of the original scene in only a few

iterations. The proposed restoration method does not require knowledge of the point-spread-function support size or exact

alignment of the acquired frames. Simulation results on synthetic and real data, using both fixed and dynamically varying

FNM models, demonstrate its efficiency under both noisy and noise-free conditions.

33 Color Extended Visual Cryptography Using Error Diffusion

Color visual cryptography (VC) encrypts a color secret message into Q color halftone image shares. Previous methods in

the literature show good results for black and white or gray scale VC schemes, however, they are not sufficient to be

applied directly to color shares due to different color structures. Some methods for color visual cryptography are not

satisfactory in terms of producing either meaningless shares or meaningful shares with low visual quality, leading to

suspicion of encryption. This paper introduces the concept of visual information pixel (VIP) synchronization and error

diffusion to attain a color visual cryptography encryption method that produces meaningful color shares with high visual

quality. VIP synchronization retains the positions of pixels carrying visual information of original images throughout the

color channels and error diffusion generates shares pleasant to human eyes. Comparisons with previous approaches show

the superior performance of the new method.

34 Combined Invariants to Similarity Transformation and to Blur Using Orthogonal Zernike Moments

The derivation of moment invariants has been extensively investigated in the past decades. In this paper, we construct a set

of invariants derived from Zernike moments which is simultaneously invariant to similarity transformation and to

convolution with circularly symmetric point spread function (PSF). Two main contributions are provided: the theoretical

framework for deriving the Zernike moments of a blurred image and the way to construct the combined geometric-blur

invariants. The performance of the proposed descriptors is evaluated with various PSFs and similarity transformations. The

comparison of the proposed method with the existing ones is also provided in terms of pattern recognition accuracy,

template matching and robustness to noise. Experimental results show that the proposed descriptors perform on the

overall better.

11



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


35 Comments on “Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering”

In order to resolve the problem that the denoising performance has a sharp drop when noise standard deviation reaches 40,

[1] proposed to replace the wavelet transform by the DCT. In this comment, we argue that this replacement is unnecessary,

and that the problem can be solved by adjusting some numerical parameters. We also present this parameter modification

approach here. Experimental results demonstrate that the proposed modification achieves better results in terms of both

peak signal-to-noise ratio and subjective visual quality than the original method for strong noise.

36 Compressibility-Aware Media Retargeting With Structure Preserving

A number of algorithms have been proposed for intelligent image/video retargeting with image content retained as much as

possible. However, they usually suffer from some artifacts in the results, such as ridge or structure twist. In this paper, we

present a structure-preserving media retargeting technique that preserves the content and image structure as best as

possible. Different from the previous pixel or grid based methods, we estimate the image content saliency from the

structure of the content. A block structure energy is introduced with a top-down strategy to constrain the image structure

inside to deform uniformly in either Q or direction. However, the flexibilities for retargeting are quite different for different

images. To cope with this problem, we propose a compressibility assessment scheme for media retargeting by combining

the entropies of image gradient magnitude and orientation distributions. Thus, the resized media is produced to preserve

the image content and structure as best as possible. Our experiments demonstrate that the proposed method provides

resized images/ videos with better preservation of content and structure than those by the previous methods

37 Computational Perceptual Features for Texture Representation and Retrieval

A perception-based approach to content-based image representation and retrieval is proposed in this paper.We consider

textured images and propose to model their textural content by a set of features having a perceptual meaning and their

application to content-based image retrieval. We present a new method to estimate a set of perceptual textural features,

namely coarseness, directionality, contrast, and busyness. The proposed computational measures can be based upon two

representations: the original images representation and the autocorrelation function (associated with original images)

representation. The set of computational measures proposed is applied to content-based image retrieval on a large image

data set, the well-known Brodatz database. Experimental results and benchmarking show interesting performance of our

approach. First, the correspondence of the proposed computational measures to human judgments is shown using a

psychometric method based upon the Spearman rank-correlation coefficient. Second, the application of the proposed

computational measures in texture retrieval shows interesting results, especially when using results fusion returned by

each of the two representations. Comparison is also given with related works and show excellent performance of our

approach compared to related approaches on both sides: correspondence of the proposed computational measures with

human judgments as well as the retrieval effectiveness.

12



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


38 Constrained Acquisition of Ink Spreading Curves From Printed Color Images

The derivation of moment invariants has been extensively investigated in the past decades. In this paper, we construct a set

of invariants derived from Zernike moments which is simultaneously invariant to similarity transformation and to

convolution with circularly symmetric point spread function (PSF). Two main contributions are provided: the theoretical

framework for deriving the Zernike moments of a blurred image and the way to construct the combined geometric-blur

invariants. The performance of the proposed descriptors is evaluated with various PSFs and similarity transformations. The

comparison of the proposed method with the existing ones is also provided in terms of pattern recognition accuracy,

template matching and robustness to noise. Experimental results show that the proposed descriptors perform on the

overall better.

39 Contactless and Pose Invariant Biometric Identification Using Hand Surface

This paper presents a novel approach for hand matching that achieves significantly improved performance even in the

presence of large hand pose variations. The proposed method utilizes a 3-D digitizer to simultaneously acquire intensity

and range images of the user’s hand presented to the system in an arbitrary pose. The approach involves determination of

the orientation of the hand in 3-D space followed by pose normalization of the acquired 3-D and 2-D hand images.

Multimodal (2-D as well as 3-D) palmprint and hand geometry features, which are simultaneously extracted from the user’s

pose normalized textured 3-D hand, are used for matching. Individual matching scores are then combined using a new

dynamic fusion strategy. Our experimental results on the database of 114 subjects with significant pose variations yielded

encouraging results. Consistent (across various hand features considered) performance improvement achieved with the

pose correction demonstrates the usefulness of the proposed approach for hand based biometric systems with

unconstrained and contact-free imaging. The experimental results also suggest that the dynamic fusion approach

employed in this work helps to achieve performance improvement of 60% (in terms of EER) over the case when matching

scores are combined using the weighted sum rule.

40 Contextual Kernel and Spectral Methods for Learning the Semantics of Images

This paper presents contextual kernel and spectral methods for learning the semantics of images that allow us to

automatically annotate an image with keywords. First, to exploit the context of visual words within images for automatic

image annotation, we define a novel spatial string kernel to quantify the similarity between images. Specifically, we

represent each image as a 2-D sequence of visual words and measure the similarity between two 2-D sequences using the

shared occurrences of -length 1-D subsequences by decomposing each 2-D sequence into two orthogonal 1-D sequences.

Based on our proposed spatial string kernel, we further formulate automatic image annotation as a contextual keyword

propagation problem, which can be solved very efficiently by linear programming. Unlike the traditional relevance models

that treat each keyword independently, the proposed contextual kernel method for keyword propagation takes into account

the semantic context of annotation keywords and propagates multiple keywords simultaneously. Significantly, this type of

13



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


semantic context can also be incorporated into spectral embedding for refining the annotations of images predicted by

keyword propagation. Experiments on three standard image datasets demonstrate that our contextual kernel and spectral

methods can achieve significantly better results than the state of the art.

41 Contextual Object Localization With Multiple Kernel Nearest Neighbor

Recently, many object localization models have shown that incorporating contextual cues can greatly improve accuracy

over using appearance features alone. Therefore, many of these models have explored different types of contextual

sources, but only considering one level of contextual interaction at the time. Thus, what context could truly contribute to

object localization, through integrating cues from all levels, simultaneously, remains an open question. Moreover, the

relative importance of the different contextual levels and appearance features across different object classes remains to be

explored. Here we introduce a novel framework for multiple class object localization that incorporates different levels of

contextual interactions. We study contextual interactions at the pixel, region and object level based upon three different

sources of context: semantic, boundary support, and contextual neighborhoods. Our framework learns a single similarity

metric from multiple kernels, combining pixel and region interactions with appearance features, and then applies a

conditional random field to incorporate object level interactions. To effectively integrate different types of feature

descriptions, we extend the large margin nearest neighbor to a novel algorithm that supports multiple kernels. We perform

experiments on three challenging image databases: Graz-02, MSRC and PASCAL VOC 2007. Experimental results show that

our model outperforms current state-of-the-art contextual frameworks and reveals individual contributions for each

contextual interaction level as well as appearance features, indicating their relative importance for object localization.

42 Convex Total Variation Denoising of Poisson Fluorescence Confocal Images With Anisotropic Filtering

Fluorescence confocal microscopy (FCM) is now one of the most important tools in biomedicine research. In fact, it makes

it possible to accurately study the dynamic processes occurring inside the cell and its nucleus by following the motion of

fluorescent molecules over time. Due to the small amount of acquired radiation and the huge optical and electronics

amplification, the FCM images are usually corrupted by a severe type of Poisson noise. This noise may be even more

damaging when very low intensity incident radiation is used to avoid phototoxicity. In this paper, a Bayesian algorithm is

proposed to remove the Poisson intensity dependent noise corrupting the FCM image sequences. The observations are

organized in a 3-D tensor where each plane is one of the images acquired along the time of a cell nucleus using the

fluorescence loss in photobleaching (FLIP) technique. The method removes simultaneously the noise by considering

different spatial and temporal correlations. This is accomplished by using an anisotropic 3-D filter that may be separately

tuned in space and in time dimensions. Tests using synthetic and real data are described and presented to illustrate the

application of the algorithm. A comparison with several state-of-the-art algorithms is also presented.

43 Dealing With Parallax in Shape-From-Focus

We propose a new method that extends the capability of shape-from-focus (SFF) to estimate the depth profile of 3-D objects

in the presence of structure-dependent pixel motion. Existing SFF techniques work under the constraint that there is no

parallax in the captured stack of frames. However, in off-the-shelf cameras, there can be appreciable pixel motion among

the observations when there is relative motion between the object and the camera. In such a scenario, the depth estimates

14



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


will be erroneous if the parallax effect is not factored in. Our degradation model accounts for pixel migration effects in the

observations due to parallax resulting in a generalization of the SFF technique. We show that pixel motion and defocus blur

therein are tightly coupled to the underlying shape of the 3-D object. Simultaneous reconstruction of the underlying 3-D

structure and the all-in-focus image is carried out within an optimization framework using local image operations. The

proposed method when tested on many examples, both synthetic and real, is very effective and delivers state-of-the-art

performance.

44 Dictionary Learning for Stereo Image Representation

One of the major challenges in multi-view imaging is the definition of a representation that reveals the intrinsic geometry of

the visual information. Sparse image representations with overcomplete geometric dictionaries offer a way to efficiently

approximate these images, such that the multi-view geometric structure becomes explicit in the representation. However,

the choice of a good dictionary in this case is far from obvious. We propose a new method for learning overcomplete

dictionaries that are adapted to the joint representation of stereo images. We first formulate a sparse stereo image model

where the multi-view correlation is described by local geometric transforms of dictionary elements (atoms) in two stereo

views. A maximum-likelihood (ML) method for learning stereo dictionaries is then proposed, where a multi-view geometry

constraint is included in the probabilistic model. The ML objective function is optimized using the expectation-maximization

algorithm. We apply the learning algorithm to the case of omnidirectional images, where we learn scales of atoms in a

parametric dictionary. The resulting dictionaries provide better performance in the joint representation of stereo

omnidirectional images as well as improved multi-view feature matching. We finally discuss and demonstrate the benefits

of dictionary learning for distributed scene representation and camera pose estimation.

45 Diffuse Prior Monotonic Likelihood Ratio Test for Evaluation of Fused Image Quality Measures

This paper introduces a novel method to score how well proposed fused image quality measures (FIQMs) indicate the

effectiveness of humans to detect targets in fused imagery. The human detection performance is measured via human

perception experiments. A good FIQM should relate to perception results in a monotonic fashion. The method computes a

new diffuse prior monotonic likelihood ratio (DPMLR) to facilitate the comparison of the H1 hypothesis that the intrinsic

human detection performance is related to the FIQM via a monotonic function against the null hypothesis that the detection

and image quality relationship is random. The paper discusses many interesting properties of the DPMLR and

demonstrates the effectiveness of the DPMLR test via Monte Carlo simulations. Finally, the DPMLR is used to score FIQMs

with test cases considering over 35 scenes and various image fusion algorithms.

46 Direct Intermode Selection for H.264 Video Coding Using Phase Correlation

The H.264 video coding standard exhibits higher performance compared to the other existing standards such as H.263,

MPEG-X. This improved performance is achieved mainly due to the multiple-mode motion estimation and compensation.

Recent research tried to reduce the computational time using the predictive motion estimation, early zero motion vector

detection, fast motion estimation, and fast mode decision, etc. These approaches reduce the computational time

substantially, at the expense of degrading image quality and/or increase bitrates to a certain extent. In this paper, we use

phase correlation to capture the motion information between the current and reference blocks and then devise an algorithm

15



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


for direct motion estimation mode prediction, without excessive motion estimation. A bigger amount of computational time

is reduced by the direct mode decision and exploitation of available motion vector information from phase correlation. The

experimental results show that the proposed scheme outperforms the existing relevant fast algorithms, in terms of both

operating efficiency and video coding quality. To be more specific, 82~92%of encoding time is saved compared to the

exhaustive mode selection (against 58~74% in the relevant state-of-the-art), and this is achieved without jeopardizing

image quality (in fact, there is some improvement over the exhaustive mode selection at mid to high bit rates) and for a

wide range of videos and bitrates (another advantages over the relevant state-of-the-art).

47 Discretization Error Analysis and Adaptive Meshing Algorithms for Fluorescence Diffuse Optical Tomography in the Presence of Measurement Noise

Quantitatively accurate fluorescence diffuse optical tomographic (FDOT) image reconstruction is a computationally

demanding problem that requires repeated numerical solutions of two coupled partial differential equations and an

associated inverse problem. Recently, adaptive finite element methods have been explored to reduce the computation

requirements of the FDOT image reconstruction. However, existing approaches ignore the ubiquitous presence of noise in

boundary measurements. In this paper, we analyze the effect of finite element discretization on the FDOT forward and

inverse problems in the presence of measurement noise and develop novel adaptive meshing algorithms for FDOT that take

into account noise statistics. We formulate the FDOT inverse problem as an optimization problem in the maximum a

posteriori framework to estimate the fluorophore concentration in a bounded domain. We use the mean-square-error (MSE)

between the exact solution and the discretized solution as a figure of merit to evaluate the image reconstruction accuracy,

and derive an upper bound on the MSE which depends upon the forward and inverse problem discretization parameters,

noise statistics, a priori information of fluorophore concentration, source and detector geometry, as well as background

optical properties. Next, we use this error bound to develop adaptive meshing algorithms for the FDOT forward and inverse

problems to reduce the MSE due to discretization in the reconstructed images. Finally, we present a set of numerical

simulations to illustrate the practical advantages of our adaptive meshing algorithms for FDOT image reconstruction.

48 Distributed Multiple Description Video Coding on Packet Loss Channels

In this paper, we are to solve the drift problem of multiple description video coding on packet loss channels by using state-

of-the-art distributed techniques. We first present an asymptotically optimal code design of multiple descriptions in the

Wyner–Ziv (MDWZ) setting. Then we propose a distributed multiple description video coding (DMDVC) scheme, which

performs MDWZ coding on each nonintra coded frame. Instead of the prediction loops used in traditional multiple

description video coding, Slepian–Wolf based coding is used to exploit interframe correlations. A bitplane extraction

scheme is proposed to improve the balance between two descriptions, so that side informations can be interchanged

between the side decoders ofDMDVCwith negligible quality degradation, which is crucial to robust transmission over

packet loss channels. Experiment results demonstrate the robustness of our scheme, especially at high packet loss rates.

49 Efficiently Learning a Detection Cascade With Sparse Eigenvectors

Real-time object detection has many computer vision applications. Since Viola and Jones [1] proposed the first real-time

AdaBoost based face detection system, much effort has been spent on improving the boosting method. In this work, we

first show that feature selection methods other than boosting can also be used for training an efficient object detector. In

16



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


particular, we introduce greedy sparse linear discriminant analysis (GSLDA) [2] for its conceptual simplicity and

computational efficiency; and slightly better detection performance is achieved compared with [1]. Moreover, we propose a

new technique, termed boosted greedy sparse linear discriminant analysis (BGSLDA), to efficiently train a detection

cascade. BGSLDA exploits the sample reweighting property of boosting and the class-separability criterion of GSLDA.

Experiments in the domain of highly skewed data distributions (e.g., face detection) demonstrate that classifiers trained

with the proposed BGSLDAoutperforms AdaBoost and its variants. This finding provides a significant opportunity to argue

that AdaBoost and similar approaches are not the only methods that can achieve high detection results for real-time object

detection.

50 Elastic Sequence Correlation for Human Action Analysis

This paper addresses the problem of automatically analyzing and understanding human actions from video footage. An

“action correlation” framework, elastic sequence correlation (ESC), is proposed to identify action subsequences from a

database of (possibly long) video sequences that are similar to a given query video action clip. In particular, we show that

two well-known algorithms, namely approximate pattern matching in computer and information sciences and dynamic time

warping (DTW) method in signal processing, are special cases of our ESC framework. The proposed framework is applied

to two important real-world applications: action pattern retrieval, as well as action segmentation and recognition, where, on

average, its run time speed (in matlab) is about 3.3 frames per second. In addition, comparing with the state-of-the-art

algorithms on a number of challenging data sets, our approach is demonstrated to perform competitively.

51 Enhanced Shift and Scale Tolerance for Rotation Invariant Polar Matching With Dual-Tree Wavelets

Polar matching is a recently developed shift and rotation invariant object detection method that is based upon dual-tree

complex wavelet transforms or equivalent multiscale directional filterbanks. It can be used to facilitate both keypoint

matching, neighborhood search detection, or detection and tracking with particle filters. The theory is extended here to

incorporate an allowance for local spatial and dilation perturbations.With experiments, we demonstrate that the robustness

of the polar matching method is strengthened at modest computational cost.

52 Face Recognition System Using Multiple Face Model of Hybrid Fourier Feature Under Uncontrolled Illumination Variation

The authors present a robust face recognition system for large-scale data sets taken under uncontrolled illumination

variations. The proposed face recognition system consists of a novel illumination-insensitive preprocessing method, a

hybrid Fourier-based facial feature extraction, and a score fusion scheme. First, in the preprocessing stage, a face image is

transformed into an illumination-insensitive image, called an “integral normalized gradient image,” by normalizing and

integrating the smoothed gradients of a facial image. Then, for feature extraction of complementary classifiers, multiple

face models based upon hybrid Fourier features are applied. The hybrid Fourier features are extracted from different

Fourier domains in different frequency bandwidths, and then each feature is individually classified by linear discriminant

analysis. In addition, multiple face models are generated by plural normalized face images that have different eye distances.

Finally, to combine scores from multiple complementary classifiers, a log likelihood ratio-based score fusion scheme is

applied. The proposed system using the face recognition grand challenge (FRGC) experimental protocols is evaluated;

FRGC is a large available data set. Experimental results on the FRGC version 2.0 data sets have shown that the proposed

method shows an average of 81.49% verification rate on 2-D face images under various environmental variations such as

17



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


illumination changes, expression changes, and time elapses.

53 Face Recognition by Exploring Information Jointly in Space, Scale and Orientation

Information jointly contained in image space, scale and orientation domains can provide rich important clues not seen in

either individual of these domains. The position, spatial frequency and orientation selectivity properties are believed to

have an important role in visual perception. This paper proposes a novel face representation and recognition approach by

exploring information jointly in image space, scale and orientation domains. Specifically, the face image is first

decomposed into different scale and orientation responses by convolving multiscale and multiorientation Gabor filters.

Second, local binary pattern analysis is used to describe the neighboring relationship not only in image space, but also in

different scale and orientation responses. This way, information from different domains is explored to give a good face

representation for recognition. Discriminant classification is then performed based upon weighted histogram intersection

or conditional mutual information with linear discriminant analysis techniques. Extensive experimental results on FERET,

AR, and FRGC ver 2.0 databases show the significant advantages of the proposed method over the existing ones.

54 Fast Model-Based X-Ray CT Reconstruction Using Spatially Nonhomogeneous ICD Optimization

Recent applications of model-based iterative reconstruction (MBIR) algorithms to multislice helical CT reconstructions have

shown that MBIR can greatly improve image quality by increasing resolution as well as reducing noise and some artifacts.

However, high computational cost and long reconstruction times remain as a barrier to the use of MBIR in practical

applications. Among the various iterative methods that have been studied for MBIR, iterative coordinate descent (ICD) has

been found to have relatively low overall computational requirements due to its fast convergence. This paper presents a

fast model-based iterative reconstruction algorithm using spatially nonhomogeneous ICD (NHICD) optimization. The NH-

ICD algorithm speeds up convergence by focusing computation where it is most needed. The NH-ICD algorithm has a

mechanism that adaptively selects voxels for update. First, a voxel selection criterion VSC determines the voxels in

greatest need of update. Then a voxel selection algorithm VSA selects the order of successive voxel updates based upon

the need for repeated updates of some locations, while retaining characteristics for global convergence. In order to speed

up each voxel update, we also propose a fast 1-D optimization algorithm that uses a quadratic substitute function to upper

bound the local 1-D objective function, so that a closed form solution can be obtained rather than using a computationally

expensive line search algorithm. We examine the performance of the proposed algorithm using several clinical data sets of

various anatomy. The experimental results show that the proposed method accelerates the reconstructions by roughly a

factor of three on average for typical 3-D multislice geometries.

55 FAST Rate Allocation Through Steepest Descent for JPEG2000 Video Transmission

18



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


This work addresses the transmission of pre-encoded JPEG2000 video within a video-on-demand scenario. The primary

requirement for the rate allocation algorithm deployed in the server is to match the real-time processing demands of the

application. Scalability in terms of complexity must be provided to supply a valid solution by a given instant of time. The

FAst rate allocation through STeepest descent (FAST) method introduced in this work selects an initial (and possibly poor)

solution, and iteratively improves it until time is exhausted or the algorithm finishes execution. Experimental results

suggest that FAST commonly achieves solutions close to the global optimum while employing very few computational

resources.

56 Fast Sparse Image Reconstruction Using Adaptive Nonlinear Filtering

Compressed sensing is a new paradigm for signal recovery and sampling. It states that a relatively small number of linear

measurements of a sparse signal can contain most of its salient information and that the signal can be exactly

reconstructed from these highly incomplete observations. The major challenge in practical applications of compressed

sensing consists in providing efficient, stable and fast recovery algorithms which, in a few seconds, evaluate a good

approximation of a compressible image from highly incomplete and noisy samples. In this paper, we propose to approach

the compressed sensing image recovery problem using adaptive nonlinear filtering strategies in an iterative framework, and

we prove the convergence of the resulting two-steps iterative scheme. The results of several numerical experiments

confirm that the corresponding algorithm possesses the required properties of efficiency, stability and low computational

cost and that its performance is competitive with those of the state of the art algorithms.

57 Fine-Granularity and Spatially-Adaptive Regularization for Projection-Based Image Deblurring

This paper studies two classes of regularization strategies to achieve an improved tradeoff between image recovery and

noise suppression in projection-based image deblurring. The first is based on a simple fact that -times Landweber iteration

leads to a fixed level of regularization, which allows us to achieve fine-granularity control of projection-based iterative

deblurring by varying the value . The regularization behavior is explained by using the theory of Lagrangian multiplier for

variational schemes. The second class of regularization strategy is based on the observation that various regularized filters

can be viewed as nonexpansive mappings in the metric space. A deeper understanding about different regularization filters

can be gained by probing into their asymptotic behavior—the fixed point of nonexpansive mappings. By making an analogy

to the states of matter in statistical physics, we can observe that different image structures (smooth regions, regular edges

and textures) correspond to different fixed points of nonexpansive mappings when the temperature(regularization)

parameter varies. Such an analogy motivates us to propose a deterministic annealing based approach toward spatial

adaptation in projection-based image deblurring. Significant performance improvements over the current state-of-the-art

schemes have been observed in our experiments, which substantiates the effectiveness of the proposed regularization

strategies.

58 Fractal Dimension of Color Fractal Images

Fractal dimension is a very useful metric for the analysis of the images with self-similar content, such as textures. For its

computation there exist several approaches, the probabilistic algorithm being accepted as the most elegant approach.

19



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


However, all the existing methods are defined for 1-D signals or binary images, with extension to grayscale images. Our

purpose is to propose a color version of the probabilistic algorithm for the computation of the fractal dimension. To validate

this new approach, we also propose an extension of the existing algorithm for the generation of probabilistic fractals, in

order to obtain color fractal images. Then we show the results of our experiments and conclude this paper.

59 From Local Pixel Structure to Global Image Super-Resolution: A New Face Hallucination Framework

We have developed a new face hallucination framework termed from local pixel structure to global image super-resolution

(LPS-GIS). Based on the assumption that two similar face images should have similar local pixel structures, the new

framework first uses the input low-resolution (LR) face image to search a face database for similar example high-resolution

(HR) faces in order to learn the local pixel structures for the target HR face. It then uses the input LR face and the learned

pixel structures as priors to estimate the target HR face. We present a three-step implementation procedure for the

framework. Step 1 searches the database for K example faces that are the most similar to the input, and then warps the K

example images to the input using optical flow. Step 2 uses the warped HR version of the K example faces to learn the

local pixel structures for the target HR face. An effective method for learning local pixel structures from an individual face,

and an adaptive procedure for fusing the local pixel structures of different example faces to reduce the influence of warping

errors, have been developed. Step 3 estimates the targetHRface by solving a constrained optimization problem by means of

an iterative procedure. Experimental results show that our new method can provide good performances for face

hallucination, both in terms of reconstruction error and visual quality; and that it is competitive with existing state-of-the-art

methods.

60 From Point to Local Neighborhood: Polyp Detection in CT Colonography Using Geodesic Ring Neighborhoods

Existing polyp detection methods rely heavily on curvature-based characteristics to differentiate between lesions. These

assume that the discrete triangulated surface mesh or volume closely approximates a smooth continuous surface.

However, this is often not the case and because curvature is computed as a local feature and a second-order differential

quantity, the presence of noise significantly affects its estimation. For this reason, a more global feature is required to

provide an accurate description of the surface at hand. In this paper, a novel method incorporating a local neighborhood

around the centroid of a surface patch is proposed. This is done using geodesic rings which accumulate curvature

information in a neighborhood around this centroid. This geodesic-ring neighborhood approximates a single smooth,

continuous surface upon which curvature and orientation estimation methods can be applied. A new global shape index, S

is also introduced and computed. These curvature and orientation values will be used to classify the surface as either a

bulbous polyp, ridge-like fold or semiplanar structure. Experimental results show that this method is promising (100%

sensitivity, 100% specificity for lesions > 10 mm) for distinguishing between bulbous polyps, folds and planar-like

structures in the colon.

61 From Tiger to Panda: Animal Head Detection

Robust object detection has many important applications in real-world online photo processing. For example, both Google

image search and MSN live image search have integrated human face detector to retrieve face or portrait photos. Inspired

by the success of such face filtering approach, in this paper, we focus on another popular online photo category—animal,

which is one of the top five categories in the MSN live image search query log. As a first attempt, we focus on the problem

20



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


of animal head detection of a set of relatively large land animals that are popular on the internet, such as cat, tiger, panda,

fox, and cheetah. First, we proposed a new set of gradient oriented feature, Haar of Oriented Gradients (HOOG), to

effectively capture the shape and texture features on animal head. Then, we proposed two detection algorithms, namely

Bruteforce detection and Deformable detection, to effectively exploit the shape feature and texture feature simultaneously.

Experimental results on 14 379 well labeled animals images validate the superiority of the proposed approach. Additionally,

we apply the animal head detector to improve the image search result through text based online photo search result

filtering.

62 Fuzzy Random Impulse Noise Removal From Color Image Sequences

In this paper, a new fuzzy filter for the removal of random impulse noise in color video is presented. By working with

different successive filtering steps, a very good tradeoff between detail preservation and noise removal is obtained. One

strong filtering step that should remove all noise at once would inevitably also remove a considerable amount of detail.

Therefore, the noise is filtered step by step. In each step, noisy pixels are detected by the help of fuzzy rules, which are very

useful for the processing of human knowledge where linguistic variables are used. Pixels that are detected as noisy are

filtered, the others remain unchanged. Filtering of detected pixels is done by blockmatching based on a noise adaptive

mean absolute difference. The experiments show that the proposed method outperforms other state-of-the-art filters both

visually and in terms of objective quality measures such as the mean absolute error (MAE), the peak-signal-to-noise ratio

(PSNR) and the normalized color difference (NCD).

63 Geodesic Active Fields—A Geometric Framework for Image Registration

In this paper we present a novel geometric framework called geodesic active fields for general image registration. In image

registration, one looks for the underlying deformation field that best maps one image onto another. This is a classic ill-

posed inverse problem, which is usually solved by adding a regularization term. Here, we propose a multiplicative coupling

between the registration term and the regularization term, which turns out to be equivalent to embed the deformation field

in a weighted minimal surface problem. Then, the deformation field is driven by a minimization flow toward a harmonic map

corresponding to the solution of the registration problem. This proposed approach for registration shares close similarities

with the well-known geodesic active contours model in image segmentation, where the segmentation term (the edge

detector function) is coupled with the regularization term (the length functional) via multiplication as well. As a matter of

fact, our proposed geometric model is actually the exact mathematical generalization to vector fields of the weighted length

problem for curves and surfaces introduced by Caselles-Kimmel-Sapiro [1]. The energy of the deformation field is

measured with the Polyakov energy weighted by a suitable image distance, borrowed from standard registration models.

We investigate three different weighting functions, the squared error and the approximated absolute error for monomodal

images, and the local joint entropy for multimodal images. As compared to specialized state-of-the-art methods tailored for

specific applications, our geometric framework involves important contributions. Firstly, our general formulation for

registration works on any parametrizable, smooth and differentiable surface, including nonflat and multiscale images. In the

latter case, multiscale images are registered at all scales simultaneously, and the relations between space and scale are

intrinsically being accounted for. Second, this method is, to the best of our knowledge, the first reparametrization invariant

registration method introduced in the literature. Thirdly, the multiplicative coupling between the registration term, i.e. local

image discrepancy, and the regularization term naturally results in a data-dependent tuning of the regularization strength.

Finally, by choosing the metric on the deformation field one can freely interpolate between classic Gaussian and more

21



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


interesting anisotropic, TV-like regularization.

64 Geometric Calibration of Lens and Filter Distortions for Multispectral Filter-Wheel Cameras

High-fidelity color image acquisition with a multispectral camera utilizes optical filters to separate the visible

electromagnetic spectrum into several passbands. This is often realized with a computer-controlled filter wheel, where each

position is equipped with an optical bandpass filter. For each filter wheel position, a grayscale image is acquired and the

passbands are finally combined to a multispectral image. However, the different optical properties and non-coplanar

alignment of the filters cause image aberrations since the optical path is slightly different for each filter wheel position. As

in a normal camera system, the lens causes additional wavelength-dependent image distortions called chromatic

aberrations. When transforming the multispectral image with these aberrations into an RGB image, color fringes appear,

and the image exhibits a pincushion or barrel distortion. In this paper, we address both the distortions caused by the lens

and by the filters. Based on a physical model of the bandpass filters, we show that the aberrations caused by the filters can

be modeled by displaced image planes. The lens distortions are modeled by an extended pinhole camera model, which

results in a remaining mean calibration error of only 0.07 pixels. Using an absolute calibration target, we then geometrically

calibrate each passband and compensate for both lens and filter distortions simultaneously. We show that both types of

aberrations can be compensated and present detailed results on the remaining calibration errors.

65 Geometrically Induced Force Interaction for Three-Dimensional Deformable Models

In this paper, we propose a novel 3-D deformable model that is based upon a geometrically induced external force field

which can be conveniently generalized to arbitrary dimensions. This external force field is based upon hypothesized

interactions between the relative geometries of the deformable model and the object boundary characterized by image

gradient. The evolution of the deformable model is solved using the level set method so that topological changes are

handled automatically. The relative geometrical configurations between the deformable model and the object boundaries

contribute to a dynamic vector force field that changes accordingly as the deformable model evolves. The geometrically

induced dynamic interaction force has been shown to greatly improve the deformable model performance in acquiring

complex geometries and highly concave boundaries, and it gives the deformable model a high invariancy in initialization

configurations. The voxel interactions across the whole image domain provide a global view of the object boundary

representation, giving the external force a long attraction range. The bidirectionality of the external force field allows the

new deformable model to deal with arbitrary cross-boundary initializations, and facilitates the handling of weak edges and

broken boundaries. In addition, we show that by enhancing the geometrical interaction field with a nonlocal edge-

preserving algorithm, the new deformable model can effectively overcome image noise. We provide a comparative study on

the segmentation of various geometries with different topologies from both synthetic and real images, and show that the

proposed method achieves significant improvements against existing image gradient techniques.

22



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


66 Goal-Oriented Rectification of Camera-Based Document Images

Document digitization with either flatbed scanners or camera-based systems results in document images which often suffer

from warping and perspective distortions that deteriorate the performance of current OCR approaches. In this paper, we

present a goal-oriented rectification methodology to compensate for undesirable document image distortions aiming to

improve the OCR result. Our approach relies upon a coarse-to-fine strategy. First, a coarse rectification is accomplished

with the aid of a computationally low cost transformation which addresses the projection of a curved surface to a 2-D

rectangular area. The projection of the curved surface on the plane is guided only by the textual content’s appearance in

the document image while incorporating a transformation which does not depend on specific model primitives or camera

setup parameters. Second, pose normalization is applied on the word level aiming to restore all the local distortions of the

document image. Experimental results on various document images with a variety of distortions demonstrate the

robustness and effectiveness of the proposed rectification methodology using a consistent evaluation methodology that

encounters OCR accuracy and a newly introduced measure using a semi-automatic procedure.

67 Gradient Profile Prior and Its Applications in Image Super-Resolution and Enhancement

In this paper, we propose a novel generic image prior—gradient profile prior, which implies the prior knowledge of natural

image gradients. In this prior, the image gradients are represented by gradient profiles, which are 1-D profiles of gradient

magnitudes perpendicular to image structures. We model the gradient profiles by a parametric gradient profile model.

Using this model, the prior knowledge of the gradient profiles are learned from a large collection of natural images, which

are called gradient profile prior. Based on this prior, we propose a gradient field transformation to constrain the gradient

fields of the high resolution image and the enhanced image when performing single image super-resolution and sharpness

enhancement. With this simple but very effective approach, we are able to produce state-of-the-art results. The

reconstructed high resolution images or the enhanced images are sharp while have rare ringing or jaggy artifacts.

68 Graph Cuts for Curvature Based Image Denoising

Minimization of total variation (TV) is a well-known method for image denoising. Recently, the relationship between TV

minimization problems and binary MRF models has been much explored. This has resulted in some very efficient

combinatorial optimization algorithms for the TV minimization problem in the discrete setting via graph cuts. To overcome

limitations, such as staircasing effects, of the relatively simple TV model, variational models based upon higher order

derivatives have been proposed. The Euler’s elastica model is one such higher order model of central importance, which

minimizes the curvature of all level lines in the image. Traditional numerical methods for minimizing the energy in such

higher order models are complicated and computationally complex. In this paper, we will present an efficient minimization

algorithm based upon graph cuts for minimizing the energy in the Euler’s elastica model, by simplifying the problem to that

of solving a sequence of easy graph representable problems. This sequence has connections to the gradient flowof the

energy function, and converges to a minimum point. The numerical experiments show that our new approach is more

effective in maintaining smooth visual results while preserving sharp features better than TV models.

23



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


69 Graph Regularized Sparse Coding for Image Representation

Sparse coding has received an increasing amount of interest in recent years. It is an unsupervised learning algorithm,

which finds a basis set capturing high-level semantics in the data and learns sparse coordinates in terms of the basis set.

Originally applied to modeling the human visual cortex, sparse coding has been shown useful for many applications.

However, most of the existing approaches to sparse coding fail to consider the geometrical structure of the data space. In

many real applications, the data is more likely to reside on a low-dimensional submanifold embedded in the high-

dimensional ambient space. It has been shown that the geometrical information of the data is important for discrimination.

In this paper, we propose a graph based algorithm, called graph regularized sparse coding, to learn the sparse

representations that explicitly take into account the local manifold structure of the data. By using graph Laplacian as a

smooth operator, the obtained sparse representations vary smoothly along the geodesics of the data manifold. The

extensive experimental results on image classification and clustering have demonstrated the effectiveness of our proposed

algorithm.

70 HAIRIS: A Method for Automatic Image Registration Through Histogram-Based Image Segmentation

Automatic image registration is still an actual challenge in several fields. Although several methods for automatic image

registration have been proposed in the last few years, it is still far from a broad use in several applications, such as in

remote sensing. In this paper, a method for automatic image registration through histogram-based image segmentation

(HAIRIS) is proposed. This new approach mainly consists in combining several segmentations of the pair of images to be

registered, according to a relaxation parameter on the histogram modes delineation (which itself is a new approach),

followed by a consistent characterization of the extracted objects—through the objects area, ratio between the axis of the

adjust ellipse, perimeter and fractal dimension—and a robust statistical based procedure for objects matching. The

application of the proposed methodology is illustrated to simulated rotation and translation. The first dataset consists in a

photograph and a rotated and shifted version of the same photograph, with different levels of added noise. It was also

applied to a pair of satellite images with different spectral content and simulated translation, and to real remote sensing

examples comprising different viewing angles, different acquisition dates and different sensors. An accuracy below 1Q for

rotation and at the subpixel level for translation were obtained, for the most part of the considered situations. HAIRIS allows

for the registration of pairs of images (multitemporal and multisensor) with differences in rotation and translation, with

small differences in the spectral content, leading to a subpixel accuracy.

71 High Capacity Color Barcodes: Per Channel Data Encoding via Orientation Modulation in Elliptical Dot Arrays

We present a new high capacity color barcode. The barcode we propose uses the cyan, magenta, and yellow (C,M,Y)

colorant separations available in color printers and enables high capacity by independently encoding data in each of these

separations. In each colorant channel, payload data is conveyed by using a periodic array of elliptically shaped dots whose

individual orientations are modulated to encode the data. The orientation based data encoding provides beneficial

robustness against printer and scanner tone variations. The overall color barcode is obtained when these color separations

are printed in overlay as is common in color printing. A reader recovers the barcode data from a conventional color scan of

the barcode, using red, green, and blue (R,G,B) channels complementary, respectively, to the print C, M, and Y channels.

For each channel, first the periodic arrangement of dots is exploited at the reader to enable synchronization by

24



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


compensating for both global rotation/scaling in scanning and local distortion in printing. To overcome the color

interference resulting from colorant absorptions in noncomplementary scanner channels, we propose a novel interference

minimizing data encoding approach and a statistical channel model (at the reader) that captures the characteristics of the

interference, enabling more accurate data recovery. We also employ an error correction methodology that effectively

utilizes the channel model. The experimental results show that the proposed method works well, offering (error-free)

operational rates that are comparable to or better than the highest capacity barcodes known in the literature.

72 High Dynamic Range Image Display With Halo and Clipping Prevention

The dynamic range of an image is defined as the ratio between the highest and the lowest luminance level. In a high

dynamic range (HDR) image, this value exceeds the capabilities of conventional display devices; as a consequence,

dedicated visualization techniques are required. In particular, it is possible to process an HDR image in order to reduce its

dynamic range without producing a significant change in the visual sensation experienced by the observer. In this paper,

we propose a dynamic range reduction algorithm that produces high-quality results with a low computational cost and a

limited number of parameters. The algorithm belongs to the category of methods based upon the Retinex theory of vision

and was specifically designed in order to prevent the formation of common artifacts, such as halos around the sharp edges

and clipping of the highlights, that often affect methods of this kind. After a detailed analysis of the state of the art, we shall

describe the method and compare the results and performance with those of two techniques recently proposed in the

literature and one commercial software.

73 High-Resolution Imaging Via Moving Random Exposure and Its Simulation

In this correspondence, we introduce a new imaging method to obtain high-resolution (HR) images. The image acquisition

is performed in two stages, compressive measurement and optimization reconstruction. In order to reconstruct HR images

by a small number of sensors, compressive measurements aremade. Specifically, compressive measurements are made by

a low-resolution (LR) camera with randomly fluttering shutter, which can be viewed as a moving random exposure pattern.

In the optimization reconstruction stage, the HR image is computed by different models according to the prior knowledge of

scenes. The proposed imaging method offers a new way of acquiring HR images of essentially static scenes when the

camera resolution is limited by severe constraints such as cost, battery capacity, memory space, transmission bandwidth,

etc. and when the prior knowledge of scenes is available. The simulation results demonstrate the effectiveness of the

proposed imaging method.

74 Human Motion Tracking by Temporal-Spatial Local Gaussian Process Experts

Human pose estimation via motion tracking systems can be considered as a regression problem within a discriminative

framework. It is always a challenging task to model the mapping from observation space to state space because of the

high-dimensional characteristic in the multimodal conditional distribution. In order to build the mapping, existing

techniques usually involve a large set of training samples in the learning process which are limited in their capability to

deal with multimodality. We propose, in this work, a novel online sparse Gaussian Process (GP) regression model to

recover 3-D human motion in monocular videos. Particularly, we investigate the fact that for a given test input, its output is

mainly determined by the training samples potentially residing in its local neighborhood and defined in the unified input-

output space. This leads to a local mixture GP experts system composed of different local GP experts, each of which

25



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


dominates a mapping behavior with the specific covariance function adapting to a local region. To handle the

multimodality, we combine both temporal and spatial information therefore to obtain two categories of local experts. The

temporal and spatial experts are integrated into a seamless hybrid system, which is automatically self-initialized and robust

for visual tracking of nonlinear human motion. Learning and inference are extremely efficient as all the local experts are

defined online within very small neighborhoods. Extensive experiments on two real-world databases, HumanEva and PEAR,

demonstrate the effectiveness of our proposed model, which significantly improve the performance of existing models.

75 Hyperspectral BSS Using GMCA With Spatio-Spectral Sparsity Constraints

Generalized morphological component analysis (GMCA) is a recent algorithm for multichannel data analysis which was

used successfully in a variety of applications including multichannel sparse decomposition, blind source separation (BSS),

color image restoration and inpainting. Building on GMCA, the purpose of this contribution is to describe a new algorithm

for BSS applications in hyperspectral data processing. It assumes the collected data is a mixture of components exhibiting

sparse spectral signatures as well as sparse spatial morphologies, each in specified dictionaries of spectral and spatial

waveforms. We report on numerical experiments with synthetic data and application to real observations which

demonstrate the validity of the proposed method.

76 Image Denoising in Mixed Poisson–Gaussian Noise

We propose a general methodology (PURE-LET) to design and optimize a wide class of transform-domain thresholding

algorithms for denoising images corrupted by mixed Poisson–Gaussian noise. We express the denoising process as a

linear expansion of thresholds (LET) that we optimize by relying on a purely data-adaptive unbiased estimate of the mean-

squared error (MSE), derived in a non-Bayesian framework (PURE: Poisson–Gaussian unbiased risk estimate). We provide

a practical approximation of this theoretical MSE estimate for the tractable optimization of arbitrary transform-domain

thresholding. We then propose a pointwise estimator for undecimated filterbank transforms, which consists of subband-

adaptive thresholding functions with signal-dependent thresholds that are globally optimized in the image domain. We

finally demonstrate the potential of the proposed approach through extensive comparisons with state-of-the-art techniques

that are specifically tailored to the estimation of Poisson intensities.We also present denoising results obtained on real

images of low-count fluorescence microscopy.

77 IMAGE Resolution Enhancement by Using Discrete and Stationary Wavelet Decomposition

In this correspondence, the authors propose an image resolution enhancement technique based on interpolation of the

high frequency subband images obtained by discrete wavelet transform (DWT) and the input image. The edges are

enhanced by introducing an intermediate stage by using stationary wavelet transform (SWT). DWT is applied in order to

decompose an input image into different subbands. Then the high frequency subbands as well as the input image are

interpolated. The estimated high frequency subbands are being modified by using high frequency subband obtained

through SWT. Then all these subbands are combined to generate a new high resolution image by using inverse DWT

(IDWT). The quantitative and visual results are showing the superiority of the proposed technique over the conventional

and state-of-art image resolution enhancement techniques.

26



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


.78 Image Segmentation Using Fuzzy Region Competition and Spatial/Frequency Information

This paper presents a multiphase fuzzy region competition model that takes into account spatial and frequency information for image

segmentation. In the proposed energy functional, each region is represented by a fuzzy membership function and a data fidelity term that

measures the conformity of spatial and frequency data within each region to (generalized) Gaussian densities whose parameters are

determined jointly with the segmentation process. Compared with the classical region competition model, our approach gives soft

segmentation results via the fuzzy membership functions, and moreover, the use of frequency data provides additional region information

that can improve the overall segmentation result. To efficiently solve the minimization of the energy functional, we adopt an alternate

minimization procedure and make use of Chambolle’s fast duality projection algorithm. We apply the proposed method to synthetic and

natural textures as well as real-world natural images. Experimental results show that our proposed method has very promising

segmentation performance compared with the current state-of-the-art approaches.

79 In Search of Perceptually Salient Groupings

Finding meaningful groupings of image primitives has been a long-standing problem in computer vision. This paper studies

how salient groupings can be produced using established theories in the field of visual perception alone. The major

contribution is a novel definition of the Gestalt principle of Prägnanz, based upon Koffka’s definition that image

descriptions should be both stable and simple. Our method is global in the sense that it operates over all primitives in an

image at once. It works regardless of the type of image primitives and is generally independent of image properties such as

intensity, color, and texture. A novel experiment is designed to quantitatively evaluate the groupings outputs by our

method, which takes human disagreement into account and is generic to outputs of any grouper. We also demonstrate the

value of our method in an image segmentation application and quantitatively show that segmentations deliver promising

results when benchmarked using the Berkeley Segmentation Dataset (BSDS).

80 Incremental Training of a Detector Using Online Sparse Eigen decomposition

The ability to efficiently and accurately detect objects plays a very crucial role for many computer vision tasks. Recently,

offline object detectors have shown a tremendous success. However, one major drawback of offline techniques is that a

complete set of training data has to be collected beforehand. In addition, once learned, an offline detector cannot make use

of newly arriving data. To alleviate these drawbacks, online learning has been adopted with the following objectives: 1) the

technique should be computationally and storage efficient; 2) the updated classifier must maintain its high classification

accuracy. In this paper, we propose an effective and efficient framework for learning an adaptive online greedy sparse

linear discriminant analysis model. Unlike many existing online boosting detectors, which usually apply exponential or

logistic loss, our online algorithm makes use of linear discriminant analysis’ learning criterion that not only aims to

maximize the class-separation criterion but also incorporates the asymmetrical property of training data distributions. We

provide a better alternative for online boosting algorithms in the context of training a visual object detector.We demonstrate

the robustness and efficiency of our methods on handwritten digit and face data sets. Our results confirm that object

detection tasks benefit significantly when trained in an online manner.

27



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


81 Information Content Weighting for Perceptual Image Quality Assessment

Many state-of-the-art perceptual image quality assessment (IQA) algorithms share a common two-stage structure: local

quality/distortion measurement followed by pooling. While significant progress has been made in measuring local image

quality/distortion, the pooling stage is often done in ad-hoc ways, lacking theoretical principles and reliable computational

models. This paper aims to test the hypothesis that when viewing natural images, the optimal perceptual weights for

pooling should be proportional to local information content, which can be estimated in units of bit using advanced

statistical models of natural images. Our extensive studies based upon six publicly-available subject- rated image

databases concluded with three useful findings. First, information content weighting leads to consistent improvement in

the performance of IQA algorithms. Second, surprisingly, with information content weighting, even the widely criticized

peak signal-to-noise-ratio can be converted to a competitive perceptual quality measure when compared with state-of-the-

art algorithms. Third, the best overall performance is achieved by combining information content weighting with multiscale

structural similarity measures.

82 Interactive Streaming of Stored Multi view Video Using Redundant Frame Structures

While much of multiview video coding focuses on the rate-distortion performance of compressing all frames of all views for

storage or non-interactive video delivery over networks, we address the problem of designing a frame structure to enable

interactive multiview streaming, where clients can interactively switch views during video playback. Thus, as a client is

playing back successive frames (in time) for a given view, it can send a request to the server to switch to a different view

while continuing uninterrupted temporal playback. Noting that standard tools for random access (i.e., I-frame insertion) can

be bandwidth-inefficient for this application, we propose a redundant representation of I-, P-, and “merge” frames, where

each original picture can be encoded into multiple versions, appropriately trading off expected transmission rate with

storage, to facilitate view switching. We first present ad hoc frame structures with good performance when the view-

switching probabilities are either very large or very small.We then present optimization algorithms that generate more

general frame structures with better overall performance for the general case.We show in our experiments that we can

generate redundant frame structures offering a range of tradeoff points between transmission and storage, e.g.,

outperforming simple I-frame insertion structures by up to 45% in terms of bandwidth efficiency at twice the storage cost.

83 Inverse Half toning Based on the Bayesian Theorem

This study proposes a method which can generate high quality inverse halftone images from halftone images. This method

can be employed prior to any signal processing over a halftone image or the inverse halftoning used in JBIG2. The

proposed method utilizes the least-mean-square (LMS) algorithm to establish a relationship between the current processing

position and its corresponding neighboring positions in each type of halftone image, including direct binary search, error

diffusion, dot diffusion, and ordered dithering. After which, a referenced region called a support region (SR) is used to

extract features. The SR can be obtained by relabeling the LMS-trained filters with the order of importance. Moreover, the

probability of black pixel occurrence is considered as a feature in this work. According to this feature, the probabilities of

28



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


all possible grayscale values at the current processing position can be obtained by the Bayesian theorem. Consequently,

the final output at this position is the grayscale value with the highest probability. Experimental results show that the

proposed method offers better visual quality than that of Mese–Vaidyanathan’s and Chang et al.’s methods in terms of

human-visual peak signal-to-noise ratio (HPSNR). In addition, the memory consumption is also superior to Mese–

Vaidyanathan’s method.

84 Iterative Shrinkage Approach to Restoration of Optical Imagery

The problem of reconstruction of digital images from their degraded measurements is regarded as a problem of central

importance in various fields of engineering and imaging sciences. In such cases, the degradation is typically caused by the

resolution limitations of an imaging device in use and/or by the destructive influence of measurement noise. Specifically,

when the noise obeys a Poisson probability law, standard approaches to the problem of image reconstruction are based

upon using fixed-point algorithms which follow the methodology first proposed by Richardson and Lucy. The practice of

using these methods, however, shows that their convergence properties tend to deteriorate at relatively high noise levels.

Accordingly, in the present paper, a novel method for denoising and/or deblurring of digital images corrupted by Poisson

noise is introduced. The proposed method is derived under the assumption that the image of interest can be sparsely

represented in the domain of a linear transform. Consequently, a shrinkagebased iterative procedure is proposed, which

guarantees the solution to converge to the global maximizer of an associated maximum a posteriori criterion. It is shown in

a series of computer-simulated experiments that the proposed method outperforms a number of existing alternatives in

terms of stability, precision, and computational efficiency.

85 JPEG2000-Based Scalable Interactive Video (JSIV)

We propose a novel paradigm for interactive video streaming and we coin the term JPEG2000-based scalable interactive

video (JSIV) for it. JSIV utilizes JPEG2000 to independently compress the original video sequence frames and provide for

quality and spatial resolution scalability. To exploit interframe redundancy, JSIV utilizes prediction and conditional

replenishment of code-blocks aided by a server policy that optimally selects the number of quality layer for each code-

block transmitted and a client policy that makes most of the received (distorted) frames. It is also possible for JSIV to

employ motion compensation; however, we leave this topic to future work. To optimally solve the server transmission

problem, a Lagrangian-style rate-distortion optimization procedure is employed. In JSIV, a wide variety of frame prediction

arrangements can be employed including hierarchical B-frames of the scalable video coding (SVC) extension of the

H.264/AVC standard. JSIV provides considerably better interactivity compared to existing schemes and can adapt

immediately to interactive changes in client interests, such as forward or backward playback and zooming into individual

frames. Experimental results for surveillance footage, which does not suffer from the absence of motion compensation,

show that JSIV’s performance is comparable to that of SVC in some usage scenarios while JSIV performs better in others.

86 Kernel Maximum Autocorrelation Factor and Minimum Noise Fraction Transformations

This paper introduces kernel versions of maximum autocorrelation factor (MAF) analysis and minimum noise fraction (MNF)

analysis. The kernel versions are based upon a dual formulation also termed Q-mode analysis in which the data enter into

the analysis via inner products in the Gram matrix only. In the kernel version, the inner products of the original data are

replaced by inner products between nonlinear mappings into higher dimensional feature space. Via kernel substitution also

29



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


known as the kernel trick these inner products between the mappings are in turn replaced by a kernel function and all

quantities needed in the analysis are expressed in terms of this kernel function. This means that we need not know the

nonlinear mappings explicitly. Kernel principal component analysis (PCA), kernel MAF, and kernel MNF analyses handle

nonlinearities by implicitly transforming data into high (even infinite) dimensional feature space via the kernel function and

then performing a linear analysis in that space. Three examples show the very successful application of kernel MAF/MNF

analysis to: 1) change detection in DLR 3K camera data recorded 0.7 s apart over a busy motorway, 2) change detection in

hyperspectral HyMap scanner data covering a small agricultural area, and 3) maize kernel inspection. In the cases shown,

the kernel MAF/MNF transformation performs better than its linear counterpart as well as linear and kernel PCA. The leading

kernel MAF/MNF variates seem to possess the ability to adapt to even abruptly varying multi and hypervariate backgrounds

and focus on extreme observations.

87 Large Disparity Motion Layer Extraction via Topological Clustering

In this paper, we present a robust and efficient approach to extract motion layers from a pair of images with large disparity

motion. First, motion models are established as: 1) initial SIFT matches are obtained and grouped into a set of clusters

using our developed topological clustering algorithm; 2) for each cluster with no less than three matches, an affine

transformation is estimated with least-square solution as tentative motion model; and 3) the tentative motion models are

refined and the invalid models are pruned. Then, with the obtained motion models, a graph cuts based layer assignment

algorithm is employed to segment the scene into several motion layers. Experimental results demonstrate that our method

can successfully segment scenes containing objects with large interframe motion or even with significant interframe scale

and pose changes. Furthermore, compared with the previous method invented by Wills et al. and its modified version, our

method is much faster and more robust.

88 Correspondence Lazy Sliding Window Implementation of the Bilateral Filter on Parallel Architectures

Bilateral filter is one of the state-of-the-art methods for noise reduction in images. The plausible visual result the filter

produces makes it a common choice for image and video processing applications, yet, its high computational complexity

makes a real-time implementation a challenging task. Presented here is a parallel version of the bilateral filter using a lazy

sliding window, suitable for SIMD-type architectures.

89 Light Field Analysis for Modeling Image Formation

Image formation is traditionally described by a number of individual models, one for each specific effect in the image

formation process. However, it is difficult to aggregate the effects by concatenating such individual models. In this paper,

we apply light transport analysis to derive a unified image formation model that represents the radiance along a light ray as

a 4-D light field signal and physical phenomena such as lens refraction and blocking as linear transformations or

modulations of the light field. This unified mathematical framework allows the entire image formation process to be

30



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


elegantly described by a single equation. It also allows most geometric and photometric effects of imaging, including

perspective transformation, defocus blur, and vignetting, to be represented in both 4-D primal and dual domains. The result

matches that of traditional models. Generalizations and applications of this theoretic framework are discussed.

90 Lightweight Detection of Additive Watermarking in the DWT-Domain

This article aims at lightweight, blind detection of additive spread-spectrum watermarks in the DWT domain.We focus on

two host signal noise models and two types of hypothesis tests for watermark detection. As a crucial point of our work we

take a closer look at the computational requirements of watermark detectors. This involves the computation of the

detection response, parameter estimation and threshold selection. We show that by switching to approximate host signal

parameter estimates or even fixed parameter settings we achieve a remarkable improvement in runtime performance

without sacrificing detection performance. Our experimental results on a large number of images confirm the assumption

that there is not necessarily a tradeoff between computation time and detection performance.

91 Measuring the Quality of Quality Measures

Print quality (PQ) is a composite attribute defined by human perception. As such, the ultimate way to determine and

quantify PQ is by human survey. However, repeated surveys are time consuming and often represent a burden on

processes that involve repeated evaluations. A desired alternative would be an automatic quality rating tool. Once such

quality evaluation measure is proposed, it should be qualified. That is, it should be shown to reflect human assessment. If

two of the human opinions conflict, the tool cannot possibly agree with both. Conflicts between human opinions are

common, which complicates the evaluation of tool’s success in reflecting human judgment. There are many optional ways

for measuring the agreement between human assessment and tool evaluation, but different methods may have conflicting

results. It is, therefore, important to pre-establish the appropriate method for the evaluation of quality-evaluation-tools, a

method that takes the disagreement among the survey participants into account. In this paper, we model human quality

preference and derive the most appropriate method to qualify quality evaluation tools.We demonstrate the resulting

qualification method in a real life scenario—the qualification of the mechanical band meter.

92 Missing Intensity Interpolation Using a Kernel PCA-Based POCS Algorithm and its Applications

A missing intensity interpolation method using a kernel principal component analysis (PCA)-based projection onto convex

sets (POCS) algorithm and its applications are presented in this paper. In order to interpolate missing intensities within a

target image, the proposed method reconstructs local textures containing the missing pixels by using the POCS algorithm.

In this reconstruction process, a nonlinear eigenspace is constructed from each kind of texture, and the optimal subspace

for the target local texture is introduced into the constraint of the POCS algorithm. In the proposed method, the optimal

subspace can be selected by monitoring errors converged in the reconstruction process. This approach provides a solution

to the problem in conventional methods of not being able to effectively perform adaptive reconstruction of the target

textures due to missing intensities, and successful interpolation of the missing intensities by the proposed method can be

realized. Furthermore, since our method can restore any images including arbitrary-shaped missing areas, its potential in

two image reconstruction tasks, image enlargement and missing area restoration, is also shown in this paper.

31



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


93 Multidimensional Filter Bank Signal Reconstruction From Multichannel Acquisition

We study the theory and algorithms of an optimal use of multidimensional signal reconstruction from multichannel

acquisition by using a filter bank setup. Suppose that we have an N-channel convolution system, referred to as Q analysis

filters, in M dimensions. Instead of taking all the data and applying multichannel deconvolution, we first reduce the

collected data set by an integer M * M uniform sampling matrix D, and then search for a synthesis polyphase matrix which

could perfectly reconstruct any input discrete signal. First, we determine the existence of perfect reconstruction (PR)

systems for a given set of finite-impulse response (FIR) analysis filters. Second, we present an efficient algorithm to find a

sampling matrix with maximum sampling rate and to find a FIR PR synthesis polyphase matrix for a given set of FIR

analysis filters. Finally, once a particular FIR PR synthesis polyphase matrix is found, we can characterize all FIR PR

synthesis matrices, and then find an optimal one according to design criteria including robust reconstruction in the

presence of noise.

94 Multiple Player Tracking in Sports Video: A Dual-Mode Two-Way Bayesian Inference Approach With Progressive Observation Modeling

Multiple object tracking (MOT) is a very challenging task yet of fundamental importance for many practical applications. In

this paper, we focus on the problem of tracking multiple players in sports video which is even more difficult due to the

abrupt movements of players and their complex interactions. To handle the difficulties in this problem, we present a new

MOT algorithm which contributes both in the observation modeling level and in the tracking strategy level. For the

observation modeling, we develop a progressive observation modeling process that is able to provide strong tracking

observations and greatly facilitate the tracking task. For the tracking strategy, we propose a dual-mode two-way Bayesian

inference approach which dynamically switches between an offline general model and an online dedicated model to deal

with single isolated object tracking and multiple occluded object tracking integrally by forward filtering and backward

smoothing. Extensive experiments on different kinds of sports videos, including football, basketball, as well as hockey,

demonstrate the effectiveness and efficiency of the proposed method.

95 Multiregion Image Segmentation by Parametric Kernel Graph Cuts

The purpose of this study is to investigate multiregion graph cut image partitioning via kernel mapping of the image data.

The image data is transformed implicitly by a kernel function so that the piecewise constant model of the graph cut

formulation becomes applicable. The objective function contains an original data term to evaluate the deviation of the

32



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


transformed data, within each segmentation region, from the piecewise constant model, and a smoothness, boundary

preserving regularization term. The method affords an effective alternative to complex modeling of the original image data

while taking advantage of the computational benefits of graph cuts. Using a common kernel function, energy minimization

typically consists of iterating image partitioning by graph cut iterations and evaluations of region parameters via fixed point

computation. A quantitative and comparative performance assessment is carried out over a large number of experiments

using synthetic grey level data as well as natural images from the Berkeley database. The effectiveness of the method is

also demonstrated through a set of experiments with real images of a variety of types such as medical, synthetic aperture

radar, and motion maps.

96 Nonlocal Mumford-Shah Regularizers for Color Image Restoration

We propose here a class of restoration algorithms for color images, based upon the Mumford-Shah (MS) model and

nonlocal image information. The Ambrosio-Tortorelli and Shah elliptic approximations are defined to work in a small local

neighborhood, which are sufficient to denoise smooth regions with sharp boundaries. However, texture is nonlocal in

nature and requires semilocal/non-local information for efficient image denoising and restoration. Inspired from recent

works (nonlocal means of Buades, Coll, Morel, and nonlocal total variation of Gilboa, Osher), we extend the local Ambrosio-

Tortorelli and Shah approximations to MS functional (MS) to novel nonlocal formulations, for better restoration of fine

structures and texture. We present several applications of the proposed nonlocal MS regularizers in image processing such

as color image denoising, color image deblurring in the presence of Gaussian or impulse noise, color image inpainting,

color image super-resolution, and color filter array demosaicing. In all the applications, the proposed nonlocal regularizers

produce superior results over the local ones, especially in image inpainting with large missing regions. We also prove

several characterizations of minimizers based upon dual norm formulations.

97 Nonlocal PDEs-Based Morphology on Weighted Graphs for Image and Data Processing

Mathematical morphology (MM) offers a wide range of operators to address various image processing problems. These

operators can be defined in terms of algebraic (discrete) sets or as partial differential equations (PDEs). In this paper, we

introduce a nonlocal PDEs-based morphological framework defined on weighted graphs. We present and analyze a set of

operators that leads to a family of discretized morphological PDEs on weighted graphs. Our formulation introduces

nonlocal patch-based configurations for image processing and extends PDEs-based approach to the processing of

arbitrary data such as nonuniform high dimensional data. Finally, we show the potentialities of our methodology in order to

process, segment and classify images and arbitrary data.

98 Non rigid Registration of 2-D and 3-D Dynamic Cell Nuclei Images for Improved Classification of Sub-cellular Particle Motion

The observed motion of subcellular particles in fluorescence microscopy image sequences of live cells is generally a

superposition of the motion and deformation of the cell and the motion of the particles. Decoupling the two types of

movements to enable accurate classification of the particle motion requires the application of registration algorithms. We

have developed an intensity-based approach for nonrigid registration of multichannel microscopy image sequences of cell

nuclei. First, based on 3-D synthetic images we demonstrate that cell nucleus deformations change the observed motion

types of particles and that our approach allows to recover the original motion. Second, we have successfully applied our

approach to register 2-D and 3-D real microscopy image sequences. A quantitative experimental comparison with previous

33



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


approaches for nonrigid registration of cell microscopy has also been performed.

99 Non-uniform Directional Filter Banks With Arbitrary Frequency Partitioning

Directional filter banks (DFBs) are highly desired in directional representation of images. In this correspondence, we

propose a 2-D nonsubsampled nonuniform directional filter bank (NUDFB) and its design method. The proposed NUDFB

has nonuniform wedge-shaped subbands and allows arbitrary frequency partitioning schemes. It can extract directional

information according to the directional distribution of images. This attractive advantage cannot be achieved by the

existing directional transforms. The design method of the proposed NUDFB is based upon the pseudopolar Fourier

transform. By utilizing the geometry property of the pseudopolar grid, we employ a 1-D nonsubsampled nonuniform filter

bank to obtain a set of nonuniform wedge-shaped subbands. During the design process, only 1-D operations are involved

and, thus, the difficulty encountered in the design of 2-D fan filters is avoided. To demonstrate the potential of the proposed

NUDFB, an example on image directional decomposition is given.

100 No-Reference Blur Assessment of Digital Pictures Based on Multi-feature Classifiers

In this paper, we address the problem of no-reference quality assessment for digital pictures corrupted with blur. We start

with the generation of a large real image database containing pictures taken by human users in a variety of situations, and

the conduction of subjective tests to generate the ground truth associated to those images. Based upon this ground truth,

we select a number of high quality pictures and artificially degrade them with different intensities of simulated blur

(gaussian and linear motion), totalling 6000 simulated blur images. We extensively evaluate the performance of state-of-the-

art strategies for no-reference blur quantification in different blurring scenarios, and propose a paradigm for blur evaluation

in which an effective method is pursued by combining several metrics and low-level image features.We test this paradigm

by designing a no-reference quality assessment algorithm for blurred images which combines different metrics in a

classifier based upon a neural network structure. Experimental results show that this leads to an improved performance

that better reflects the images’ ground truth. Finally, based upon the real image database, we show that the proposed

method also outperforms other algorithms and metrics in realistic blur scenarios.

101 On a Derivative-Free Fan-Beam Reconstruction Formula

We clarify that the derivative-free fan-beam reconstruction formula [IEEE Trans. Image Process. 2, 543–547, 1993] only

allows exact reconstruction of an object for a circular trajectory or at the origin of the coordinate system for a radially

symmetric noncircular trajectory.

102 On the Selection of Optimal Feature Region Set for Robust Digital Image Watermarking

A novel feature region selection method for robust digital image watermarking is proposed in this paper. This method aims

to select a nonoverlapping feature region set, which has the greatest robustness against various attacks and can preserve

image quality as much as possible after watermarked. It first performs a simulated attacking procedure using some

34



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


predefined attacks to evaluate the robustness of every candidate feature region. According to the evaluation results, it then

adopts a track-with-pruning procedure to search a minimal primary feature set which can resist the most predefined

attacks. In order to enhance its resistance to undefined attacks under the constraint of preserving image quality, the

primary feature set is then extended by adding into some auxiliary feature regions. This work is formulated as a

multidimensional knapsack problem and solved by a genetic algorithm based approach. The experimental results for

StirMark attacks on some benchmark images support our expectation that the primary feature set can resist all the

predefined attacks and its extension can enhance the robustness against undefined attacks. Comparing with some well-

known feature-based methods, the proposed method exhibits better performance in robust digital watermarking.

103 Online Sparse Gaussian Process Regression and Its Applications

We present a new Gaussian process (GP) inference algorithm, called online sparse matrix Gaussian processes (OSMGP),

and demonstrate its merits by applying it to the problems of head pose estimation and visual tracking. The OSMGP is based

upon the observation that for kernels with local support, the Gram matrix is typically sparse. Maintaining and updating the

sparse Cholesky factor of the Gram matrix can be done efficiently using Givens rotations. This leads to an exact, online

algorithm whose update time scales linearly with the size of the Gram matrix. Further, we provide a method for constant

time operation of the OSMGP using matrix downdates. The downdates maintain the Cholesky factor at a constant size by

removing certain rows and columns corresponding to discarded training examples. We demonstrate that, using these

matrix downdates, online hyperparameter estimation can be included at cost linear in the number of total training examples.

We describe a robust appearance-based head pose estimation system based upon the OSMGP. Numerous experiments and

comparisons with existing methods using a large dataset system demonstrate the efficiency and accuracy of our system.

Further, to showcase the applicability of OSMGP to a wide variety of problems, we also describe a regression-based visual

tracking method. Experiments show that our OSMGP algorithm generalizes well using online learning.

104 Optimal Design of FIR Triplet Halfband Filter Bank and Application in Image Coding

This correspondence proposes an efficient semidefinite programming (SDP) method for the design of a class of linear

phase finite impulse response triplet halfband filter banks whose filters have optimal frequency selectivity for a prescribed

regularity order. The design problem is formulated as the minimization of the least square error subject to peak error

constraints and regularity constraints. By using the linear matrix inequality characterization of the trigonometric semi-

infinite constraints, it can then be exactly cast as a SDP problem with a small number of variables and, hence, can be

solved efficiently. Several design examples of the triplet halfband filter bank are provided for illustration and comparison

with previous works. Finally, the image coding performance of the filter bank is presented.

105 Optimal Image Alignment With Random Projections of Manifolds: Algorithm and Geometric Analysis

This paper addresses the problem of image alignment based on random measurements. Image alignment consists of

estimating the relative transformation between a query image and a reference image.We consider the specific problem

where the query image is provided in compressed form in terms of linear measurements captured by a vision sensor.We

35



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


cast the alignment problem as a manifold distance minimization problem in the linear subspace defined by the

measurements. The transformation manifold that represents synthesis of shift, rotation, and isotropic scaling of the

reference image can be given in closed form when the reference pattern is sparsely represented over a parametric

dictionary. We show that the objective function can then be decomposed as the difference of two convex functions (DC) in

the particular case where the dictionary is built on Gaussian functions. Thus, the optimization problem becomes a DC

program, which in turn can be solved globally by a cutting plane method. The quality of the solution is typically affected by

the number of random measurements and the condition number of the manifold that describes the transformations of the

reference image. We show that the curvature, which is closely related to the condition number, remains bounded in our

image alignment problem, which means that the relative transformation between two images can be determined optimally in

a reduced subspace.

106 Optimal Inversion of the Anscombe Transformation in Low-Count Poisson Image Denoising

The removal of Poisson noise is often performed through the following three-step procedure. First, the noise variance is

stabilized by applying the Anscombe root transformation to the data, producing a signal in which the noise can be treated

as additive Gaussian with unitary variance. Second, the noise is removed using a conventional denoising algorithm for

additive white Gaussian noise. Third, an inverse transformation is applied to the denoised signal, obtaining the estimate of

the signal of interest. The choice of the proper inverse transformation is crucial in order to minimize the bias error which

arises when the nonlinear forward transformation is applied. We introduce optimal inverses for the Anscombe

transformation, in particular the exact unbiased inverse, a maximum likelihood (ML) inverse, and a more sophisticated

minimum mean square error (MMSE) inverse. We then present an experimental analysis using a few state-of-the-art

denoising algorithms and show that the estimation can be consistently improved by applying the exact unbiased inverse,

particularly at the low-count regime. This results in a very efficient filtering solution that is competitive with some of the

best existing methods for Poisson image denoising.

107 Optimizing a Tone Curve for Backward-Compatible High Dynamic Range Image and Video Compression

For backward compatible high dynamic range (HDR) video compression, the HDR sequence is reconstructed by inverse

tone-mapping a compressed low dynamic range (LDR) version of the original HDR content. In this paper, we show that the

appropriate choice of a tone-mapping operator (TMO) can significantly improve the reconstructed HDR quality. We develop

a statistical model that approximates the distortion resulting from the combined processes of tone-mapping and

compression. Using this model, we formulate a numerical optimization problem to find the tone-curve that minimizes the

expected mean square error (MSE) in the reconstructed HDR sequence. We also develop a simplified model that reduces

the computational complexity of the optimization problem to a closed-form solution. Performance evaluations show that the

proposed methods provide superior performance in terms of HDR MSE and SSIM compared to existing tone-mapping

schemes. It is also shown that the LDR image quality resulting from the proposed methods matches that produced by

perceptually-based TMOs.

108 Paramer Mismatch-Based Spectral Gamut Mapping

Aspectral agreement between the original scene and a printed reproduction is required to achieve an illuminant-invariant

visual match. This is usually impossible since the spectral gamut of typical printing systems is only a small subset of all

natural reflectances. Out-of gamut reflectances need to be mapped into the spectral gamut of the printer minimizing the

36



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


perceived error between original and reproduction for more than one illuminant. In this paper, we propose an algorithmic

framework for spectral gamut mapping to achieve a reproduction that is as visually correct as a colorimetric reproduction

for one illuminant and is superior for a set of other illuminants. A sequence of hierarchical mappings in 3-D color spaces

are performed utilizing the observer’s color quantization to increase the spectral variability of subsequent transformations:

For the most important illuminant a traditional colorimetric gamut mapping is performed. For any additional illuminants

colors are mapped onto pixel-dependent paramer mismatch gamuts preserving the visual equivalence of previous

transformations. We present a separation method for investigating the spectral gamut mapping framework and show that

hue shifts and chroma gains cannot be always avoided for the second and subsequent illuminants and that the order of

illuminants has a large impact on the final reproduction.

109 Passive Polarimetric Imagery-Based Material Classification Robust to Illumination Source Position and Viewpoint

Polarization, a property of light that conveys information about the transverse electric field orientation, complements other

attributes of electromagnetic radiation such as intensity and frequency. Using multiple passive polarimetric images, we

develop an iterative, model-based approach to estimate the complex index of refraction and apply it to target classification.

110 Perceptual Segmentation: Combining Image Segmentation With Object Tagging

Human observers understand the content of an image intuitively. Based upon image content, they perform many

imagerelated tasks, such as creating slide shows and photo albums, and organizing their image archives. For example, to

select photos for an album, people assess image quality based upon the main objects in the image. They modify colors in

an image based upon the color of important objects, such as sky, grass or skin. Serious photographers might modify each

object separately. Photo applications, in contrast, use low-level descriptors to guide similar tasks. Typical descriptors, such

as color histograms, noise level, JPEG artifacts and overall sharpness, can guide an imaging application and safeguard

against blunders. However, there is a gap between the outcome of such operations and the same task performed by a

person. We believe that the gap can be bridged by automatically understanding the content of the image. This paper

presents algorithms for automatic tagging of perceptual objects in images, including sky, skin, and foliage, which

constitutes an important step toward this goal.

111 Performance Analysis of n-Channel Symmetric FEC-Based Multiple Description Coding for OFDM Networks

Recently, multiple description source coding has emerged as an attractive framework for robust multimedia transmission

over packet erasure channels. In this paper, we mathematically analyze the performance of n-channel symmetric FEC-based

multiple description coding for a progressive mode of transmission over orthogonal frequency division multiplexing

(OFDM) networks in a frequency-selective slowly-varying Rayleigh faded environment. We derive the expressions for the

bounds of the throughput and distortion performance of the system in an explicit closed form, whereas the exact

performance is given by an expression in the form of a single integration. Based on this analysis, the performance of the

system can be numerically evaluated. Our results show that at high SNR, the multiple description encoder does not need to

fine-tune the optimization parameters of the system due to the correlated nature of the subcarriers. It is also shown that,

despite the bursty nature of the errors in a slow fading environment, FEC-based multiple description coding without

37



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


temporal coding provides a greater advantage for smaller description sizes.

112 Practical Bounds on Image Denoising: From Estimation to Information

Recently, in a previous work, we proposed a way to bound how well any given image can be denoised. The bound was

computed directly from the noise-free image that was assumed to be available. In this work, we extend the formulation to

the more practical case where no ground truth is available.We show that the parameters of the bounds, namely the cluster

covariances and level of redundancy for patches in the image, can be estimated directly from the noise corrupted image.

Further, we analyze the bounds formulation to show that these two parameters are interdependent and they, along with the

bounds formulation as a whole, have a nice information-theoretic interpretation as well. The results are verified through a

variety of well-motivated experiments.

113 Proto-Object Based Rate Control for JPEG2000: An Approach to Content-Based Scalability

The JPEG2000 system provides scalability with respect to quality, resolution and color component in the transfer of

images. However, scalability with respect to semantic content is still lacking. We propose a biologically plausible salient

region based bit allocation mechanism within the JPEG2000 codec for the purpose of augmenting scalability with respect to

semantic content. First, an input image is segmented into several salient proto-objects (a region that possibly contains a

semantically meaningful physical object) and background regions (a region that contains no object of interest) by modeling

visual focus of attention on salient proto-objects. Then, a novel rate control scheme distributes a target bit rate to each

individual region according to its saliency, and constructs quality layers of proto-objects for the purpose of more precise

truncation comparable to original quality layers in the standard. Empirical results show that the suggested approach adds

to the JPEG2000 system scalability with respect to content as well as the functionality of selectively encoding, decoding,

and manipulation of each individual proto-object in the image, with only some slightly trivial modifications to the JPEG2000

standard. Furthermore, the proposed rate control approach efficiently reduces the computational complexity and memory

usage, as well as maintains the high quality of the image to a level comparable to the conventional post-compression rate

distortion (PCRD) optimum truncation algorithm for JPEG2000.

114 Quality Assessment of Deblocked Images

We study the efficiency of deblocking algorithms for improving visual signals degraded by blocking artifacts from

compression. Rather than using only the perceptually questionable PSNR, we instead propose a block-sensitive index,

named PSNR-B, that produces objective judgments that accord with observations. The PSNR-B modifies PSNR by including

a blocking effect factor. We also use the perceptually significant SSIM index, which produces results largely in agreement

with PSNR-B. Simulation results show that the PSNR-B results in better performance for quality assessment of deblocked

images than PSNR and a well-known blockiness-specific index.

115 Random N-Finder (N-FINDR) Endmember Extraction Algorithms for Hyperspectral Imagery

38



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


N-finder algorithm (N-FINDR) has been widely used in endmember extraction. When it comes to implementation several

issues need to be addressed. One is determination of endmembers, required for N-FINDR to generate. Another is its

computational complexity resulting from an exhaustive search. A third one is its requirement of dimensionality reduction. A

fourth and probably the most critical issue is its use of random initial endmembers which results in inconsistent final

endmember selection and results are not reproducible. This paper re-invents the wheel by re-designing the N-FINDR in

such a way that all the above-mentioned issues can be resolved while making the last issue an advantage. The idea is to

implement the N-FINDR as a random algorithm, called random N-FINDR (RN-FINDR) so that a single run using one set of

random initial endmembers is considered as one realization. If there is an endmember present in the data, it should appear

in any realization regardless of what random set of initial endmembers is used. In this case, the N-FINDR is terminated

when the intersection of all realizations produced by two consecutive runs of RN-FINDR remains the same in which case

the p is then automatically determined by the intersection set without appealing for any criterion. In order to substantiate

the proposed RN-FINDR custom-designed synthetic image experiments with complete knowledge are conducted for

validation and real image experiments are also performed to demonstrate its utility in applications.

116 Random Phase Textures: Theory and Synthesis

This paper explores the mathematical and algorithmic properties of two sample-based texture models: random phase noise

(RPN) and asymptotic discrete spot noise (ADSN). These models permit to synthesize random phase textures. They

arguably derive from linearized versions of two early Julesz texture discrimination theories. The ensuing mathematical

analysis shows that, contrarily to some statements in the literature, RPN and ADSN are different stochastic processes.

Nevertheless, numerous experiments also suggest that the textures obtained by these algorithms from identical samples

are perceptually similar. The relevance of this study is enhanced by three technical contributions providing solutions to

obstacles that prevented the use of RPN or ADSN to emulate textures. First, RPN and ADSN algorithms are extended to

color images. Second, a preprocessing is proposed to avoid artifacts due to the nonperiodicity of real-world texture

samples. Finally, the method is extended to synthesize textures with arbitrary size from a given sample.

117 Real-Time Discriminative Background Subtraction

The authors examine the problem of segmenting foreground objects in live video when background scene textures change

over time. In particular, we formulate background subtraction as minimizing a penalized instantaneous risk functional—

yielding a local online discriminative algorithm that can quickly adapt to temporal changes. We analyze the algorithm’s

convergence, discuss its robustness to nonstationarity, and provide an efficient nonlinear extension via sparse kernels. To

accommodate interactions among neighboring pixels, a global algorithm is then derived that explicitly distinguishes

objects versus background using maximum a posteriori inference in a Markov random field (implemented via graph-cuts).

By exploiting the parallel nature of the proposed algorithms, we develop an implementation that can run efficiently on the

highly parallel graphics processing unit (GPU). Empirical studies on a wide variety of datasets demonstrate that the

proposed approach achieves quality that is comparable to state-of-the-art offline methods, while still being suitable for real-

time video analysis.

39



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


118 Reference Sharing Mechanism for Watermark Self-Embedding

This paper proposes two novel self-embedding watermarking schemes based upon a reference sharing mechanism, in

which the watermark to be embedded is a reference derived from the original principal content in different regions and

shared by these regions for content restoration. After identifying tampered blocks, both the reference data and the original

content in the reserved area are used to recover the principal content in the tampered area. By using the first scheme, the

original data in five most significant bit layers of a cover image can be recovered and the original watermarked image can

also be retrieved when the content replacement is not too extensive. In the second scheme, the host content is

decomposed into three levels, and the reference sharing methods with different restoration capabilities are employed to

protect the data at different levels. Therefore, the lower the tampering rate, the more levels of content data are recovered,

and the better the quality of restored results.

119 Regularized Background Adaptation: A Novel Learning Rate Control Scheme for Gaussian Mixture Modeling

To model a scene for background subtraction, Gaussian mixture modeling (GMM) is a popular choice for its capability of

adaptation to background variations. However, GMM often suffers from a tradeoff between robustness to background

changes and sensitivity to foreground abnormalities and is inefficient in managing the tradeoff for various surveillance

scenarios. By reviewing the formulations of GMM, we identify that such a tradeoff can be easily controlled by adaptive

adjustments of the GMM’s learning rates for image pixels at different locations and of distinct properties. A new rate control

scheme based on high-level feedback is then developed to provide better regularization of background adaptation for GMM

and to help resolving the tradeoff. Additionally, to handle lighting variations that change too fast to be caught by GMM, a

heuristic rooting in frame difference is proposed to assist the proposed rate control scheme for reducing false foreground

alarms. Experiments show the proposed learning rate control scheme, together with the heuristic for adaptation of over-

quick lighting change, gives better performance than conventional GMM approaches.

120 Resolution Scalable Image Coding With Reversible Cellular Automata

In a resolution scalable image coding algorithm, a multiresolution representation of the data is often obtained using a linear

filter bank. Reversible cellular automata have been recently proposed as simpler, nonlinear filter banks that produce a

similar representation. The original image is decomposed into four subbands, such that one of them retains most of the

features of the original image at a reduced scale. In this paper, we discuss the utilization of reversible cellular automata and

arithmetic coding for scalable compression of binary and grayscale images. In the binary case, the proposed algorithm that

uses simple local rules compares well with the JBIG compression standard, in particular for images where the foreground

is made of a simple connected region. For complex images, more efficient local rules based upon the lifting principle have

been designed. They provide compression performances very close to or even better than JBIG, depending upon the image

characteristics. In the grayscale case, and in particular for smooth images such as depth maps, the proposed algorithm

outperforms both the JBIG and the JPEG2000 standards under most coding conditions.

121 Robust Principal Component Analysis Based on Maximum Correntropy Criterion

40



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


Principal component analysis (PCA) minimizes the mean square error (MSE) and is sensitive to outliers. In this paper, we

present a new rotational-invariant PCA based on maximum correntropy criterion (MCC). A half-quadratic optimization

algorithm is adopted to compute the correntropy objective. At each iteration, the complex optimization problem is reduced

to a quadratic problem that can be efficiently solved by a standard optimization method. The proposed method exhibits the

following benefits: 1) it is robust to outliers through the mechanism of MCC which can be more theoretically solid than a

heuristic rule based on MSE; 2) it requires no assumption about the zero-mean of data for processing and can estimate data

mean during optimization; and 3) its optimal solution consists of principal eigenvectors of a robust covariance matrix

corresponding to the largest eigenvalues. In addition, kernel techniques are further introduced in the proposed method to

deal with nonlinearly distributed data. Numerical results demonstrate that the proposed method can outperform robust

rotational-invariant PCAs based on L1 norm when outliers occur.

122 Salient Motion Features for Video Quality Assessment

Design of algorithms that are able to estimate video quality as perceived by human observers is of interest for a number of

applications. Depending on the video content, the artifacts introduced by the coding process can be more or less

pronounced and diversely affect the quality of videos, as estimated by humans. While it is well understood that motion

affects both human attention and coding quality, this relationship has only recently started gaining attention among the

research community, when video quality assessment (VQA) is concerned. In this paper, the effect of calculating several

objective measure features, related to video coding artifacts, separately for salient motion and other regions of the frames

of the sequence is examined. In addition, we propose a new scheme for quality assessment of coded video streams, which

takes into account salient motion. Standardized procedure has been used to calculate the Mean Opinion Score (MOS),

based on experiments conducted with a group of non-expert observers viewing standard definition (SD) sequences. MOS

measurements were taken for nine different SD sequences, coded using MPEG-2 at five different bit-rates. Eighteen

different published approaches related to measuring the amount of coding artifacts objectively on a single-frame basis

were implemented. Additional features describing the intensity of salient motion in the frames, as well as the intensity of

coding artifacts in the salient motion regions were proposed. Automatic feature selection was performed to determine the

subset of features most correlated to video quality. The results show that salient-motion-related features enhance

prediction and indicate that the presence of blocking effect artifacts and blurring in the salient regions and variance and

intensity of temporal changes in non-salient regions influence the perceived video quality.

123 Size-Controllable Region-of-Interest in Scalable Image Representation

Differentiating region-of-interest (ROI) from non-ROI in an image in terms of relative size as well as fidelity becomes an

important functionality for future visual communication environment with a variety of display devices. In this paper, we

propose a scalable image representation with the ROI functionality in the spatial domain, which allows us to generate a

hierarchy of images with arbitrary sizes. The ROI functionality of our scalable representation is a result of a nonuniform grid

transformation in the spatial domain, where only the center of ROI and an expansion parameter are to be known. Our grid

transformation guarantees no loss of information within the area of ROI.

124 Spatial Sparsity-Induced Prediction (SIP) for Images and Video: A Simple Way to Reject Structured Interference

41



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


We propose a prediction technique that is geared toward forming successful estimates of a signal based on a correlated

anchor signal that is contaminated with complex interference. The corruption in the anchor signal involves intensity

modulations, linear distortions, structured interference, clutter, and noise just to name a few. The proposed setup reflects

nontrivial prediction scenarios involving images and video frames where statistically related data is rendered ineffective for

traditional methods due to cross-fades, blends, clutter, brightness variations, focus changes, and other complex

transitions. Rather than trying to solve a difficult estimation problem involving nonstationary signal statistics, we obtain

simple predictors in linear transform domain where the underlying signals are assumed to be sparse. We show that these

simple predictors achieve surprisingly good performance and seamlessly allow successful predictions even under

complicated cases. None of the interference parameters are estimated as our algorithm provides completely blind and

automated operation. We provide a general formulation that allows for nonlinearities in the prediction loop and we consider

prediction optimal decompositions. Beyond an extensive set of results on prediction and registration, the proposed method

is also implemented to operate inside a state-of-the-art compression codec and results show significant improvements on

scenes that are difficult to encode using traditional prediction techniques.

125 Spatiotemporal Localization and Categorization of Human Actions in Unsegmented Image Sequences

In this paper we address the problem of localization and recognition of human activities in unsegmented image sequences.

The main contribution of the proposed method is the use of an implicit representation of the spatiotemporal shape of the

activity which relies on the spatiotemporal localization of characteristic ensembles of feature descriptors. Evidence for the

spatiotemporal localization of the activity is accumulated in a probabilistic spatiotemporal voting scheme. The local nature

of the proposed voting framework allows us to deal with multiple activities taking place in the same scene, as well as with

activities in the presence of clutter and occlusion. We use boosting in order to select characteristic ensembles per class.

This leads to a set of class specific codebooks where each codeword is an ensemble of features. During training, we store

the spatial positions of the codeword ensembles with respect to a set of reference points, as well as their temporal

positions with respect to the start and end of the action instance. During testing, each activated codeword ensemble casts

votes concerning the spatiotemporal position and extend of the action, using the information that was stored during

training. Mean Shift mode estimation in the voting space provides the most probable hypotheses concerning the

localization of the subjects at each frame, as well as the extend of the activities depicted in the image sequences. We

present classification and localization results for a number of publicly available datasets, and for a number of sequences

where there is a significant amount of clutter and occlusion.

126 Structured Max-Margin Learning for Inter-Related Classifier Training and Multilabel Image Annotation

In this paper, a structured max-margin learning algorithm is developed to achieve more effective training of a large number

of inter-related classifiers for multilabel image annotation application. To leverage multilabel images for classifier training,

each multilabel image is partitioned into a set of image instances (image regions or image patches) and an automatic

instance label identification algorithm is developed to assign multiple labels (which are given at the image level) to the most

relevant image instances. A K-way min-max cut algorithm is developed for automatic instance clustering and kernel weight

determination, where multiple base kernels are seamlessly combined to address the issue of huge intra-concept visual

diversity more effectively. Second, a visual concept network is constructed for characterizing the inter-concept visual

similarity contexts more precisely in the high-dimensional multimodal feature space. The visual concept network is used to

determine the inter-related learning tasks directly in the feature space rather than in the label space because feature space

is the common space for classifier training and image classification. Third, a parallel computing platform is developed to

achieve more effective learning of a large number of inter-related classifiers over the visual concept network. A structured

42



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


max-margin learning algorithm is developed by incorporating the visual concept network, max-margin Markov networks

and multitask learning to address the issue of huge inter-concept visual similarity more effectively. By leveraging the inter-

concept visual similarity contexts for inter-related classifier training, our structured max-margin learning algorithm can

significantly enhance the discrimination power of the inter-related classifiers. Our experiments have also obtained very

positive results for a large number of object classes and image concepts.

127 Studentized Dynamical System for Robust Object Tracking

This paper describes a studentized dynamical system (SDS) for robust target tracking using a subspace representation.

Dynamical systems (DS) provide a powerful framework for the probabilistic modeling of temporal sequences. Visual

tracking problems are often cast as a sequential inference problem within the DS framework and a compact way to model

the observation distributions (i.e., object appearances) is through probabilistic principal component analysis (PPCA). PPCA

is a classic Gaussian based subspace representation method and a popular tool for appearance modeling. Although

Gaussian density has theoretically nice properties, resulting in models that are always tractable, they are also severely

limited in practical settings. One of the central issues in the use of PPCA for target appearance modeling is that it is very

sensitive to outliers. The Gaussian density has a very light tail, while real world data with outliers exhibit heavy tails.

Recently, more heavy-tailed distributions (e.g., Student’s t-distribution) have been introduced to increase the robustness of

the original PPCA. We propose to augment the traditional target tracking DS by adding a set of auxiliary latent variables to

adjust the shape of the observation distribution. We show that by carefully choosing the probability density of these

auxiliary latent variables, a more robust observation distribution can be obtained with tails heavier than Gaussian.

Numerical experiments verify that the proposed SDS has a better capability to handle considerable amount of outlier noise

and an improved tracking performance over DS with a Gaussian based observation model.

128 Sub-Hexagonal Phase Correlation for Motion Estimation

We present a novel frequency-domain motion estimation technique, which operates on hexagonal images and employs the

hexagonal Fourier transform. Our method involves image sampling on a hexagonal lattice followed by a normalised

hexagonal cross-correlation in the frequency domain. The term subpixel (or subcell) is defined on a hexagonal grid in order

to achieve floating point registration. Experiments using both artificially induced motion and actual motion demonstrate

that the proposed method outperforms the state-of-the-art in frequency-domain motion estimation operating on a square

lattice, in the shape of phase correlation, in terms of subpixel accuracy for a range of test material and motion scenarios.

129 Subpixel Registration With Gradient Correlation

We address the problem of subpixel registration of images assumed to be related by a pure translation. We present a

method which extends gradient correlation to achieve subpixel accuracy. Our scheme is based on modeling the dominant

singular vectors of the 2-D gradient correlation matrix with a generic kernel which we derive by studying the structure of

gradient correlation assuming natural image statistics. Our kernel has a parametric form which offers flexibility in modeling

the functions obtained from various types of image data.We estimate the kernel parameters, including the unknown

subpixel shifts, using the Levenberg-Marquardt algorithm. Experiments with LANDSAT and MRI data show that our scheme

outperforms recently proposed state-of-the-art phase correlation methods.

43



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


130 Text From Corners: A Novel Approach to Detect Text and Caption in Videos

Detecting text and caption from videos is important and in great demand for video retrieval, annotation, indexing, and

content analysis. In this paper, we present a corner based approach to detect text and caption from videos. This approach

is inspired by the observation that there exist dense and orderly presences of corner points in characters, especially in text

and caption. We use several discriminative features to describe the text regions formed by the corner points. The usage of

these features is in a flexible manner, thus, can be adapted to different applications. Language independence is an

important advantage of the proposed method. Moreover, based upon the text features, we further develop a novel algorithm

to detect moving captions in videos. In the algorithm, the motion features, extracted by optical flow, are combined with text

features to detect the moving caption patterns. The decision tree is adopted to learn the classification criteria. Experiments

conducted on a large volume of real video shots demonstrate the efficiency and robustness of our proposed approaches

and the real-world system. Our text and caption detection system was recently highlighted in a worldwide multimedia

retrieval competition, Star Challenge, by achieving the superior performance with the top ranking.

131 The Roadmaker’s Algorithm for the Discrete Pulse Transform

The discrete pulse transform (DPT) is a decomposition of an observed signal into a sum of pulses, i.e., signals that are

constant on a connected set and zero elsewhere. Originally developed for 1-D signal processing, the DPT has recently been

generalized to more dimensions. Applications in image processing are currently being investigated. The time required to

compute the DPT as originally defined via the successive application of LULU operators (members of a class of minimax

filters studied by Rohwer) has been a severe drawback to its applicability. This paper introduces a fast method for obtaining

such a decomposition, called the Roadmaker’s algorithm because it involves filling pits and razing bumps. It acts

selectively only on those features actually present in the signal, flattening them in order of increasing size by subtracing an

appropriate positive or negative pulse, which is then appended to the decomposition. The implementation described here

covers 1-D signal as well as two and 3-D image processing in a single framework. This is achieved by considering the

signal or image as a function defined on a graph, with the geometry specified by the edges of the graph. Whenever a feature

is flattened, nodes in the graph are merged, until eventually only one node remains. At that stage, a new set of edges for the

same nodes as the graph, forming a tree structure, defines the obtained decomposition. The Roadmaker’s algorithm is

shown to be equivalent to the DPT in the sense of obtaining the same decomposition. However, its simpler operators are

not in general equivalent to the LULU operators in situations where those operators are not applied successively. A by-

product of the Roadmaker’s algorithm is that it yields a proof of the so-called Highlight Conjecture, stated as an open

problem in 2006. We pay particular attention to algorithmic details and complexity, including a demonstration that in the 1-D

case, and also in the case of a complete graph, the Roadmaker’s algorithm has optimal complexity: it runs in time QQQ,

where Q is the number of arcs in the graph.

132 The Sparse Matrix Transform for Covariance Estimation and Analysis of High Dimensional Signals

44



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


Covariance estimation for high dimensional signals is a classically difficult problem in statistical signal analysis and

machine learning. In this paper, we propose a maximum likelihood (ML) approach to covariance estimation, which employs

a novel non-linear sparsity constraint. More specifically, the covariance is constrained to have an eigen decomposition

which can be represented as a sparse matrix transform (SMT). The SMT is formed by a product of pairwise coordinate

rotations known as Givens rotations. Using this framework, the covariance can be efficiently estimated using greedy

optimization of the log-likelihood function, and the number of Givens rotations can be efficiently computed using a cross-

validation procedure. The resulting estimator is generally positive definite and well-conditioned, even when the sample size

is limited. Experiments on a combination of simulated data, standard hyperspectral data, and face image sets show that the

SMT-based covariance estimates are consistently more accurate than both traditional shrinkage estimates and recently

proposed graphical lasso estimates for a variety of different classes and sample sizes. An important property of the new

covariance estimate is that it naturally yields a fast implementation of the estimated eigen-transformation using the SMT

representation. In fact, the SMT can be viewed as a generalization of the classical fast Fourier transform (FFT) in that it uses

“butterflies” to represent an orthonormal transform. However, unlike the FFT, the SMT can be used for fast eigen-signal

analysis of general non-stationary signals.

133 Tomographic Reconstruction of Gated Data Acquisition Using DFT Basis Functions

In image reconstruction gated acquisition is often used in order to deal with blur caused by organ motion in the resulting

images. However, this is achieved almost inevitably at the expense of reduced signal-to-noise ratio in the acquired data. In

this work, we propose a reconstruction procedure for gated images based upon use of discrete Fourier transform (DFT)

basis functions, wherein the temporal activity at each spatial location is regulated by a Fourier representation. The gated

images are then reconstructed through determination of the coefficients of the Fourier representation. We demonstrate this

approach in the context of single photon emission computed tomography (SPECT) for cardiac imaging, which is often

hampered by the increased noise due to gating and other degrading factors. We explore two different reconstruction

algorithms, one is a penalized least-square approach and the other is a maximum a posteriori approach. In our experiments,

we conducted a quantitative evaluation of the proposed approach using Monte Carlo simulated SPECT imaging. The results

demonstrate that use of DFT-basis functions in gated imaging can improve the accuracy of the reconstruction. As a

preliminary demonstration, we also tested this approach on a set of clinical acquisition.

134 Topological Well-Composedness and Glamorous Glue: A Digital Gluing Algorithm for Topologically Constrained Front Propagation

We propose a new approach to front propagation algorithms based on a topological variant of well-composedness which

contrasts with previous methods based on simple point detection. This provides for a theoretical justification, based on the

digital Jordan separation theorem, for digitally “gluing” evolved well-composed objects separated by well-composed

curves or surfaces. Additionally, our framework can be extended to more relaxed topologically constrained algorithms

based on multisimple points. For both methods this framework has the additional benefit of obviating the requirement for

both a user-specified connectivity and a topologically- consistent marching cubes/squares algorithm in meshing the

resulting segmentation.

135 Total Variation Projection With First Order Schemes

This article proposes a new algorithm to compute the projection on the set of images whose total variation is bounded by a

45



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


constant. The projection is computed through a dual formulation that is solved by first order non-smooth optimization

methods. This yields an iterative algorithm that applies iterative soft thresholding to the dual vector field, and for which we

establish convergence rate on the primal iterates. This projection algorithm can then be used as a building block in a

variety of applications such as solving inverse problems under a total variation constraint, or for texture synthesis.

Numerical results are reported to illustrate the usefulness and potential applicability of our TV projection algorithm on

various examples including denoising, texture synthesis, inpainting, deconvolution and tomography problems. We also

show that our projection algorithm competes favorably with state-of-the-art TV projection methods in terms of convergence

speed.

136 Transferring Boosted Detectors Towards Viewpoint and Scene Adaptiveness

In object detection, disparities in distributions between the training samples and the test ones are often inevitable, resulting

in degraded performance for application scenarios. In this paper, we focus on the disparities caused by viewpoint and

scene changes and propose an efficient solution to these particular cases by adapting generic detectors, assuming

boosting style. A pretrained boosting-style detector encodes a priori knowledge in the form of selected features and weak

classifier weighting. Towards adaptiveness, the selected features are shifted to the most discriminative locations and

scales to compensate for the possible appearance variations. Moreover, the weighting coefficients are further adapted with

covariate boost, which maximally utilizes the related training data to enrich the limited new examples. Extensive

experiments validate the proposed adaptation mechanism towards viewpoint and scene adaptiveness and show

encouraging improvement on detection accuracy over state-of-the-art methods.

137 Unequal Protection of Video Data According to Slice Relevance

In this paper, we devise a procedure that mimics the behavior of a progressive video stream starting from a non

progressive one such as H.264/AVC encoded video. This allows one to unequally protect the video data in an efficient way,

according to their importance and the network state. The reported results demonstrate the superior performance of the

proposed approach in comparison to state-of-the-art methods for resilient transmission of H.264/AVC data. Moreover, the

flexibility in terms of redundancy insertion and achieved quality levels, allows one to span different applications, possibly

including P2P video streaming.

138 Uniform Motion Blur in Poissonian Noise: Blur/Noise Tradeoff

In this paper we consider the restoration of images corrupted by both uniform motion blur and Poissonian noise.We

formulate an image formation model that explicitly takes into account the length of the blur point-spread function and the

noise level as functions of the exposure time. Further, we present an analysis of the achievable restoration performance by

showing how the root mean squared error varies with respect to the exposure time. It turns out that the worst situations are

represented by either too short or too long exposure times. In between there exists an optimal exposure time that

maximizes the restoration performance, balancing the amount of blur and noise in the observation.We justify such result

through a mathematical analysis of the signal-to-noise ratio in Fourier domain; this study is then validated by deblurring

synthetic data as well as camera raw data.

46



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


139 Variable Length Open Contour Tracking Using a Deformable Trellis

This paper focuses on contour tracking, an important problem in computer vision, and specifically on open contours that

often directly represent a curvilinear object. Compelling applications are found in the field of bioimage analysis where

blood vessels, dendrites, and various other biological structures are tracked over time. General open contour tracking, and

biological images in particular, pose major challenges including scene clutter with similar structures (e.g., in the cell), and

time varying contour length due to natural growth and shortening phenomena, which have not been adequately answered

by earlier approaches based on closed and fixed end-point contours.We propose a model-based estimation algorithm to

track open contours of time-varying length, which is robust to neighborhood clutter with similar structures. The method

employs a deformable trellis in conjunction with a probabilistic (hidden Markov) model to estimate contour position,

deformation, growth and shortening. It generates a maximum a posteriori estimate given observations in the current frame

and prior contour information from previous frames. Experimental results on synthetic and real-world data demonstrate the

effectiveness and performance gains of the proposed algorithm.

140 Variational Bayesian Super Resolution

In this paper, we address the super resolution (SR) problem froma set of degraded lowresolution (LR) images to obtain a

high resolution (HR) image. Accurate estimation of the sub-pixel motion between theLRimages significantly affects the

performance of the reconstructedHRimage. In thispaper,weproposenovel super resolution methods where theHRimage and

the motion parameters are estimated simultaneously. Utilizing a Bayesian formulation, we model the unknown HR image,

the acquisition process, the motion parameters and the unknown model parameters in a stochastic sense. Employing a

variational Bayesian analysis, we develop two novel algorithms which jointly estimate the distributions of all unknowns.

The proposed framework has the following advantages: 1) Through the incorporation of uncertainty of the estimates, the

algorithms prevent the propagation of errors between the estimates of the various unknowns; 2) the algorithms are robust

to errors in the estimation of the motion parameters; and 3) using a fully Bayesian formulation, the developed algorithms

simultaneously estimate all algorithmic parameters along with the HR image and motion parameters, and therefore they are

fully-automated and do not require parameter tuning. We also show that the proposed motion estimation method is a

stochastic generalization of the classical Lucas-Kanade registration algorithm. Experimental results demonstrate that the

proposed approaches are very effective and compare favorably to state-of-the-art SR algorithms.

141 ViBe: A Universal Background Subtraction Algorithm for Video Sequences

This paper presents a technique for motion detection that incorporates several innovative mechanisms. For example, our

proposed technique stores, for each pixel, a set of values taken in the past at the same location or in the neighborhood. It

then compares this set to the current pixel value in order to determine whether that pixel belongs to the background, and

adapts the model by choosing randomly which values to substitute from the background model. This approach differs from

those based upon the classical belief that the oldest values should be replaced first. Finally, when the pixel is found to be

part of the background, its value is propagated into the background model of a neighboring pixel. We describe our method

in full details (including pseudo-code and the parameter values used) and compare it to other background subtraction

techniques. Efficiency figures show that our method outperforms recent and proven state-of-the-art methods in terms of

both computation speed and detection rate. We also analyze the performance of a downscaled version of our algorithm to

the absolute minimum of one comparison and one byte of memory per pixel. It appears that even such a simplified version

47



Madurai




Contact : 91452 4390702, 4392702, 4394702.


Trichy


3rd

Floor,SI Towers,



Contact : 91431 - 4002234.


Kollam




Contact : 91474 2723622.


of our algorithm performs better than mainstream techniques.

142 Video Tracking Based on Sequential Particle Filtering on Graphs

In this paper, we develop a novel solution for particle filtering on general graphs. We provide an exact solution for particle

filtering on directed cycle-free graphs. The proposed approach relies on a partial-order relation in an antichain

decomposition that forms a high-order Markov chain over the partitioned graph. We subsequently derive a closed-form

sequential updating scheme for conditional density propagation using particle filtering on directed cycle-free graphs.We

also provide an approximate solution for particle filtering on general graphs by splitting graphs with cycles into multiple

directed cycle-free subgraphs. We then use the sequential updating scheme by alternating among the directed cycle-free

subgraphs to obtain an estimate of the density propagation.We rely on the proposed method for particle filtering on general

graphs for two video tracking applications: 1) object tracking using high-orderMarkov chains; and 2) distributed multiple

object tracking based on multi-object graphical interaction models. Experimental results demonstrate the improved

performance of the proposed approach to particle filtering on graphs compared with existing methods for video tracking.

143 Window-Level Rate Control for Smooth Picture Quality and Smooth Buffer Occupancy

In rate control, smooth picture quality and smooth buffer occupancy are both important but contrary to each other at a

given bit rate. How to get a good tradeoff between them was not devoted much attention previously. To deal with this

problem, a theoretical window model is proposed in this paper, in which several adjacent frames grouped as a window are

considered together. The smoothness of both picture quality and buffer occupancy can be gracefully achieved by

regulating the size of the window. To illustrate the usage of window model, a window-level rate control algorithm

cooperated with the traditional Q-domain rate-distortion model is further introduced. In experiments, we first show howthe

proposed windowmodel achieves the tradeoff between picture quality smoothness and buffer smoothness, and then

demonstrate the significant PSNR improvement, accuracy of bit control and consistency of visual quality of the proposed

window-level rate control algorithm.

48

IEEE Final Year Projects 2011-2012 :: Elysium Technologies Pvt Ltd::Imageprocessing

Education

Transcript of IEEE Final Year Projects 2011-2012 :: Elysium Technologies Pvt Ltd::Imageprocessing