Seminar Ppt

Random Forests-Based 2D-to-3D

Video Conversion

Author: Mahsa T. Pourazad, Panos Nasiopoulos,

Ali Bashashati; University of British Columbia

Presenter: Prabhat Kumar Tiwary

Outline:

I. Introduction

II. Need to use multi-monocular depth cues

III. The proposed scheme

IV. Steps involved

V. Experimental Results

VI. Conclusion

VII. References

I. Introduction

To meet the initial demand for 3D video content,

it is not realistic to rely only on the production

of new 3D videos.

Solution : the conversion of existing 2D popular

movies and documentaries into 3D format.

Process : Depth image based rendering.

II. Need to use multi-monocular

depth cues

Human visual depth perception mechanism : not just

binocular parallax.

But, monocular depth cues such as light and shade,

relative size, motion parallax, interposition

(partial occlusion), texture variation, haze

(atmosphere scattering) and edge information

(perspective)

The monocular cues can be extracted from

consecutive video frames.

Use of single depth cue is not sufficient.

III. The proposed scheme

Effective 2D to 3D video conversion using multiple

monocular depth cues.

Use of RF machine learning algorithm.

Use of training data sets.

Random Forest : a classification and regression

technique which is a collection of individual

Decision Trees (DTs) that have been constructed

based on random selection of input feature

vectors.

IV. Steps involved

In our proposed scheme, we use video

sequences with available depth information to

train the RF model.

The trained RF model is then used to estimate the

depth map information of unknown 2D video

sequences based on their extracted monocular

depth cues.

Steps :

A) Extracting Features Representing Depth Cues

1. Motion parallax

2. Texture variation

3. Haze

4. Perspective (edge information)

5. Vertical spatial coordinate

6. Sharpness

7. Occlusion (intreposition)

B) RF Model Estimation

A. Extracting Features

Representing Depth Cues

1) Motion parallax:

near objects move faster across the retina than

further objects do.

Software used : DERS

Fig 1: Arrangement of 20 video frames in three diferent sequences for

estimating disparity over time

2) Texture variation:

the face-texture of a textured material (such as

fabric or wood) is more apparent when it is closer.

3) Haze:

haze happens when the direction and power of

the propagation of light through the atmosphere

is altered due to a diffusion of radiation by small

particles in the atmosphere.

the distant objects visually appear less distinct

and more bluish than objects nearby.

4) Perspective (edge information):

the more the lines converge, the farther away

they appear to be.

5) Vertical spatial coordinate:

In practice, movie content is recorded such that the

objects closer to the bottom boarder of the camera

image are closer to the viewer.

6) Sharpness:

the closer objects appear sharper.

Use of the diagonal Laplacian method.

7) Occlusion (intreposition):

an object which overlaps or partly obscures our

view of another object, is considered to be

closer.

B. RF Model Estimation

Use of training sets with feature vectors.

Application of mean-shit image segmentation

algorithm

Once the RF model is trained, the depth information

of unseen video sequences is predicted based on

monocular depth cues.

V. EXPERIMENTAL RESULTS

VI. Conclusion

a new, fast and efficient method that

approximates the depth map of a 2D video

sequence using Random Forest regression.

incorporates several monocular depth cues to

approximate a very accurate depth estimate.

The experimental results show that this approach

outperforms motion parallax-based depth map

estimation technique by providing very realistic

depth information of a scene.

VII. Refernces

[I] W. 1. Tam, A. Soung Yee, 1. Ferreira, S. Tariq, F. Speranza,

"Stereoscopic image rendering based on depth maps created from blur

and edge information," Proc. of Stereoscopic Displays and Applications

XII, Vo!' 5664, pp.104-115, 2005

[2] S. H. Lai, C. W. Fu, S. Chang, "A generalized depth estimation

algorithm with a single image," PAMI, Vol. 14(4), pp. 405-411,1992.

[3] D. Kim, D. Min, and K. Sohn, "Stereoscopic video generation method

using motion analysis," Proc. of 3D TV Con. pp. 1-4,2007.

[4] M. T. Pourazad, P. Nasiopoulos, and R. K. Ward, "Generating the Depth

Map rom the Motion Information of H.264-Encoded 2D Video

Sequence," EURASIP Jounal on Image and Video Processing, vol.

2010, Article lD 108584, 13 pages, 2010.

[5] L. Breiman, and A. Cutler, "Random forest. Machine Lening," 45, pp.

5 32, 2001.

Thank You

Seminar Ppt

Documents

Transcript of Seminar Ppt