LNCS 4105 - Accelerating Depth Image-Based Rendering Using...

B. Gunsel et al. (Eds.): MRCS 2006, LNCS 4105, pp. 562 – 569, 2006. © Springer-Verlag Berlin Heidelberg 2006

Accelerating Depth Image-Based Rendering Using GPU

Man Hee Lee and In Kyu Park

School of Information and Communication Engineering, Inha University 253 Yonghyun dong, Nam-gu, INCHEON 402-751, Korea [email protected], [email protected]

Abstract. In this paper, we propose a practical method for hardware-accelerated rendering of the depth image-based representation (DIBR) object, which is defined in MPEG-4 Animation Framework eXtension (AFX). The proposed method overcomes the drawbacks of the conventional rendering, i.e. it is slow since it is hardly assisted by graphics hardware and surface lighting is static. Utilizing the new features of modern graphic processing unit (GPU) and programmable shader support, we develop an efficient hardware-accelerated rendering algorithm of depth image-based 3D object. Surface rendering in re-sponse of varying illumination is performed inside the vertex shader while adaptive point splatting is performed inside the fragment shader. Experimental results show that the rendering speed increases considerably compared with the software-based rendering and the conventional OpenGL-based rendering method.

1 Introduction

Image-based rendering has received wide interests from computer vision and graphics researchers since it can represent complex 3D models with just a few reference im-ages, which are inherently photorealistic. When the camera position changes, the novel view can be synthesized using the reference images along with their camera information, without reconstructing the actual 3D shape [2]. Therefore, compared with the conventional polygon-based modeling, it is simple and efficient way to rep-resent photorealistic 3D models.

However, there apparently exists the negative side of image-based rendering. One of them is the lack of hardware acceleration. Since image-based rendering does not use the conventional polygon-filling rasterization, rendering its primitives, i.e. point spatting, should be done in software processing. In image-based rendering, splat size optimization and splat rendering are the most common and basic operations as well as in more general point-based rendering [5]. Another disadvantage is that lighting in image-based rendering is mostly static; therefore the novel view usually appears arti-ficial since there is no change in surface shading, specular reflection, and shadow.

On the other hand, the recent advances in modern GPU have shown two directions of evolution. One is to increase the processor’s processing power. For example, a

Accelerating Depth Image-Based Rendering Using GPU 563

Fig. 1. Reference images (surroundings) and a result of novel view synthesis (center) of a DIBR model (SimpleTexture model)

GeForce FX 5900 GPU performs 20 Giga FLOPS, while 3GHz Pentium 4 CPU does only 6 Giga FLOPS. The other direction which is more recent trend is to allow the user programmability of internal GPU functionality. This is known as the vertex and fragment (pixel) shaders. Using shaders, it is possible to customize the functionalities of graphics pipeline in GPU. For example, we can change the type of projective trans-formation, the illumination models, and the texture mapping methods, allowing more general and photorealistic effects which the conventional fixed pipeline cannot handle.

In this paper, we propose a practical method of accelerating image-based rendering using GPU and shaders. Although any particular image-based rendering algorithms can be accelerated, our target is the one which is used as an international standard, i.e. SimpleTexture format in depth image-based representation (DIBR) of MPEG-4 AFX (Animation Framework eXtension) [1][8]. The SimpleTexture format is a set of refer-ence images covering visible surfaces of the 3D object. Similar to the relief texture [3], each reference image consists of the texture (color) image and the corresponding depth map, in which each pixel value denotes the distance from the image plane to the surface of the object. An example of a 3D model in SimpleTexture format is shown in Figure 1. From now on, we will use the DIBR term to represent the SimpleTexture format. The proposed method can be extended and applied to other image-based ren-dering algorithms without big modification.

This paper is organized as follows. In Section 2, we start with a brief review of the previous related works. In Section 3 we describe the proposed methods in detail. Ex-perimental results are shown in Section 4. Finally, we give a conclusive remark in Section 5.

2 Previous Work

The DIBR format of 3D model stores the regular layout of 3D points in depth and texture images separately [1][8]. This representation is similar to the relief texture [3] in that the pixel values represent the distance of 3D point from the camera plane and the point’s color, respectively. Rendering of DIBR can be done by usual 3D warping algorithm [2] which requires geometric transformations and splatting in screen space. However, splatting is slow since it could not be accelerated by hardware so far.

564 M.H. Lee and I.K. Park

There have been a couple of noticeable methodologies of image-based rendering: light field rendering and point-based rendering. Although light field rendering and its extension, i.e. surface light field [4][6], can provide high quality rendering results with varying illumination, the application of this approach is a bit impractical since we need too many input images and the amount of data to handle is huge. Therefore, in this paper, we focus on the ‘multiview’ image-based rendering in which only a few images are used as the reference image. Based on this idea, point-based rendering [5] is an off-the-shelf rendering method for image-based rendering. The common point is that both use 3D point as the rendering primitive, while the main difference is points are arranged as the regular grid structure in image-based rendering. GPU acceleration of point-based rendering has been reported in a few previous works [7][9], in which point blending, splat shape and size computation problem have been addressed.

A noticeable previous work is Botsch and Kobbelt’s GPU-accelerated rendering framework for point-sampled geometry [7]. In order to provide high visual quality and fast rendering together, they performed most of the major procedures of point-based rendering on GPU, including splat size and shape computation, splat filtering, and per-pixel normalization. Although their general framework can be applied to image-based rendering, the proposed approach has more customized framework for DIBR data, utilizing fast texture buffer access inside GPU.

3 Proposed Approach

In this paper, we propose the method for rendering of DIBR object using GPU accel-eration. The proposed approach consists of a sequence of reference data setting, pro-jective transformation, surface reflection computation, and point splatting, which will be described in detail in the following subsections.

3.1 Caching Reference Images on Texture Buffer

DIBR data is well-suited for GPU-based acceleration since the reference images can be regarded as texture data and set on the texture buffer on graphic card. Compared with conventional method in which geometry primitives are stored in system memory or partially in video memory, this scheme reduces heavy traffics on AGP bus signifi-cantly and therefore increases the frame rate. In other words, all the further processing including transformation, shading, and splatting are performed by GPU. This allevi-ates the burden of CPU, letting it devote to other non-graphic processes.

In order to access the texture buffer in vertex shader, new feature of the Native Shader Model 3.0, i.e. Vertex Texture fetch [13], is employed. Vertex Texture allows us to access the texture data just as fragment shader does. One problem in Vertex Texture is that the maximum number of texture which vertex shader can access is limited to only four. However, since at least twelve (cube map) and usually more reference images are used in DIBR, all the reference images cannot be loaded on the texture buffer together. In order to solve the problem, the reference images are recon-figured into a few texture images less than five. In Figure 2(a) and 2(b), 32 reference


(a) (b) (c)

Fig. 2. Merged reference images and normal map for Vertex Texture. The resolution is 1024x1024. The resolution of individual reference image is 256x256. (a) Merged depth image. (b) Merged color image. (c) Normal map computed from (a) and (b).

images are merged into two reference images, one for depth image and the other for color (texture) image. They are fed onto the Vertex Texture using standard OpenGL commands. Further geometric processing is performed by the commands in the vertex shader codes.

3.2 Geometric Transformation and Lighting

Geometric transformation consists of three intermediate transformations, i.e. refer-ence-to-world coordinates, world-to-camera coordinates, and camera projection trans-formation. Since the camera position and viewing direction constantly changes, the

(a) (b) (c)

(d) (e) (f)

Fig. 3. Real test data [10]. Each resolution is 2048x1536. (a) Break-dance – Color. (b) Break-dance – Depth. (c) Break-dance – Normal map. (d) Ballet – Color. (e) Ballet – Depth. (f) Ballet – Normal map.


first transformation is the only one which remains unchanged. The total transforma-tion matrix is computed and assigned in the vertex shader.

Normal vectors are computed in the preprocessing stage using the 4-neighbor pix-els of depth image. Since they should be used to compute the surface shading by light models, they should be transferred somehow to the vertex shader. In our approach, the normal map is constructed for each point in the reference images. In the normal map, (x, y, z) elements of the normal vector are scaled to [0, 255] and stored in RGB chan-nel of the normal map. Thus, the normal map is eventually another texture data which can be set onto the texture buffer in the video memory. In Figure 2(c), the normal map which corresponds to the reference image in Figure 2(a) and 2(b) is shown in the RGB color image format. Subsequently, the vertex shader can access the normal map and use it to compute the surface lighting of each point. In our implementation, the common Phong illumination model is used in order to compare the performance with the fixed-function state pipeline, although any other illumination model can be adopted. In result, the proposed method of surface lighting is very efficient, in that what have to do is just fetch the Vertex Texture which is a small communication be-tween GPU and video memory.

4 Experimental Results

In order to evaluate the performance of the proposed approach, the experiment has been carried on the synthetic DIBR data shown in Figure 2 and the real calibrated color-depth pair image set shown in Figure 3.

4.1 Experiments Setup

The experiments are performed on NVIDIA GeForce 7800GTX GPU and 2.0GHz AMD Athlon64 CPU with 2GB memory. The codes are written based on GLSL (GL shading language [11]) which has been included formally in OpenGL 2.0 [12].

4.1.1 Implementation Overview In Figure 4, the block diagram of the developed rendering framework is shown. In the main program, the reference images are reconfigured first, yielding less than five textures. Next, normal maps are also computed from the depth images. The reconfigured reference images and the normal maps are set as the vertex textures and stored in the video memory, which will be accessed in the vertex shader. Finally, the reference-to-screen transformation matrices are computed and set ready for use in the vertex shader.

4.1.2 Vertex and Fragment Shader Vertex shader performs the per-vertex operations for geometric transformation, surface reflection, and splat size computation. All the data required for those operations are fetched from Vertex Texture cache, therefore there is little data traffic between GPU and system memory which was usually a bottleneck in the conventional rendering system.

In the fragment shader, the per-pixel operations are performed to compute the final pixel color in the frame buffer. The output of the vertex shader is transferred to the


Fig. 4. Block diagram of the developed rendering framework

fragment shader, which includes the position in screen coordinates, the albedo in surface reflection, and the splat size. According to the splat size, each individual pixel is usually overlapped by a few splats. The fragment shader performs each splat’s color computation using Phong illumination model and blends those overlaps, yield-ing smooth filtering effect across the whole target image.

4.2 Rendering Results

The rendering speed of the proposed approach is compared with that of the conven-tional methods. The statistical summary is given in Table 1 and 2. As shown in the tables, in contrast to the software rendering and the conventional OpenGL render-ing, it is observed that the rendering frame rate increases remarkably. Shader-based rendering shows more than 6 times faster than OpenGL-based rendering. The main difference from the conventional OpenGL rendering is where the splat size is com-puted; there is no way to accelerate it unless the shaders are used. In our implemen-tation, the splat size is computed as inversely proportional to the distance from the view point. The proportional coefficient is found empirically and further intensive research is required to find the optimal splat size which would be best for any DIBR data.

The effect of dynamic illumination is shown in Figure 5. The left image is the re-sult of the conventional DIBR rendering, which has no surface lighting effect. On the contrary, the proposed method enables the lighting effect and it increases the photore-alism much. In Figure 5 (b), the light position is approximately at the lower-left cor-ner of the model, yielding the shown reflection pattern.

In Figure 6, a rendered example of splat size control is shown. It can be clearly seen that nearer surface has larger splats (magnification in lower-right corner) while farther surface has smaller splats (magnification in upper-left corner).

Vertex shader Fragment shader

Reference Image

Reconfigure

Vertex Texture Setup

Geometric Transform

Setup

Per-Vertex Geometric Transform

Per-Vertex Lighting (Albedo

Computation)

Vertex Color Fetch

Per-pixel Lighting (Phong

Illumination)

Main Program

Normal Map Computation

Per-Vertex Splat Size

Computation

Image-Space Filtering for Antialiasing (Optional)


Table 1. Comparison of rendering speed (Synthetic DIBR data [8])

Frames Per Second (FPS), Mega Points Per Second (MPPS) # of Reference

Images # of points

Shader OpenGL Software

559.84 94.17 14.59 4 45,012 25.19 4.24 0.66

310.18 50.96 7.89 9 83,521 25.74 4.23 0.65

152.32 25.25 3.93 16 167,980 25.44 4.22 0.66

Table 2. Comparison of rendering speed (Real image data – Ballet [10])

Frames Per Second (FPS), Mega Points Per Second (MPPS) # of Reference

Images # of points Shader OpenGL Software

33.7 5.44 0.85 1 786,432 26.35 4.25 0.67 8.29 1.35 0.22 4 3,145,728

26.02 4.23 0.69 4.78 0.78 0.12 8 6,291,456

30.04 4.87 0.77

(a) (b)

Fig. 5. Result of surface lighting. (a) Without surface lighting. (b) With surface lighting.

Fig. 6. An example of controlling splat size


5 Conclusion

In this paper, we presented a practical method of GPU-based DIBR object rendering. By employing the new features of modern graphic processing unit and programmable shader support, we developed an efficient hardware accelerated rendering algorithm of image-based 3D object. The rendering speed increased remarkably compared with software-based rendering and conventional OpenGL-based rendering.

Future research will be focused on the optimization of the splat shape and size should be performed. Furthermore, given a 3D model, it will be also valuable to de-sign an efficient algorithm to select the optimal set of reference images, which would minimize the number of reference images.

Acknowledgement

This work was supported by Korea Research Foundation Grant funded by Korea Government (MOEHRD, Basic Research Promotion Fund) (KRF-2005-003D00320).

References

1. Information Technology – Coding of Audio-Visual Objects – Part 16: Animation Frame-work eXtension (AFX), ISO/IEC Standard JTC1/SC29/WG11 14496–16: 2003.

2. L. McMillan and G. Bishop, “Plenoptic modeling: An image-based rendering system,” Proc. SIGGRAPH ’95, pp. 39–46, Los Angeles, USA, August 1995.

3. M. Oliveira, G. Bishop, and D. McAllister, “Relief textures mapping,” Proc. SIGGRAPH ’00, pp. 359–368, July 2000.

4. D. Wood et al, “Surface light fields for 3D photography,” Proc. SIGGRAPH ’00, pp. 359–368, July 2000.

5. M. Zwicker, H. Pfister, J. Van Baar, and M. Gross, “Surface splatting,” Proc. SIGGRAPH ’01, pp. 371-378, Los Angeles, USA, July 2001.

6. W. Chen et al, “Light field mapping: Efficient representation and hardware rendering of surface light fields,” ACM Trans. on Graphics. 21(3): pp. 447-456, July 2002.

7. M. Botsch and L. Kobbelt, “High-quality point-based rendering on modern GPUs,” Proc. 11th Pacific Conference on Computer Graphics and Applications, October 2003.

8. L. Levkovich-Maslyuk et al, “Depth image-based representation and compression for static and animated 3D objects,” IEEE Trans. on Circuits and Systems for Video Technology, 14(7): 1032-1045, July 2004.

9. R. Pajarola, M. Sainz, and P. Guidotti, “Confetti: Object-space point blending and splat-ting,” IEEE Trans. on Visualization and Computer Graphics, 10(5): 598-608, Septem-ber/October 2004.

10. C. Zitnick et al, “High-quality video view interpolation using a layered representation,” ACM Trans. on Graphics, 23(3): 600-608, August 2004.

11. R. Rost, OpenGL® Shading Language, Addison Wesley, 2004. 12. J. Leech and P. Brown (editors), The OpenGL® Graphics System: A Specification (Ver-

sion 2.0), October 2004. 13. NVIDIA GPU Programming Guide Version 2.2.0, http://developer.nvidia.com/object/gpu_

programming_guide.html

LNCS 4105 - Accelerating Depth Image-Based Rendering Using...

Documents

Transcript of LNCS 4105 - Accelerating Depth Image-Based Rendering Using...