Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the...

21
Towards Efficient Video Compression Using Scalable Vector Graphics on the Cell Broadband Engine Andreea Sandu, Emil Slusanschi , Alin Murarasu, Andreea Serban, Alexandru Herisanu, Teodor Stoenescu University Politehnica of Bucharest Computer Science and Engineering Department International Workshop on Multi-core Software Engineering, Cape Town, South Africa, May 1 st 2010

Transcript of Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the...

Page 1: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

Towards Efficient Video Compression Using Scalable Vector Graphics

on the Cell Broadband Engine

Andreea Sandu, Emil Slusanschi , Alin Murarasu, Andreea Serban, Alexandru Herisanu, Teodor Stoenescu

University Politehnica of Bucharest Computer Science and Engineering Department

International Workshop on Multi-core Software Engin eering, Cape Town, South Africa, May 1 st 2010

Page 2: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

2

Outline

• Video Codecs & Image Characteristics

• NURBS Curves• NURBS Curves

• Image Representation

• Image Encoding

• Porting to the Cell/B.E.

• Results

• Related Projects @cs.pub.ro

• Conclusions & Outlook

Page 3: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

3

Video Codecs

• A software program or library

• Encodes/Decodes the video component of a movie/clip in a digital formatdigital format

• Aim: create a decoder using scalar vector graphics (SVG)

• Advantages of SVG:• Data Compression – efficient representation• Losseless display at any resolution – shape preservation

• Disadvantages of SVG: difficult conversion from raster• Disadvantages of SVG: difficult conversion from raster

Vector Raster

Page 4: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

4

NURBS Curves

� NURBS = Non-uniform relational B-splines

� Can be used to represent curves and surfaces

� Used extensively in Computer Aided Design (CAD)

� Parameters� Degree (1,2,3,5,…)� Control points & weights� Knots

� NURBS advantages:� NURBS advantages:� Invariant to scalar transformations� Computable with stable algorithms (e.g. DeBoor)� Can represent complex features with few parameters� A curve can be handled easily through its parameters

Page 5: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

5

NURBS Conversion

� Polygonal approximation:� Curve evaluation – deBoor’s algorithm� Initial approximation to curve knots� Initial approximation to curve knots� Iterative process of adding nodes

� Integrated in ffmpeg & ogg & used

in the VLC player

Video frameInternal Representation

Page 6: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

6

Image Encoding

Despeckling Quantization Follow curves NURBSVideo frame Raster EdgesRaster

� Modular Design� Stage algorithms can be treated independently:

� Despeckling & noise filtering� Create big pieces of same color zones – similar to AutoTrace, by

smoothing/combining neighboring pixels of similar colors

Colors Curves

smoothing/combining neighboring pixels of similar colors

� Color quantization� Create a new color scale

� The algorithm is based on octrees

� Reduce number of colors in order to reduce the image size in the vector representation – loses details/quality vs. original image

Page 7: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

7

Feature Extraction with NURBS

� Determine zones of constant color

� Determine edges between these � Determine edges between these zones using NURBS curves

� Determine knots of sharing edges

� The approximation is passing through these knotsthrough these knots

� The approximation uses a least-squares approach

Page 8: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

8

IBM’s Cell/B.E. Processor

• Heterogeneous multi-core system architecture

SPE

LS

SXUSPU

LS

SXUSPU

LS

SXUSPU

LS

SXUSPU

LS

SXUSPU

LS

SXUSPU

LS

SXUSPU

LS

SXUSPU

– Power Processor Element for control tasks

– Synergistic Processor Elements for data-intensive processing

• Synergistic Processor Element (SPE) consists of

– Synergistic Processor Unit (SPU)

16B/cycle (2x)16B/cycle

BICMIC

16B/cycle

EIB (up to 96B/cycle)

16B/cycle

PPE

SMF

PXUL1

PPU

L2

SMFSMFSMFSMFSMFSMFSMF

Unit (SPU)– Synergistic Memory Flow

Control (SMF)• Data movement and

synchronization• Interface to high-

performance Element Interconnect Bus

FlexIO TMDual XDRTM

64-bit Power Architecture with VMX

PXUL116B/cycle

L232B/cycle

Page 9: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

9

400

Encoding – Serial Profiling

• Profiling is done on slices of 308 x 400 pixels

50

100

150

200

250

300

350

Despeckling

Quantize

Follow

Curves

0

50

Time (ms)

Phase Despeckling Quantize Follow Curves

Time (ms) 368.8 61.051 25.822 68.884

Percentage 70.31% 11.64% 4.92% 13.13%

Page 10: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

10

Porting to the Cell/B.E . Architecture

• IBM’s Cell/B.E. is heterogeneous: PPE/SPE

• Usual image processing algorithms methodology• Usual image processing algorithms methodology– Divide the problem – process data – reconstruct results

• Image Despeckling– Whole algorithm is run on the SPEs

• Image Quantization– The PPE divides the frame– The PPE divides the frame– The SPEs generate the octrees– The PPE fuses the octrees together

• NURBS curves– Entire algorithm is run on SPEs

Page 11: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

11

Despeckling Design Tradeoffs

• Split image in slices and distribute them to SPUs

• Smoothing is done independently by each SPU• Smoothing is done independently by each SPU

• The PPU rebuilds the image from the processed fragments

• Tradeoffs:

• The slices are too small – the smoothing will be • The slices are too small – the smoothing will be exaggerated

• The slices are too big – they will not fit on the SPU local storage memory

Page 12: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

12

Quantization Design Tradeoffs

• The PPU decides if/when to reduce the number of colors

• The SPUs generate partial color trees with a • The SPUs generate partial color trees with a maximal number of levels by counting pixels of each color

• The PPU combines the SPU generated trees in a global tree

• Tradeoffs:• Tradeoffs:• The slices are too big – generate too many partial trees

and too many DMA transfers & significant overhead in the global tree reconstruction

• The slices are too small – processing on the PPU may be more efficient

Page 13: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

13

Quantization SIMD/Vectorization

• Groups of 3 bits from the three basic color (RGB) components are forming paths in the partial trees built by SPUs: built by SPUs:

• bit_R<<2

• bit_G<<1

• bit_B<<0

• Computing the paths serially is done with successive shifts in 8 iterations

• The vector/SIMD version allows the computing of entire vector paths in the partial tree at once

Page 14: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

14

Ongoing developments

• Edge detection component in the quantization phase moved from PPU to SPUs– Color trees are aligned to ease transfer and processing– Color trees are aligned to ease transfer and processing– Each SPU makes a local copy & converts pixels to codes– After conversion release memory to allow edge detection

passes to continue

• Tradeoffs:– Edge detection algorithm generates useless edges around – Edge detection algorithm generates useless edges around

the current slice to avoid lots of coordinate testing– Big slices are good because of code serialization – no

more branching code– Small slices – generate lots of useless edges thus

increasing storage requirements

Page 15: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

15

Results – Image Quality

Original 4SPUsOriginal Image

4SPUs

� x86 codec speed: 4-6 fps� Compression ratio to date: 0.982 – 1.754

16SPUs8SPUs

Page 16: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

16

Results – Despeckling@SPUs

SPUs 1 2 4 8 16

Time (ms) 368.80 204.96 104.10 54.35 29.21

8

10

12

14

Spe

edup

Speedup 1.00 1.80 3.54 6.79 12.63

0

2

4

6

1 2 4 8 16

Spe

edup

Number of SPUs

Page 17: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

17

Results – Quantization@SPUs

SPUs 1 2 4 8 16Time (ms) 61.05 23.83 12.64 10.29 12.06Speedup 1.00 2.56 4.83 5.93 5.06

3

4

5

6

7

Spe

edup

Speedup 1.00 2.56 4.83 5.93 5.06

0

1

2

3

1 2 4 8 16

Spe

edup

Number of SPUs

Page 18: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

18

Feature Extraction from Satellite Images on Hybrid x86/CellBE Systems

Related Projects @cs.pub.ro

Original Grayscale Image Detection(Sobel)

Hough Accumulator

X86_64

Hough Peaksover image edges

Mark road segment edges

Final identified feature (road)

X86_64

Cell/B.E.

Saved as SVG

Page 19: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

19

Related Projects @ cs.pub.ro

Interactive 3D Map of Romania

SVG for Map Representation

Page 20: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

20

Conclusions & Outlook

• Conclusions– The performance of the SVG codec benefits from its

deployment on the Cell/B.E. architecturedeployment on the Cell/B.E. architecture– The quality and performance of the codec are strongly

dependent on design choices in the processing steps– The codec compression still requires further improvement

• Outlook– Currently only Intra-coded-frames (I-frame) are encoded – Currently only Intra-coded-frames (I-frame) are encoded

leading to big SVG file sizes– Add support for Predicted (previous) & Bi-coded (previous

& next) frames thus improving SVG storage requirements• Use motion estimation techniques between reference I-frame

blocks & blocks in subsequent frames (translation/rotation/etc)• The offsets/differences are stored in motion vectors

Page 21: Towards Efficient Video Compression Using Scalable Vector ...€¦ · – The performance of the SVG codec benefits from its deployment on the Cell/B.E. architecture – The quality

Thank you for your attention

Q & A

[email protected]