SOC Consortium Course Material SoC Design Laboratory Case Study ARM Platform-based JPEG Codec HW/SW...

37
SOC Consortium Course Material SoC Design Laboratory Case Study Case Study ARM Platform-based JPEG Codec ARM Platform-based JPEG Codec HW/SW Co-design HW/SW Co-design Teaching Assistant : Yu-Ju Cho Advisor : Prof. An-Yeu Wu

Transcript of SOC Consortium Course Material SoC Design Laboratory Case Study ARM Platform-based JPEG Codec HW/SW...

SOC Consortium Course Material SoC Design Laboratory

Case StudyCase StudyARM Platform-based JPEG Codec ARM Platform-based JPEG Codec

HW/SW Co-designHW/SW Co-design

Teaching Assistant : Yu-Ju Cho

Advisor : Prof. An-Yeu Wu

2SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Outline

Introduction to JPEG CodecIntroduction to JPEG CodecLab ─ Case studyReference

3SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

ISO/IEC 10918-1 JPEG

JPEG: Joint Photographic Experts GroupJPEG voted as international standard in 1994JPEG standard has four compression method

– Baseline sequential DCT-based coding– Progressive DCT-based coding– Lossless coding method

• Sampling and Quantization are not considered at loss-less coding scheme

– Hierarchical coding method

4SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Compression Method

TISO0730-93/d009

TIS O0 74 0-93/d0 10

F ig u r e 1 0 – H ie r a rc h ic a l m u lt i- r es o lu t io n e n c o d in g

Baseline sequential V.S. Progressive DCT-based coding

5SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Block Diagram of JPEG Encoder

f(i,j)

8x8

DCTF(u,v)

Quantization

QuantizationTable

DPCM

RLC

Fq(u,v)

zig zagscan

DC

AC

EntropyCoding

CodingTables

Data

Tables

Header

01001011101…

Y Cb Cr

DPCM: Differential Pulse Code ModulationRLC: Run-Length Code

R G B

6SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Color Model in Video ─ YCrCb

Y: Luminance Cb,Cr: ChrominanceYCbCr color model is used in JPEG and MPEG

7SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

CCIR-601 transform formula

Color space transform is loss-less

BGRC

BGRC

BGRY

r

b

081.0419.05.0

499.0331.0168.0

114.0587.0299.0

Color Model in Video ─ YCrCb

8SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Chroma Sub-sampling

4:1:1 and 4:2:0 are mostly used in JPEG and MPEG

4 : 4 : 4 4 : 2 : 2

4 : 2 : 04 : 1 : 1

pixel with only Y value

pixel with only Cr and Cb value

pixel with Y, Cr and Cb value

9SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Block Diagram of JPEG Encoder

f(i,j)

8x8

DCTF(u,v)

Quantization

QuantizationTable

DPCM

RLC

Fq(u,v)

zig zagscan

DC

AC

EntropyCoding

CodingTables

Data

Tables

Header

01001011101…

Y Cb Cr

DPCM: Differential Pulse Code ModulationRLC: Run-Length Code

R G B

10SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

2-D DCT (Discrete Cosine Transform)

01)(2

1)0(

1,,1,0,,,

;2

12cos

2

12cos)()(

2

2121

1

0

1

0

221121,

1 2

2,121

nforncandcwhere

Nkknn

N

kn

N

knxkckc

NX

N

n

N

nnnkk

Space domainFrequency domain

11SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Basis Image of 2-D DCT

Horizontal Frequency

VerticalFrequency

High

HighLow

12SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Frequency Distribution of 2-D DCT

DC

lowfrequency

mediumfrequency

highfrequency

DCVerticaledges

Horizontaledges

Diagonaledges

Highfrequency

By frequency: By direction:

13SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

8 point 1-D DCT Algorithm (1/2)

otherwise

kCwhereLkfor

L

kiCxY

lk

L

i

kilk

1

02

1;1,,1,0

;2

12cos

,

1

0

,

Better for VLSI design implementation!

14SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

8 point 1-D DCT Algorithm (2/2)

43

52

61

70

1357

3715

5173

7531

7

5

3

1

43

52

61

70

6226

4444

2662

4444

6

4

2

0

7

6

5

4

3

2

1

0

75311357

62266226

51733715

44444444

37155173

26622662

13577531

44444444

7

6

5

4

3

2

1

0

;

cos

xx

xx

xx

xx

cccc

cccc

cccc

cccc

Y

Y

Y

Y

xx

xx

xx

xx

cccc

cccc

cccc

cccc

Y

Y

Y

Y

icwhere

x

x

x

x

x

x

x

x

cccccccc

cccccccc

cccccccc

cccccccc

cccccccc

cccccccc

cccccccc

cccccccc

Y

Y

Y

Y

Y

Y

Y

Y

XCY i

otherwise

kCwhereLkfor

L

kiCxY lk

L

ikilk

1

02

1;1,,1,0;

2

12cos ,

1

0,

15SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Implementation 2-D DCT Example: row-column decomposition

Separable, row-column decomposition

otherwise

kCwhereLkfor

L

kiCxY

AXAZ

lk

L

i

kilk

T

1

02

1;1,,1,0;

2

12cos2

1

,

1

0

,

1D DCTUnit

TransportMemory

(Y)

X Z

Y=AX Z=YAT

1D DCTUnit

16SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Block Diagram of JPEG Encoder

f(i,j)

8x8

DCTF(u,v)

Quantization

QuantizationTable

DPCM

RLC

Fq(u,v)

zig zagscan

DC

AC

EntropyCoding

CodingTables

Data

Tables

Header

01001011101…

Y Cb Cr

DPCM: Differential Pulse Code ModulationRLC: Run-Length Code

R G B

17SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Quantization Table for Luminance

16 11 10 16 24 40 51 61

12 12 14 19 26 58 60 55

14 13 16 24 40 67 69 56

14 17 22 29 51 87 80 62

18 22 37 56 68 109 103 77

24 35 55 64 81 104 113 92

49 64 78 87 103 121 120 101

72 92 95 98 112 100 103 99

18SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Quantization Table for Chrominance

17 18 24 47 99 99 99 99

18 21 26 66 99 99 99 99

24 26 56 99 99 99 99 99

47 66 99 99 99 99 99 99

99 99 99 99 99 99 99 99

99 99 99 99 99 99 99 99

99 99 99 99 99 99 99 99

99 99 99 99 99 99 99 99

19SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Block Diagram of JPEG Encoder

f(i,j)

8x8

DCTF(u,v)

Quantization

QuantizationTable

DPCM

RLC

Fq(u,v)

zig zagscan

DC

AC

EntropyCoding

CodingTables

Data

Tables

Header

01001011101…

Y Cb Cr

DPCM: Differential Pulse Code ModulationRLC: Run-Length Code

R G B

20SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Predictive Coding of DC Coefficients

1iDCPrevious sample

sample iDC

Difference

1 ii DCDC

iDC

1iblock iblock

1iDC

•Differential Pulse Code Modulation (DPCM)•To Store the differential value is better than the exact value.

21SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Zig-zag Scan (AC Coefficients)

DC

22SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Run-Length Coding(RLC)

(R,L) => (0,-3)(0,-2)(0,-1)(0,-2)(0,-1)(2,-1)(EOB)

30 -3 -1 0 0 0 0 0

-2-2

-1-1

0 0 0 0 0

0 0 0 0 0

0

0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

DC

23SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Huffman Coding for DC and AC Coefficient

Category AC Coefficient Range

1 -1,1

2 -3,-2,2,3

3 -7,…,-4,4,…,7

4 -15,…,-8,8,…,15

5 -31,…,-16,16,…,31

6 -63,…,-32,32,…,63

7 -127,…,-64,64,…,127

8 -255,…,-128,128,…,255

9 -511,…,-256,256,…,511

10 -1023,…,-512,512,…,1023

11 -2047,…,-1024,1024,…,2047

SSSS Value

-1,1

-3,-2,2,3

-7,…,-4,4,…,7

-15,…,-8,8,…,15

-31,…,-16,16,…,31

00

1

2

3

4

5……

……

(R,L) => (0,-3)(0,-2)(0,-1)(0,-2)(0,-1)(2,-1)(EOB)

(0,2)(-3),(0,2)(-2),(0,1)(-1),(0,2)(-2),…(0,0)(Run,SSSS/Catagory)

Huffman Table

24SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Huffman Coding for DC and AC Coefficient

Category Code length Code word

10 2 000

11 3 010

12 3 011

13 3 100

14 3 101

15 3 110

16 4 1110

17 5 11110

18 6 111110

19 7 1111110

10 8 11111110

11 9 111111110

Table for luminance DC coefficient differences

Run/Size Code length Code word

0/0 (EOB) 14 1010

0/1 12 00

0/2 12 01

0/3 13 100

0/4 14 1011

0/5 15 11010

0/6 17 1111000

0/7 18 11111000

0/8 10 1111110110

0/9 16 1111111110000010

0/A 16 1111111110000011

1/1 14 1100

1/2 15 11011

1/3 17 1111001

1/4 19 111110110

Table for luminance AC coefficients

(0,2)(3),(0,2)(-2),(0,1)(-1),(0,2)(-2),…(0,0)=>(01) (11) (01) (01) ……(1010)

25SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

An Example of Baseline DCT-based Coding

-128

12 16 19 12 11 27 51 4716 24 12 19 12 20 39 5124 27 8 39 35 34 24 4440 17 28 32 24 27 8 3234 20 28 20 12 8 19 3419 39 12 27 27 12 8 343 28 -5 39 34 16 12 1920 27 8 27 24 19 19 8

FDCT

185 -17 14 -8 23 -9 -13 -1820 -34 26 -9 -10 10 13 6-10 -23 -1 6 -18 3 -20 0-8 -5 14 -14 -8 -2 -3 8-3 9 7 1 -11 17 18 153 -2 -18 8 8 -3 0 -68 0 -2 3 -1 -7 -1 -10 -7 -2 1 1 4 -6 0

Q

3 5 7 9 11 13 15 175 7 9 11 13 15 17 197 9 11 13 15 17 19 219 11 13 15 17 19 21 2311 13 15 17 19 21 23 2513 15 17 19 21 23 25 2715 17 19 21 23 25 27 2917 19 21 23 25 27 29 31

61 -3 2 0 2 0 0 -14 -4 2 0 0 0 0 0-1 -2 0 0 -1 0 -1 00 0 1 0 0 0 0 00 0 0 0 0 0 0 00 0 -1 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0

Zig-Zag

Run-length

(6)(61),(0,2)(-3), (0,3)(4),(0,1)(-1), (0,3)(-4),(0,2)(2), (1,2)(2),(0,2)(-2), (0,2)(-2),(5,2)(2), (3,1)(1),(6,1)(-1), (2,1)(-1),(4,1)(-1), (7,1)(-1),(0,0)

Huffman(1110)(111101)(01)(00)(100)(100)(00)(0)(100)(001)(01)(10)(11011)(10)(01)(01)(01)(01)(11111110111)(10)(111010)(1)(1111011)(0)(11100)(0)(111011)(0)(11111010)(0)(1010)total 102 bits

Q Table

140 144 147 140 140 155179 179144 152 140 147 140 148167 179152 155 136 167 163 162152 172168 145 156 160 152 155136 160162 148 156 148 140 136147 162147 167 140 155 155 140136 162136 156 123 167 162 144140 147148 155 136 155 152 147147 136

For Y, (8*8 pixels *8 bits/pixel = 512 bits)

26SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Block Diagram of JPEG Encoder

f(i,j)

8x8

DCTF(u,v)

Quantization

QuantizationTable

DPCM

RLC

Fq(u,v)

zig zagscan

DC

AC

EntropyCoding

CodingTables

Data

Tables

Header

01001011101…

Y Cb Cr

DPCM: Differential Pulse Code ModulationRLC: Run-Length Code

R G B

27SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Block Diagram of JPEG Decoder

f(i,j)

8x8

IDCTF(u,v)Inverse

Quantization

QuantizationTable

Fq(u,v)EntropyDecoder

CodingTables

Data

Tables

Header

01001011101…

28SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

JPEG Bitstream

Start_of_image End_of_image

Tables, etc. header ...... scanscan

Frame

Tables, etc. header Restart segment Restart ......segment

block block ...... blockblock

29SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Outline

Introduction to JPEG CodecLab ─ Case studyLab ─ Case studyReference

30SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

File StructureFi nal _proj ect - - - - - - - - - sw. bat - 純軟體執行之批次檔 |

| - - - - hw. bat - 軟硬體共同執行之批次檔 |

| - - - - Downl oad. brd - 燒錄bit檔至LM模組之檔案 |

| - - - - sw - - - - - - - - _dct. cpp - 硬體之DCT程式碼 | | - - - - bmp. cpp - 讀取*.bmp檔 | | - - - - j peg. cpp - JPEG之區塊編解碼引擎 | | - - - - j peg. h - jpeg.cpp之宣告 | | - - - - mai n. cpp - JPEG Codec程式主體 | | - - - - marker. h - JPEG圖檔之標籤 | | - - - - pi cture. cpp - 存取靜態影像 | | - - - - stream. cpp - 讀取bitstream

| | - - - - stream. h - stream.cpp之宣告 | | - - - - type. h - 基本型別的宣告 |

| - - - - hw - - - - - - - - ahb2apb. v

| - - - - ahbahbtop. bi t - I P的Xilinx燒錄檔 | - - - - ahbahbtop. v

| - - - - ahbapbsys. v

| - - - - ahbdecoder. v

| - - - - ahbmuxs2m. v

| - - - - ahbzbtram. v

| - - - - apbi ntcon. v

| - - - - apbregs. v

| - - - - dct. v - Chen's DCT/IDCT核心電路 | - - - - LM_fl ash_l oad. bi t

| - - - - map. ucf

| - - - - myi p. v - DCT/IDCT之IP

31SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Read & Write Address

Core Module / Motherboard

memoryand peripherals

PCI

Core modulealias memory

Logic module 0

Logic module 1

Logic module 2

Logic module 3

LM registers

Interrupt

SSRAM

Bus Error response

test_register

0xC0000000

0xD0000000

0xE0000000

0xF0000000

0xC0000000

0xC1000000

0xC2000000

0xC21000000xC2100004

0xCFFFFFFF

Write_head

0xcc000000

0xcc000004

0xcc000008

0xcc00000c

0xcc000010

0xcc000014

0xcc000018

0xcc00001c

Read_head

0xcc000020

0xcc000024

0xcc000028

0xcc00002c

0xcc000030

0xcc000034

0xcc000038

0xcc00003c

Write_head

0xcc000040

0xcc000044

0xcc000048

0xcc00004c

0xcc000050

0xcc000054

0xcc000058

0xcc00005c

Read_head

0xcc000060

0xcc000064

0xcc000068

0xcc00006c

0xcc000070

0xcc000074

0xcc000078

0xcc00007c

FDCTFDCT IDCTIDCT

32SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Result for SW Simulation

OriginalOriginal

EncoderEncoder DecoderDecoder

33SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Result for HW Simulation

OriginalOriginal

EncoderEncoder DecoderDecoder

34SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Profiling Result of SW Simulation

35SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Lab ─ Case Study

Goal– Implement the JPEG codec system using ARM platform

Principles– Implement the ARM platform-based JPEG codec HW/SW

co-design

Requirement– Analysis the profiling of pure software simulation– Explain how to partition the HW/SW of JPEG codec– Implement the JPEG codec with HW/SW co-design

Discussion– Explain where is the stack and heap ? And who initialize

them

36SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Outline

Introduction to JPEG CodecLab ─ Case studyReferenceReference

37SOC Consortium Course Material

Real-tim

e OS

SoC Design Laboratory

Reference Wen-Hsiung Chen, C. Harrison Smith, and S. C. Fralick, "A Fast Comput

ational Algorithm for the Discrete Cosine Transform," IEEE Trans. Comm

un., vol. COM-25, pp. 1004-1009, Sept 1977. JPEG: Still Image Data Compression Standard by William B. Pennebake

r and Joan L. Mitchell, Kluwer Academic Publishers, ISBN: 0442012721