CS 523 (CS 423/EE 533) Computer Vision Lecture 1 INTRODUCTION TO COMPUTER VISION.
CS/EE 5810 CS/EE 6810 F00: 1 Multimedia. CS/EE 5810 CS/EE 6810 F00: 2 New Architecture Direction...
-
Upload
karin-haynes -
Category
Documents
-
view
214 -
download
2
Transcript of CS/EE 5810 CS/EE 6810 F00: 1 Multimedia. CS/EE 5810 CS/EE 6810 F00: 2 New Architecture Direction...
CS/EE 5810CS/EE 6810
F00: 1
Multimedia
CS/EE 5810CS/EE 6810
F00: 2
New Architecture Direction• “… media processing will become the dominant force in
computer architecture and microprocessor design”
• “… new media-rich applications … involve significant real-time processing of continuous media streams and make heavy use of vectors of packed 8-, 16-, and 32-bit integer and f.p.”
– “How Multimedia Workloads will Change Processor Design,” Diefendorff & Dubey, IEEE Computer (9/97)
• Needs includes high memory bandwidth, high network bandwidth, continuous media data types, real-time response, fine-grain parallelism
• Also significant focus on system bus performance
– Common bridge to the memory system and I/O
– Critical performance component for SMP server platforms
CS/EE 5810CS/EE 6810
F00: 3
Multimedia Workloads
• Multimedia
– Video conferencing
– Video authoring
– Animation
– Games
• Algorithms
– Image compression (jpeg)
– Video Compression (mpeg)
– 3-D graphics
– encryption
CS/EE 5810CS/EE 6810
F00: 4
Multimedia Characteristics• Real-time response
– Video, audio• Continuous media data types
– 8-16 bits sufficient for many applications• Data parallelism
– E.g. share same operation to whole image– Vector or SIMD work well here
• Coarse-grained parallelism– E.g. video encoding/decoding, audio encoding/decoding
• Small loops– Most time spent in kernal– Amenable to hand-optimization
• High memory bandwidth– Video, 3d graphics– Caches not large enough
CS/EE 5810CS/EE 6810
F00: 5
Multimedia ISA Extensions
• HP PA-RISC
– MAX-2
• SUN SPARC
– VIS
• Intel x86
– MMX
• MIPS
– MDMX
• PowerPC
– Altivec
CS/EE 5810CS/EE 6810
F00: 6
MMX
• “MMX Technology Extension to the Intel Architecture” Alex Peleg and Uri Weiser, IEEE Micro, August 1996
• Goals
– Improve performance of multimedia applications» Graphics, MPEG video
» Image processing, speech recognition
– Remain completely compatible with Intel x86 ISA
– Minimize cost
• Approach
– Use packed data types
– Exploit SIMD parallelism
– Make use of existing wide data paths
CS/EE 5810CS/EE 6810
F00: 7
Data Types and Operands
• Three fixed-point integer types packed into 64 bit quad word
– Packed Byte: 8 8-bit bytes
– Packed Word: 4 16-bit words
– Packed Doubleword: 2 32-bit words
• User-controlled fixed point
• Eight 64-bit GP registers (mm0-mm7)
• MMX shares FPU
– Can’t do FP an MMX at the same time
• Random Access
– Learned lesson from FP unit design.
CS/EE 5810CS/EE 6810
F00: 8
MMX Operations
• 57 MMX instructions work on all data types
• Support for saturation arithmetic
– Simplifies handling of underflow and overflow
– Matches physical behavior
• Packed operations
– Addition/subtraction, multiplication, compares, shifts
• Conversion operations
– Pack/unpack
• Performance improvement
– Fewer loads and stores
– Fewer arithmetic operations, but more conversion
CS/EE 5810CS/EE 6810
F00: 9
MMX Operations
A3 A2 A1 A0
B3 B2 B1 B0 X X X X
A3 X B3 A2 X B2 A1 X B1 A0 X B0
A3XB3 + A2XB2 A3XB3 + A2XB2
Packed multiply-addTo doubleword
51 3 5 23
73 2 5 6 > > > >
00…0 11…1 00…0 11…1
Packed compareGreater-than word
CS/EE 5810CS/EE 6810
F00: 10
Using MMX
• Assembly language coding
• Use of libraries
– E.g. IDCT, DCT, matrix multiply…
• Use of C macros (“intrinsics”)
– Generate optimized assembly code
– Performs register allocation and instruction scheduling» MMX64 t0, t1;
t0 = padd(t0, t1);
– Requires intimate knowledge of MMX
• Could a compiler generate MMX code?
CS/EE 5810CS/EE 6810
F00: 11
Chroma Keying
• Weatherman example» For (I = 0; I < imagesize; I++)
new_image = (x[I] == blue) ? Y[I] : X[I];
– Movq mm3, mem1 ; load 8 pixels from weathermanmovq mm4, mem2 ; load 8 pixels from mapPcmpeq mm1, mm3 ; generate select maskpand mm4, mm1 ; AND map with maskpandn mm1, mm3 ; AND weatherman with inverse maskpor mm4, mm1 ; OR masked images together