Post on 18-Jan-2016
Spatiotemporal Saliency Map of a Video Sequence in
FPGA hardware
David Boland
Acknowledgements: Professor Peter CheungMr Yang Liu
What is Spatiotemporal Saliency?
Saliency – parts of a scene that appear pronounced
Spatiotemporal Saliency – parts of a scene that appear pronounced in video
Why Important?
General environments are complex and dynamic Human eye handles this by focusing upon salient
objects Real-time algorithm to emulate this has many
uses: Image processing Surveillance Machine vision Navigation…
The Problem
Spatiotemporal Saliency algorithms have high computational complexity.Store stack of video framesUnsuitable for real-time
Need algorithm with reduced memory requirements
Overview
Introduce Algorithm and section completed Brief background Implementation
Software ModelHardware Model
Results Optimisations (if time) Summary
Algorithm For Spatiotemporal Saliency
Feature Tracking Module
Object tracking generally achieved through monitoring optical flow
Optical flow: “the distribution of apparent velocities of movement of brightness patterns in an image”
Several Algorithms – None perfect Good Trade complexity vs. accuracy – Lukas
Kanade Algorithm
Lukas Kanade Algorithm
Definition of problem: Let I and J are two consecutive images Let u = [ux, uy ] be an image point in I Find v = u + d = [ux+dx, uy+dy] where v is a similar point
on J
Points not tracked equally due to aperture problem. Solution is to minimise error:
2)),(),(()()( yx
wxux
wxuxx
wyuy
wyuyyyx dydxyxddede
JI
Lukas Kanade SolutionbGvopt
1
where
wxux
wxuxx
wyuy
wyuyy yyx
yxxG 2
2
III
III
wxux
wxuxx
wyuy
wyuyy y
x
I
Ib
I
I
2/)),1(),1(( yxyx LLx III
2/))1,()1,(( yxyx LLy III
),(),( Ly
Ly
Lx
Lx
LL dgydgxyxI JI
Find (Iteratively Refine)
Pyramidal Lukas Kanade Algorithm
Lukas Kanade Algorithm assumes small motion Handle Larger motion with window size
But Lose Accuracy
Solution Create Hierarchy of images
Each image ½ as large
Perform Lukas Kanade on each level to get guess Map guess to lower levels
Pyramidal Lukas Kanade Algorithm
Track feature between two images at the highest level to obtain guess for new feature location
Map guess to lower levels, obtain better guess
Find final pixel location
Apply LK
Apply LK, start at guess
Apply LK, start at guess
Implementation – Software Model Why?
Results to test the hardware against Useful during debugging stage
Choice of Software Language: Matlab Matrix calculations Maps well to hardware Simple for fast development
Method: Apply feature detection algorithm to find co-ordinates Apply Pyramidal Lukas Kanade to track co-ordinates
Software Model - Demo
Implementation – Hardware
Aims:Fit onto the FPGAClock Frequency 65MHz for VGA
Not Straightforward: Initial design emulate software correctly:
Well over 200% size of FPGA Initial Design 4MHz
Hardware Considerations
Choice Software Language: Handel-C Minimise expensive operations
Memory Accesses Multiplication Division
Maintain Precision Floating point precision unavailable
General Optimisations Minimise Delay Path or Logic Depth Minimise Fan-out
Memory Considerations – Building Hierarchy To build image of higher
level: Iterate over even pixels Collect mask of values
surrounding the pixel Weight as shown on right Sum
Repeat recursively on output for higher levels
Memory Considerations – Building Hierarchy Pixels re-used:
Store locally Reduce Memory reads
Memory Considerations – Building Hierarchy
Memory Considerations – Optical Flow Only read once
values once from main memory
Also reduce fan-out
2/)),1(),1(( yxyx LLx III
2/))1,()1,(( yxyx LLy III
Multiplications
Avoid via left-shifting Pre-compute results whenever possible Use Dedicated Multipliers
Combined for large multiplications
Division Considerations
Division Costly process Handel-C designs hardware to implement in
one cycle. Large number of bits implies large delay Solution: Spread over multiple cycles
Long Division Slow – unbounded stage
Binary Search If limit range of optical flow per iteration [-1 1]
Division Considerations
0.5 B
0.75 B
0.25 B
0.125B
0.375 B
0.625B
0.825B
0.25 B
0.5 B
0.75 B
1 B
0 B
≥
<
<
<
<
<
<
<≥
≥
≥
≥
≥
≥
A/B=x ≡ A=B*x
Division Considerations
0.5 B
0.75 B
0.25 B
0.125B
0.375B
0.625B
0.825B
0.25 B
0.5 B
0.75 B
1 B
0 B
1
0
0
0
0
0
0
0
1
1
1
1
1
1
111
110/101
100/011
010/001
000
Hardware Testing
Test against software model Store Feature co-ordinates & tracked locations from
software model Load feature co-ordinates in hardware Track in hardware Compare difference
Vary number of fractional bits Examine importance/cost of different fractional
precision
Accuracy Results (I)
Percentage number of co-ordinates tracked differently in hardware and software models
0
10
20
30
40
50
60
70
80
90
100
0 1 2 3 4 5 6 7
No Fractional Bits
%
Accuracy Results (II)
Percentage number of co-ordinates tracked significantly differently in hardware and software models
0
10
20
30
40
50
60
0 1 2 3 4 5 6 7
No Fractional Bits
%
Area Results
Resource Usage for Fractional Bits
0
2000
4000
6000
8000
10000
12000
0 1 2 3 4 5 6 7
No Fractional Bits
No
Re
so
urc
es
Flip Flops
Slices
Look-up tables
Speed Results
Clock Frequency vs No Fractional Bits
0
10000000
20000000
30000000
40000000
50000000
60000000
0 1 2 3 4 5 6 7
No Fractional Bits
Clo
ck
Fre
qu
en
cy
Hz
Results Summary
Final design only uses 1/6 FPGA Use 4/5/6 fractional bits for good accuracy Speed short of desired (approx 50 MHz)
ISE estimates cautiousPipelining can increase this
Reduced Loop control
Optimisations
Final Design only uses 1/6 FPGA. Use space to increase Speed:
Pipelined HardwareParallel Hardware
Pipelined Architecture I
Pipelined Architecture II
Parallel Architecture
Summary Spatiotemporal Saliency framework Role of optical flow within framework Steps to create & test hardware implementation Effective method to find optical flow
High Speed/Accuracy, small area Optimisations to achieve this Further Improvements possible
Some performance advantages over other hardware optical flow implementations
Optical flow useful beyond Spatiotemporal Saliency Framework