CDDC - Specification FormTemplate

7/29/2019 CDDC - Specification FormTemplate

1/5

{{YYoouurr PPrroojjeeccttss NNaammee}}SSppeecciiffiiccaattiioonn DDooccuummeenntt

page 1 of 5

{FPGA Stereovision}

{ChandraKanth Pamrthi}{[email protected]}

{Prasanth Verma}{[email protected]

Submitted for CoreEL Digilent Design Contest 2013

{19 04 - 2013}

Advisor: {Joycee Mekie}

{IIT Gandhinagar}{Ahmedabad, Gujarat}


2/5

{Your Projects Name} Specification Document

page 2 of 5

Project:

FPGA Stereovision

Brief Overview:

This project describes the development of an integrated stereovision sensor intended to be on mobile

platforms on mobile platforms like robots or intelligent vehicles. Companies which make Intelligent

Transportation system (ITS), are eager to integrate sensors and perceptual algorithms on cars, for

different applications; obstacle detection on motorway or in urban traffic, lane departure detection,

parking assistance, navigation, cockpit and driver monitoring etc.

Monocular vision has been proposed to detect obstacles (cars or pedestrians) in urban scenes, but

without assumptions on the environment (no flat road approximation for example). Monocular

vision does not allow to cope with complex situations and is generally with other kind of sensors (e.g

radar or laser devices).

Stereovision is widely used in the robotics community, typically to evaluate the terrain navigability at

short distances (Matthies 1992). Several companies (Videre Design Company) propose stereo rigs

with a short baseline (10cm), well suited for indoor perception. Stereovision has been also evaluated

in ITS applications for many years, but the real-time requirements, the limitations of the depth field,

the lack of robustness makes difficult to use stereovision in changing contexts.

This algorithm/theory is used in many applications. Eg: Stereo is the main sensor used for outdoor

terrestrial robot, detection of free parking sloat and assistance for parking manoevre and pedestrian

detection in urban scene etc

Design Overview:

Intially the original right and left images are processed independently. The distortion correction and

rectification step allows to provide two aligned rectified images. The Key components of this project

are:

- Multiple Image sensors (only 2 cameras in this project)- FPGA interfacing with Etherenet

- Interface board to connect the image sensors to FPGA board

Image rectification is a crucial first-step in many image processing tasks and especially in

stereovision. Most stereovision algorithms depend on the input images conforming to simplified

epipolar geometry with coplanar images. This allows for the assumption that a given point in one

image can be found in the same row of the other image (provided that point is not occluded) thus

dramatically reducing search space.


3/5


page 3 of 5

we cant easily build a system with distortion-fr.ee cameras/lenses and perfect alignment, so the

image rectification step is required to take real-world image data and turn it into something

resembling the ideal case. Acalibration process is run on the un-corrected stereo image data to

determine what sort of transformation the rectification step has to perform.

FPGA logic:

The most straight-forward way for an FPGA to implement this rectification step is by using look-up-

tables: for each rectified output pixel, you have a table entry that indicates the source pixel. A naive

implementation that allows each table entry to reference anywhere in the entire source image would

be very memory-intensive; sub-pixel resolution would only compound matters.

A better implementation might, for example, encode coordinate differences between adjacent pixels

(under the perfectly-reasonable assumption that the source coordinate for a particular output pixel

will be very similar to the source coordinate for its neighbor). An 8-bit value could encode both a 4-bit

X and Y difference, which could themselves be fixed-point fractions (e.g. with a range of +3.75 to -

4.00).


4/5


page 4 of 5

Alternatively (or possibly in conjunction with), one could use a lower resolution look-up-table with

simple linear interpolation between entries. The size of that lower resolution table would depend on

the severity of the distortion being corrected, and the amount of error that is tolerable (relative to an

ideal full-resolution table).With coordinates in hand, the FPGA can then use a simple sampling algorithm (e.g. bilinear

interpolation) to generate the output pixels. For all but nearest-neighbor interpolation, the sampling

algorithm would need to read multiple source pixels for each output pixel, so a cache would be

needed to reduce external memory accesses.

For well-controlled distortion, it would be possible to perform rectification in a streaming fashion

without any reliance on external memory. Each image sensor would write directly into an internal

memory that is large enough to hold several rows worth of image data. Then, each output row could

be generated entirely from this input buffer. Within each output row, the range of source Y-

coordinates would have to span less than the height of the input buffer.

Implementation plans:

ISE Webpack Simulator andAnalysis Tool, PlanAhead

ChipScope Pro

Spartane 3e FPGA

Development Environment

FPGA board chosen


5/5


page 5 of 5

Digilent products requested for the project:

No. Product Quantity Notes

1. Spartan 3E 1 -2. MT9V032 LVD camera 2 Image sensor

Signatures

{Participants printed name, signature} {Advisors printed name, signature}

{Participants printed name, signature}

{Participants printed name, signature}

CDDC - Specification FormTemplate

Documents

Transcript of CDDC - Specification FormTemplate