Computational Neuroscience Final Project – Depth Vision
description
Transcript of Computational Neuroscience Final Project – Depth Vision
![Page 1: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/1.jpg)
COMPUTATIONAL NEUROSCIENCEFINAL PROJECT – DEPTH VISION
Omri Perez2013
![Page 2: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/2.jpg)
INTRO
![Page 3: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/3.jpg)
DEPTH CUES
Pictorial Depth Cues Physiological Depth Cues Motion Parallax Stereoscopic Depth Cues
![Page 4: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/4.jpg)
PHYSIOLOGICAL DEPTH CUES
Two Physiological Depth Cues:1. Accommodation
2. Convergence
![Page 5: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/5.jpg)
PHYSIOLOGICAL DEPTH CUES
–Accommodation
![Page 6: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/6.jpg)
PHYSIOLOGICAL DEPTH CUES
Accommodation relaxed lens = far away accommodating lens = near
What must the visual system be able to compute unconsciously?
![Page 7: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/7.jpg)
PHYSIOLOGICAL DEPTH CUES
–Convergence
![Page 8: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/8.jpg)
PHYSIOLOGICAL DEPTH CUES
Convergence small angle of convergence = far away large angle of convergence = near
– What two sensory systems is the brain integrating?
– What happens to images closer or farther away from fixation point?
![Page 9: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/9.jpg)
MOTION DEPTH CUES
Parallax
![Page 10: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/10.jpg)
MOTION DEPTH CUES
– Parallax Points at different locations in the visual
field move at different speeds depending on their distance from fixation
http://www.youtube.com/watch?v=ktdnA6y27Gk&NR=1
![Page 11: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/11.jpg)
Seeing in StereoSeeing in Stereo
![Page 12: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/12.jpg)
SEEING IN STEREO
It’s very hard to read words if there are multiple images on your retina
It’s very hard to read words if there are multiple images on your retina
![Page 13: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/13.jpg)
SEEING IN STEREO
It’s very hard to read words if there are multiple images on your retina
It’s very hard to read words if there are multiple images on your retina
But how many images are there on yourretinae?
![Page 14: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/14.jpg)
BINOCULAR DISPARITY
Your eyes have a different image on each retina hold pen at arms length and fixate the
spot
how many pens do you see? which pen matches which eye?
![Page 15: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/15.jpg)
BINOCULAR DISPARITY
Your eyes have a different image on each retina now fixate the pen
how many spots do you see? which spot matches which eye?
![Page 16: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/16.jpg)
BINOCULAR DISPARITY
Binocular disparity is the difference between the two images
![Page 17: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/17.jpg)
BINOCULAR DISPARITY
Binocular disparity is the difference between the two images
Disparity depends on where the object is relative to the fixation point: objects closer than fixation project
images that “cross” objects farther than fixation project
images that do not “cross”
![Page 18: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/18.jpg)
BINOCULAR DISPARITY
Corresponding retinal points
![Page 19: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/19.jpg)
BINOCULAR DISPARITY
Corresponding retinal points
![Page 20: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/20.jpg)
BINOCULAR DISPARITY
Corresponding retinal points
![Page 21: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/21.jpg)
BINOCULAR DISPARITY
Corresponding retinal points
![Page 22: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/22.jpg)
BINOCULAR DISPARITY
Points in space that have corresponding retinal points define a plane called the horopter or Panum’s fusional area
The Horopter
![Page 23: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/23.jpg)
BINOCULAR DISPARITY
Points not on the horopter will be disparate on the retina (they project images onto non-corresponding points)
![Page 24: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/24.jpg)
BINOCULAR DISPARITY
Points not on the horopter will be disparate on the retina (they project images onto non-corresponding points)
The nature of the disparity depends on where they are relative to the horopter
![Page 25: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/25.jpg)
BINOCULAR DISPARITY
points nearer than horopter have
crossed disparity
points farther than horopter have
uncrossed disparity
![Page 26: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/26.jpg)
BINOCULAR DISPARITY
Why don’t we see double vision?
![Page 27: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/27.jpg)
BINOCULAR DISPARITY
Why don’t we see double vision?
Images with a small enough disparity are fused into a single image
![Page 28: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/28.jpg)
BINOCULAR DISPARITY
Why don’t we see double vision?
Images with a small enough disparity are fused into a single image
The region of space that contains images with close enough disparity to be fused is called Panum’s Area
![Page 29: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/29.jpg)
BINOCULAR DISPARITY
Panum’s Area extends just in
front of and just behind the
horopter
![Page 30: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/30.jpg)
STEREOPSIS
Our brains interpret crossed and uncrossed disparity as depth
That process is called stereoscopic depth perception or simply stereopsis
![Page 31: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/31.jpg)
STEREOPSIS
Stereopsis requires that the brain can encode the two retinal images independently
![Page 32: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/32.jpg)
STEREOPSIS
Primary visual cortex (V1) has bands of neurons that keep input from the two eyes separate
![Page 33: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/33.jpg)
CORTICAL HYPER COLUMNS IN V1
The basic processing unit of depth perception
The cortical column consists of a complete set of orientation columns over a cycle of 180º and of right and left dominance columns in the visual cortex. A hypercolumn may be about 1 mm wide.
![Page 34: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/34.jpg)
OUR GOAL
To compute the binocular depth of stereo images
Left Right
![Page 35: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/35.jpg)
SIMULATING RECEPTIVE FIELDS IN V1
To emulate the receptive fields of V1 neurons we use the Gabor function.
Even symmetry
Odd symmetry
10 20 30 40 50
10
20
30
40
5010 20 30 40 50
10
20
30
40
5010 20 30 40 50
10
20
30
40
50
10 20 30 40 50
10
20
30
40
5010 20 30 40 50
10
20
30
40
5010 20 30 40 50
10
20
30
40
50
Sinus .* 2D Gaussian Gabor
![Page 36: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/36.jpg)
FILTERING THE IMAGES WITH GABOR FILTERS
We filter by doing a 2D convolution of the filters with the image. The different results are averaged together.
Left Right
![Page 37: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/37.jpg)
ESTIMATION OF BINOCULAR DISPARITY
2D cross correlation (xcorr2)
Tip: In most cases, peak cross correlation results in the x axis (columns) between the left and right eye should only be positive!
![Page 38: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/38.jpg)
BASIC ALGORITHM – MAXIMUM OF CROSS CORRELATION
![Page 39: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/39.jpg)
RANDOM DOT STEREOGRAM (RDS)
50 100 150 200
20
40
60
80
100
120
140
![Page 40: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/40.jpg)
3 VARIATIONS: 1. maximum of cross correlation 2. First neuron to fire in a 2D LIF array (winner take
all). The input is the cross correlation result. 3. Population vector of 2D LIF array after X
simulation steps. Same input.Notice the horizontal smearing in 2 and 3 is because of cross over activity when switching patches
![Page 41: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/41.jpg)
YOUR TASK1. Load a pair of stereo images. You can use the supplied function
image_loader.m which makes sure the image is in grayscale and has a proper dynamic range.
2. Generate the filters. You can use the supplied function generate_filters.m to generate the array of filters. I urge you to try out different sizes (3rd parameter) than the default ones in the function.
3. Filter each of the two images using the filters from the previous stage. You can use the function filter_with_all_filters.m to do this.
4. Now, using the two filtered images, iterate over patches, calculate the cross correlation matrix and determine the current depth using the methods 1-3 described in the previous slide. You can tweak the overlap of the patches to reduce computation time.
Note for methods 2 and 3: a. you can use the supplied function LIF_one_step_multiple_neurons.m to simulate the LIF neurons b. For methods 2 and 3 it is wise to normalize the xcorr2 results, e.g. by dividing by the maximum value or some such normalization.
![Page 42: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/42.jpg)
YOUR TASK - CONTINUED5. Incorporate all these into a function of the form:
result = find_depth_with_LIF( Left_im_name,Right_im_name, method_num,patch_size, use_filters ) Where result is the matrix representing the estimated depth (pixel shifts), Left and Right_im_name are the names of the stereo images. method_num is the number of the method (1-3, see above). patch_size the ratio of the patch size to the image dimensions. E.g. in an image that is 640x480 a ratio of 1/15 will produce a patch of size ~43x32. Please note that in matlab the indexing convention is rowsxcols (and not x,y) so the image is actually 480x640. use_filters is a flag that determines whether to filter the images (step 3 in previous slide) before computing the depth map (useful for debugging, however should be set to true when generating the final results).
In addition to the supplied stereo image pairs, you should also generate a left and right random dot stereogram image pair using the supplied function RDS_generator.m together with a mask (I supplied you with an example mask, RDS_Pac-Man_mask.png)
You can find other stereo image pairs online, e.g. http://vasc.ri.cmu.edu/idb/html/stereo/index.html
Bonus: You can add a fourth depth estimation method of our choice. This can be something you read somewhere or an original idea. For example you can use one of the methods 1-3 but change the patch scan so it won’t be an orderly right to left then down one row and right to left. Instead it can be a random scan which, among other things, will cause several regions to be left uncalculated but other regions more tightly sampled.
![Page 43: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/43.jpg)
WHAT TO HAND IN (E-MAIL IN)
1. Your code together with any images (regular and RDS) you used and the supplied images.
2. A document showing the depth results on the two supplied stereo image pairs and one RDS you generated, for each of the 3 methods .(If you chose to do the bonus then show the results for the bonus method as well). Don’t forget when showing results for the RDS to relate them to the mask used to generate it. The document should contain a concise explanation of what you did, your algorithms and interpretation of the results.
![Page 44: Computational Neuroscience Final Project – Depth Vision](https://reader035.fdocuments.in/reader035/viewer/2022062722/56813aee550346895da35feb/html5/thumbnails/44.jpg)
HOW TO HAND IN
The project should be submitted by mail to Omri.
Good Luck and a succesful test period (and vacation?) !!!