Accelerometers to Augmented Reality

Post on 17-May-2015

3.844 views 3 download

Tags:

description

iOSDevCampDC 2011 talk by Jonathan Blocksom

Transcript of Accelerometers to Augmented Reality

From Accelerometers to Augmented Reality

Jonathan Blocksom@jblocksom

iOSDevCamp 2011August 13, 2011

Jonathan Saggau(@jonmarimba)

present in spirit

The Theme

• Mobile devices are not just for viewing information on a screen; they can be a gateway to interacting with the world around us.

• What can we do with them and how do we do it?

iOS and the World Around Us

Purpose Sensor SDK

Device Movement Accelerometer, Gyros Core Motion

Geolocation GPS, Magnetometer Core Location

Video Camera AVFoundation

Audio Microphone AVFoundation,Core Jose

Core Motion

• High level interface to orientation and movement data

• iOS 4+

• Accelerometer and Gyroscope

• Sensor Fusion and Data Filtering

Motion Sensors:Accelerometer

• Accelerometer

• From iPhone 1

• Noisy Gravity Detector

• 100Hz, +/- 2.3G

!"#$#%&'%()*+,%-#,.#/%"012334445+67+$58#93:;+,<3=9)><39<$)3=!?@A1BCDEE@4E&>%

Motion Sensors:Gyroscopes

• Gyroscope

• iPhone 4, iPad 2

• Rotation Rate

• 200/500/2500 dps

Demo:Vibration

• Visualize Accelerometer Data

• App Store, $4.99http://itunes.apple.com/app/vibration/id301097580

Demo 1:Core Motion Teapot

Using Core Motion

• Poll for data or block handlers for updates

• Data asYaw, Pitch, RollQuaternion4x4 Transform

• Timestamps included

Demo:Core Motion Viewer

• https://bitbucket.org/jblocksom/coremotionviewer

Core Motion Resources

• Core Motion Framework Reference• Event Handling Guide for iPhone OS

• “Core Motion” under “Motion Events”• WWDC ’10: CoreMotionTeapot• J. R. Powers Talk from VTM November

’10• O’Reilly Basic Sensors in iOS

Computer Vision

Computer Vision:OpenCV

• The SDK I love to hate

Building OpenCV on iOS

• Follow this 20 step recipe:http://computer-vision-talks.com/2010/12/building-opencv-for-ios/

• Or go here:https://github.com/jonmarimba/OpenCV-iOS

Face Detection

• Use Haar Wavelet classification• Built in classifiers in OpenCV to find

front and side facing faces• Not perfect, not too fast, but not bad• Video: http://vimeo.com/12774628

Haar classification

• “Cascade of boosted classifiers working with haar-like features”

Loading the Classifier

• Just call cvLoad        NSString  *path  =  [[NSBundle  mainBundle]  

                                           pathForResource:@"haarcascade_frontalface_default"          ofType:@"xml"];

CvHaarClassifierCascade  *cascade  =                  (CvHaarClassifierCascade  *)cvLoad(

                                                             [path  cStringUsingEncoding:NSASCIIStringEncoding],                                                                                      NULL,  NULL,  NULL);

Running the classifier

• Image, Haar cascades, spare storage• 1.2f: Size inc. for features per stage• 2: Minimum rectangle neighbors• Canny Pruning: Throw out areas with too

few / too many edges

 CvSeq*  faces  =  cvHaarDetectObjects(small_image,  cascade,storage,  1.2f,  2,  

               CV_HAAR_DO_CANNY_PRUNING,cvSize(20,  20));

What it’s Doing

• Windows show where wavelets being checked

• Overlapping rectangles are a detection

Defeating Face Detection

• cvDazzle project• Can also just turn to

the side

Demo:Face Detection

• OpenCV based

Feature Matching

• Feature Matching is the workhorse of modern computer vision• Panoramas• Image stabilization• Superresolution• 3D reconstruction

Feature Matching

• SIFT, SURF, FLANN:Salient points in an image

Scale

(first

octave)

Scale

(next

octave)

Gaussian

Difference of

Gaussian (DOG)

. . .

Figure 1: For each octave of scale space, the initial image is repeatedly convolved with Gaussians toproduce the set of scale space images shown on the left. Adjacent Gaussian images are subtractedto produce the difference-of-Gaussian images on the right. After each octave, the Gaussian image isdown-sampled by a factor of 2, and the process repeated.

In addition, the difference-of-Gaussian function provides a close approximation to thescale-normalized Laplacian of Gaussian, !2!2G, as studied by Lindeberg (1994). Lindebergshowed that the normalization of the Laplacian with the factor !2 is required for true scaleinvariance. In detailed experimental comparisons, Mikolajczyk (2002) found that the maximaand minima of !2!2G produce the most stable image features compared to a range of otherpossible image functions, such as the gradient, Hessian, or Harris corner function.

The relationship betweenD and !2!2G can be understood from the heat diffusion equa-tion (parameterized in terms of ! rather than the more usual t = !2):

"G

"!= !!2G.

From this, we see that !2G can be computed from the finite difference approximation to"G/"!, using the difference of nearby scales at k! and !:

!!2G ="G

"!"

G(x, y, k!) # G(x, y,!)

k! # !

and therefore,

G(x, y, k!) # G(x, y,!) " (k # 1)!2!2G.

This shows that when the difference-of-Gaussian function has scales differing by a con-stant factor it already incorporates the !2 scale normalization required for the scale-invariant

6

(a) (b)

(c) (d)Figure 5: This figure shows the stages of keypoint selection. (a) The 233x189 pixel original image.(b) The initial 832 keypoints locations at maxima and minima of the difference-of-Gaussian function.Keypoints are displayed as vectors indicating scale, orientation, and location. (c) After applyinga threshold on minimum contrast, 729 keypoints remain. (d) The final 536 keypoints that remainfollowing an additional threshold on ratio of principal curvatures.

As suggested by Brown, the Hessian and derivative of D are approximated by using dif-ferences of neighboring sample points. The resulting 3x3 linear system can be solved withminimal cost. If the offset x̂ is larger than 0.5 in any dimension, then it means that the ex-tremum lies closer to a different sample point. In this case, the sample point is changed andthe interpolation performed instead about that point. The final offset x̂ is added to the locationof its sample point to get the interpolated estimate for the location of the extremum.

The function value at the extremum, D(x̂), is useful for rejecting unstable extrema withlow contrast. This can be obtained by substituting equation (3) into (2), giving

D(x̂) = D +1

2

!D

!x

T

x̂.

For the experiments in this paper, all extrema with a value of |D(x̂)| less than 0.03 werediscarded (as before, we assume image pixel values in the range [0,1]).

Figure 5 shows the effects of keypoint selection on a natural image. In order to avoid toomuch clutter, a low-resolution 233 by 189 pixel image is used and keypoints are shown asvectors giving the location, scale, and orientation of each keypoint (orientation assignment isdescribed below). Figure 5 (a) shows the original image, which is shown at reduced contrastbehind the subsequent figures. Figure 5 (b) shows the 832 keypoints at all detected maxima

11

Image gradients Keypoint descriptor

Figure 7: A keypoint descriptor is created by first computing the gradient magnitude and orientationat each image sample point in a region around the keypoint location, as shown on the left. These areweighted by a Gaussian window, indicated by the overlaid circle. These samples are then accumulatedinto orientation histograms summarizing the contents over 4x4 subregions, as shown on the right, withthe length of each arrow corresponding to the sum of the gradientmagnitudes near that direction withinthe region. This figure shows a 2x2 descriptor array computed from an 8x8 set of samples, whereasthe experiments in this paper use 4x4 descriptors computed from a 16x16 sample array.

6.1 Descriptor representation

Figure 7 illustrates the computation of the keypoint descriptor. First the image gradient mag-nitudes and orientations are sampled around the keypoint location, using the scale of thekeypoint to select the level of Gaussian blur for the image. In order to achieve orientationinvariance, the coordinates of the descriptor and the gradient orientations are rotated relativeto the keypoint orientation. For efficiency, the gradients are precomputed for all levels of thepyramid as described in Section 5. These are illustrated with small arrows at each samplelocation on the left side of Figure 7.

A Gaussian weighting function with ! equal to one half the width of the descriptor win-dow is used to assign a weight to the magnitude of each sample point. This is illustratedwith a circular window on the left side of Figure 7, although, of course, the weight falls offsmoothly. The purpose of this Gaussian window is to avoid sudden changes in the descriptorwith small changes in the position of the window, and to give less emphasis to gradients thatare far from the center of the descriptor, as these are most affected by misregistration errors.

The keypoint descriptor is shown on the right side of Figure 7. It allows for significantshift in gradient positions by creating orientation histograms over 4x4 sample regions. Thefigure shows eight directions for each orientation histogram, with the length of each arrowcorresponding to the magnitude of that histogram entry. A gradient sample on the left canshift up to 4 sample positions while still contributing to the same histogram on the right,thereby achieving the objective of allowing for larger local positional shifts.

It is important to avoid all boundary affects in which the descriptor abruptly changes as asample shifts smoothly from being within one histogram to another or from one orientationto another. Therefore, trilinear interpolation is used to distribute the value of each gradientsample into adjacent histogram bins. In other words, each entry into a bin is multiplied by aweight of 1 ! d for each dimension, where d is the distance of the sample from the centralvalue of the bin as measured in units of the histogram bin spacing.

15

Application:Automatic Panoramas

• Application: Panoramas

Tracking

• Feature finding and matching is slow

• Lower quality features can match faster with same results

Demo: FAST Tracking

Augmented Reality

Augmented Reality

• Fuse live video with generated pixels based on device sensors

• Geolocated• Marker Based

• Commercial SDKs available

Geolocated AR

• AR Based on GPS Location• Fuse rendered objects with real world

locations

3DAR Toolkit

• http://spotmetrix.com/• Drop in replacement for MKMapView• Shows AR view based on phone

orientation• Free if branded• $5K for unbranded

Demos:LAYAR, 3DAR

Marker Based AR

• Find a marker• Figure out camera transform to it• Render something on top of it

• String SDK• Qualcomm AR SDK

Demo:Marker Based AR

SDK License Notes

NYAR GPL Old

Stringhttp://poweredbystring.com

Commercial $

Qualcommhttp://developer.qualcomm.com/dev/

augmented-realityCommercial, No cost Still in beta

Qualcomm SDK

• FLANN to find initial features• FAST to update after marker is found

That’s It!

• Qualcomm AR SDK:http://developer.qualcomm.com/dev/augmented-reality

• String SDK:http://poweredbystring.com

• Me:http://twitter.com/jblocksom/