Imbue Reality: An Augmented Reality Mobile Application for...

Imbue Reality: An Augmented Reality Mobile Applicationfor Displaying User-Relevant Information

Patrick GreenUniversity of Puget Sound

1500 N WarnerTacoma, WA

[email protected]

Christopher LivingstonUniversity of Puget Sound


[email protected]

Billy RathjeUniversity of Puget Sound


[email protected] Spiegel

University of Puget Sound1500 N Warner

Tacoma, [email protected]

ABSTRACTThis paper describes the mobile application, Imbue, thatuses a GPS/compass-enabled AR algorithm to display to theuser relevant events, building information, and other perti-nent details within the user’s current surroundings. The mo-bile application has two AR views, a Camera AR view anda Map AR view. The Camera AR view is fully augmented,while the Map view simply orients the map to the currentheading and location of the user. The app supercedes theaccuracyrecent attempts to offer AR in conjunction with aworldwide point of interest database, and it is one of thefirst consumer AR mobile apps to be released. This appli-cation was developed over the course of the semester and isavailable to the public at no cost via the Google Play Storeand the Apple App Store.

General TermsHuman Factors, Performance, Algorithms

1. INTRODUCTIONWe present Imbue, a novel mobile application for Androidand iOS that uses Augmented Reality to display social me-dia and located-related information to users. The applica-tion displays information about nearby buildings, Facebookevents, and the location of nearby friends to users of theapp. In addition to allowing users to view information aboutbuildings and events around them, it also lets users identifytheir current location to Facebook friends.

The goal of Imbue was largely to create an app that bringsAR functionality to the general public, offering data of in-terest to the general public like social media and locationinformation. While other apps exist for displaying infor-

mationa about nearby points of interest using AR, thereare few publicly available applications intended for the con-sumer market. Moreover, there are virtually no applicationsthat draw from worldwide databases of point of intrest todisplay AR related information in worldwide environments,and those that do are available for highly specialized audi-ences like blind users [?]. This particular app suffers fromsignificant accuracy issues as well. Popular apps have re-cently added Augmented Reality technology. Yelp has anAR feature called ”monocle,” but the algorithm suffers fromsignificant accuracy and display issues.

Additionally, applications that display points of interestthroughout the world, such as Google Maps and GoogleEarth, do not presently include AR technology, but inte-grating AR into such apps is not extremely difficult.

In designing and implementing Imbue, we demonstrate thatproviding AR for points of interest worldwide, Facebookevents, and other social media data like user location canbe done to a high degree of accuracy with mobile phones.Imbue is also, to our knowledge, the first consumer-level so-cial media app using AR released for Android and iOS.

The app can be used in a variety of contexts. We have cur-rently added points of interest for all campus buildings at theUniversity of Puget Sound to the app. This could allow theapp to easilly facilitate campus tours and navigation, anda similar AR-based approach could be used on other cam-puses. We have also acquired points of interest from GooglePlaces and events and other information from Facebook todemonstrate the app’s ability to display content using ARthroughout the world.

Imbue is the first mobile application of its kind and attemptsto fully integrate the user’s social media experience into theirphysical interpretation of reality.

2. BACKGROUND AND RELATED WORK2.1 Augmented RealityAugemented reality refers to software that presents an envi-ronment to the user along with additional data and informa-tion not contained in that environment. Typically, an AR

application will overlay informative text and graphics on avideo of an environment. AR has been used in a wide varietyof settings, including in surgey, in navigation and tour ap-plications, and to enhance popular location-aware apps likeYelp.

In order to provide AR features, an application must firstidentify objects and locations in the user’s environment andthen augment the environment with additional information.Typically, this is done by first identifying features of theuser’s environment and then using a visual display to overlaytext, images, and other data on a video of the environmentdrawn from a camera.

Presently, there are three common approaches to AR: 1)sensor-based approaches, 2) computer vision approaches, 3)and hybrid approaches [?]. Sensor-based approaches useGPS, accelerometer, compass, camera, and other sensors toidentify locations. Computer vision uses image-detection al-gorithms and cameras to identify nearby objects. Hybridmethods combine both sensor and computer vision tech-niques.

Sensor-based approaches are faster than computer vision-based approaches, and they are also simpler to implement.The primary disadvantage of sensor-based approaches is thatthey can be susceptible to electromagnetic interference, mak-ing them difficult to use in indoor environments. Com-puter vision-based approaches are comparatively slower inaddition to being more difficult to implement. Their pri-mary advantage is that they perform well indoors. Hybrid-approaches aim to overcome the disadvantages of both ap-proaches by combining sensors and computer vision [?].

In developing Imbue, we chose to use a sensor-based ap-proach. We found that we can achieve acceptable perfor-mance indoors with sensors alone, particularly because weare not trying to identify interior objects with a high degreeof sensitivity. The majority of objects we identify are build-ings outside, and the only interior identification we mustperform is identifying the building in which a user is lo-cated. Moreover, computer vision is relatively slow com-pared to sensor-based approaches, and we wanted to maxi-mize identification speeds whenever possible in order to im-mediately identify buildings and to optimize CPU usage formobile devices. Finally, given the benefits and performanceof sensor-based approaches, we saw no need to implementmore complicated computer vision approaches for this app.

While off-the-shelf packages exist for computer-vision basedAR (such as OpenCV), there are few packages for sensor-based AR. As a result, we implemented our own sensor-based AR algorithm. In addition to the fact that no pre-existing libraries were available to implement the algorithmwe wanted to use, implementing our own version of sensor-based AR gives us signficant control over the algorithm, andhas allowed us the flexibility to optimize and adjust the al-gorithm as needed.

Mobile phones are useful for AR presentation because theyoffer both video cameras and displays, and their motilitymakes them more flexible than computers for AR in outdoorenvironments.

2.2 Related WorkRecently, several mobile applications have been developedto help users navigate large environments. Both sensor andcomputer-vision based AR systems have been proposed andimplemented for mobile AR applications.

Mata and Claramunt [?] developed an AR-based mobileapp for navigating cities that uses a combination of GPSand compass bearing to identify nearby buildings. The appdraws building information from an SQLite database. Theauthors note that the compass algorithm is in need of furtherrefinement.

Guan et al. [?] provides an alternate approach to AR-basednavigation of large cities. They developed a mobile appthat uses computer vision to identify nearby buildings. Theapp produces a list of nearby buildings using GPS, thenuses a learning algorithm (k-nearest neighbor) to identifyimages using the phone’s camera. The app compares im-ages from the camera against web-based image search re-sults. A combination of real-time web searches and com-puter vision techniques makes the app’s performance rela-tively slow, operating at peak speeds of 40-345 milliseconds.These peak speeds were achieved after optimizing the algo-rithm by matching only portions of images, and they stilllikely produce a human-perceptible lag.

Reitmayr and Drummond [?] have developed a hybrid aug-mented reality mobile application for navigating urban en-vironments. The app combines GPS and compass locationwith computer vision based recognition of buildings. It com-pares images of buildings to a database of 3D models. Thehybrid approach is intended to use computer vision to over-come the limitations of GPS in urban and building-heavyenvironments and the impact of sources of magnetic inter-ference on compasses. The disadvantage of this approach isthat the computer vision approach to building recognitionintroduces a 40 ms delay into the app, which, like Guan etal., likely produces a human perceptible lag.

Blum et al. [?] comes closest to our own implementation.They developed an iPhone app to help blind users navigateurban environments using AR. They rely on compass andGPS to determine location to points of interest. They usethe Google Places API to generalize the app such that it canhelp users navigate in any location on earth. Published in2013, the authors note that they are one of the first AR appsto draw from a dataset that can provide points of interestfor any location in the world. They conclude, however, thatthe current state of GPS and compass accuracy of iPhonesmakes accurate identification of urban buildings intractable,and that AR technology will have to be delayed until hard-ware improves. We find that our app performs with a highdegree of accuracy even in urban environments such as Seat-tle, and conclude that iPhone hardware is not presently toolimited to perform accurate GPS/compass based AR iden-tification.

Watts has developed TigerEye [?], a mobile application forproviding AR-based campus tours of Clemson University.As with Mata and Claramunt, the app uses GPS and com-pass heading to provide AR information about nearby build-ings. The app stores building corners in an SQLite database.

The algorithm for detecting buildings within the camera’sfield of view involves calculating a vector between the user’slocation and every building corner in the database. Theapp then eliminates vectors falling outside of the camera’sfield of view. It then averages together headings that cor-respond with corners for the same building. From there,it determines a single heading and provides building infor-mation onscreen by converting between heading and pixelswith a simple conversion factor. The app appears to in-clude a number of redundant calculations. It calculates abearing to every building corner on campus, even buildingsfar away from the camera, which could be eliminated witha simpler SQLite query. The app also calculates a bearingto each building corner only to average bearings for cornerscorresponding to the same building. Since the app is notusing building corner information, it could simply use aver-age bearings in the first place as long as identifications areaccurate.

Other than mobile AR apps, there are a variety of appsavailable for viewing content about points of interest, themost well known being Google Maps and Google Earth. TheGoogle Earth app displays icons for clickable informationabout points of interest. The clickable items range fromphotos taken at a a location to wikipages about a point ofinterest. Google Earth displays clicked information similarto the Building Info and Event Info activities. It is dif-ficult to develop an app with the same access to data asGoogle’s database, since Google places limits on how muchcontent you can pull from their places API, and does notlet you pull data from as many sources of information asthey have access to (until recently, for example, it was dif-ficult to integrate Google Streetview with Google maps).By adding other sources of information, however, we pro-vide more information than most mapping apps currentlydisplay to users.

3. IMPLEMENTATION

3.1 OverviewThe following section details the implementation of the An-droid and iOS versions of Imbue. We first discuss the Aug-mented Reality algorithm that we developed for Imbue. Wethen document the Android and iOS implementations in de-tail. We then outline the two views We then discuss the twoAR views Imbue presents: a map view that rotates andmoves with the user’s location and a camera view that dis-plays overlays with information about buildings and eventswithin the camera’s field of view. From there, we discussthe types of data that Imbue displays to the user with AR:Buildings, Events, and User Locations.

We maintain a database of points of interest and user loca-tions on a server hosted by Parse. Parse is a database andapp backend platform that allows for cloud hosting of Mon-goDB databases. It also provides a wide variety of methodsthrough its API for interfacing with the database, and forsocial media platforms like Facebook, that significantly sim-plify the process of accessing the database from Android andiOS.

We use the Parse server in order to allow POIs to be updated

and expanded at any time. The app communicates withthe server instead of storing POIs locally. We also use thedatabase to maintain user locations. At this time, we haveadded points of interest for all buildings at the Universityof Puget Sound. In the future, the database could store ad-ditional information about points of interest acquired fromGoogle Places.

3.2 Design Patterns and PhilosophyBoth the iOS and Android apps use the Model View Con-troller (MVC) pattern. MVC involves separating an appinto three components, a model that stores data, a viewthat displays data, and a controller that interfaces betweenthe data and the view. The data that we host on the Parseserver resides outside of the iOS and Android app imple-mentations and is the app’s model. Each app contains itsown sets of views, which are separated from program logic.Program logic is contained in the controllers of the iOS andAndroid apps. The controllers are a set of classes that per-form program operations, download data from parse, anddisplay data in the app views.

The app also makes substantial use of the client-server de-sign patter. This allowed us to seperate our data from theAndroid and iOS implementations, placing it in an easilyupdatable and implemenation agnostic environment. Be-cause much of the data, such as events, is subject to frequentchange, and because the app is designed to be used in loca-tions throughout the world rather than one fixed location,we found that offloading the app’s data to a server wouldalso allow the app to be used with the greatest amount offlexibility and to be readily expanded. Since the app wasoriginally intended for use at the University of Puget Soundwith goals to expand beyond it, using a server architecturefrom early in the implementation process made expansionrelatively seamless, and will allow the app to expand fur-ther in the future.

We used agile design techniques for both Android and iOSdevelopment. We made frequent use of pair programming,brainstorming and implementing ideas in group meetings.We often prototyped features on one platform before portingto another. Generally, we also aimed to rapidly implementfeatures before going back and refining designs and layouts.

3.3 Augmented Reality Algorithms3.3.1 AR Algorithm

We developed our own AR algorithm for both iOS and An-droid. It is used to display overlays on the camera view.The algorithm is detailed below:

1. Create a list of all points of interest within a 111 meterradius of the user’s current location.

2. Find all points of interest in this list that fall withinplus or minus 1/2 the angle of view (50 degrees). Thisis accomplished by first determining the heading fromthe user to each point of interest, then determining ifthe difference between this value and the user’s com-pass bearing is less than or equal to half the angle ofview.

There are two special cases in which the calculatedheadings must be adjusted. Generally speaking, thesespecial cases occur when the user’s compass bearing iswithin the field of view of a point of interest, but thecompass bearing and point of interest sit across thethreshold from 365 degrees to 0 degrees. This wouldresult in reporting a large difference in bearings whenthere is really a small one. In these cases, 365 degreesis added to one of the values to normalize the differencein bearing as follows:

(a) If the user’s current heading is >360-1/2 the angleof view and the heading to the point of interest is< 1/2 angle of view, then 360 degrees is added tothe heading to the point of interest.

(a) If the users current heading is < 1/2 angle of viewand the heading to the point of interest is >360-1/2 the angle of view, then 360 degrees is addedto the user’s heading.

3. Add 1/2 the angle of view to the heading value of thepoint of interest (providing a range of 0 to the angleof view) to get a raw value

4. Divide the number of pixels the screen is wide by theangle of view to obtain a scaling factor

5. Multiply the value by the scaling factor to find whereon the screen the pixel should be drawn (see below).

3.3.2 Displaying Overlays OnscreenScreen x-axis coordinates for displaying overlays for pointsof interest within the user’s field of view are calculated usingthe following formula. Let s be the device’s screen width inpixels, f be the field of view size in degrees, b be the bearingto a building coordinate in degrees, and c be the currentcompass bearing in degrees.

x =s12f× (b− c) + f

The formula first calculates a conversion factor to convertbetween screen pixels and degrees in the field of view. Thisvalue is multiplied by the difference between the bearing tothe point of interest and the current bearing, a maximumvalue of half of the field of view, to yield the difference inpixels between the camera’s line of sight and the point ofinterest. The size of the field of view is then added to theproduct in order to position the point of interest in relationto the start of the field of view.

Y-coordinates are generated arbitrarily. Since there may bemultiple points of interest at the same GPS location, eachnew coordinate is placed a constant y value away from theprevious coordinate in order to prevent overlap.

3.3.3 Augmented Map View AlgorithmAnother simpler augmented reality algorithm is used forthe augmented map view. This algorithm simply aligns themap’s camera heading in response to compass sensor changesto keep the map oriented with the user’s heading.

Figure 1: Points of interest with in the blue arecurrently near and the yellow shows our angle ofview which if a point of interest is with in both willthen be shown on the screen.

3.4 AndroidAndroid is one of the mobile OS platforms that this app wasdeveloped for. The language to create Android applicationsis Java. There are many aspects that go into creating amobile application. Aspects such as design guidelines andthe Android framework. The Android Framework is whatmakes up a mobile application. It provides many differentAPIs to use the features Android has to offer. There aremany small parts that all work to have a application runon a phone. We will touch on a few of the most importantaspects of the framework.

3.4.1 ActivitiesAn activity can be best thought of a interaction point be-tween the user. The activity allows a place for UI to beplaced. When the app first starts up a map is loaded. Thereis a pullout side menu also. This entire screen can be thoughtas an activity. This application has many different activi-ties. These activities each serve a particular purpose, suchas displaying building or event information, the augmentedview and map view. It is important to keep track of Activ-ity lifecycles. As a user moves throughout the application,they may close different activities and open new ones. Itis important to manage the lifecycle of these activities orelse they add up and take up more of the phones resources.There is the ability to pause, resume, restart and stop ac-tivities and we stop all activities when they are left. Whenaccessing a new activity a new instance of the activity willbe loaded. The only activity that is not killed when the usernavigates away is the map view activity. This activity doesrestart though when the user navigates back to provide anaccurate location and query of nearby buildings.

3.4.2 LayoutsA layout is a way to define the user interface of an activity.Layouts are made of XML language. XML is very similar tothe HTML syntax because of the use of the tags, attributesand elements. When an activity is created, a xml layout canbe loaded along with other user interface aspects. Layoutelements can be created at runtime also, such as the eventlist in the Building Info Activity. Every XML layout musthave a particular layout structure as the root element. Thiscould be a commonly used linear layout, grid layout or rela-tive layouts. Custom layouts can also be created to further

Figure 2: The view when the application is startedin Android

customize the user interface. Everything from margins, textsize, font, colors and titles are determined in each elementof a layout.

3.4.3 Map ActivityThe map view of the app is what is loaded on the launchof the app. This activity does some important tasks on thecreation of the activity listed here:

1. The app makes a connection to Parse

2. Then the sensor service is initialized

3. The main view of the activity is paired with the XML

4. Next a Google Map is created an placed on the screen.This map is enabled to have on touch listeners.

5. Last the side drawer menu is created to provide theuser with an easy way to navigate throughout the ap-plication.

3.4.4 Camera ActivityAndroid handles the camera view different from the iOS ver-sion. In Android, the Camera Activity is where the major-ity of the math to calculate augmented reality takes place.Some work goes into setting up this user interface for theuser to interact with.

3.5 iOSThe iOS app is structured similarly to the Android app, butthere are a number of differences in implementation stem-ming from differences inherent to iOS and Android develop-ment.

The iOS implementation is comprised of three Objective-C classes, ViewController, ContentViewController, andBearing.

While Google Maps provides an Android method for cal-culating bearings to points of interest, iOS does not. TheBearing class calculates a bearing from the user’s compassbearing to a point of interest using the following formulas:

y = sin(dLon) × cos(lat2)

x = cos(lat1)×sin(lat2)−sin(lat1)×cos(lat2)×cos(dLon)

The ViewController class provides the majority of the app’sfunctionality. When the app is first launched, it establishesa connection with Parse and downloads points of interest,storing them in a mutable array. It also downloads Facebookevents on launch along with any available user locations.When a GPS and compass reading is first acquired, it orientsthe map in relation to these readings.

When the app is rotated into landscape mode, the ViewCon-troller allocates and initializes a UIImagePickerController.The image picker controller is typically used to quickly takea picture and add it to a user’s library, but with the additionof overlays, it provides an easy way to create a customizedcamera view on iOS. There are few other approaches to dis-playing a customized camera view. To implement this cus-tom view, text and image layers were added to the view assubviews to display AR overlays. Additionally, the cameraview has been transformed to fill the entire screen, as thedefault camera view does not match the screen size.

The ContentViewController displays detail views for pointsof interest. It is passed the title and description of pointsof interest. It acquires images either from the ViewCon-troller for buildings that are loaded into the Parse databaseor by making a call to Facebook’s API for Facebook eventimages. This lazy initialization is done to avoid loading alarge number of Facebook images upfront.

There are some other minor differences in implementationbetween iOS and Android. The iOS app does not display

events in the augmented camera view while the Android ver-sion does. This was a performance concession; after findingthat events significantly impacted the performance of theAndroid app, we decided that they were not worth imple-menting for the iOS app. The iOs app also does not imple-ment a search bar. By contrast, the Android app does notdisplay the user’s location but displays the location of otherusers. The iOS app makes use of a lightweight menu ratherthan a swipe-out ”drawer” menu because Apple has recentlydiscouraged the use of this type of layout.

3.6 Map ViewThe map view snaps to the user’s current location. The mapwill rotates similar to a compass based on the user’s bearing.When the screen is tapped or when a marker is clicked, themap stops rotating with the user’s bearing to allow the userto zoom and otherwise adjust the map. In order for the userto quickly snap back to their location, the compass icon inthe top right can be pressed.

The map view displays data via several different types ofmarkers. There are red markers which represent buildings,yellow markers that represent interior points of interest, pur-ple markers representing the location of nearby facebookfriends, and blue markers representing public and privateFacebook events.

The map view activity also has a search function in the topaction bar. The magnifying glass button opens the searchbar. The search queries the titles of all currently visiblebuildings to check for a match.

3.7 Camera ViewThe user has the ability to get to the camera activity by (onAndroid only) selecting the camera menu button or by ro-tating the phone to a landscape position. This starts theaugmented view. The augmented view shows all nearbybuildings and, for Android, Facebook events. The cameraactivity has two parts. The camera interface and the cameraoverlay interface. The camera interface displays the imagethat the front camera is pointing at. The camera overlayinterface is used to draw overlay content on top of the cam-era interface. The camera overlay allows for drawing objectsand text on the screen. This method was found to be themost efficient for time and timeline constraints when work-ing on the project, but it would be desirable to, as futurework, implement this overlay in OpenGl, as this would allowfor overlaying content in a 3D plane instead of the 2D planethe camera overlay currently uses.

3.8 BuildingsImbue displays information about nearby buildings and pointsof interest to users in its AR views. Building informationcomes from two data sources: Google Places and our ownParse server database. We used Google Places to collect anddisplay information about buildings throughout the world.We query Google Places directly for buildings near the user.

We also maintain our own server database with buildingnames, descriptions, images, coordinates, and rectangularcoordinate regions used for calculating whether the user isinside of the building. The server currently contains in-formation about all buildings for the University of Puget

Figure 3: An example of the Map View on Androidthat orients itself to the user’s heading

Sound. This allows the app to be used for campus tours andby students of the University of Puget Soundl, but, in the fu-ture, we could also store building information acquired fromGoogle Places in order to minimize Google Places queries(which are limited to 1000/day within the free payment tier)and to augment Google Places data with additional sourcesof information.

3.8.1 Google Places BuildingsWe include building information from Google Places to ac-quire AR overlays for buildings throughout the world. GooglePlaces is a rich source of information; Blum et al. found itsuperior to other data sources like FourSquare in terms ofcoverage.

Google Places buildings were drawn from Google’s data repos-itory using a javascript API query performed with ParseCloud Code. We used Cloud Code because Google Placesonly provides a javascipt API. When the phone launches themain screen, we call a method in Cloud code to query theGoogle Places API. The process of querying Google Placesis as follows:

1. The current user location is placed in a dictionary thatis passed to our Cloud Code function as a parameter.

2. The app calls the ”placesRequest” Parse Cloud Codefunction using an asyncronous HTTP request methodprovided through the Parse API. This function sub-mits the following API query to Google places in theform of a GET request:

https://maps.googleapis.com/maps/api/place/nearbysearch/json?location=’ + request.params.location+’&radius=1000&sensor=false&key=[Our API Key]

The query returns a JSON object containing the namesand locations of nearby buildings within a 1000 m ra-dius or, if unsucessful, returns an error. The locationof the query is determined from the dictionary passedto Cloud Code as a parameter.

3. When the function completes we parse the JSON ob-ject to obtain the names and locations of all buildings.We store these in a list of buildings.

The information provided is extremely limited (only build-ing names and locations are returned with some limited userreview information if available). In the future, it wouldbe ideal to query for more building information from othersources. We could also crowdsource the acquisition of infor-mation and allow users to upload information about build-ings to our server.

3.8.2 Parse Server BuildingsWe collected information for all buildings at the Universityof Puget Sound and store them in a Parse database. Thedatabase contains fields for building tile, description, image,coordinate location, and coordinate boundary.

The coordinate boundary allows the app to determine whetherthe user is currently inside of a building. It also allows us topresent information about what is happening in the buildingsuch as what Facebook events fall within those coordinateboundaries.

In the future we would like to host our own database ofbuilding information for points of interest throughout theworld, relying on Google Places only when the database doesnot contain sufficient information about nearby places. Thiswould reduce our reliance on Google Places, which offers alimited number of daily queries, allow us to synthesize in-formation from additional sources, and also allow us to of-fer additional information not available from Google Places.As long as we pull information directly from Google Placeswithout also maintaining our own database of points of in-terest, we will be limited by the information available fromGoogle Places.

Figure 4: An example of the information containedin a building

3.9 EventsTo show users Facebook events we used FQL(Facebook query-ing language). We wrote a query that would pull any eventsthat a user could see based on their current location. Orig-inally we limited the events to being in a few mile range ofuser. We decided to generalize the query to return eventsthroughout the world, as the query still returns a relativelysmall amount of data. This allows us to show a user all theevents that are public or that are available to their friendsor themselves. This means that there will be a unique mapfor each user based on their social media connections. TheFQL query for obtaining events is as follows:

SELECT name, pic_big,venue, description, start_time,

end_time, eid FROM event WHERE eid IN (SELECT eid FROM

event_member WHERE (uid IN (SELECT uid2 FROM friend

WHERE uid1 = me())OR uid = me())limit 2500)

AND (end_time>now() OR start_time>now()

AND venue.latitude > (360)

AND venue.latitude <(360)

AND venue.longitude < (360)

AND venue.longitude > (360)

The query obtains pictures, coordinate locations (containedin the ”venue” term), descriptions, and start and end timesfor events. It obtains events for the user’s friends (”FROMfriend”) or for the user (”uid = me()”). It performs a check todetermine whether the event is currently taking place, andlimits the number of events obtained to 2500 since the mapview begins to lag when it must display more than severalthousand markers. Event information returned is parsedand stored in a list of events.

3.10 User LocationImbue allows users to log their current locations in order toshare their location with their Facebook friends. Users canchoose to log their current locations and a message abouttheir current location. The message is recorded along withthe user’s name and current GPS coordinate. This informa-tion is stored on the Parse server for each user. In addition,the Parse server maintains a boolean field shareLocation

that specifies whether or not the user’s location is currentlybeing shared. Users can manually choose to stop sharingtheir locations, in which case this field will be set to false.The field is also set to false if users move a certain radiusaway from their current location or if a certain amount oftime elapses since the user’s location was last logged.

Locations of the current user’s Facebook friends are plottedon the map view. To do this, the app’s controller filters thelist of app users for only those who are Facebook friendswith the current user and then plots the locations of anyusers choosing to share their locations. The app controllerdoes this by first querying Facebook for a set of unique idnumbers for users who are Facebook friends with the currentuser. It then queries the Parse server for a dictionary of appusers ordered by Facebook unique id. For each Facebookfriend whose unique id is in the dictionary, it checks if theshareLocation field is set to true, and, if it is, it obtainsGPS coordinates and the message for the user’s locationand plots these on the map.

The FQL query for identifying Facebook friends of the cur-rent user is:

SELECT uid FROM user WHERE is_app_user AND uid IN (SE-

LECT uid2 FROM friend WHERE uid1 = me())

3.11 Updating User DataBecause the app is highly dependent on the user’s currentlocation, we frequently query our various data sources as theuser’s location changes. We query for Facebook events, thelocations of nearby Facebook friends, Google Places, andour own database of points of interest when the app is firstlaunched, when the app is relaunched from the background,and whenever the user moves outside a significant radius.

Table 1: Performance of AR Algorithm on iOSTrial Avg Speed (ms)1 2.022 2.053 1.984 1.955 2.01

3.12 Libraries

3.12.1 FacebookFacebook is used for many of Imbue’s features. We use theFacebook API to display events to users. We also use Face-book in conjunction with our nearby friends feature. Wequery the Facebook API to obtain a list of Facebook friendsin order to determine whether a nearby user is Facebookfriends with the user and should be displayed on the map.

3.12.2 Google MapsThe Google Maps API brings in features such as 3D render-ing, vector tiles, indoor floor plans, gesture controls, loca-tion services and the ability to draw on and customize themap. Imbue uses all these components. 3D rendering isshown where available. Indoor floor plans are very usefulin Imbue, because a user can share their location to theirFacebook friends and friends will be able to see what part ofthe building they are in. In addition, indoor maps providea variety of free points of interest displayed directly on theGoogle map.

3.12.3 Google PlacesThe Google Places API provides a database of general infoabout points of interest throughout the world. Google Placesallows 1,000 requests per day. With credit card verification,the amount of requests can be increased to 100,000 per day.The Google Places API allows for requests to be made fornearby buildings, and it can display details about the build-ings and images of the buildings. Given Google Places’s se-vere restrictions on daily queries and our restricted budget,we will need to find a solution for obtaining or storing pointof interest data without or in addition to Google Places inthe near future.

3.12.4 ParseParse was acquired by Facebook in 2013. Parse is an appbackend development service. It provides database hostingfor mongoDB databases, as well as APIs to interact withdatabases from Android and iPhone apps. Parse is very pop-ular with hosting data for many mobile applications. Parseoffers other services such as Parse Cloud Code, which hostsjavascript files that can be called from mobile devices andexecuted on Parse’s cloud servers. Parse recently changedits pricing plan to allow up to 30 requests per second, whichshould be enough to accomodate Imbue until it acquires asignificant user base, and even then Parse is designed to bescalable and relatively inexpensive.

4. EVALUATIONWe measured the performance in milliseconds of the AR al-gorithm across three trials. The time is measured for the

Table 2: Performance of AR Algorithm on AndroidTrial Avg Speed (ms)1 1.162 3.263 3.24

Table 3: Performance of AR Algorithm on Androidwith Facebook Events

Trial Avg Speed (ms)1 2702 2823 300

entire AR algorithm’s operation on both devices, startingat the creation of a list of nearby buildings using GPS tothe display of an onscreen overlay for each building withinthe camera’s field of vision. Time values are averaged across1000 calls to the AR routine. The average speed of the algo-rithm falls consistently within 1-2 milliseconds (avg. 1.98)on iOS and 1-2 millisecond averages on Android.

Our AR algorithm performs 20 times faster on iOS than themost optimized version of Guan et al.’s computer vision ARalgorithm and 272 times faster than the least optimized ver-sion of their algorithm. The Android performs 40 percentfaster than the most optimized version of Guan et al. Theleast optimized version of Imbue on Android runs 270 per-cent faster than their application. Their algorithm achievesa maximum speed of 40-60 ms after several optimizationsand using fast sensors, and a worst case performance of 490-540 ms with no optimizations and slowest sensors [?]. Theapp likewise outperforms the algorithm developed by Reit-mayr and Drummond, which, like Guan et al., achieves apeak performance of 40 ms. We credit this to the supe-rior performance of sensor-based approaches over computervision techniques.

When we add Facebook events that are displayed to thescreen on Android, we observe an average time of 274ms.The events dramatically slow down the algorithm simply dueto the size some type of sorting could dramatically improveboth event and event free versions of the algorithm. We donot report event data for iOS because we elected not to dis-play events in the camera view largely due to the significantperformance drop we observed on the Android platform.

5. DISCUSSIONAs noted throughout this paper, Imbue provides a variety offunctionality that improves upon AR applications curentlyavailable.

Imbue runs faster than Guan et al.’s computer vision ARalgorithm which was running on top quality hardware in2010. It similarly outperforms Reitmayr and Drummond’s2006 computer vision/sensor hybrid algorithm. In generaluse, we have found that Imbue also does not suffer fromthe same degradation in sensitivity characteristic to sensorbased AR methods, although we have not yet evaluated itsaccuracy at the level of sensitivity used to evaluate com-puter vision based algorithms. However, we have found that

it does consistently recognize points of interest accuratelyeven in urban environments, which commonly trouble com-pass and GPS readings. Moreover, even as our datasetshave grown to include buildings, events, and user locations,Imbue’s AR algorithm has not suffered from significant per-formance degredations.

In fact, Oculus Rift has recently noted that an algorithmrunning below 20 ms does not suffer from human-perceptiblelag, although others have argued that application should aimfor 7-15 ms. Anything over 40 ms results in a ”distubringlag.” [?] Imbue consistently performs at approximately 2ms, while the majority of recent computer-based AR appli-cations published perform at or above 40 ms.

Imbue has also expanded past work such as Google Mapsand Yelp’s mobile application. Imbue creates a unique com-bination of social media, augmented reality and maps. Wefeel that Imbue, when used in combination with mappingservices like Google Maps, offers useful functionality thatcan be easily intergrated with existing navigation technolo-gies. Imbue supercedes other technologies that have at-tempted to integrate AR with existing social media, likeYelp’s AR feature, ”Monacle.” We account for the rotationof the phone around the x-axis where the Yelp app does not.Yelp’s failure to do this menas that Yelp’s labels appear inincorrect locations. Yelp’s camera view does currently dis-play graphical overlays superior in quality to Imbue’s, butImbue’s camera view allows for smooth drawing of the mark-ers that are overlaid on the screen, while Yelp’s are choppyand stutter frequently.

6. FUTURE WORKBecause Imbue includes both AR and social media features,we could provide a variety of new functionality for eitherthe augmented or social media side of our app. On the aug-mented side, storing additional points of interest in our owndatabase is one of our first goals. This will allow mark-ers to display significantly more information not containedin Google Places, such as picture galleries and other infor-mation tailored to the specific point of interest, like menusfor restaurants. Expanding our database will also reduceour Google Places query load, which will become unreason-ably high if the app gains even a small or average-sized userbase. In order to expand our database, we will need to lookinto collecting data from other public domain informationsources or from crowd-sourced, user-inputted data.

On the social media side, improvements can be made thatprovide additional ways of interactions with friends. Im-plementing group chat options for friends sharing locationwould be a useful feature for people to have. Being able toshow more information about friends sharing their location,such as profiles pictures, statuses, tweets from Twitter andreal time updates of what friends are sharing could also beuseful additions to the app.

Implementing graphics in Open GL is left as future work.While we attempted to implement Open GL, we ran intoa variety of issues that made displaying text and graphicsunfeasible. The implementation was too inefficient and stut-tered. However, Open GL is widely used on mobile devicesand should, with additional time, be relatively simple to im-

plement. Aligning graphics in three dimensional space willmake the relationship between buildings and their overlaysmuch clearer.

Finally, it would be ideal to implement Imbue for additionaldevices, such as the iPad and Google Glass. Preparing theapp for iPad could be done extremely quickly, while imple-menting it for Google Glass, while more time consuming,could allow users to integrate the app into everyday life.

7. CONCLUSIONSIn developing Imbue, we created several novel features notcurrently available in apps for mobile devices while achievingmany of the goals we set out to accomplish. We believe thatsoon, many of the features we have been working on will beseen in many applications we use today. We developed an ef-ficient augmented reality algorithm ourselves, and we madestrides in displaying information about the campus and sur-rounding areas. We also developed an interactive map viewwhich has since been adopted in Google Maps. Overall wecreated an engaging augmented reality application for twoplatforms that we believe offers a fun, intersting, and newway to view location-based information.

8. ACKNOWLEDGMENTSWe would like to thank Prof. Brad Richards for oversee-ing capstone and providing project guidance, Mike Rotters-man, Associate Director of Admissions, for providing linksto Puget Sound campus and building information for ourdatabase, and the University of Puget Sound computer sci-ence department for a quality computer science education.

Imbue Reality: An Augmented Reality Mobile Application for...

Documents

Transcript of Imbue Reality: An Augmented Reality Mobile Application for...