VISYM: A Platform for Large Scale Computer Vision€¦ ·  · 2013-05-01VISYM: A Platform for...

8
VISYM: A Platform for Large Scale Computer Vision Jeffrey Byrne, Jack Sim and Jianbo Shi University of Pennsylvania {jebyrne,jiwoong,jshi}@cis.upenn.edu Abstract What if every vision algorithm ever written was ready to run via browser, on images from any network connected camera? In this paper, we introduce visym.com, a new plat- form to easily share and distribute computer vision as a real time web service. We believe that this platform can have a broad impact for the vision research community. Specifi- cally, it can provide a means for seamless MATLAB algo- rithm and data sharing, transparent coarse grained paral- lelization, intermediate results reuse, and resource pooling, all without changing current R&D software practices. Fur- thermore, exposing vision as a real time web service opens up development to the web, to address the long tail of vi- sion applications. A beta test of this platform is currently underway at http://beta.visym.com. 1. Introduction Modern computer vision research and development has exploded since its origins in the 1950s to include a wide variety of applications. For example, a scan of research archives including CVPR, ICCV and PAMI shows over 107,000 papers with 7300 keywords associated with com- puter vision, with algorithms and applications ranging from abnormal event detection to zip code recognition. In gen- eral, these applications following a long tailed distribution, such that there are some popular applications such as face recognition, but many more niche applications. It can be ar- gued that the ”killer app” for vision is not a single popular application, rather it is a platform for distribution of niche apps. The challenge of distribution is that vision is a big data problem. Recent work in internet scale vision has moved to databases with millions of labeled training images [1, 2, 3], and this number will only increase [7]. Simply sharing ter- abyte scale databases is a challenge due to limited band- width, which means that practical sharing of datasets re- quires physically shipping hard drives between groups. Fur- thermore, sharing functions such as data driven classifiers trained on big datasets often have parameters which require Figure 1. VISYM is a platform for sharing and distribution of web- services for computer vision sharing the majority of the dataset as well [2, 5, 12]. Also, processing big datasets requires a large computational plat- form. Most vision algorithms exhibit coarse grained image parallelism, which can be exploited for large scale process- ing. However, setting up a cluster to handle peak load may not be cost effective for individual research groups, and these computational resources still need datasets close by. So, there is a need for a large scale vision platform to share big data with the computational resources to process this data by exploiting data locality. Vision is also a big, diverse data problem. Complex im- ages may require many functions operating in parallel on a single image to extract all relevant content [9]. Further- more, many algorithms share similar preprocessing and fea- ture extraction components, and these components are con- stantly evolving due to the dynamic nature of vision soft- ware research. So, there is a need for a big library of di- verse vision services to address complex images, and to al- low components to be seamlessly updated as necessary. Finally, vision would benefit from more reproducible re- search. There are of course many reasons why an author would not provide source code to accompany a paper, how- ever common issues include: lack of incentive to main- 1

Transcript of VISYM: A Platform for Large Scale Computer Vision€¦ ·  · 2013-05-01VISYM: A Platform for...

Page 1: VISYM: A Platform for Large Scale Computer Vision€¦ ·  · 2013-05-01VISYM: A Platform for Large Scale Computer Vision Jeffrey Byrne, Jack Sim and Jianbo Shi University of Pennsylvania

VISYM: A Platform for Large Scale Computer Vision

Jeffrey Byrne, Jack Sim and Jianbo ShiUniversity of Pennsylvania

{jebyrne,jiwoong,jshi}@cis.upenn.edu

Abstract

What if every vision algorithm ever written was readyto run via browser, on images from any network connectedcamera? In this paper, we introduce visym.com, a new plat-form to easily share and distribute computer vision as a realtime web service. We believe that this platform can have abroad impact for the vision research community. Specifi-cally, it can provide a means for seamless MATLAB algo-rithm and data sharing, transparent coarse grained paral-lelization, intermediate results reuse, and resource pooling,all without changing current R&D software practices. Fur-thermore, exposing vision as a real time web service opensup development to the web, to address the long tail of vi-sion applications. A beta test of this platform is currentlyunderway at http://beta.visym.com.

1. IntroductionModern computer vision research and development has

exploded since its origins in the 1950s to include a widevariety of applications. For example, a scan of researcharchives including CVPR, ICCV and PAMI shows over107,000 papers with 7300 keywords associated with com-puter vision, with algorithms and applications ranging fromabnormal event detection to zip code recognition. In gen-eral, these applications following a long tailed distribution,such that there are some popular applications such as facerecognition, but many more niche applications. It can be ar-gued that the ”killer app” for vision is not a single popularapplication, rather it is a platform for distribution of nicheapps.

The challenge of distribution is that vision is a big dataproblem. Recent work in internet scale vision has moved todatabases with millions of labeled training images [1, 2, 3],and this number will only increase [7]. Simply sharing ter-abyte scale databases is a challenge due to limited band-width, which means that practical sharing of datasets re-quires physically shipping hard drives between groups. Fur-thermore, sharing functions such as data driven classifierstrained on big datasets often have parameters which require

Figure 1. VISYM is a platform for sharing and distribution of web-services for computer vision

sharing the majority of the dataset as well [2, 5, 12]. Also,processing big datasets requires a large computational plat-form. Most vision algorithms exhibit coarse grained imageparallelism, which can be exploited for large scale process-ing. However, setting up a cluster to handle peak load maynot be cost effective for individual research groups, andthese computational resources still need datasets close by.So, there is a need for a large scale vision platform to sharebig data with the computational resources to process thisdata by exploiting data locality.

Vision is also a big, diverse data problem. Complex im-ages may require many functions operating in parallel ona single image to extract all relevant content [9]. Further-more, many algorithms share similar preprocessing and fea-ture extraction components, and these components are con-stantly evolving due to the dynamic nature of vision soft-ware research. So, there is a need for a big library of di-verse vision services to address complex images, and to al-low components to be seamlessly updated as necessary.

Finally, vision would benefit from more reproducible re-search. There are of course many reasons why an authorwould not provide source code to accompany a paper, how-ever common issues include: lack of incentive to main-

1

Page 2: VISYM: A Platform for Large Scale Computer Vision€¦ ·  · 2013-05-01VISYM: A Platform for Large Scale Computer Vision Jeffrey Byrne, Jack Sim and Jianbo Shi University of Pennsylvania

tain open source, embarrassment about sharing quick anddirty code, IP or ITAR restrictions and headaches of pack-aging and regression testing. Also, even when a MATLABtoolbox is available, it can be difficult to use due to MAT-LAB version incompatibilities, licensed toolbox require-ments and library dependencies. Since MATLAB is thedominant language in vision, there is a need to enable re-searchers to quickly and easily share MATLAB code with-out exposing source, to allow colleagues to reproduce re-sults and test on new images. Furthermore, making suchfunctions available would allow non-vision developers totest and evaluate, opening development to the broader com-munity on the web.

In this paper, we introduce the Visym platform to theresearch community to address these observations. TheVisym platform is:

1. Server appliance: We provide a software appliance,a virtual machine image containing a light web server,database, and web service handler supporting MATLABservices. This appliance is the core building block of theVisym platform. Appliances can be provisioned to runon standard hypervisors in any infrastructure-as-a-serviceprovider or deployed in a user’s local cluster, to providelarge scale distributed processing. Every appliance hasaccess to all data and services available on the platformthrough a sharded URI to URL naming scheme, and everyappliance exposes a web service API for web based servicecalls and low latency data access.

2. Web service API: A web based programming interfacethat provides a language agnostic method of remote proce-dure calls of vision services. The API is consumable by anylanguage, and is accessible from any appliance via browseror mobile devices.

3. MATLAB client tools: We provide an installable MAT-LAB toolbox which abstracts the API to native MATLABobjects. Calling vision services on the platform is simplyprogramming in MATLAB, and building new vision ser-vices from existing MATLAB code is a one-click operation.All API calls are asynchronous, and they provides trans-parent parallelization based on referentially transparent ser-vices, lazy evaluation of results, and strict consistency fromMATLAB variable causality. MATLAB clients get coarsegrained parallelization for free.

In this paper, we describe on the architectural design ofthe server appliance and API, then describe the MATLABclient tools from a user perspective. Next, we evaluate thetransparent parallelization performance of three commonvision toolboxes built using the MATLAB client tools, exe-cuted using multiple server appliances deployed in an oper-ational cloud. This demonstrates scalability and usability tojustify large scale deployment.

2. Related WorkIn the early 90s, DARPA introduced the Image Under-

standing Environment (IUE), a five year research programto provide a common development environment for vision[6]. The goal was to increase research productivity andtechnology transfer by enabling sharing and reuse of func-tions and intermediate results. This goal was not fully real-ized, however there were two primary lessons learned ofrelevance to this effort. First, the IUE provided an APIdefining common data structures, however this centralizedprocess was carried out in closed workshops and did notaddress the needs of the long tail. Second, IUE forced re-searchers to use C++ and Lisp, so those who develop inother languages were required to port their code, a time con-suming process with few research incentives. The lessonslearned are that a vision platform must be language agnos-tic, and the API development must be open to the commu-nity.

Internet Vision is a growing research area, where onelong term goal is the Visipedia [9], which highlights thetechnical challenges for vision on the web. This platformcould provide the computational engine and web servicesfor the automata in such a community driven effort. Re-lated efforts in large scale labeled image datasets includeLabelMe [1], ImageNet [2], Lotus Hill [3], where total im-ages have exceeded eleven million. However, even withthese dataset sizes, analysis shows that larger datasets areneeded [7]. Recent examples of large scale vision thatwould benefit from a vision platform include scene parsing[10], scene understanding [11], object categorization [12],localization [5], structure from motion [13], landmark clas-sification [14].

A platform for large scale vision is an example of dataintensive, high performance computing (HPC). Condor isa popular workload management system for grid comput-ing, however Condor requires distributed computing exper-tise for deployment and maintenance, and Condor job sub-mission tools can be unintuitive for non-experts. Hadoopprovides tools for MapReduce based batch processing oflarge datasets, however these tools are optimized for highthroughput over low latency of data access due to the as-sumptions of the hadoop distributed file system containinghuge files split into blocks. Vision data has a natural atomicsize (e.g. ”the image”), which does not require the over-head of a full distributed file system, and also web deploy-ment requires low latency data access. Microsoft Dryad [8]provides coarse grained parallelism for jobs with sparse de-pendencies between subtasks, rather than embarrassinglyparallel operations expressible in MapReduce or Condor.However, this requires the user to provide a directed acyclicgraph expressing job causality constraints for scheduling.

Platform-as-a-service (PaaS) is a commercial offeringwhich provides a platform for distribution of custom de-

Page 3: VISYM: A Platform for Large Scale Computer Vision€¦ ·  · 2013-05-01VISYM: A Platform for Large Scale Computer Vision Jeffrey Byrne, Jack Sim and Jianbo Shi University of Pennsylvania

signed webapps. Platforms such as Google AppEngine,Microsoft Windows Azure, Joyent, Heroku and Engineyardallow users to deploy Python, Java or Ruby on Rails we-bapps without worrying about scalability. Infrastructure-as-a-Service (IaaS) providers such as Amazon, Rackspace, orGoGrid provide utility computing and distributed storagerented on demand to customers. This paper can be consid-ered a PaaS provider for MATLAB apps, leveraging IaaSfor scalability.

Finally, MathWorks provides tools for MATLAB webdeployment and distributed computing. The MATLABcompiler and MATLAB builder NE/JA provides tools forbuilding and deploying MATLAB applications against theMATLAB compiler runtime (MCR) royalty free. Howeverthese solutions are not language agnostic, require ¿200MBMCR installation and are synchronous, limiting client par-allelization. MATLAB Parallel Computing and DistributedComputing toolbox provide fine and coarse grained paral-lelism, however they require license fees which limit scala-bility. This effort is most closely aligned with MATLAB*P[21] for transparent parallelization through operator over-loading.

3. Platform ArchitectureThe design goals of the platform architecture were de-

fined to address the observed need in the research com-munity as outlined in section 1. Specifically, these goalsare: scalable, decentralized, interactive, sharable and userfriendly. Scalable and decentralized operation, as well asother distributed systems goals such as fault tolerance, loadbalancing and scheduling are well known and achievablegoals in the distributed system literature, and will not beaddressed in detail here.

The primary challenges that are unique to this platformare client interactivity and sharing. As discussed in section2, batch processing tools do exist to create massively dataparallel pipelines. However, while monitoring tools suchas Flume and Hue allow a user to submit batch jobs andmonitor status of Hadoop clusters, such tools do not tightlyintegrate with the developer’s design workflow. We believethat vision researchers want to write in familiar languageslike MATLAB and they want to monitor the intermediateresults as MATLAB variables to quickly view the resultsduring evaluation. This is possible since vision datasetsand results are typically characterized by a large numberof megabyte scale files (images, metadata and parameters),rather than a small number of terabyte scale files commonin batch jobs. Furthermore, client interactivity includes webclients, and web interactivity is not a use case for batch pro-cessing. Second, since vision datasets are characterized bya large number of files and more importantly a large numberof functions for data analytics, our goal is to enable sharing.A typical batch job assumes that input data is manually pro-

vided as input to the job creation script, and analytics arecustom designed by the batch submitter. In contrast, com-plex images require many functions that are developed byother researchers, and many share intermediate results fromprevious executions. Vision researchers should have a plat-form optimized for image data analytics, so our goal is tomake it easy to share and discover functions and data cre-ated by others on the platform.

The platform architecture is shown in figure 2. TheVisym platform is a set of server appliances deployed in hy-pervisors at visym.com, IaaS providers and at partner loca-tions. Visym client tools allow MATLAB and Python usersto easily connect to the visym platform, and to access anyURI deployed on the platform.

A Visym server appliance is a virtual machine imagethat includes: a light weight HTTP server for static con-tent, a FastCGI server for dynamic content, a Python basedweb service handler, a non-relational database, and a se-cure chroot jailed MATLAB process server containing theMATLAB compiler runtime (MCR). The MATLAB serverspawns K processes, each with an independent MCR, totake advantage of multicore servers. Each MATLAB pro-cess executes shared object libraries built using the MAT-LAB compiler, reusing the MCR initialization to avoid sig-nificant startup overhead. These appliances handle APIcalls to run MATLAB services, where inputs and functionobjects are fetched by URI, and results are cached locallyby URI for reuse. This API is described in section 3.1.

Uniform Resource Identifiers (URI) are used to provideidentification and versioning to all data and functions on theplatform. Appliance databases are sharded by URI, to hor-izontally partition and load balance across appliances. Theplatform API provides name server functionality, to forwardURI requests to primary or secondary appliances, which re-spond with a URL for resource fetch. The URI scheme isdescribed in section 3.2.

The Visym client tools enable users to natively con-nect to the Visym platform API. Currently, only MATLABclients are supported, however Python support is in develop-ment. The MATLAB client tools abstract the serializationof native datatypes into uploaded MAT or HDF5 files, gen-erate URLs for remote procedure calls, and handle unseri-alization of results back to MATLAB datatypes. The clienttools provide support for monitoring service calls, throw-ing exceptions and displaying service logs. Furthermore,the client tools allow users to create new services from theirMATLAB code. The MATLAB client tools are describedin section 3.3 and building new services in section 3.4.

Consider the following use case. Researcher AL writesa new paper, and has a MATLAB function (f1.m) usedfor evaluation and a set of support functions (f2.m,, fj.m,fk.mex,). AL installs the Visym MATLAB client tools andruns visym.build(’f1.m’), which locates all dependencies,builds a service for f1.m and returns a service URI F. Re-

Page 4: VISYM: A Platform for Large Scale Computer Vision€¦ ·  · 2013-05-01VISYM: A Platform for Large Scale Computer Vision Jeffrey Byrne, Jack Sim and Jianbo Shi University of Pennsylvania

Figure 2. VISYM Architecture

searcher BE wants to test AL’s code. Instead of sharing atoolbox on his website, AL shares URI F with BE. BE in-stalls the client tools and runs f = visym.function(F), whichreturns a function handle f. BE can process N of her imagesin parallel using AL’s function:

>> for k=1:N>> myoutput{k} = f(myimage{k})>> end

This loop will return immediately, and the cell array my-output stores ticket variables. Each ticket has a URI refer-encing the result of ALs function on the platform. Thesevariables overload all MATLAB operators, and can be usednatively just like any other variable. When the variable isreferenced, the result is downloaded (or blocked until theresult is available) and cached. The client tools make ser-vice calls easy, and the platform handles the parallelizationby scheduling API calls to available appliances, all trans-parently to BE.

3.1. Web Services

The Visym platform includes a REST-RPC hybridweb service API to provide vision-as-a-service function-ality. RESTful web services define an API using stan-dard HTTP operations (GET, POST, PUT, DELETE) onURL identified resources, while remote procedure call(RPC) web services define unique functions and argu-ments posted to a service endpoint. In this platform, asimple helloworld service call is expressed by an HTTPGET to http://beta.visym.com/api?f=:func:test.helloworld&args=[”me”]. The arguments in the querystring (follow-ing ?) define a built-in function URI :func:test.helloworldidentifying the function to execute, called with a javascriptobject notation (JSON) encoded argument list ([”me”]).Simple vision examples include KLT features [15], andminimum spanning tree segmentation [16] available atbeta.visym.com/demo.

Features of this web service include: (i) asynchronous,responses where service calls return immediately with aJSON encoded transaction URI response or ticket. Tickets

can be fetched to check status, throw exceptions and showMATLAB command window output of a long running ser-vice call, (ii) session based authentication, (iii) JSON serial-ization to enable JSONP cross domain web services whichenable web mashups, (iv) argument formatted as URI, URLor JSON encoded literals and (v) service controls for emailnotification and response formatting by adding keywords tothe querystring.

3.2. URIs

Effective sharing of vision data means providing aunique ID to identify images, intermediate results and func-tions. These IDs should be easy for humans to remember,assigned in a decentralized manner without conflicts, easilyversionable, persistent, and should provide a unique ID forevery pixel in every image. In this section, we introduce avariation of the tag Uniform Resource Identifier (tag-URI)[17] to achieve these design goals.

Visym URIs are defined by the following general syntax:

:owner:tagname-tagversion:type:id#fragment

where the full ABNF syntax specification is provided in[17]. The owner is a well formed email address, thetagname-tagversion defines a set label and option dotteddecimal version number, the type defines the data type [img,func, data, txn, var] for images, functions, data, transactionsor variables. The id identifies an element in the set (e.g. im-age in a dataset), and the fragment identifies a subset ofelements in the id (e.g. region of interest in an image).

Visym URIs can be vectorized by dropping the id. Forexample, :[email protected]:myset-1.2:img: is a well formedURI which refers to the set of all images tagged myset-1.2.Furthermore, for well known datasets, the owner can im-plied :myset-1.2:img: This is a concise means to referencelarge, user defined datasets.

Finally, Visym URIs are immutable. All URIs havewrite once, read many access model. This facilitates datacoherency given concurrent access, and also provides ver-sioned ”snapshots” of data and functions that never change.

Page 5: VISYM: A Platform for Large Scale Computer Vision€¦ ·  · 2013-05-01VISYM: A Platform for Large Scale Computer Vision Jeffrey Byrne, Jack Sim and Jianbo Shi University of Pennsylvania

3.3. Matlab Client ToolsWe provide MATLAB client tools to seamlessly inter-

face with the Visym service. These MATLAB tools are in-stallable from the MATLAB prompt:

>> eval(urlread(’http://beta.visym.com/installer.m’));

which will guide the user through the installation and setupprocess. The MATLAB client tools provide the followingfeatures deployed as a package of classes: service func-tion calls, user management, configuration file manage-ment, operator overloading, data management for data up-load/download, exception handling and service building.

The client tools were designed to be MATLAB nativeand seamless for MATLAB users. Calling a Visym ser-vice function looks to the end user just like calling a localMATLAB function with MATLAB variables. For example,a function call to compute normalized cuts [18]:

>> f_ncuts = visym.function(’:func:ncuts’);>> C = f_ncuts(img, 10);>> imagesc(C());

visym.function is a method provided by the client tools forremote service calls. Visym.function is passed a URI iden-tifying a service function (a built-in URI :func:ncuts) andreturns a function handle (f ncuts). When this function han-dle is called, the client tools serialize and upload the usersupplied image variable (img) to the platform, then com-pute multiscale ncuts on the platform with 10 segments.The function call is asynchronous, and returns immediately,such that output argument ’C’ encodes a transaction URI.This variable overloads all standard MATLAB operators,and acts to the user like the actual ncuts result in a variableC. The user can monitor the status of the function execu-tion (C.state()), and seamlessly download results for userdisplay when execution completes.

The same function can be called using an existing imageURI.

>> C = f_ncuts(’:img:tiger’, 15)

would perform ncuts on a platform built-in image URI(:img:tiger), which does not require upload of the full im-age prior to execution. Furthermore, output arguments canbe shared and reused. For example, suppose there existsan second function g which uses the output C as an input.The client tools exhibit lazy evaluation, where variables arenot downloaded until contents are accessed. When g(C) iscalled, the client tools include the data URI for C in theservice call rather than downloading the intermediate re-sult. However, if C is accessed (e.g. region of interestC(1:10,1:10)), then the variable is downloaded and cachedlocally as expected.

Figure 3. Experimental Results

3.4. Building MATLAB Services

Finally, the visym.build function provides a means forusers to build their own services. With one click, the toolsparse a directory to determine all M and MEX source de-pendencies, warn the user about incompatible functions,and upload the package to the platform for building. Whencomplete, the platform emails the user with a function URIfor use by visym.function and to share this function withothers.

Building services requires that functions are referentiallytransparent. Referential transparency means that a functionhas no ”side effects” and can be replaced by its output ar-guments without changing the program. This approach hasbeen used in functional programming to enable caching andtransparent coarse grained parallelism. Referential trans-parency is enforced during building by parsing dependen-cies for disallowed keywords that change global state (e.g.global variables, file writes, system calls) and prompt theuser for modification. MEX files are exempt from thisparse, and are recompiled on the server assuming they arereferentially transparent.

The resulting MATLAB services are called usingvisym.function using the returned function URI or could bedirectly called via REST-RPC API. The semantics of theinput and output arguments are defined by the user, whichleaves the specific API for sharing vision functions up to thecommunity.

4. Experimental Results

In this section, we evaluate the parallelization perfor-mance and service overhead for the platform. We chosethree commonly used MATLAB toolboxes: multiscale nor-malized cuts [18], multiscale global Pb [19] and part basedobject detection [20]. These toolboxes were chosen due totheir wide citation, and since they have nontrivial runtimeper image and would benefit from parallelization and reuse.

Page 6: VISYM: A Platform for Large Scale Computer Vision€¦ ·  · 2013-05-01VISYM: A Platform for Large Scale Computer Vision Jeffrey Byrne, Jack Sim and Jianbo Shi University of Pennsylvania

Figure 4. Service Overhead

We were able to build a service for these toolboxes fromeach of their latest releases without making any modifica-tions to the source code.

The experimental evaluation included the Visym serverappliance deployed on four 512MB provisioned slices inRackspace Hosting. Each appliance was running K=1MATLAB processes, and one appliance was manuallyelected to provide simple round robin scheduling of APIcalls. A remote MATLAB user was running the client toolsto build services for interface functions in each toolbox, andto upload 200 Berkeley segmentation training set images tothe appliances. The resulting image and function URIs wereused to process N images on between one and four servers,with a 2 sec poll interval to check completion status of eachservice call. Analysis was limited to four servers simply todemonstrate the scalability trend.

Figure 3 shows the total runtime for each algorithm us-ing 1, 2 or 4 appliances. This shows that the runtime per-formance is inversely proportional to the number of servers,since a 512MB slice from Rackspace provides at most onevirtual core. This demonstrates transparent parallelizationfor an embarrassingly parallel input. Figure 4 shows theoverhead statistics for this experiment. The average over-head for the service is 0.267 second per call. This overheadincludes time for downloading results of algorithms to lo-cal client. We assumed a scenario where there is no uploadrequired except parameters where all input images are pre-uploaded.

Next, we compared the performance to existing paral-lelization tools using Condor. We installed Condor on allappliance servers, compiled NCuts to a standalone exe-cutable using the MATLAB compiler, and wrote a Condorjob submission script to replicate service execution. Re-sults show that runtime performance is 1140.3 s, 583.3 s,and 292.1 s for 1, 2 and 4 server execution, nearly identicalto platform performance in Figure 3. Standalone executa-bles reload the MATLAB MCR on every execution, and themean reload time is 1.4 sec. This is approximately equalto the 2 second polling interval for service execution. Thisshows that Visym platform is equivalent in performance totraditional batch execution, but provides additional featuresof interactivity, native MATLAB integration and web de-ployment.

4.1. MATLAB Demo

Figure 5 shows a demonstration of the Matlab clienttools included in the visym installation. This provides awalkthrough of the primary feature of the Matlab clienttools and how they are used for platform interactivity andsharing. This figure is also available in a larger format athttp://beta.visym.com/overview.

4.2. Web Demo

Finally, we provide a demonstration of calling vision ser-vices from javascript in browser. This demonstration isavailable at http://beta.visym.com/demo and allowsthe user to call specific functions on images from any URLand view results. This interface can also be used for eval-uation of new MATLAB services, where the image resultdisplayed is the contents of the last figure plotted by theMATLAB service. This provides a built-in browser demofor any vision service for free. The supplementary materialshows a video of this demo and usage of the client tools.

5. Future Work

In this paper, we have described a platform for largescale vision and evaluated common use cases. However,there are a number of remaining challenges. We describedthe server appliance and end user tools of relevance to thevision community, but only briefly touched on distributedsystems issues: scheduling, fault tolerance, load balancing,data archival, data consistency, synchronization, replicationand security. We believe that these issues can be cleanlyaddressed within the proposed framework, however suchissues will be explored further during the open beta. Fur-thermore, we addressed embarrassingly image parallel jobs,rather than parallelization with sparse dependencies amongcomponents which characterize the majority of vision al-gorithms. This parallelization is a natural extension of theresults shown, however evaluation will be future work.

References[1] A. Torralba, R. Fergus and W. Freeman 80 million tiny im-

ages: a large dataset for non-parametric object and scenerecognition PAMI, vol.30(11), pp. 1958-1970, 2008. 1, 2

[2] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, ImageNet: A Large-Scale Hierarchical Image Database.CVPR, 2009 1, 2

[3] Z.Yao, X. Yang, and S. Zhu, Introduction to a large scalegeneral purpose ground truth dataset: methodology, annota-tion tool, and benchmarks, 6th Int’l Conf on EMMCVPR,August 2007. 1, 2

[4] J. Hays and A. Efros. Scene Completion Using Millions ofPhotographs. ACM Transactions on Graphics (SIGGRAPH2007). August 2007, vol. 26, No. 3.

Page 7: VISYM: A Platform for Large Scale Computer Vision€¦ ·  · 2013-05-01VISYM: A Platform for Large Scale Computer Vision Jeffrey Byrne, Jack Sim and Jianbo Shi University of Pennsylvania

Figure 5. MATLAB client tools demonstration script

[5] J. Hays and A. Efros: IM2GPS: estimating geographic infor-mation from a single image. CVPR 2008 1, 2

[6] J. Mundy, The Image Understanding Environment Program,IEEE Intelligent Systems, pp. 64-73, Dec, 1995 2

[7] J. Ponce, M. Hebert, C. Schmid, A. and Zisserman, eds.2006. ”Dataset issues in Object Recognition”, in TowardCategory-Level Object Recognition, volume 4170 of LectureNotes in Computer Science. Springer. 1, 2

[8] M. Isard, M. Budiu, Y. Yu, A. Birrell, and D Fetterly, Dryad:Distributed Data-Parallel Programs from Sequential Build-ing Blocks European Conference on Computer Systems (Eu-roSys), Lisbon, Portugal, March 21-23, 2007 2

[9] P. Perona, Visions of a Visipedia. Proceedings of the IEEEAugust 2010. Vol. 98, Issue: 8, Pages 1526-1534 1, 2

[10] C. Liu, J. Yuen, A. Torralba. Nonparametric scene parsing:label transfer via dense scene alignment. CVPR 2009. 2

[11] J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba: SUNdatabase: Large-scale scene recognition from abbey to zoo.CVPR 2010: 3485-3492 2

[12] J. Deng, A. Berg, K. Li and L. Fei-Fei. What does classifyingmore than 10,000 image categories tell us? ECCV, 2010. 1,2

[13] S. Agarwal, N. Snavely, I. Simon, S. Seitz and R. SzeliskiBuilding Rome in a Day, ICCV, 2009, Kyoto, Japan. 2

[14] Y. Li, D. Crandall, and D. Huttenlocher, Landmark Classifi-cation in Large-Scale Image Collections, ICCV 2009 2

[15] J Shi and C. Tomasi, Good Features to Track, CVPR’94,1994, pp. 593 - 600. 4

[16] P. Felzenszwalb and D. Huttenlocher , Efficient Graph-BasedImage Segmentation International Journal of Computer Vi-sion, Volume 59, Number 2, September 2004 4

[17] T. Kindberg, S. Hawke, The ”tag” URI Scheme, RFC 41514

Page 8: VISYM: A Platform for Large Scale Computer Vision€¦ ·  · 2013-05-01VISYM: A Platform for Large Scale Computer Vision Jeffrey Byrne, Jack Sim and Jianbo Shi University of Pennsylvania

[18] T. Cour, F. Benezit, and J. Shi. Spectral Segmentation withMultiscale Graph Decomposition. CVPR, 2005 5

[19] M. Maire, P. Arbelaez, C. Fowlkes and J. Malik. Using Con-tours to Detect and Localize Junctions in Natural Images.CVPR 2008 5

[20] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ra-manan, Object Detection with Discriminatively Trained PartBased Models. To appear in the IEEE PAMI. (v4) 5

[21] R Choy, A. Edelman and C Moler, Parallel MATLAB: Doingit right, Proceedings of the IEEE, 2005, p331341 3