Setalyzer - University of Washingtoncourses.cs.washington.edu/courses/cse573/12au/... · trinary...

6
Setalyzer Adam Lerner & Will Scott Introduction & Motivation Setalyzer is a program for solving a common problem when playing the Set card game: no player sees a legal move, but nobody is confident that a valid move does not exist. To address this problem, we built an Android application, which uses a set of computer vision techniques to identify and then augment the game with remaining moves. The Set game itself is a computationally tractable problem space. There will typically be 12 cards in play, and a move consists of matching three cards. This means that there are 12 choose 3 possible moves at any given time, which can be fully explored very quickly. The harder problem is correctly identifying the cards in play quickly and without user interaction, so that the application can be useful in context. Approach We built Setalyzer as an Android application. This means that Setalyzer is programmed in pure Java, and feature extraction and classification of camera data must be calculated within tight memory and time bounds. We rely on the underlying image representation used by Android, since transformations of the image are expensive, which surprisingly meant there was very sparse existing machinery for us to build upon, and resulted in much of our time being spent building the system for extracting, manipulating and classifying images. Our analysis process is split into two main components, with a training phase split into two parts and a live evaluation phase. The resulting application performs the following steps in this live evaluation: we take an image from the phone's camera, and first segment and then classify that image. Segmentation identifies cards in the image, and reconstructs them as faceon rectangles. Classification identifies features in a card image, and then evaluates them using a pretrained classifier. After classification, results are drawn back on the device screen. In order to train the classifier, we created a desktop training program to allow us to hand label a test corpus of images we took. Once these images are labeled, we can extract features from them to produce the attribute relation files used by the Weka classification system. Since the feature extraction code is platform specific (it relies on image manipulation of the native Android Bitmap image format, which is a java interface to ARM native code), determining features on the test image set requires extraction on the phone after labeling, to ensure that we have representative features for those images.

Transcript of Setalyzer - University of Washingtoncourses.cs.washington.edu/courses/cse573/12au/... · trinary...

Page 1: Setalyzer - University of Washingtoncourses.cs.washington.edu/courses/cse573/12au/... · trinary characteristics color, fill pattern, count, and shape. Classification and Features

SetalyzerAdam Lerner & Will Scott

Introduction & MotivationSetalyzer is a program for solving a common problem when playing the Set card game: noplayer sees a legal move, but nobody is confident that a valid move does not exist. To addressthis problem, we built an Android application, which uses a set of computer vision techniques toidentify and then augment the game with remaining moves.

The Set game itself is a computationally tractable problem space. There will typically be 12cards in play, and a move consists of matching three cards. This means that there are 12choose 3 possible moves at any given time, which can be fully explored very quickly. Theharder problem is correctly identifying the cards in play quickly and without user interaction, sothat the application can be useful in context.

ApproachWe built Setalyzer as an Android application. This means that Setalyzer is programmed in pureJava, and feature extraction and classification of camera data must be calculated within tightmemory and time bounds. We rely on the underlying image representation used by Android,since transformations of the image are expensive, which surprisingly meant there was verysparse existing machinery for us to build upon, and resulted in much of our time being spentbuilding the system for extracting, manipulating and classifying images.

Our analysis process is split into two main components, with a training phase split into two partsand a live evaluation phase. The resulting application performs the following steps in this liveevaluation: we take an image from the phone's camera, and first segment and then classify thatimage. Segmentation identifies cards in the image, and reconstructs them as face­on rectangles.Classification identifies features in a card image, and then evaluates them using a pre­trainedclassifier. After classification, results are drawn back on the device screen.

In order to train the classifier, we created a desktop training program to allow us to hand label atest corpus of images we took. Once these images are labeled, we can extract features fromthem to produce the attribute relation files used by the Weka classification system. Since thefeature extraction code is platform specific (it relies on image manipulation of the native AndroidBitmap image format, which is a java interface to ARM native code), determining features on thetest image set requires extraction on the phone after labeling, to ensure that we haverepresentative features for those images.

Page 2: Setalyzer - University of Washingtoncourses.cs.washington.edu/courses/cse573/12au/... · trinary characteristics color, fill pattern, count, and shape. Classification and Features

Our classifier must be robust on four axes. Each card in the game has one value for each of fourtrinary characteristics ­ color, fill pattern, count, and shape.

Classification and FeaturesOur classification algorithm takes as input a series of features extracted from a tightly croppedimage of a card and uses those features as inputs to a trained classifier. Using our currentfeatures, we ran preliminary tests using a variety of classifier types provided by Weka, includingNaive Bayes, Bayes Nets, IB1 (a nearest­neighbor classifier), several types of meta boostingclassifiers, and decision trees. None of these approaches produced satisfactory results evenover a small set of test­cases, with the issue coming from a low signal in our feature extraction,rather than the classification technique.

For features, we currently extract 12 features from each card image. Nine of those features arepixel color value averages. Those features are derived by sampling each feature from a differentportion of the image, including the whole image, diagonal transects, and the center of the image.The final 3 features are built from a distance metric based upon SURF (Speeded Up RobustFeatures). We calculate SURF features for the image to be classified, which provides a list ofpoints of interest in the image. A strength of SURF for this application is that it encourages thesepoints to be calculated similarly for images which vary in both rotation and size, allowing ourfeatures to be scale and rotation invariant.

We also calculate SURF features for 3 canonical examples of cards. Then we apply a greedyassociation algorithm which matches similar SURF points between our example and thosecanonical cards. From this we derive 1 feature per canonical card: the number of matches whichthe association algorithm finds between the features of the example and the canonical card. Theexample card’s similarity to known cards seems that it should be a likely candidate for a usefulfeature.

Between these features, we expected that the average color value features would helpcharacterize the color property of the cards, as well as possibly the fill pattern and count, sincefor example central pixel values will be on average lighter in a card which has a count of 2 (andthus no symbol directly centered in the card). We expected that the distance metric featureshould primarily capture count and shape, as it is invariant with color and SURF features don’tseem to vary significantly with fill pattern (they appear largely on the edges of the symbols). Wediscuss a number of possible improvements to our features at the end of the evaluation section,below.

Page 3: Setalyzer - University of Washingtoncourses.cs.washington.edu/courses/cse573/12au/... · trinary characteristics color, fill pattern, count, and shape. Classification and Features

Evaluation

The client Interface. Setalyzer marks cards which form a set. The bottom toolbar allows thephone to act as a feature extractor for previously labeled images.

The Setalyzer training Interface, allowing labeling of captured images. The labels are stored,and a second application computes features for each labeled card.

Page 4: Setalyzer - University of Washingtoncourses.cs.washington.edu/courses/cse573/12au/... · trinary characteristics color, fill pattern, count, and shape. Classification and Features

An important piece of the Setalyzer system is allowing it to run on an Android handset. Toaccomplish this, we looked at the performance for both training and execution of samples. Wewill add that the code uses several idioms which are known to be expensive in android, basedon how the existing Boof library extracts features ­ for example image buffers are recreatedseveral times, which is highly discouraged, because repeated interactions with Android heapmanagement is much slower than reusing a single buffer. Thus we consider these numbersprimarily as diagnostic for where to focus optimization efforts rather than as characterizing theruntime our application might eventually achieve.

Feature generation for the 179 test images we used took approximately 30 minutes(approximately 10 seconds per image). Part of this time was spent on data transfer overhead,since the full image was transferred to the phone for each calculation.

Full processing of an image currently takes 40 seconds when cards are present, which isunacceptably long. ¾ of the time is currently spent on SURF feature extraction, which is by farthe most expensive part of the analysis. Segmentation is completed in well under a second, andmuch of the rest of the time is spent in the overhead of transforming detect cards, andconverting image formats.

Visual representation of the confusion matrix for identifying shapes using a Naive BayesClassifier against 179 training samples. The classifier preferred Ovals, although the sampleswere evenly distributed in shape, correctly classifying 33.5% of test instances inCross­Validation.

The accuracy of each of the classifiers we tried ranged between 30% and 35% for each trinarycharacteristic. For example, the Bayes Net classifier had 33.2% accuracy, and a Bagging metaclassifier was able to reach 36.3% accuracy for the shape property. This is not significantlybetter than a random classifier would have performed, and we believe this indicates that ourfeatures do not provide sufficient signal to identify cards.

Page 5: Setalyzer - University of Washingtoncourses.cs.washington.edu/courses/cse573/12au/... · trinary characteristics color, fill pattern, count, and shape. Classification and Features

We have considered a number of additional features and improvements to current features. Wecurrently use only 3 canonical cards which do not fully represent the space of possible cards.An extension would be to calculate the distance metric from each of the 81 possible cards in thedeck, or from a well distributed subset of the deck if calculating all 81 distance features takes toolong given the computational resources of an average Android device. We might also tweak thedistance metric by incorporating the quality of point matching and the location of the pointsmatched. Other possible features could include those derived from corner or edge detectionalgorithms, color distribution from the overall image or of all cards extracted from the containingimage (to account for lighting variation), and other simpler features such as total proportion of“white” vs “dark” pixels in the image or the number of contiguous regions of each type of pixels(since segmentation involves computing a binary threshold of the image, we can reuse theresults of those computations as classification features for free).

Conclusions & Future WorksOne of the results of this project which we are most excited about is that we were able to build anative Android system for evaluating feature extraction routines. OpenCV is a library providingexisting feature extractors, but it does not support Android natively, nor does it provide anintegrated mechanism for evaluating the significance of extracted features. As part of Setalyzerwe ended up building a generic component allowing for iterative evaluation of feature extractors,and while we were unsuccessful in finding appropriate features for our vision problem, we thinkthe evaluation framework can become a valuable artifact.

The future work remaining for Setalyzer rests on determining a set of tractable, high qualityfeatures. Additionally, there is a restructuring possible in the application where instead ofevaluating the features in each camera frame individually, we loosen the time requirement forevaluation and instead attempt to do a simple motion tracking algorithm between frames toupdate the overlay. That change will allow for a less jerky user experience, but requires anadditional layer of computation.

Appendix: LogisticsThis project was naturally parallelizable. Will worked primarily on the training system and thepipeline for evaluating feature detection, while Adam worked on the primary execution pathwayof segmentation and feature extraction.

Setalyzer uses Weka for classification, and the BoofCV library for some feature extractionalgorithms. BoofCV was chosen because it was a native Java codebase which runs onAndroid, unlike the more powerful and commonly used OpenCV. The downside is that it is muchless powerful, and provides only a minimal set of tools like edge extraction and thresholding.

To run setalyzer, checkout the source from https://github.com/willscott/setalyzer. The setalyzerproject contains a functioning Android Application, which can be installed on Android devicesusing the eclipse android plugin. Alternatively, packed apk binaries are uploaded to github.

Page 6: Setalyzer - University of Washingtoncourses.cs.washington.edu/courses/cse573/12au/... · trinary characteristics color, fill pattern, count, and shape. Classification and Features

Appendix: References

Bay, Herbert, Tinne Tuytelaars, and Luc Van Gool. "Surf: Speeded up robust features." ComputerVision–ECCV 2006 (2006): 404­417.

BoofCV: http://boofcv.org/index.php?title=Main_Page

Canny, John. "A computational approach to edge detection." Pattern Analysis and Machine Intelligence,IEEE Transactions on 6 (1986): 679­698.

Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian H. Witten (2009); TheWEKA Data Mining Software: An Update; SIGKDD Explorations, Volume 11, Issue 1.

W.C. Chen, Y. Xiong, J. Gao, N. Gelfand, and R. Grzeszczuk, "Efficient Extraction of Robust ImageFeatures on Mobile Devices." ISMAR '07