Toward Learning Mixture-of-Parts Pictorial Structures
-
Upload
allen-perez -
Category
Documents
-
view
16 -
download
0
description
Transcript of Toward Learning Mixture-of-Parts Pictorial Structures
Toward Learning Mixture-of-Parts Toward Learning Mixture-of-Parts
Pictorial StructuresPictorial Structures
Robin Hess and Alan Fern
School of Electrical Engineering and Computer Science
Oregon State University
Alan Fern Oregon State University
Overview OSU Digital Scout Project
Describe problem of initial formation labeling Representational and inference challenges
Mixture-of-Parts Pictorial Structures Model definition Inference
Opportunities for learning Parameters and structure Speedup Learning Active Learning Transfer Learning
Talk Objectives
Alan Fern Oregon State University
The OSU Digital Scout ProjectObjective: compute semantic interpretations of football video
Raw video High-level interpretation of play
Professional/college teams spend many hours attaching semantic tags to video for DB access We want to make this process much more automatic
Support computer assisted strategic analysis of opponents
Previous Work: S. Intille. Visual Recognition of Multi-Agent Action. PhD Thesis, MIT, 1999.
Alan Fern Oregon State University
Obtained several games worth of home field video from OSU football team Once video file per play Exact same video used by coaches Video shot by single fixed location at top of Reser stadium Camera is constantly panning and zooming
Raw Video Data
Alan Fern Oregon State University
Registered Video Data Semantic interpretation requires registration of video
data to football field coordinates Developed robust registration approach [Hess & Fern, CVPR’07]
planar homography
Alan Fern Oregon State University
Problem: Formation Labelling We consider a subproblem of full play interpretation
Given: initial registered video frame of a play Output: offensive formation
types and locations of 11 offensive players
Thousands of possible formations
player locations & types
Alan Fern Oregon State University
Challenges in Formation Labelling Player appearances nearly identical
Appearance not useful for inferring player type
Difficult to robustly segment individual players “part detector” style approaches are difficult to apply
Alan Fern Oregon State University
Challenges in Formation LabellingDifferent formations can differ in subtle ways
Alan Fern Oregon State University
Problem Constraints A number of hard constraints imposed by rule book
Exactly 11 players Exactly 7 players on line and 4 players behind line Exactly 1 quarterback and 1 center Location of center is at midfield or “hash line”
Alan Fern Oregon State University
Problem Constraints Soft constraints on relative spatial locations of
players Constraints strongly depend on the set of player types
Alan Fern Oregon State University
Previous Attempt
Intille used KB of hard constraints to cast as a SAT-like problem Constraints: “near”, “to the left of”, “bit of vertical
space between”, etc.
Simplified problem by hand-labelling the field locations of the 11 players Only tried to infer player types
Failed to get the approach to work well and was abandoned in previous work
S. Intille. Visual Recognition of Multi-Agent Action. PhD Thesis, MIT, 1999.
Alan Fern Oregon State University
Structured Output Representations
Infer type & location for all of 11 players ti {QBS, QB, C, LG, RG, LTE, . . . }, 34 types
li {(0,0),(0,1),…, (n,m)}, pixel location
Our representation must capture Hard joint constraints among types Soft joint constraints among locations
conditioned on types and image data 22 output variables
Possible to encode constraints via standard discrete factor-graph models (e.g. CRFs, weighted CSPs, ILP, etc.)
Such encodings appear problematic wrt off-the-shelf inference techiques (?)
Domains of variables are huge many valuesLarge factors (e.g. exactly 7 “line type” players)Location constraints are inherently numeric
Alan Fern Oregon State University
Pictorial Structures Offensive formations can be viewed as multi-part
articulated objects (parts correspond to players)
Pictorial structure models have been successful for multi-part objects in computer vision Local part appearance models Deformable connections Joint estimation of part locations
Courtesy Fischler & Elschlager
simply pairwisegraphical models
node values are part locations
Alan Fern Oregon State University
Alan Fern Oregon State University
When edge structure forms a tree can use DP to compute map in O(nh2) time n - # of parts, h - # of pixels h2 is often impractical
If in addition dij(. , .) is a Mahalanobis distance then can do computation in O(nh) time!
Alan Fern Oregon State University
Pictorial Structures for Football For a fixed set of player types, locations can be well
approximated by pictorial structure
But part sets (i.e. player types) varies across plays Can’t use standard pictorial structures for our problem
Can we still leverage benefits of pictorial structures?
Alan Fern Oregon State University
Mixture of Parts Pictorial Structures (MoPPS)
Captures constraints on legal part sets via pv
Captures spatial constraints among parts via f
Alan Fern Oregon State University
MoPPS Inference
Find MAP estimate of most likely set of parts and their locations:
Worst case: evaluate pictorial structure of each legal part set Requires over an hour of processing for our problem
Need a structured MoPPS representation that can be exploited for fast inference We use a “MoPPS Tree”
Alan Fern Oregon State University
MoPPS Tree Representation
Pictorial structure for a legal part set is projection of global tree onto part set
Alan Fern Oregon State University
MoPPS Tree for Football
34 parts in model (one for each possible player type)
Includes local observation models
Includes pairwise spatial constraints
Also provide constraints for evaluating legal part sets
Alan Fern Oregon State University
MoPPS Tree Inference
Becomes combinatorial optimization over legal part sets
We use Branch-and-Bound Search (BBS)
Alan Fern Oregon State University
Branch-and-Bound Search
Search nodes are part sets Internal nodes represent sets of legal part sets Leaves are legal part sets
While solution not found Expand least node according to ordering relation Computer upper and lower bound Prune any dominated node
Alan Fern Oregon State University
Lower Bound Computations
Monotonicity: adding to a set of parts will never result in reduced cost Simply compute pictorial structure match of tree projected on parts in
search node Can improve on this by adding cost for “missing parts”
Alan Fern Oregon State University
Upper Bound Computations
Match entire MoPPS tree to image data Use as a heuristic for quickly finding legal completion of current part set Cost of completion is upper bound
Alan Fern Oregon State University
MoPPS Tree Parameters for Football
34 parts, 3200+ legal formations 16 basic player types plus subtypes
Connections modeled as Gaussian overideal location relative to “parent” player Parameters manually set using training images
Observation model uses two independent components : based on background model : based on color histogramming
Alan Fern Oregon State University
Background Model Register lots of video to field model
Learn kernel density estimate of color at each pixel
Alan Fern Oregon State University
Alan Fern Oregon State University
Alan Fern Oregon State University
Results
Alan Fern Oregon State University
Anytime Behavior: % Correct
• Exhaustive search requires close to an hour
• Greedy search is fast but achieves only 80% accuracy
• Mean-squared location error less than a yard
Alan Fern Oregon State University
Directions Learning MoPPS Models
Successfully hand-coded a MoPPS model Was quite time consuming to get parameters right Motivates supervised structure and parameter learning
MoPPS model takes average of 4 minutes per play Still too slow for weekly volume of game video Motivates speedup learning
MoPPS model will sometimes need to be relearned/adapted to different sets of video Want to reduce labelling effort Motivates active and transfer learning
Alan Fern Oregon State University
Structure and Parameter Learning
Goal: learn structure and parameters of MoPPS tree from labelled data Assume hard constraints on legal part sets provided
There are algorithms for learning the structure of pictorial structures Can easily modify to learn MoPPS tree Easy to combine with generative parameter learning
Alan Fern Oregon State University
Structure and Parameter Learning
Issue: pure generative parameter learning will not likely be sufficient Hand-coded model incorporate “reward terms” to make
up for deficiencies in generative observation model Suggests augmenting generative model with
discriminatively trained components
Issue: inference time of 4 minutes makes most generative training methods quite expensive Suggests using approaches that do not perform full joint
inference for each parameter update
Alan Fern Oregon State University
Speedup Learning
How can we speedup branch-and-bound search? There are a number of interesting settings
Setting 1: Given a MoPPS model & upper/lower bound functions Learn an effective search space operators
Setting 2: Given a MoPPS model & search space Learn more accurate upper/lower bound functions
Setting 3: Given a MoPPS model & search space & possibly bounds Learn an effective priority queue ranking function
Alan Fern Oregon State University
Active Model Calibration
Want to minimize labelling effort for new video set Active learning and/or semi-supervised
Want to leverage experience with previous videos Transfer learning
How can we combine these two paradigms for label efficient active model calibration? User interface is also critical
Very rough idea: Assume fixed model structure Learn prior on parameters from previous data sets Use prior for regularization and example selection
Alan Fern Oregon State University
Summary and Future Work
New structured output challenge problem We will provide labelled data set Can off-the-shelf structured learning approaches work
Suggests investigating lesser studied directions Speedup learning Active calibration
On the horizon Applying to defensive formations Full temporal play interpretation Mining strategic knowledge Strategic planning
Alan Fern Oregon State University
DigitalScout
Project
The
http://eecs.oregonstate.edu/football