Consistent Visual Information Processing
-
Upload
owen-clark -
Category
Documents
-
view
27 -
download
1
description
Transcript of Consistent Visual Information Processing
Consistent Visual Information Processing
Axel Pinz
EMT – Institute of Electrical Measurement and Measurement Signal Processing
TU Graz – Graz University of Technology
[email protected]://www.emt.tu-graz.ac.at/~pinz
“Consistency”• Active vision systems / 4D data streams
• Imprecision
• Ambiguity
• Contradiction
• Multiple visual information
This Talk: Consistency in
• Active vision systems:– Active fusion– Active object recognition
• Immersive 3D HCI:– Augmented reality– Tracking in VR/AR
AR as Testbed
Consistent perceptionin 4D:
• Space– Registration– Tracking
• Time– Lag-free– Prediction
Agenda
• Active fusion
• Consistency
• Applications– Active object recognition– Tracking in VR/AR
• Conclusions
Active Fusion
fusion, contro lin teraction
w orld
w orlddescription
sceneselection
scene
exposure
im age
im age proc.,segm entation
im age
description
grouping,3D m odeling
scenedescription
integration
Simple top level decision-action-fusion loop:
Active Fusion (2)
• Fusion schemes– Probabilistic– Possibilistic (fuzzy)– Evidence theoretic (Dempster & Shafer)
Probabilistic Active Fusion
N measurements, sensor inputs: mi
M hypotheses: oj , O = {o1, …, oM }
Bayes formula:
),...,(
)()|,...,(),...,|(
1
11
N
jjNNj
mmP
oPommPmmoP
Use entropy H(O) to measure the quality of P(O)
)(log)()(1
j
M
j
j oPoPOH
Probabilistic Active Fusion (2)Flat distribution: P(oj )=const. Hmax
• Measurements can be:• difficult,• expensive,
• N can be prohibitively large, …
Find iterative strategy to minimize H(O)
Pronounced distribution:P(oc ) = 1; P(oj ) = 0, j c H = 0
)(log)()(1
j
M
j
j oPoPOH
Probabilistic Active Fusion (3)
Start with A 1 measurements: P(oj|m1, … ,mA), HA
Iteratively take more measurements: mA+1, … ,mB
Until: P(oj|m1, … ,mB), HB Threshold
Summary: Active Fusion
• Multiple (visual) information, many sensors, measurements,…
• Selection of information sources
• Maximize information content / quality
• Optimize effort (number / cost of measurements, …)
Information gain by entropy reduction
Summary: Active Fusion (2)
• Active systems (robots, mobile cameras)– Sensor planning– Control– Interaction with the scene
• “Passive” systems (video, wearable,…)– Filtering– Selection of sensors / measurements
Consistent Subsets
Hypotheses O = {o1 ,…, oM }
Ambiguity: P(O) is multimodal
Consistent unimodal subsets Ok O
• Application domains
• Support of hypotheses
• Outlier rejection
Benefits:
Distance Measures
Depend on representations, e.g.:
• Pixel-level SSD, correlation, rank• Eigenspace Euclidean• 3D models Euclidean• Feature-based Mahalanobis, …• Symbolic Mutual information• Graphs Subgraph isomorphism
Mutual Information
Shannon´s measure of mutual information:
O = {o1 ,…, oM }A O, B O
I(A,B) = H(A) + H(B) – H(A,B)
Applications
• Active object recognition– Videos– Details
• Tracking in VR / AR– Landmark definition / acquisition– Real-time tracking
Active Object Recognitionin Parametric Eigenspace
• Classifier for a single view
• Pose estimation per view
• Fusion formalism
• View planning formalism
• Estimation of object appearance at unexplored viewing positions
Applications
Active object recognition– Videos– Details
Control of active vision systems
• Tracking in VR / AR– Landmark definition / acquisition– Real-time tracking
Selection, combination, evaluation Constraining of huge spaces
Automatic Landmark Acquisition
• Capture a dataset of the scene:– calibrated stereo rig
– trajectory (by magnetic tracking)– n stereo pairs
• Process this dataset– visually salient landmarks for tracking
Automatic Landmark Acquisitionvisually salient landmarks for tracking
• salient points in 2D image• 3D reconstruction• clusters in 3D:
– compact, many points– consistent feature descriptions
• cluster centers landmarks
Real-Time Tracking
• Measure position and orientation of object(s) • Obtain trajectories of object(s)
• Stationary observer – “outside-in” – Vision-based
• Moving observer, egomotion – “inside-out”– Hybrid
• Degrees of Freedom – DoF– 3 DoF (mobile robot)– 6 DoF (head and device tracking in AR)
Outside-in Tracking (1)
stereo-rigIR-illumination
• wireless
• 1 marker/device:3 DoF
• 2 markers: 5 DoF• 3 markers: 6 DoF
devices
2D B
lob
Tra
ckin
gE
pipo
lar
Geo
met
ry3D
Cor
resp
onde
nce
3D O bjects and Pose
2D Backpro jection
Epipolar G eom etry
C onstra in ts
3D C orrespondence
3D Prediction
B lob D etection
Tile Q uantisation
Prediction
B lob D etection
Tile Q uantisation
Prediction
W orkspace
O bject M odels
LEFT IM AG E R IG H T IM AG E
Outside-inTracking (2)
Consistent Tracking (1)
• Complexity– Many targets– Exhaustive search vs. Real-time
• Occlusion– Redundancy (targets | cameras)
• Ambiguity in 3D– Constraints
Consistent Tracking (2)
• Dynamic interpretation tree– Geometric / spatial consistency
• Local constraints– Multiple interpretations can happen– Global consistency is impossible
• Temporal consistency– Filtering, prediction
Hybrid Inside-Out Tracking (1)
• 3 accelerometers• 3 gyroscopes• signal processing• interface
Inertial Tracker
Summary: Consistency in
• Active vision systems:– Active fusion– Active object recognition
• Immersive 3D HCI:– Augmented reality– Tracking in VR/AR
Conclusion
Consistent processing of visual informationcan significantly improve
the performance ofactive and real-time vision systems