Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… ·...
Transcript of Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… ·...
![Page 1: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/1.jpg)
Hand Pose Estimation
Author: Pratik Kalshet Supervisor: Parag Chaudhuri
Department of Computer Science and EngineeringIndian Institute of Technology Bombay
RnD Project
![Page 2: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/2.jpg)
Introduction Problem Statement Previous Work Approach Results
Outline
![Page 3: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/3.jpg)
Applications - Human-computer interaction, Augmented and Virtual Reality, … Hot research topic – ICCV, CVPR, SIGGRAPH. 2016
IntroductionMotivation
Robert Wang. Nimble VR 2014
![Page 4: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/4.jpg)
IntroductionChallenges
Self-occlusion Self-similarity NoiseHigh Degree-of-freedom
![Page 5: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/5.jpg)
Aim – Accuracy and Efficiency
Problem StatementHand Pose Estimation
Input – Depth Image (of hand) Output – Joint Locations in 3-D
![Page 6: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/6.jpg)
Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to get hand pose
Advantage – accurate, valid poses
Disadvantage – slow, local minima (initialization problem)
Data-driven (Discriminative) [Sun+15(CVPR)] Direct regression function – observed image to hand pose
Advantage – fast (real-time)
Disadvantage – coarse results, violate hand geometry
Previous WorkTypes of Techniques
Generative Methods
Discriminative MethodsTheobalt. “Real-time Capture of Hands in Motion”. CVPR. 2015
![Page 7: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/7.jpg)
Hybrid [Tay+16(SIGGRAPH)] Initialization using discriminative, refinement using generative
Advantage – accurate, fast
Disadvantage – separate stages lead to sub-optimal results
Previous WorkTypes of Techniques
Tomson et. al. “Real-time continuous pose recovery of human hands using convolutional networks”. TOG. 2014
![Page 8: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/8.jpg)
Hand prior
Non-linear regression
Previous WorkIssues in Existing Systems
Ge. “Robust 3D hand pose estimation in single depth images: from single-view CNN to multi-view CNNs”. CVPR. 2016
![Page 9: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/9.jpg)
ApproachOverview
Input(Depth Image)
Output(3-D Joint Positions)
Deep Network(ConvNet, Kinematic Layer)
Zhou et al. “Model-based Deep Hand Pose Estimation”. IJCAI. 2016
![Page 10: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/10.jpg)
ApproachPre-processing
Zhang et al. “Accurate per-pixel hand detection from a single depth image”. Optical Engineering. 2017
1. Hand detection
2. Depth normalization
*This is assumed to be done.
![Page 11: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/11.jpg)
ApproachDeep Network
Loss:
![Page 12: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/12.jpg)
NYU Hand Pose Dataset Training samples: 10000 Test samples: 1200 Joints: 31 DoF: 26
ResultsData
Input – Depth Image Label – Joint Positions in 3-D
Tomson et. al. “Real-time continuous pose recovery of human hands using convolutional networks”. TOG. 2014
![Page 13: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/13.jpg)
ResultsQualitative Results
Input
Prediction
Ground Truth
![Page 14: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/14.jpg)
ResultsComparative Study
Input
Prediction
Ground Truth
Without Kinematic Layer With Kinematic Layer
![Page 15: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/15.jpg)
ResultsComparison with state-of-the-art
Technique Error
No prior 6395.45
Existing best prior 4699.16
Kinematic prior 3079.38
![Page 16: Hand Pose Estimation RnD Project - CSE, IIT Bombaypratikm/projectPages/deepLearningForPo… · Model-driven (Generative) [Mak+15(CVPR)] Synthesize, optimize energy (discrepancy) to](https://reader034.fdocuments.in/reader034/viewer/2022042302/5ecd5f87ca840f61077673bb/html5/thumbnails/16.jpg)
Hand kinematic prior in a deep network Achieved competitive results
ConclusionSummary
Future Work
Multi-view CNN Temporal data for tracking Physics-based constraint layer