Deeply Learned Compositional Models for Human Pose...
Transcript of Deeply Learned Compositional Models for Human Pose...
![Page 1: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/1.jpg)
Deeply Learned Compositional Models for Human Pose Estimation
Presenter: Shen SangCSE 291D
Wei Tang, Pei Yu and Ying Wu Northwestern University
![Page 2: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/2.jpg)
CONTENTS
• Background• Related works• Technical details
• Compositional model
• Spatially local information summarization
• Bone-based part representation
• Network initialization
• Experiments• Conclusion
![Page 3: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/3.jpg)
BACKGROUND
![Page 4: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/4.jpg)
HUMAN POSE ESTIMATION
Task: • Determine the precise pixel location of important
keypoints of the body • Estimate the configuration of the human body• Achieve an understanding of a person’s posture
and limb articulation
Useful for higher level tasks: • human-computer/robot interaction • video surveillance• gaming• sport performance analysis• scene understanding Sarafianos, N., Boteanu, B., Ionescu, B., Kakadiaris, I.A.: 3d human
pose estimation: A review of the literature and analysis of covariates
![Page 5: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/5.jpg)
DIFFICULTIES
overlapping parts nearby persons clutter backgrounds
occlusion, severe deformation, changes in appearance due to factors like clothing and lighting ……
![Page 6: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/6.jpg)
RELATED WORKS
![Page 7: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/7.jpg)
RELATED WORKS
Explore “bidirectional” architectures that also reason with top-down feedback, in which way neural units are influenced by both lower and higher-level units.
Hu, P., Ramanan, D.: Bottom-up and top-down reasoning with hierarchical rectified gaussians
![Page 8: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/8.jpg)
RELATED WORKS
During inference, a node sends a message to each of its neighbors and receives messages from each neighbor. The framework can be viewed as two components: • a front-end deep convolutional neural networks for learning feature representations of body parts• message passing layers for conducting inference and learning on mixture of parts with deformation constraints
between parts.
Yang, W., Ouyang, W., Li, H., Wang, X.: End-to-end learning of deformable mixture of parts and deep convolutional neural networks for human pose estimation
![Page 9: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/9.jpg)
RELATED WORKS
• Repeatedly produce 2D belief maps for the location of each part
• Learn rich image-dependent spatial models of the relationships between parts.
• Provide a natural learning objective function that enforces intermediate supervision
Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines
![Page 10: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/10.jpg)
RELATED WORKS
A detection-followed-by-regression CNN cascaded architecture specifically designed for learning part relationships and spatial context.
• The first part of the cascade outputs part detection heatmaps.• The second part performs regression on these heatmaps.
Bulat, A., Tzimiropoulos, G.: Human pose estimation via convolutional part heatmap regression
![Page 11: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/11.jpg)
RELATED WORKS: CONCLUSION
Keypoints:
• Explore the structure of the body and relastionships between parts • Hierarchical/sequential structure to refine the result of former estimation
![Page 12: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/12.jpg)
TECHNICAL DETAILS
![Page 13: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/13.jpg)
EXISTING PROBLEMS
• Assuming a Gaussian distribution on the subpart displacement with the subpart’s anchor position being its mean.
• Using a set of discrete type variables (e.g orientation, scale of a part, semantic classes) to model the compatibility among parts.
• Existing loops in compositional structure.
• It is incapable to characterize the complex compositional relationships among body parts.• State spaces for higher-level parts can be exponentially large, making both computation and
storage demanding.• Approximate inference algorithms must be used, learning and testing will be affected.
![Page 14: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/14.jpg)
MOTIVATION
• Explicitly learn the hierarchical compositionality of visual patterns via deep neural networks.
• Encode the orientation, scale and shape of each part compactly and avoids their potentially large state spaces.
• Use hierarchical compositional structure and bottom-up/top-down inference stages across multiple semantic levels.
![Page 15: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/15.jpg)
COMPOSITIONAL MODEL: NOTATIONS
• Compisitional model:
• Graph structure:
• Potential function:
• State variable:
• All state variables:
(ν, ϵ, ϕand, ϕleaf )(ν, ϵ)
(ϕand, ϕleaf )wu = {pu, tu}, ∀u ∈ ν
Ω
![Page 16: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/16.jpg)
COMPOSITIONAL MODEL: ARCHITECTURE
p(Ω |I) =1Z
exp(−E(Ω, I))
Probability distribution:
E: energy function of state variables given an imageZ: partition function (normalizing constant)
![Page 17: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/17.jpg)
COMPOSITIONAL MODEL: SCORE FUNCTION
• The first term acts like a detector: it determines how likely the primitive modeled by leaf-node u is present at location pu and of type tu.
• The second term models the state compatibility between a subpart v and its parent u.
![Page 18: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/18.jpg)
COMPOSITIONAL MODEL: TWO STAGES
• Bottom-up:
• Top-down:
![Page 19: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/19.jpg)
SLIS: RECURSIVE EQUATIONS
• Bottom-up:
• Top-down:
• Recursive equations:
equal to 1 if and 0 otherwise wu = w*u
![Page 20: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/20.jpg)
SLIS: INFORMATION SUMMARIZATION
• Pooling: combine features in a way that preserves task-related information while removing inrelevant details, lead to more compact representations and better robustness to noise and clutter.
• In compositional inference, score maps of some parts are combined to get relevant information about the states of other related parts.
pooling
![Page 21: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/21.jpg)
SLIS: LOCAL SPATIAL REGION
Intuition: child and parent parts should not be far from each other
![Page 22: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/22.jpg)
SLIS FUNCTION
1.Information summarization2.Acting in a local region3.Location invarient
![Page 23: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/23.jpg)
CNN: WHY AND HOW?
• CNNs aggregate information on a local spatial support using location-invariant parameters. • CNNs are known for their capability to approximate inference functions.
![Page 24: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/24.jpg)
PART SHARING AND HIGH-ORDER POTENTIALS
PART SHARING
High-order potential function
![Page 25: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/25.jpg)
PART SHARING AND HIGH-ORDER POTENTIALS
PART SHARING
![Page 26: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/26.jpg)
PART REPRESENTATION: COMPLEXITY
High storage and computation demanding!!!
Example:
• Left lower/upper leg: • Left leg: • Left and right leg:
O(N2)O(N4)
O(N )
![Page 27: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/27.jpg)
BONE-BASED PART REPRESENTATION
• have three semantic levels, including 16, 12 and 6 parts respectively. • all the children sharing a common parent are linked to each other.• represent each part with its bones, which are generated by putting
Gaussian kernels along the part segments. • taken as the ground truth of score maps when training networks
![Page 28: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/28.jpg)
BONE-BASED PART REPRESENTATION
Advantages• Score maps are now 2-D matrices instead of 3-D tensors. This reduces space and
computation complexities in score map predictions. • The bones compactly encode orientations, scales and shapes of parts.
![Page 29: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/29.jpg)
DEEPLY LEARNED COMPOSITIONAL MODEL
![Page 30: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/30.jpg)
MODEL INITIALIZATION: HOURGLASS NETWORK
Why?
• The hourglass module extends the fully convolutional network by processing and consolidating features across multiple scales. This enables it to capture the various spatial relationships associated with the input score maps.
• The eight-stack hourglass network, formed by sequentially stacking eight hourglass modules, has achieved state-of-the-art results on several HPE benchmarks.
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation
![Page 31: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/31.jpg)
HOURGLASS NETWORK: OVERVIEW
Motivation: capture information at every scale
• Stacked hourglass networks
• The person’s orientation, the arrangement of their limbs, and the relationships of adjacent joints are among the many cues that are best recognized at different scales in the image.
• While local evidence is essential for identifying features like faces and hands, a final pose estimate requires a coherent understanding of the full body.
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation
![Page 32: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/32.jpg)
HOURGLASS NETWORK: ARCHITECTURE
• Using skip layers to preserve spatial information at each resolution.• The topology of the hourglass is symmetric, so for every layer present on the way down there is a
corresponding layer going up.
• Stacking multiple hourglasses end-to-end, feeding the output of one as input into the next.• Repeated bottom-up, top-down processing used in conjunction with intermediate supervision.
• A single “hourglass” module • Residual module
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation
![Page 33: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/33.jpg)
HOURGLASS NETWORK: INTERMEDIATE SUPERVISION
The network splits and produces a set of heatmaps (outlined in blue) where a loss can be applied.
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation
![Page 34: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/34.jpg)
HOURGLASS NETWORK: SUMMARY
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation
![Page 35: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/35.jpg)
EXPERIMENTS
![Page 36: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/36.jpg)
DATASETS AND METRICS
• Frames Labeled In Cinema (FLIC)composed of 5003 images taken from popular Hollywood movies, 3987 for training 1016 for testing
• Leeds Sports Pose (LSP)extented LSP consists of 11k training images and 1k testing images from sports activities
• MPII Human Poseincludes around 25K images with annotated body joints, covers 410 human activities and a great variety of full-body poses
Percentage of Correct Keypoints (PCK): calculate the percentage of detections that fall within a normalized distance of the ground truth
![Page 37: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/37.jpg)
QUANTITATIVE COMPARISON
![Page 38: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/38.jpg)
QUANTITATIVE COMPARISON
![Page 39: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/39.jpg)
QUANTITATIVE COMPARISON
The proposed approach clearly outperforms the eight-stack hourglass networks
![Page 40: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/40.jpg)
COMPLEXITY
GFLOPS: Giga floating point operations per second
• Use only five hourglass modules instead of eight, the model has less parameters and lower computational complexity.
• The prior top-performing method on the benchmarks has 74% more parameters and needs 37% more GFLOPS.
![Page 41: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/41.jpg)
QUALITATIVE RESULT: SCORE MAPS
![Page 42: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/42.jpg)
QUALITATIVE RESULT: COMPARISON
DLCM can resolve the ambiguities that appear in bottom-up pose predictions of an 8-stack hourglass network.
![Page 43: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/43.jpg)
COMPONENT ANALYSIS
Evaluation metric : Mean [email protected] over hard joints, i.e., ankle, knee, hip, wrist and elbow
![Page 44: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/44.jpg)
CONCLUSION
![Page 45: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/45.jpg)
CONCLUSION
• The first attempt to explicitly learn the hierarchical compositionality of visual patterns via deep neural networks.
• Propose a novel bone-based part representation to avoid potentially large state spaces for higher-level parts.
• Outperform state-of-the-art methods on three benchmark datasets.
![Page 46: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/46.jpg)
POSSIBLE EXTENSIONS
• Multi-Person 2D Pose Estimation • 3D Human pose estimation • Human pose/action recognition
![Page 47: Deeply Learned Compositional Models for Human Pose Estimationcseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2019/... · 2019-03-08 · • a front-end deep convolutional neural](https://reader033.fdocuments.in/reader033/viewer/2022050200/5f538ca3377de501903c5464/html5/thumbnails/47.jpg)
THANK YOU