Learning deep representation from coarse to fine for face alignment
-
Upload
zhiwen-shao -
Category
Science
-
view
19 -
download
1
Transcript of Learning deep representation from coarse to fine for face alignment
![Page 1: Learning deep representation from coarse to fine for face alignment](https://reader036.fdocuments.in/reader036/viewer/2022083117/587d0eac1a28abae148b54e1/html5/thumbnails/1.jpg)
{shaozhiwen, feiben, yiru.zhao, qinchuan.zhang}@sjtu.edu.cn, [email protected]
Learning Deep Representation from Coarse to Fine for Face Alignment Zhiwen Shao, Shouhong Ding, Yiru Zhao, Qinchuan Zhang, and Lizhuang Ma
Department of Computer Science and Engineering
Shanghai Jiao Tong University, China
Problem
Face alignment is to locate facial landmarks
Motivation & Challenges
Face alignment has many applications
• Face animation
• Face beautification
• Face preprocessing
There are some challenges
• Large pose, illumination and expression
variations
• Partial occlusion
• Low quality
We need an effective method to represent
highly complex faces
Ours vs. Others
Conventional methods
• Their results are highly relevant to the
initial shape
• Our network takes raw faces as input
without any initialization
Deep learning methods
• They use cascaded networks or multitask
learning
• Our method uses one network and
doesn’t require extra facial attributes
CNN
Coarse-to-fine Training Algorithm Comparison with other methods
The detection of dense landmarks is difficult
owing to too many labels of each face
There are a few key landmarks coarsely
determining the face shape
Given landmarks can be split into principal
subset and elaborate subset
Principal subset
Elaborate subset
Loss function
controls the relative weight
of principal subset
The prediction for location of the principal
subset can extract intrinsic facial structure
We further fine-tune the learned model by
adjusting the relative weight of principal
subset
Deep convolutional network
Convolutional layer 3×3/1/1
Principal unit
Input
50×50×3
25×25×64 13×13×128 7×7×192
25×25×128
4×4×256
50×50×64 13×13×192 7×7×256
256
24
n-24
24
n-24
24
n-24
24
n-24
Max-pooling layer 2×2/2/0
Fully-connected unit
Elaborate unit
RCPR
CFT
Algorithm discussions
The input is 50×50×3 for color face patches. n is
equal to double total number of landmarks
Three face alignment benchmarks
• Helen, 300-W, COFW
Direct training algorithm (DT)
Coarse-to-fine training algorithm (CFT)
Results of RCPR and CFT on several images from COFW
Results of CFT on several images from Helen and IBUG
Conclusion
Comparison of mean errors (%) with other methods
We propose a novel coarse-to-fine algorithm to train deep
convolutional network for facial landmark detection
Our network directly predicts the coordinates of landmarks
using a single network without any other additional
operation, whilst significantly improving the accuracy of
face alignment in the condition of severe occlusion
We believe that the proposed algorithm can be applied to
other problems using deep convolutional network