SynRhythm: Learning a Deep Heart Rate Estimator from ... · Spatial-temporal HR signal...

1
Experiments Database 25.9 13.6 21 7.62 10.7 6.23 8.72 11.31 4.49 RMSE(bpm) Problem: Remote heart rate (HR) estimation from face video Existing methods for remote HR estimation are mainly based on some certain physical models or skin reflection priors, which may not hold in realistic situations because of various challenges in head movement, illumination variation and recording device. SynRhythm: Learning a Deep Heart Rate Estimator from General to Specific Xuesong Niu, Hu Han, Shiguang Shan, Xilin Chen Institute of Computing Technology, Chinese Academy of Sciences [email protected], {hhan, sgshan, xlchen}@ict.ac.cn Proposed Method Spatial-temporal HR signal representation Transform a video sequence of face to a good spatial-temporal representation General-to-specific learning approach with synthetic signal maps Synthetic HR signal HR signal from a real face video synthetic rhythm signal State II : Pre-train on large scale synthetic spatial- temporal maps State I : Pre- train on lmageNet State III : Fine-tune on real-life spatial- temporal maps HR regression model HR estimator 68 bpm ImageNet Synthetic signals Synthetic spatial- temporal maps Face videos Real-life spatial- temporal maps cardiac cycle Breath activity Motion related signal Problem Motivation Learning-based methods, especially deep convolutional neural networks (DCNN), have shown great power in modeling complicated variations in a number of computer vision tasks, such as object detection, and face recognition. The challenges of applying deep learning for remote HR estimation lie in that there is not sufficient data for training a robust DCNN that can generalize to unseen scenarios. The periodical color change cause by HR can be viewed as a one dimensional signal with low PSNR. Such a signal can be simulated using a periodic function. Therefore, we propose to learn a deep HR estimator from general to specific by synthesizing a large number of HR signals. Limited Training Data Limited Spatial- temporal Maps Ground truth: 70 bpm Prediction: 59 bpm Heart Rate Estimator Synthetical Spatial -temporal Maps Limited Training Data Limited Spatial- temporal Maps Ground truth: 70 bpm Prediction: 68 bpm Heart Rate Estimator 19.95 13.97 11.37 6.83 Li2014 Haan2013 Tulyakov2016 Proposed RMSE(bpm) Results Performance on MAHNOB-HCI Performance on MMSE-HR Effectiveness of synthetic data 中科院计算所 70 bpm Heart Rate Estimator Hand-crafted feature Pros: Interpretable Performing well in well-controlled environment Cons: Based on certain assumptions or priors which may not suitable for realistic situation Learning-based representation Pros: Strong modeling ability Robust in realistic situations Cons: Large-scale data is needed for training a robust estimator. Database Subject Video Video length protocol MAHNOB-HCI 27 527 30s three-fold MMSE-HR 40 102 30s three-fold MAHNOB-HCI MMSE-HR

Transcript of SynRhythm: Learning a Deep Heart Rate Estimator from ... · Spatial-temporal HR signal...

Page 1: SynRhythm: Learning a Deep Heart Rate Estimator from ... · Spatial-temporal HR signal representation Transform a video sequence of face to a good spatial-temporal representation

ExperimentsDatabase

25.9

13.6

21

7.6210.7

6.238.72

11.31

4.49RMSE

(bpm

)

Problem: Remote heart rate (HR) estimation from face video

Existing methods for remote HR estimation are mainly based on somecertain physical models or skin reflection priors, which may not hold inrealistic situations because of various challenges in head movement,illumination variation and recording device.

SynRhythm: Learning a Deep Heart Rate Estimator from General to Specific

Xuesong Niu, Hu Han, Shiguang Shan, Xilin ChenInstitute of Computing Technology, Chinese Academy of Sciences

[email protected], {hhan, sgshan, xlchen}@ict.ac.cn

Proposed Method

Spatial-temporal HR signal representation Transform a video sequence of face to a good spatial-temporal representation

General-to-specific learning approach with synthetic signal maps

Synthetic HR signal

HR signal from a real face video

synthetic rhythm signal

State II : Pre-train on large scale

synthetic spatial-temporal maps

State I : Pre-train on

lmageNet

State III : Fine-tune on real-life spatial-

temporal maps

HR regression

model

HR estimator

68 bpm

ImageNet

Synthetic signals

Synthetic spatial-temporal maps

Face videos

Real-life spatial-temporal maps

cardiac cycle

Breath activity

Motion related signal

Problem Motivation Learning-based methods, especially deep convolutional neural networks

(DCNN), have shown great power in modeling complicated variations in anumber of computer vision tasks, such as object detection, and facerecognition.

The challenges of applying deep learning for remote HR estimation lie inthat there is not sufficient data for training a robust DCNN that cangeneralize to unseen scenarios. The periodical color change cause by HRcan be viewed as a one dimensional signal with low PSNR. Such a signalcan be simulated using a periodic function. Therefore, we propose to learna deep HR estimator from general to specific by synthesizing a largenumber of HR signals.

Limited Training DataLimited Spatial-temporal Maps

Ground truth: 70 bpmPrediction: 59 bpm

Heart Rate Estimator

Synthetical Spatial-temporal Maps

Limited Training Data

Limited Spatial-temporal Maps

Ground truth: 70 bpmPrediction: 68 bpm

Heart Rate Estimator

19.95

13.9711.37

6.83

Li2014 Haan2013 Tulyakov2016 Proposed

RMSE

(bpm

)

Results

Performance on MAHNOB-HCI

Performance on MMSE-HR Effectiveness of synthetic data

中科院计算所

70 bpmHeart Rate Estimator

Hand-crafted feature

Pros: Interpretable Performing well in well-controlled

environmentCons: Based on certain assumptions or

priors which may not suitable forrealistic situation

Learning-based representation Pros: Strong modeling ability Robust in realistic situations

Cons: Large-scale data is needed for

training a robust estimator.

Database Subject Video Video length protocolMAHNOB-HCI 27 527 30s three-fold

MMSE-HR 40 102 30s three-fold

MAHNOB-HCI MMSE-HR