Learning from Few Subjects with Large Amounts of Voice ... · John V. Guttag Robert E. Hillman...

Jose Javier Gonzalez Ortiz

Learning from Few Subjects with Large Amounts of Voice Monitoring Data

John V . Gut tagRober t E . H i l lman Daryush D . Mehta Ja r rad H . Van S tan Marzyeh Ghassemi


• Few subjects and large amounts of data→ Overfitting to subjects

• No obvious mapping from signal to features→ Feature engineering is labor intensive

• Subject-level labels→ In many cases, no good way of getting sample

specific annotations

1

Challenges of Many Medical Time Series


• Few subjects and large amounts of data→ Overfitting to subjects

• No obvious mapping from signal to features→ Feature engineering is labor intensive

• Subject-level labels→ In many cases, no good way of getting sample

specific annotations

2

Challenges of Many Medical Time Series

Unsupervised feature extraction

Multiple Instance Learning


128 x 64 128 x 64Conv + BatchNorm + ReLUPoolingDenseUpsamplingSigmoid

3

Learning Features

• Segment signal into windows• Compute time-frequency representation• Unsupervised feature extraction

Jose Javier Gonzalez Ortiz4

Ours

Raw Waveform Spectrogram Encoder

Logistic Regression

• % Positive

Prediction

Per Window Per Subject

Raw Waveform Spectrogram Encoder

Logistic Regression

• % Positive

Prediction

Per Window Per Subject

Classification Using Multiple Instance Learning

• Logistic regression on learned features with subject labels

• Aggregate prediction using % positive windows per subject

Jose Javier Gonzalez Ortiz5

Application: Voice Monitoring Data

• Voice disorders affect 7% of the US population

• Data collected through neck placed accelerometer

1 week = ~4 billion samples

~100


Previous work relied on expert designed features[1]

6

Results

AUC Accuracy

Expert LRTrain 0.70 ± 0.05 0.71 ± 0.04

Test 0.68 ± 0.05 0.69 ± 0.04

OursTrain 0.73 ± 0.06 0.72 ± 0.04

Test 0.69 ± 0.07 0.70 ± 0.05

[1] Marzyeh Ghassemi et al. Learning to detect vocal hyperfunction from ambulatory neck-surface acceleration features: initial results for vocal fold nodules. IEEE Trans. Biomed Engineering

Comparable performance without task-specific feature engineering!


• Our method learns features from large time series data

• Reduces the need for laborious task-specific feature engineering

• Applied to large voice monitoring dataset

• Comparable performance to previous work that relied on expert engineered features

7

Summary

Learning from Few Subjects with Large Amounts of Voice ... · John V. Guttag Robert E. Hillman...

Documents

Transcript of Learning from Few Subjects with Large Amounts of Voice ... · John V. Guttag Robert E. Hillman...