Melody Detection in Polyphonic Audio
-
Upload
rui-pedro-paiva -
Category
Business
-
view
706 -
download
3
description
Transcript of Melody Detection in Polyphonic Audio
Rui Pedro Paiva
Melody Detection in
Polyphonic Audio
9º Encontro da APEA
October 20, 2007
Centro de Informática e Sistemas da Universidade de Coimbra
2 Melody Detection in Polyphonic Audio
Outline
Introduction
Overview
Multi-Pitch Detection
From Pitches to Notes
Identification of Melodic Notes
Evaluation Procedures
Overall Results
Conclusions and Future Work
3 Melody Detection in Polyphonic Audio
Introduction
Motivation and Objectives
• Goal
- Extract a symbolic representation of melody from a polyphonic audio musical signal
• Broad range of applications
- MIR, music education, plagiarism detection, metadata
• No general-purpose, robust, accurate solution developed so far
Overall Approach
• Focus on melody, no matter what other sources are present (figure-ground separation)
4 Melody Detection in Polyphonic Audio
Introduction
Melody Definition
• Subjective concept
- Different people may define and perceive melody in different ways
- Proposals addressing cultural, perceptual, emotional or musicological facets of melody
• Context of our work
- “Melody is the individual dominant pitched line in a musical ensemble”
5 Melody Detection in Polyphonic Audio
Overview
Raw Musical
Signal Melody Notes
Multi-Pitch
Detection
1) Cochleagram
2) Correlogram
3) Pitch Salience
Curve
4) Salient Pitches
Determination of
Musical Notes
1) Pitch Trajectory Construction
2) Trajectory Segmentation
(with onset detection directly
on the raw signal)
Identification of
Melodic Notes
1) Elimination of Ghost Octave
Notes
2) Selection of the Most Salient
Notes
3) Melody Smoothing
4) Elimination of False Positives
Pitches in 46-msec Frames Entire Set of Notes
6 Melody Detection in Polyphonic Audio
Multi-Pitch Detection
0.21 0.22 0.23 0.24 0.25-0.4
-0.2
0
0.2
0.4
Time (s)
x(t
)a) Sound waveform
Time (s)
Fre
qu
en
cy c
ha
nn
el
b) Cochleagram frame
0.21 0.22 0.23 0.24 0.25
20
40
60
80
Time Lag (ms)
Fre
qu
ncy c
ha
nn
el
c) Correlogram frame
0 5 10 15 20
20
40
60
80
0 5 10 15 20 250
0.1
0.2
0.3
0.4
Time Lag (ms)
Pitch
sa
lien
ce
d) Summary correlogram
7 Melody Detection in Polyphonic Audio
From Pitches to Notes
Goal
• Quantize temporal sequences of detected pitches into MIDI notes
Approach
• Pitch Trajectory Construction [Serra]
• Frequency-Based Track Segmentation
• Salience-Based Track Segmentation
8 Melody Detection in Polyphonic Audio
From Pitches to Notes
Pitch Trajectory Construction
1 2 nn-1n-2
Frame Number
MID
I note
num
ber
9 Melody Detection in Polyphonic Audio
From Pitches to Notes
Frequency-Based Track Segmentation
• Approximation of frequency curves by piecewise-constant functions (PCFs)
- 1) MIDI quantization
- 2) Merging of PCFs: oscillation, delimited sequences, glissando
- 3) Time correction
- 4) Assignment of final MIDI note numbers: tuning compensation
b) Filtered PCFs
a) Original PCFs
10 Melody Detection in Polyphonic Audio
From Pitches to Notes
Frequency-Based Track Segmentation
0 100 200 300 400 5004800
4900
5000
5100
5200
5300
5400
Time (msec)
f[k] (c
en
ts)
a) Eliades Ochoa's excerpt
0 200 400 600 8006100
6200
6300
6400
6500
6600
6700
6800
Time (msec)f[k] (c
en
ts)
b) Female opera excerpt
11 Melody Detection in Polyphonic Audio
From Pitches to Notes
Salience-Based Track Segmentation - 1) Candidate Segmentation Points
- 2) Onset Detection on the audio signal [Scheirer; Klapuri]
- 3) Validation of Segmentation Points by Onset Matching
0 0.3 0.6 0.9 1.220
40
60
80
100
Time (sec)
sF
[k]
12 Melody Detection in Polyphonic Audio
Identification of Melodic Notes
Goal
• Separate the melodic notes from the accompaniment
Approach
• Elimination of Ghost Harmonically-Related Notes
• Selection of the Most Salient Notes
• Melody Smoothing
• Elimination of Spurious Accompaniment Notes
• Note Clustering
13 Melody Detection in Polyphonic Audio
Identification of Melodic Notes
Elimination of Ghost Harmonically-Related Notes
• Delete harmonically-related notes that satisfy the “common fate” principle
30 60 90 120 150 180 2100.94
0.96
0.98
1
1.02
1.04
1.06
Time (msec)
No
rma
lize
d F
req
ue
ncy
14 Melody Detection in Polyphonic Audio
Identification of Melodic Notes
0 1 2 3 4 5 6
40
50
60
70
80
90
Time (sec)
MID
I n
ote
nu
mb
er
0 1 2 3 4 5 6
40
50
60
70
80
90
Time (sec)
MID
I n
ote
nu
mb
er
Selection of the Most
Salient Notes Melody Smoothing
Time
MID
I note
num
ber
74
72
60
62
64
66
68
70
R1
a2
R3
a2
R2
a1
R4
a3
15 Melody Detection in Polyphonic Audio
Identification of Melodic Notes
Elimination of Spurious Accompaniment Notes
- 1) Delete deep valleys in the salience contour
- 2) Delete isolated abrupt duration transitions
0 5 10 15 20 25 30 3540
50
60
70
80
90
Note Sequence
Ave
rag
e P
itch
Sa
lien
ce
16 Melody Detection in Polyphonic Audio
Identification of Melodic Notes
Elimination of Spurious Accompaniment Notes
0 5 10 1540
50
60
70
80
90
Time (s)
Pitc
h S
alie
nce
17 Melody Detection in Polyphonic Audio
Identification of Melodic Notes
Note Clustering
• Feature Extraction - Spectral shape, harmonicity, attack transient,
intensity, pitch-related features
• Feature Selection and Dimensionality Reduction
- Forward feature selection
- Principal Component Analysis
• Clustering - Gaussian Mixture Models
– 2 clusters: melodic and “garbage” cluster
18 Melody Detection in Polyphonic Audio
Evaluation Procedures
Ground Truth Data ID Song Title Category Solo Type
1 Pachelbel’s “Kanon” Classical Instrumental
2 Handel’s “Hallelujah” Choral Vocal
3 Enya – “Only Time” New Age Vocal
4 Dido – “Thank You” Pop Vocal
5 Ricky Martin – “Private Emotion” Pop Vocal
6 Avril Lavigne – “Complicated” Pop/Rock Vocal
7 Claudio Roditi – “Rua Dona Margarida” Jazz/Easy Instrumental
8 Mambo Kings – “Bella Maria de Mi Alma” Bolero Instrumental
9 Eliades Ochoa – “Chan Chan” Son Vocal
10 Juan Luis Guerra – “Palomita Blanca” Bachata Vocal
11 Battlefield Band – “Snow on the Hills” Scottish Folk Instrumental
12 daisy2 Pop Vocal
13 daisy3 Pop Vocal
14 jazz2 Jazz Instrumental
15 jazz3 Jazz Instrumental
16 midi1 Pop Instrumental
17 midi2 Folk Instrumental
18 opera female 2 Opera Vocal
19 opera male 3 Opera Vocal
20 pop1 Pop Vocal
21 pop4 Pop Vocal
19 Melody Detection in Polyphonic Audio
Evaluation Procedures
Evaluation Metrics
• Pitch Contour Accuracy - Melodic Raw Pitch Accuracy (MRPA)
- Overall Raw Pitch Accuracy (ORPA)
- Melodic Raw Note Accuracy (MRNA)
- Overall Raw Pitch Accuracy (ORNA)
• Note Extraction Accuracy - Percentage of correct frames
• Melody/Accompaniment Discrimination - Recall
- Precison
20 Melody Detection in Polyphonic Audio
Overall Results
Multi-Pitch Detection • MRPA = 81.0%
Note Determination • Frequency-Based Track Segmentation: Re = 72%; Pr = 94.7%
• Salience-Based Track Segmentation: Re = 75%; Pr = 41.2%
Identification of Melodic Notes • Elimination of Ghost Notes (EGN)
- Eliminated 37.8% of notes (only 0.3% melodic)
• Overall Melody Extraction - MRNA = 84.4%; ORNA = 77.0%
• Melody/Accompaniment Discrimination - Recall = 31.0%; Precision = 52.8%
• MIREX’2004 - Train (note accuracy) = 80.1/ 77.1%; Test = 77.4 / 75.1%
- Train (pitch contour accuracy) = 75.1; 71.1%
• MIREX’2005 - 61.1% overall raw pitch accuracy
21 Melody Detection in Polyphonic Audio
Conclusions and Future Work
Main Contributions • Note determination
• Identification of melodic notes in a mixture
Conclusions • Adequate MPD approach in a melody context
• Note determination with satisfactory accuracy
• Identification of melodic notes successful in medium SNR conditions
Future Work • Larger musical corpus
• Reduce computational time for pitch detection
• Melody/Accompaniment discrimination