Song-level Multi-pitch Tracking by Heavily Constrained Clustering Zhiyao Duan, Jinyu Han and Bryan...
-
Upload
elwin-hancock -
Category
Documents
-
view
214 -
download
0
Transcript of Song-level Multi-pitch Tracking by Heavily Constrained Clustering Zhiyao Duan, Jinyu Han and Bryan...
Song-level Multi-pitch Tracking by Heavily Constrained Clustering
Zhiyao Duan, Jinyu Han and Bryan Pardo
EECS Dept., Northwestern Univ.
Interactive Audio Lab, http://music.cs.northwestern.edu
For presentation in ICASSP 2010, Dallas, Texas, USA.
Multi-pitch Estimation & Tracking Task
• Given polyphonic music played by several monophonic harmonic instruments (Num known)
• Estimate a pitch trajectory for each instrument
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu 2
Potential Applications
• Automatic music transcription• Harmonic source separation• Other applications
– Melody-based music search– Chord recognition– Source localization– Music education– ……
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu 3
The 2-stage Standard Approach
• Stage 1: Multi-pitch Estimation (MPE): estimate pitches in each single time frame– Z. Duan, B. Pardo and C. Zhang. , “Multiple Fundamental Frequency
Estimation by Modeling Spectral Peaks and Non-peak Regions”, IEEE Trans. Audio Speech Language Process., in press.
• Stage 2: Multi-pitch Tracking (MPT): connect pitch estimates across frames into pitch trajectories
4
…
Time
Frequen
cy
State of the Art of MPT
• What existing MPT methods do– Form short pitch trajectories within a note,
(note-level) according to local time-frequency proximity of pitch estimates
• Our contribution– Form long pitch trajectories through multiple
notes (song-level) using a new constrained clustering algorithm
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu 5
Try Clustering by Timbre
• Each trajectory is a cluster of pitch estimates• One cluster per instrument• Clustering principle: maintain timbre
consistency in each cluster
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
?
Timbre Feature of Pitch Estimates• Harmonic structure: relative amplitudes
of first 50 harmonics
Time
Freq
uen
cy
0 10 20 30 40 500
20
40
60
80
100
Harmonic number
Ampl
itude
(dB)
Harmonic Structure
Minimize This Objective Function
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
2
1
( )ki
K
i kk T
f
cx
A partitioninto K clusters
The 50-d harmonicstructure of i-thpitch estimate
Number ofClusters
Center of k-th cluster
For all pitch estimates in k-th cluster
Objective Function Is Not Enough
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Add Pitch-locality Constraints
• Must-link: pitch estimates close in both time and frequency should be in the same cluster
• Cannot-link: simultaneous pitches should not be in the same cluster (only for monophonic instruments)
10Time
Frequency
Properties of Our Problem
• Objective: timbre consistency• Constraints: pitch locality• Previous constrained clustering algorithms do
not apply due to the following properties:– Inconsistent constraints:
pitch estimates sometimes erroneous
may make constraints unsatisfiable– Heavily constrained:
nearly every pitch estimate is involved in at least one constraint
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
The Proposed Clustering Algorithm
: clustering in n-th iteration;
: {all constraints satisfied by } ;
1. Start from an initial clustering , which satisfies , a subset of all constraints; n=1;
2. Find a new clustering that decreases the objective and also satisfies ;
3. = {all constraints satisfied by } ;
4. Repeat 2-4 until the objective (nearly) cannot be decreased;
0 0C
n1nC
nnC
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
0 1C CC
f
0 1) ( ) ( )(f f f
n
nCn
Initial Clustering
• Trivial one– : a random partition– : constraints satisfied by , may be empty
• A more informative one for MPT– : label pitches according to pitch order in each
frame: highest, second-highest, third.., fourth…– : will contain all cannot-links
0
0C 0
0
0C
…Time
Freq
uen
cy …Time
Freq
uen
cy
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
• 1. Satisfy current constraints• 2. Decrease the objective function
: satisfied cannot-link : unsatisfied cannot-link
: satisfied must-link : unsatisfied cannot-link
• Swap set: A connected subgraph between two clusters. • Traverse all swap sets until finding a new clustering that
decreases the objective function
4
2 3
7
8
3
1
5 6
Find A New Clustering
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
4
2 3
7
8
3
1
5 6
4
2 3
7
8
3
1
5 6
0 1C CC 0 1) ( ) ( )(f f f
Algorithm Review
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
: partition of points into clusters
: feasible solution space under constraints
kkS kC
Experiments
• Data set– 10 J.S. Bach chorales (quartets, played by violin,
clarinet, saxophone and bassoon)– Each instrument is recorded individually, then mixed
• Ground-truth pitch trajectories– Use YIN on monophonic tracks before mixing
• Input pitch estimates– Our previous work in [1]– Input accuracy: 70.0+-3.1%
[1] Zhiyao Duan, Bryan Pardo and Changshui Zhang, “Multiple Fundamental Frequency Estimation by Modeling Spectral Peaks and Non-peak Regions”, IEEE Trans. Audio Speech Language Process., in press.
16Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Overall Multi-pitch Tracking Results
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Mean % of correct pitch estimates
Among Correctly Estimated Pitches
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
An Example
0 5 10 15 20 25
40
50
60
70
80
90
Time (second)
Pitc
h (M
IDI
num
ber)
Ground-truth Pitch Trajectories
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
An Example
0 5 10 15 20 25
40
50
60
70
80
90
Time (second)
Pitc
h (M
IDI
num
ber)
Our Resutls
Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Conclusion
• Formulate the song-level Multi-pitch Tracking problem as a constrained clustering problem– Objective: timbre consistency– Constraints: pitch locality
• Existing constrained clustering algorithms do not apply due to problem properties
• Propose a new constrained clustering algorithm
• Experimental results are promisingNorthwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu
Thanks you!
22Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu