Automatic Chord Recognition using Neural Networks · Automatic Chord Recognition using Neural...

Post on 23-Aug-2020

10 views 0 download

Transcript of Automatic Chord Recognition using Neural Networks · Automatic Chord Recognition using Neural...

Automatic Chord Recognition using Neural Networks

Audio Signal Processing

Ayoub Ghriss

MVA, 2017

Computational Statistics MVA, 2017 p 1 / 20

Contents

1 Introduction

2 Features Extraction

3 Features Processing

4 Learning Architecture

5 Conclusion

Computational Statistics MVA, 2017 p 2 / 20

Introduction

Computational Statistics MVA, 2017 p 3 / 20

A bit of music and history

• Automatic Chord Recognition, first start in 1991• Classic methods : KNN, Logistic, HMM• Leap in precision with the introduction of NeuralNetworks

Computational Statistics MVA, 2017 p 4 / 20

Chords and Pitch

• Pitch : subjective perception of a note’s height• The pitch system is composed of 12 classes,• An octave means a doubling of the frequency,each octave : fn = 2

112 fn−1

• Define an equivalence relation between notes• A chord is combination of 2 or more pitches

Computational Statistics MVA, 2017 p 5 / 20

Chords and Pitch

Computational Statistics MVA, 2017 p 6 / 20

Features Extraction

Computational Statistics MVA, 2017 p 7 / 20

Short-Time Fourier Transform

• Discrete FT on equally spaced segments of thesong

• Frequency mapping can be scaled (Mel,Logarithmic)

Drawbacks :• Constant resolution and frequency difference• Not suited to represent the pitch class concept

Computational Statistics MVA, 2017 p 8 / 20

Constant Q Transform

fk = fmin.2kB (1)

where k is the frequency index,B is the number ofbins per octave.

X (k ; r) =1Nk

∑n

x(nr)w(n)e j2πnQ/Nk (2)

Q = 12B−1 , and the resolution−1Nk = [Q fs

fk]

Computational Statistics MVA, 2017 p 9 / 20

PCP, the first ACR System

• Can be seen as 1-bin Constant Q Transform• Uses pitches pattern matching using NN andhand crafted score

Takuya Fujishima, Realtime Chord Recognition ofMusical Sound

Computational Statistics MVA, 2017 p 10 / 20

CQT vs STFT : who’s the best ?

Muhammad Huzaifah, Comparison ofTime-Frequency Representations.

Computational Statistics MVA, 2017 p 11 / 20

Features Processing

Computational Statistics MVA, 2017 p 12 / 20

Pre-processing

• Time slicing : concatenating adjacent frames• First low pass filter : yn = αyn−1 + (1− α)xn• A pair of low pass filters : exponentiallyweighted mean ∑

i=−r

ra−|i |x [.+ r ]

• Other papers : Geometric mean/ Median filter

Computational Statistics MVA, 2017 p 13 / 20

Learning Architecture

Computational Statistics MVA, 2017 p 14 / 20

Computational Statistics MVA, 2017 p 15 / 20

Deep Belief Networks

• Layers of fully connected layers of RestrictedBoltzmann Machines

• A stochastic neural network :• One layer of visible units : chords• One layer of hidden units : latent variables• the hidden units of layer i are the visible for thelayer i + 1

• Seen as an out-dated model in the Deepcommunity

Computational Statistics MVA, 2017 p 16 / 20

Deep architectures

• Recurrent Neural NetworksNicolas Boulanger-Lewandowski, AudioChord Recogntion with Recurent NeuralNetworks.

• Convolutional NN:Anis Rojb al., Music Transcription by DeepLearning with Data and “Artificial Semantic”Augmentation

Computational Statistics MVA, 2017 p 17 / 20

Conclusion

Computational Statistics MVA, 2017 p 18 / 20

Conclusions

• Improvement with the introduction with neuralnetworks

• little difference in performance between differentstructures

• Smoothness of solutions to be improved

Matthias Mauch & Katy Noland & SimonDixon (2009) Using Musical Structure toEnhance Automatic Chord Transcription.

Computational Statistics MVA, 2017 p 19 / 20

Conclusions

Computational Statistics MVA, 2017 p 20 / 20