MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre...

26
MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006

Transcript of MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre...

Page 1: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Presentation on Timbre Similarity

Alexandre Savard

March 2006

Page 2: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Content• Introduction• Measurement of timbre• Measurement of similarity• Systems Evaluation• Recent developments• Conclusion

Page 3: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

IntroductionIncomplete timbre definition

– Timbre is a fundamental dimension of sound.– Timbre has been too often described as the dimension of sound that lets the listener makes distinction between two sounds that have the same pitch and the same loudness.

Page 4: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

IntroductionIncomplete timbre definition

– An efficient operational definition of timbre haven’t been already achieved.– Previous research demonstrated the multidimensional nature of timbre.– Existing timbre researches has already compared the similarity of the timbre of single instrumental notes.

Page 5: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

IntroductionPhysical features of timbre

– Attack transients– Spectral flux– Spectral gravity centre– Harmonicity Ratio– Spectral/Temporal Envelope– Other factors:

• Pitch• Loudness

Page 6: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

IntroductionGlobal Timbre

– A local definition of timbre appears to be useless for electronic music distribution development or music recommendation systems.– Researches use the concept of “global” timbre that attributes a timbre quality for an entire piece.– This idea only makes sense if there is only little variations in texture and instrumentation.

Page 7: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Measurement of timbreMel-Frequency Cepstrum Coeficient

– Mel-Frequency Cepstrum Coefficient (MFCC)

• Spectral gravity centre• Spectral envelope• Spectral Flux• Combines those measures in a “feature vector”

Page 8: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Measurement of timbreMel-Frequency Cepstrum Coefficient

– It is a measure of the spectral envelope variations.– Consist of a mapping of the linear frequencies to the psychoacoustically-based Mel scale.– It results an ordered sequence of coefficients.– Low-order coefficients describe slow temporal changes of the spectral envelope.– High-order coefficients describe fast changes.

Page 9: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Measurement of SimilaritySimilarity Metric

– Metrics are applied to calculate the distance between two representations and determine the similarity of the music.– Should be related to strategy used by humans in similarity judgments of timbre.

Page 10: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Measurement of SimilarityGaussian Mixture Model

– MFCC involves a large amount of coefficients. – It is necessary to get a more compact representation to handle those results.

Page 11: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Measurement of SimilarityGaussian Mixture Model

– GMM is composed of one or more components Gaussian probability distributions.– Distance between GMM’s can be seen as a measurement of the similarity.– Random probabilities are computed from each song to be compared.– Samples are taken from both songs to be compared.

Page 12: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Measurement of SimilarityGaussian Mixture Model

– “Distance” between GMM’s can be seen as a measurement of the similarity.– “Distance” is the amount of necessary changes to obtain samples of the second song from the first one.– The higher are those probabilities, the higher the similarity is.

Page 13: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Measurement of SimilarityGaussian Mixture Model

J. Aucouturier et al, 2004 “The Way It Sounds”: Timbre Models for Analysis and Retrieval of Music Signals.

Page 14: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Measurement of SimilarityDifferent Approaches

– Neural Networks– Hidden Markov Model– Gaussian Mixture Models– Self-Organizing Map

Page 15: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Systems EvaluationEvaluation criteria

– Timbre similarity judgment is based on a set of objective and subjective perceptual, cognitive and

cultural aspects. – Measure are highly dependent of music present in the database.

Page 16: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Systems EvaluationObjective Evaluation

– The objective evaluation of timbral similarity measure is problematic.– Metadatas of a given database include description of the artist and of the genre. However, timbre quality is not usually described in it.

Page 17: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Systems EvaluationSubjective Evaluation

– Conducting a psychoacoustical survey– Deciding whether two songs have similar timbre can be uncertain as it is an ill-defined concept.

Page 18: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Recent DevelopmentsAucouturier and Pachet (2002)

– Segmentation of each song using invariable 50 ms windows.– Make use of a 8 coefficient MFCC to characterize each segments.– Used Gaussian Mixture Model composed of three Gaussian probability distribution.– 100 random samples are taken for similarity measurement.

Page 19: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Recent DevelopmentsAucouturier and Pachet (2002)

J. Aucouturier et al, 2004, “The Way It Sounds”: Timbre Models for Analysis and Retrieval of Music Signals.

Page 20: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Recent DevelopmentsAucouturier and Pachet (2004)

– Finding the best set of parameters• Sampling rate of the music signal• Number of MFCCs extracted from each frame of data• Number of components used in the GMM • The distance sample rate to estimate the likelihood of one model given another• Window size

Page 21: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Recent DevelopmentsAucouturier and Pachet (2004)

J. Aucouturier et al, 2004, “The Way It Sounds”: Timbre Models for Analysis and Retrieval of Music Signals.

Page 22: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Recent DevelopmentsAucouturier and Pachet (2004)

– Alternative similarity measurements using Earth Mover’s Distance and Hidden Markov Model.– Those techniques didn’t improved the performances.– Bring the idea that there could exist a ceiling for the performance of technique involving timbre similarity.

Page 23: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Recent DevelopmentsLiu and Huang (2000)

– Developed an algorithm for singing voice. – Used MFCC as well as GMM for their timbre representation.– The segmentation of audio signal is done according to the phonemes in singing.

Page 24: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Recent DevelopmentsLogan and Salomon (2001)

– Characterized timbre with MFCC.– Used K-means clustering instead of GMM.– Calculate the amount of similarity using Earth Mover’s Distance.

Page 25: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Conclusion

Page 26: MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.

MUMT611: Music Information Acquisition, Preservation, and Retrieval

Bibliography• J. Aucouturier, F. Pachet, and Mark Sandler. 2004. “The way it sounds”: Timbre models for analysis and retrieval of music signals. IEEE Transaction on multimedia. • J. Aucouturier, and F. Pachet. 2004. Improving timbre similarity : How high’s the sky ? Proceedings of the International Conference on Music Information Retrieval.

• J. Aucouturier, and F. Pachet. 2002. Music similarity measures: What’s the use ?Proceedings of the International Conference on Music Information Retrieval.

• C. Liu, and C. Huang. 2002. A singer identification technique for content-based classification of mp3 music object. Proceeding of the Conference on Information and Knowledge Management.• B. Logan, and A. Salomon. 2001. A music similarity function based on signal analysis. Proceeding of the International Conference on Multimedia and Expo.