Music Information Retrieval - KTH
Transcript of Music Information Retrieval - KTH
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
1KTH Stockholm, Sweden; March 2013
Music Information Retrieval
Markus [email protected]
Department of Computational Perception
Johannes Kepler University (JKU)
Linz, Austria
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
2KTH Stockholm, Sweden; March 2013
Overview
Introduction to and Applications of Music Information Retrieval
Perceptual Music Features
Context- and Web-based Methods
Summary and Future Directions
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
3KTH Stockholm, Sweden; March 2013
What is Music Information Retrieval?
“MIR is a multidisciplinary research endeavor that strives to develop innovative
content-based searching schemes, novel interfaces, and evolving networked
delivery mechanisms in an effort to make the world’s vast store of music accessible to
all.”
[Downie, 2004]
“...actions, methods and procedures for recovering stored data to provide information
on music.”
[Fingerhut, 2004]
“MIR is concerned with the extraction, analysis, and usage of information about any
kind of music entity (for example, a song or a music artist) on any representation
level (for example, audio signal, symbolic MIDI representation of a piece of music, or
name of a music artist).
[Schedl, 2008]
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
4KTH Stockholm, Sweden; March 2013
Selected Tasks and Challenges
1. feature extraction (audio-based vs. context-based approaches)
2. music similarity measurement (e.g. for retrieval and recommendation)
3. music identification via audio fingerprinting (e.g. Shazam or SoundHound)
4. music recommendation, automated playlist generation
(e.g. Last.fm, Pandora, EchoNest)
5. clustering, visualization, intelligent user interfaces to music collections
6. classification (e.g., genre, instruments, moods) and music auto-tagging
7. speech/music discrimination
8. structural analysis (segmentation, summarization, audio-to-lyrics
alignment, score following)
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
5KTH Stockholm, Sweden; March 2013
Music Retrieval Schemes
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
6KTH Stockholm, Sweden; March 2013
Browsing
Motivation:
music collections are becoming
larger and larger (on PCs as well
as on mobile players)
most UIs of music players still
only allow organization and
searching by textual properties
accoding to scheme
(genre-)artist-album-track
→ novel and innovative strategies
to access music are sought in
MIR
„intelligent iPod“ by CP.JKU, 2006
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
7KTH Stockholm, Sweden; March 2013
Browsing
„Mapping Music in the Palm of Your Hand“ by van Gulik et al., 2004
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
8KTH Stockholm, Sweden; March 2013
Browsing
„PlaySOM“,
IFS
TU Wien, 2005
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
9KTH Stockholm, Sweden; March 2013
Browsing
„nepTune“, CP.JKU, 2007
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
10KTH Stockholm, Sweden; March 2013
Browsing
„Musicream“ by Goto and Goto, 2005
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
11KTH Stockholm, Sweden; March 2013
Browsing
„MusicTweetMap“ by Hauger and Schedl, 2012
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
12KTH Stockholm, Sweden; March 2013
Direct Querying
themefinder.org
Query equals the feature representation:
e.g. string representation of music (pitch,
interval, contour, etc.)
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
13KTH Stockholm, Sweden; March 2013
Query by Example
query by example
Query: excerpt of song
Aim: find actual song (meta-data)
Challenges: usually bad quality, background noise, etc.
Example: www.shazam.com
query by humming
Query: song excerpt hummed by user („lalala“, „nanana“)
Aim: find actual song (e.g. via pitch contours/changes – up/down/same)
Challenges: large collections, poor performance quality
Example: MelodieSuchmaschine, SoundHound
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
14KTH Stockholm, Sweden; March 2013
To create all these fancy applications, we need musical features that relate to
how we perceive music, i.e. a (simplified) representation of the music items.
content-based, audio-based, signal-based:
energy, pitch, beat, rhythm, harmony, timbre, melody, etc.
context-based, community-based, web-based, cultural features:
editorial meta-data, collaborative tags, web pages, microblogs, lyrics, playlist
information, etc.
Features Extraction for Music Retrieval
Feature categories:
“Content-based Music Retrieval and Access: An Overview“ (22-03-2013, 10.00-12.00)
Now.
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
15KTH Stockholm, Sweden; March 2013
Perceptual Music Features
music
content
Examples:
- rhythm
- timbre
- melody
- harmony
- loudness
music
context
user
context
Examples:
- semantic labels
- song lyrics
- album cover artwork
- artist's background
- music video clips
Examples:
- mood
- activities
- social context
- spatio-temporal context
- physiological aspects
user properties
music
perception
Examples:
- music preferences
- musical training
- musical experience
- demographics
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
16KTH Stockholm, Sweden; March 2013
Context- and Web-based Features
Data sources:
• lists of purchased music from (online) music stores[Ellis et al., 2002], [Whitman and Lawrence, 2002]
• music collections made available via music sharing services[Ellis et al., 2002], [Whitman and Lawrence, 2002]
• playlists of radio stations and compilation CDs[Pachet et al., 2001]
• music-related web pages[Cohen and Fan, 2000], [Whitman and Lawrence, 2002], [Knees et al., 2004], [Schedl et al., 2011]
• RSS feeds[Celma et al., 2005]
• user tags, especially from music information systems (e.g. Last.fm)[Pohle et al., 2007], [Geleijnse et al., 2007]
• microblogs[Schedl, 2012], [Schedl, 2013]
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
17KTH Stockholm, Sweden; March 2013
Why Context-based MIR?
Audio similarity may find that these two sound similar:
Foxboro Hot Tubs
“Ruby Room”
The Staggers
“Little Boy Blue”
But, for example, it won’t tell you that…
• “Foxboro Hot Tubs” are better known as “Green Day”
• “The Staggers” are a band from Graz
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
18KTH Stockholm, Sweden; March 2013
Why Context-based MIR?
What do these songs have in common?
NOFX
“Idiot Son of an Asshole”
Eminem
“Mosh”
Answer:
Both are Anti-Bush protest songs.
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
19KTH Stockholm, Sweden; March 2013
Why Context-based MIR?
What do these artists have in common?
(Example borrowed from Lamere & Celma’s Music Recommendation Tutorial)
Ravi Shankar Norah Jones
Answer:
Half of their DNA. Norah Jones is Ravi Shankar’s daughter.
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
20KTH Stockholm, Sweden; March 2013
Why Context-based MIR?
What do these songs have in common?
Antonio Carlos Jobim
“Insensatez”
Rammstein
“Rammstein”
Answer:
Both were featured on the Soundtrack
of David Lynch’s movie “Lost Highway”
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
21KTH Stockholm, Sweden; March 2013
Why Context-based MIR?
There is a lot of perceptually relevant information that are not
encoded in the audio signal, or cannot be extracted from it.
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
22KTH Stockholm, Sweden; March 2013
Context-based Similarity Measurement
Text-IR methods:
uses vector space model, TF-IDF weighting, cosine similarity, etc.
Example: microblogs
Co-occurrence analysis:
music items frequently co-occurring in “virtual documents” are
considered similar
Example: web pages, shared folders in P2P networks
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
23KTH Stockholm, Sweden; March 2013
Text-IR: Microblogs
(+music)
„Lady Gaga“
„Mozart“
„Alcest“
…
artist term profiles
similarity estimate
artist A artist B
[Schedl, 2012] Information Retrieval 15:3-4, June 2012.
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
24KTH Stockholm, Sweden; March 2013
Investigating different aspects in modeling artist term profiles from microblogs:
- term frequency
- inverse document frequency
- virtual document modeling
concatenate all tweets of the artist or perform aggregation via mean, max, etc.
- normalization with respect to document length
- similarity measure
- index term set
- query scheme
implemented in our CoMIRVA framework available from http://www.cp.jku.at/comirva
Text-IR: Microblogs
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
25KTH Stockholm, Sweden; March 2013
use query scheme “artist name”
use logarithmic formulations
music-specific dictionary favorable
don’t use Euclid; use Jeffrey or Inner Prod. no document length normalization
use logarithmic formulations
Text-IR: Microblogs
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
26KTH Stockholm, Sweden; March 2013
Co-occurrence Analysis: Web Pages
+music 100 top-ranked URLs
Alice Cooperhttp://www.geocities.com/sfloman/alicecooperband.html
http://music.yahoo.com/ar-307112-reviews--Alice-Cooper
http://music.yahoo.com/release/165446
http://www.popmatters.com/music/reviews/c/cooperalice-dirty.shtml
http://www.popmatters.com/music/reviews/c/cooperalice-billion.shtml
…
<html>
…
Metallica
…
</html> calculate DFs
BB Kinghttp://www.amazon.com/exec/obidos/tg/detail/-/B000AA4M9U?v=glance
http://www.amazon.com/exec/obidos/tg/detail/-/B00004THAY?v=glance
http://www.rollingstone.com/artists/4610/reviews
http://www.rollingstone.com/artists/4610/albums/album/7600591
http://www.popmatters.com/music/reviews/k/kingbb-anthology.shtml
…retrieve Web pages
indexing
„Alice Cooper“
„BB King“
„Beethoven“
„Prince“
„Metallica“
…
(co-occurrence) page counts
99 3 5 4
0 91 27 2
13 8 96 19
0 1 12 84
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
27KTH Stockholm, Sweden; March 2013
Co-occurrence Analysis: P2P Networks
[Shavitt and Weinsberg, 2009] Proc. IEEE ISM: AdMIRe
Approach:
Meta-data of shared files in Gnutella P2P network gathered in November 2007
(.mp3 and .wav): 530,000 songs shared by 1.2 million users
Co-occurrence-based distance measure on the song level, which corrects
popularity bias.
uc(Si, Sj) --- number of users that share songs Si and Sj
Ci, Cj --- popularity of Si and Sj, measured as total number of occurrences
Evaluation in a music recommendation setting:
30% of songs in each user collection used to predict remaining 70%
about 12% precision @ 13% recall
heavy inconsistencies in meta-data (ID3 tags)
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
28KTH Stockholm, Sweden; March 2013
Challenges for Context-based Methods
Data Sparsity
depending on the data source, there might be no data available
(especially in the "long tail")
Popularity Bias
disproportionately more information is available for popular artists than for
lesser known ones, which can easily distort similarity estimation
Community / Population Bias
only participants of the community under consideration are taken into
account (e.g. P2P, Last.fm, Twitter); users of such communities do not
represent the average music listener
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
29KTH Stockholm, Sweden; March 2013
Summary and Future Directions
• Music Information Retrieval is a broad and diverse field
• Various approaches to extract information directly from the audio signal
• Many data sources and approaches to extract contextual data and similarity
information from the web
• Multi-modal retrieval promising and allows for exciting applications
Some open challenges:
• understand how low-level features relate to human perception of music
• Model user context
• personalized and user-aware music retrieval and recommendation
(emotion, location, social context, activity, etc.)
Project “Personalized Music Retrieval via Music Content, Music Context, and User Context”
http://www.cp.jku.at/research/projects/P22856-N23/project.html
Introduction to Music Information Retrieval
Markus Schedl, [email protected]
30KTH Stockholm, Sweden; March 2013
Tack!