1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical...

32
1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School of Music Indiana University 12 January 2008

Transcript of 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical...

Page 1: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

1

Music Representation, Searching, and Retrieval

(a.k.a. Organization of and Searching in Musical Information)

Donald ByrdSchool of Informatics & School of Music

Indiana University

12 January 2008

Page 2: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

rev. Jan. 2006 2

Overview

1. Introduction and Motivation

2. Basic Representations

3. Why is Musical Information Hard to Handle?

4. Music vs. Text and Other Media

5. OMRAS and Other Projects

6. Summary

Page 3: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

22 Nov. 07 3

Classification: Logician General’s Warning

• Classification is dangerous to your understanding– Almost everything in the real world is messy, ill-defined– Absolute correlations between characteristics are rare

• Example: some mammals lay eggs; some are “naked”• Example: was the first real piano Cristofori's (ca. 1700), Broadwood's

(ca. 1790), or another?– People say “an X has characteristics A, B, C…”– Usually mean “an X has A, & usually B, C…”– Leads to:

• People who know better claiming absolute correlations• “Is it this or that or that?” questions that don’t have an answer• Don changing his mind

• But lack of classification is dangerous to understanding!• So should we abandon hierarchic classifications?

– Of course not; they're much too useful

– Just to be on guard for misleading things in classifications

Page 4: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

11 Jan. 08 4

1. Introduction and Motivation (1)

• The Fundamental Theorem of Music Informatics (maybe)– Music is created by humans for other humans

• Humans can bring tremendous amount of contextual knowledge to bear

• In fact, they can't avoid it, and they're rarely conscious of it

– But (as of early 2008) computers can never bring much contextual knowledge to bear, often none, & never without being specifically programmed to

– => doing almost anything with music by computers is very difficult; many problems essentially intractable

– For the forseeable future, only way to make significant progress is by doing as well as possible with little context, sidestepping intractable problems

• Not a theorem (I recently made it up), but important

Page 5: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

rev. 11 Jan. 08 5

1. Introduction and Motivation (2)

• Three basic forms (representations) of music are important– Audio: most important for most people (general public)

• All Music Guide (www.allmusicguide.com) has info on >>230,000 CD’s

– MIDI files: often best or essential for some musicians, especially for pop, rock, film/TV

• Hundreds of thousands of MIDI files on the Web

– CWMN (Conventional Western Music Notation): often best or essential for musicians (even amateurs) & music researchers

• Music holdings of Library of Congress: over 10M items– Most is notation, especially CWMN, not audio– Includes over 6M pieces of sheet music and tens/hundreds of

thousands of scores of operas, symphonies, etc.

• Differences among the forms are profound– NB: all statistics above are several years old

Page 6: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

27 Jan. 6

2. Basic Representations of Music & Audio (1)

Audio (e.g., CD, MP3): like speech

Time-stamped Events (e.g., MIDI file): like unformatted text

Music Notation: like text with complex formatting

Digital Audio

Time-stamped Events

Music Notation

Page 7: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

27 Jan. 7

Basic Representations of Music & Audio (2)

Audio Time-stamped Events Music Notation Common examples CD, MP3 file Standard MIDI File Sheet music

Unit Sample Event Note, clef, lyric, etc.

Explicit structure none little (partial voicing much (complete information) voicing

information)

Avg. rel. storage 2000 1 10

Convert to left - easy OK job: easy

Convert to right 1 note: pretty easy OK job: fairly hard - other: hard or very hard Ideal for music music music bird/animal sounds sound effects speech

Page 8: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

2 Oct. 07 8

Basic Representations of Music & Audio (3)

Notation: very complex structure

Audio: no (explicit) structure

Events/MIDI: simple structure

Page 9: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

9

The Four Parameters of Notes

• Four basic parameters of a definite-pitched musical note1. pitch: how high or low the sound is: perceptual analog

of frequency

2. duration: how long the note lasts

3. loudness: perceptual analog of amplitude

4. timbre or tone quality

• Above is decreasing order of importance for most Western music

• …and decreasing order of explicitness in CWMN!

Page 10: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

rev. 11 Jan. 08 10

How to Read Music Without Really Trying

• CWMN shows at least six aspects of music:– NP1. Pitches (how high or low): on vertical axis– NP2. Durations (how long): indicated by note/rest shapes– NP3. Loudness: indicated by signs like p , mf , etc.– NP4. Timbre (tone quality): indicated with words like

“violin”, “pizzicato”, etc.– Start times: on horizontal axis– Voicing: mostly indicated by staff; in complex cases also

shown by stem direction, beams, etc.• See “Essentials of Music Reading” musical example

– At the other extreme, see the “Gallery of Interesting Music Notation”!

• http://www.informatics.indiana.edu/donbyrd/InterestingMusicNotation.html

Page 11: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

11

How People Find Text Information

Query concepts

Database concepts

Query Database

understandingunderstanding

Results

matching

•What user wants is almost always concepts…

•But computer can only recognize words

Page 12: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

12

How Computers Find Text Information

Query Database

(no understanding) (no understanding)

Stemming, stopping, query expansion, etc.

Results

matching

•“Stemming, stopping, query expansion” are all tricks to increase precision & recall (avoid false negatives & false positives) due to synonyms, variant forms of words, etc.

Page 13: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

rev. 12 Jan. 08 13

3. Why is Musical Information Hard to Handle?

1. Units of meaning: not clear there are any—assuming music even has meaning! (all representations)

2. Polyphony: “parallel” independent voices, something like characters in a play (all representations)

3. Recognizing notes (audio only)4. Other reasons

– Musician-friendly I/O is difficult– Diversity: of styles of music, of people interested in music

• Music is an art!• Cf. Byrd & Crawford (2002)• But what is the information, in the first place?

Page 14: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

rev. Jan. 2007 14

Units of Meaning (Problem 1)

• Handling text information nearly always via words– “What we want is concepts; what we have is words”

• Not clear anything in music is analogous to words– No explicit delimiters (like Chinese)

– Experts don’t agree on “word” boundaries (unlike Chinese)

– Music is always art => “meaning” much more subtle!

• Are notes like words?

• No. Relative, not absolute, pitch is important

• Are pitch intervals like words?

• No. They’re too low level: more like characters

Page 15: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

rev. Jan. 07 15

Units of Meaning (Problem 1)

• Are pitch intervals like words?• No. They’re too low level: more like characters• Are pitch-interval sequences like words?• In some ways, but

– Ignores rhythm– Ignores relationships between voices (harmony)– Probably little correlation with semantics

• Are chords like words? (Christy Keele)– If so, chord progressions may be like sentences– In some ways, but ignores melody & rhythm, most relevant for

tonal music, etc.

• Anyway, in much music, pitch isn’t important, and/or notes aren’t important!

Page 16: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

16

Independent Voices in Music (Problem 2)

J.S. Bach: “St. Anne” Fugue, beginning

Page 17: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

17

Independent Voices in Text

MARLENE. What I fancy is a rare steak. Gret?

ISABELLA. I am of course a member of the / Church of England.*

GRET. Potatoes.

MARLENE. *I haven’t been to church for years. / I like Christmas carols.

ISABELLA. Good works matter more than church attendance.

--Caryl Churchill: “Top Girls” (1982), Act 1, Scene 1

M: What I fancy is a rare steak. Gret? I haven’t been...

I: I am of course a member of the Church of England.

G: Potatoes.

Performance (time goes from left to right):

Page 18: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

18

Music Notation vs. Audio

• Relationship between notation and its sound is very subtle

• Not at all one symbol <=> one symbol– Notes w/ornaments (trills, etc.) are one => many

– All symbols but notes are one => zero!

– Bach F-major Toccata example

• Style-dependent– Swing (jazz), dotting (baroque art music)

– Improvisation (baroque art music, jazz)

– “Events” (20th-century art music)

– How well-defined is style-dependent

• Interpretation is difficult even for musicians– Can take 50-90% of lesson time for performance students

Page 19: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

19

Music Perception and Music IR

• Salience is affected by texture, loudness, etc.– Inner voices in orchestral music rarely salient

• Streaming effects and cross-voice matching– produced by timbre: Wessel’s illusion (Ex. 1, 2)

– produced by register: Telemann example (Ex. 3)

• Octave identities, timbre and texture– Beethoven “Hammerklavier” Sonata example (Ex.4, 5)

– Affects pitch-interval matching

Page 20: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

20

4. Music vs. Text and Other Media

———— Explicit Structure ———— Salienceleast medium most increasers

Music audio events notation loud; thin texture

Text audio (speech) ordinary text with markup “headlining”: large, written text bold, etc.

Images photo, bitmap PostScript drawing-program bright colorfile

Video videotape MPEG? Premiere file motion, etc.w/o sound

Biological DNA sequences, MEDLINE abstracts ??data 3D protein structures

Page 21: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

21

Features of Music: Text Analogies

• Simultaneous independent voices and texture• Analogy in text: characters in a play

• Chords within a voice• Analogy in text: character in a play writing something visible

to the audience while saying different out loud

• Rhythm• Analogy in text: rhythm in poetry

• Notes and intervals• Note pitches rarely important

• Intervals more significant, but still very low-level

• Analogy in text: interval = (very roughly!) letter, not word

Page 22: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

22

Features of Text: Music Analogies

• Words• Analogy in music: for practical purposes, none

• Sentences• Analogy in music: phrases (but much less explicit)

• Paragraphs• Analogy in music: sections of a movement (but less explicit)

• Chapters• Analogy in music: movements

Page 23: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

Jan. 07 23

Course Overview

• II. Organization of Musical Information (music representation)– “What we want is concepts; what we have is words”

– Audio, MIDI, notation

• III. Finding Musical Information– A Similarity Scale for Content-Based Music IR

• IV. Musical Similarity and Finding Music by Content

• V. Finding music via Metadata– Digital music libraries (Variations2), iTunes, etc.

– Music recommender systems

Page 24: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

11 Jan. 08 24

1. Programming in R: No Problem!

• R is very interactive: can use as powerful calculator• Assignments will be fairly simple (though not for MusInfo &

CompSci grad students :-) )• Much help available: from Don & other students• Why R?

– designed for statistics, but that’s NOT why!– easy to do simple things with it– easy to do many fairly complex things, incl. graphs & handling audio files– probably not good for really complex programs– free, & available for all popular operating systems– very interactive => easy to experiment– has good documentation– In use in other Music Informatics classes, & standardizing is good

Page 25: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

3 Sep. 2006 25

1. Rudiments of R

• Originally for statistics, but useful for far more• How to get R

– Web site: http://cran.us.r-project.org/– Versions for Linux, Mac OS X, Windows– Already on STC Windows machines; will be in M373

• Tutorial:• http://xavier.informatics.indiana.edu/~craphael/teach/

symbolic_music/• Can use R interactively as a powerful graphing, musicing,

etc. calculator• …but it’s not perfect: sometimes very cryptic

Page 26: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

11 Jan. 08 26

Music Recommender Systems

• Work by genre classification and/or collaborative filtering• Major interest in recently• Best known include:

– Pandora (cf. “Music Genome Project”)– Last.fm– MusicStrands

• Other interesting sites– Hype Machine (“for savants”?)

Page 27: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

27

Typke’s MIR System Survey

• Rainer Typke’s “MIR Systems: A Survey of Music Information Retrieval Systems” lists many systems– http://mirsystems.info/

• Commercial system: Shazam• Some research systems can be used over the Web, incl.:

– C-Brahms– Meldex/Greenstone– Mu-seek– MusicSurfer– Musipedia/Tuneserver/Melodyhound– QBH at NYU– Themefinder

Page 28: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

28

Machinery to Evaluate Music-IR Research

• Problem: how do we know if one system is really better than another, or an earlier version?

• Solution: standardized tasks, databases, evaluation– In use for speech recognition, text IR, question answering, etc.

• Important example: TREC (Text Retrieval Conference)• For music IR, we now have...• IMIRSEL (International Music Information Retrieval

Systems Evaluation Laboratory) project– http://www.music-ir.org/evaluation/

• MIREX (Music IR Evaluation eXchange) modeled on TREC– 2005: audio only– 2006, 2007: audio and symbolic

Page 29: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

29

Collections (a.k.a. Databases) (1 of 2)

• Collections are improving, but very slowly• For research: poor to fair

– “Candidate Music IR Test Collections” • http://mypage.iu.edu/~donbyrd/MusicTestCollections.HTML

– Representation “CMN” vs. CMN (assume Western)

• For practical use: pathetic (symbolic) to good (pop audio)– Most are commercial, especially audio– Very little free/public domain– …especially audio! (cf. RWC)

• IPR issues are a total mess

Page 30: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

30

Collections (a.k.a. Databases) (2 of 2)• Why is so little available?

– Symbolic form: no efficient way to enter– Solution: OMR? AMR? research challenges– Music is an art!– Cf. “Searching CMN” slides: chicken & egg problem– IPR issues are a total mess

Page 31: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

rev. Jan. 2006 31

6. Summary (1 of 2)• Basic representations of music: audio, events, notation

– Fundamental difference: amount of explicit structure

• Have very different characteristics => each is by far best for some users and/or application

• Converting to reduce structure much easier than to add• Music in all forms very hard to handle mostly because of:

– Units of meaning problem– Polyphony

• Both problems are much less serious with text• Huge range of searching tasks people want to do => very

different techniques appropriate

Page 32: 1 Music Representation, Searching, and Retrieval (a.k.a. Organization of and Searching in Musical Information) Donald Byrd School of Informatics & School.

rev. Jan. 2007 32

6. Summary (2 of 2)• Projects include

– Audio-based: via recognition of polyphonic music (OMRAS, query-by-humming, etc.)

– CMN-based: monophonic query vs. polyphonic database (emphasis on UI) (OMRAS)

– Style-genre identification from audio– Music recommender systems (Pandora, Last.fm, etc.)– Digital music libraries (Variations2, etc.; iTunes, sort of)– Creative applications: music IR for improvisation, etc.

• Machinery to evaluate research is coming along (MIREX)• Collections

– for research: poor to fair– For practical use: pathetic (symbolic) to good (pop audio)– improving, but…– Serious problems with IPR as well as technology