Songsmith: Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team Microsoft...

46
Songsmith: Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team Microsoft Research Using Machine Learning to Help People Make Music

Transcript of Songsmith: Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team Microsoft...

Songsmith:

Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team

Microsoft Research

Using Machine Learning toHelp People Make Music

Computational User Experiences (CUE) group

In general:HCI +(sensors, devices, machine learning, health, physiology)

Computational User Experiences (CUE) group

Using physiological signals for input

Computational User Experiences (CUE) group

Using physiological signals for input Health and wellness

Computational User Experiences (CUE) group

Using physiological signals for input Health and wellness Creativity support tools

Songsmith:

Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team

Microsoft Research

Using Machine Learning toHelp People Make Music

What is Songsmith?

Songsmith

Your musicYou, singing...

“Automatic accompaniment generation for vocal melodies”

High-Risk Live Demo

(What other kind of live demo is there?)

Who is Songsmith for?

Today’s Talk

Overview and demo How Songsmith works Exposing machine learning parameters What are people doing with Songsmith? Creativity support tools @ Microsoft

Research

Songsmith: 5000’ Overview

G Amin C Daug

C C# D D# E F F# G G# A A# B

Chords from Melody

Songsmith’s core: Hidden Markov Model

Song start

Chord 1

Chord 2

Chord 3

Chord 4

Song end

Hidden states (chords)

Observations (note vectors)

States

Observations

What does an HMM do for me?

What does an HMM want from me?

HMMs in 5 Minutes or Less

States

Observations

HMMs in 5 Minutes or Less

Possible states:C majorC minor

C diminished… … … …

Transition probabilities:P(C Major | A Minor)?P(C Major | D Major)?

P(F# Major | F# Major)?

Observation probabilities:P( | A Minor)?P( | D Major)?

Building our HMM

Things my HMM wants from me:Possible statesTransition probabilitiesObservation probabilitiesObservations

Data-driven Not heuristic-driven

Finding the Probabilities

Training Data

~300 lead sheets(vocal melodies with chords)

C Major C Minor C# Major C# Minor D Major D Minor …

C Major 472 22 35 50 76 189

C Minor 9 314 0 44 39 71

C# Major … … … … … …

C# Minor … … … … … …

D Major … … … … … …

D Minor … … … … … …

… … … … … … …

Convert all chords to five basic triads Transpose every song into the same key Count the number of transitions from every chord

to every other chord (transition probabilities): Count the total duration of each melody note

occurring while each chord is playing (observation probabilities):

Processing the Database

C C# D D# E F F# G G# A A# B0

0.05

0.1

0.15

0.2

0.25

Notes played over C Major:

Building our HMM

Things my HMM wants from me:Possible statesTransition probabilitiesObservation probabilitiesObservations

Observations: what did the user sing?

Input:

Hard!

Building a Pitch Histogram

FFT

Building our HMM

Things my HMM wants from me:Possible statesTransition probabilitiesObservation probabilitiesObservations

Run Viterbi algorithm, get “best” sequence of chords… thank you, HMM!

One hitch: key determination…

Today’s Talk

Overview and demo How Songsmith works Exposing machine learning parameters What are people doing with Songsmith? Creativity support tools @ Microsoft

Research

So we can choose optimal chords... so what?

My demo took 10 seconds… is that fun?

What would a songwriter do next? How can we build creative exploration

into a learning-driven system?

There are always hard-coded “magic numbers” in machine learning

Machine learning also use lots of learned parameters

Can we let users control those numbers?

UI: Exposing Learning Parameters

A Bad User Interface

Songsmith: A fun way to make music(if you have a PhD in math and/or computer science)

Observation weight:

Transition matrix(edit me!)

C C# D D# E

C 0.2 0.9 0.1 0.2 0.4

C# 0.1 0.6 0.4 0.6 0.2

D 0.5 0.6 0.4 0.7 0.5

D# 0.1 0.3 0.9 0.1 0.7

E 0.4 0.1 0.1 0.3 0.3

Expected pitch histograms(edit me!)

C 0.3 0.5 1.0 0.0 0.7

C# 0.9 0.3 0.1 0.8 0.0

D 0.4 0.8 0.5 0.6 0.1

D# 1.0 0.9 0.2 0.7 0.2

E 0.8 0.4 0.4 0.2 1.0

F 0.9 0.1 0.6 0.6 0.4

Frequency smoothing:Conjugate prior:

Can we let users control those numbers?

UI: Exposing Learning Parameters

Can we let users intuitively control those numbers?

Exposing Model Parametersin Songsmith

The “Happy Factor”

Partition database into two databases (major and minor songs) using clustering

Build separate transition probability matrix for each database

When we actually run our HMM, blend the two transition matrices together according to user input…

Happy Factor: Implementation

The “Happy Factor”

1

1 1

log |

log | 1 log |i i

maj i i min i i

P c c

P c c P c c

Bonus question: what’s wrong with this equation?

0 1

Another “Happy Factor” Example

Happy

Sad

The “Jazz Factor”

When running our HMM, we need to make chords match the voice and each other

Computing how well each chord fits at a given position:

k • log( P(this chord | what the user sang) )+(1-k) • log( P(this chord | the previous chord) )

Just put k on a slider!

Jazz Factor: Implementation

Chord Locking

Global sliders are very coarse

Chords can be “locked” by the user

1

1|

0i lock

i ii lock

c CP c c

c C

“Suggested Chords”

Songwriters will often explore “chord substitutions”…

…but we’re assuming our audience doesn’t know that much music theory…

Expose suboptimal marginal probabilities at each node as “suggestions”

1

1

log |

1 log |

1 log |

i i i

i i

i i

L P x c

P c c

P c c

Interactive Machine Learning

Songsmith is one example of IMLRoughly: moving what used to be in the

domain of ML experts into users’ hands Related work: image classification

Fogarty et al, CHI 2008: CueFlikFails and Olsen, IUI 2003: Crayons

Why?Harness end-user knowledgeUse ML as a tool for data explorationUse ML as a tool for creative expression

Today’s Talk

Overview and demo How Songsmith works Exposing machine learning parameters What are people doing with Songsmith? Creativity support tools @ Microsoft

Research

Today’s Talk

Overview and demo How Songsmith works Exposing machine learning parameters What are people doing with Songsmith? Creativity support tools @ Microsoft

Research

Dynamic Mapping of Physical Controls for Tabletop Groupware

(Fiebrink, CHI 2009)

One project, two problems:1. Direct-touch, tabletop input

is great for collaboration…

…but suffers from serious precision issues.

2. Working on music alone is boring.

Incorporate high-precision controllers into a tabletop environment

Evaluate in a collaborative audio-editing app

Dynamic Mapping of Physical Controls for Tabletop Groupware

(Fiebrink, CHI 2009)

Data-Driven Exploration of Musical Chord Sequences

(Nichols, IUI 2009)

Problem: let people create and explore music by moving around in a reduced-dimensionality space

Genres, artists make for intuitive labels

Why isn’t this easy? Solution(s):

Divergence-maximizing clustering, PCA

Data-Driven Exploration of Musical Chord Sequences

(Nichols, IUI 2009)

Future work Make Songsmith freakin’ amazing, and port it

to a bajillion platforms, and make an amazing community Web site, and build it into audio hosts… etc…

Other applications of machine learning in creativity support tools…Writing? Painting? Web Design? CAD? Music?

Future work?

Songsmith

Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team

Microsoft Research

research.microsoft.com/[email protected]

Live/Google: songsmith