Songsmith: Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team Microsoft...
-
Upload
mervyn-henry -
Category
Documents
-
view
216 -
download
0
Transcript of Songsmith: Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team Microsoft...
Songsmith:
Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team
Microsoft Research
Using Machine Learning toHelp People Make Music
Computational User Experiences (CUE) group
In general:HCI +(sensors, devices, machine learning, health, physiology)
Computational User Experiences (CUE) group
Using physiological signals for input Health and wellness
Computational User Experiences (CUE) group
Using physiological signals for input Health and wellness Creativity support tools
Songsmith:
Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team
Microsoft Research
Using Machine Learning toHelp People Make Music
What is Songsmith?
Songsmith
Your musicYou, singing...
“Automatic accompaniment generation for vocal melodies”
Today’s Talk
Overview and demo How Songsmith works Exposing machine learning parameters What are people doing with Songsmith? Creativity support tools @ Microsoft
Research
Chords from Melody
Songsmith’s core: Hidden Markov Model
Song start
Chord 1
Chord 2
Chord 3
Chord 4
Song end
Hidden states (chords)
Observations (note vectors)
States
Observations
What does an HMM do for me?
What does an HMM want from me?
HMMs in 5 Minutes or Less
States
Observations
HMMs in 5 Minutes or Less
Possible states:C majorC minor
C diminished… … … …
Transition probabilities:P(C Major | A Minor)?P(C Major | D Major)?
P(F# Major | F# Major)?
Observation probabilities:P( | A Minor)?P( | D Major)?
Building our HMM
Things my HMM wants from me:Possible statesTransition probabilitiesObservation probabilitiesObservations
C Major C Minor C# Major C# Minor D Major D Minor …
C Major 472 22 35 50 76 189
C Minor 9 314 0 44 39 71
C# Major … … … … … …
C# Minor … … … … … …
D Major … … … … … …
D Minor … … … … … …
… … … … … … …
Convert all chords to five basic triads Transpose every song into the same key Count the number of transitions from every chord
to every other chord (transition probabilities): Count the total duration of each melody note
occurring while each chord is playing (observation probabilities):
Processing the Database
C C# D D# E F F# G G# A A# B0
0.05
0.1
0.15
0.2
0.25
Notes played over C Major:
Building our HMM
Things my HMM wants from me:Possible statesTransition probabilitiesObservation probabilitiesObservations
Building our HMM
Things my HMM wants from me:Possible statesTransition probabilitiesObservation probabilitiesObservations
Run Viterbi algorithm, get “best” sequence of chords… thank you, HMM!
One hitch: key determination…
Today’s Talk
Overview and demo How Songsmith works Exposing machine learning parameters What are people doing with Songsmith? Creativity support tools @ Microsoft
Research
So we can choose optimal chords... so what?
My demo took 10 seconds… is that fun?
What would a songwriter do next? How can we build creative exploration
into a learning-driven system?
There are always hard-coded “magic numbers” in machine learning
Machine learning also use lots of learned parameters
Can we let users control those numbers?
UI: Exposing Learning Parameters
A Bad User Interface
Songsmith: A fun way to make music(if you have a PhD in math and/or computer science)
Observation weight:
Transition matrix(edit me!)
C C# D D# E
C 0.2 0.9 0.1 0.2 0.4
C# 0.1 0.6 0.4 0.6 0.2
D 0.5 0.6 0.4 0.7 0.5
D# 0.1 0.3 0.9 0.1 0.7
E 0.4 0.1 0.1 0.3 0.3
Expected pitch histograms(edit me!)
C 0.3 0.5 1.0 0.0 0.7
C# 0.9 0.3 0.1 0.8 0.0
D 0.4 0.8 0.5 0.6 0.1
D# 1.0 0.9 0.2 0.7 0.2
E 0.8 0.4 0.4 0.2 1.0
F 0.9 0.1 0.6 0.6 0.4
Frequency smoothing:Conjugate prior:
Can we let users control those numbers?
UI: Exposing Learning Parameters
Can we let users intuitively control those numbers?
Partition database into two databases (major and minor songs) using clustering
Build separate transition probability matrix for each database
When we actually run our HMM, blend the two transition matrices together according to user input…
Happy Factor: Implementation
The “Happy Factor”
1
1 1
log |
log | 1 log |i i
maj i i min i i
P c c
P c c P c c
Bonus question: what’s wrong with this equation?
0 1
When running our HMM, we need to make chords match the voice and each other
Computing how well each chord fits at a given position:
k • log( P(this chord | what the user sang) )+(1-k) • log( P(this chord | the previous chord) )
Just put k on a slider!
Jazz Factor: Implementation
Chord Locking
Global sliders are very coarse
Chords can be “locked” by the user
1
1|
0i lock
i ii lock
c CP c c
c C
“Suggested Chords”
Songwriters will often explore “chord substitutions”…
…but we’re assuming our audience doesn’t know that much music theory…
Expose suboptimal marginal probabilities at each node as “suggestions”
1
1
log |
1 log |
1 log |
i i i
i i
i i
L P x c
P c c
P c c
Interactive Machine Learning
Songsmith is one example of IMLRoughly: moving what used to be in the
domain of ML experts into users’ hands Related work: image classification
Fogarty et al, CHI 2008: CueFlikFails and Olsen, IUI 2003: Crayons
Why?Harness end-user knowledgeUse ML as a tool for data explorationUse ML as a tool for creative expression
Today’s Talk
Overview and demo How Songsmith works Exposing machine learning parameters What are people doing with Songsmith? Creativity support tools @ Microsoft
Research
Today’s Talk
Overview and demo How Songsmith works Exposing machine learning parameters What are people doing with Songsmith? Creativity support tools @ Microsoft
Research
Dynamic Mapping of Physical Controls for Tabletop Groupware
(Fiebrink, CHI 2009)
One project, two problems:1. Direct-touch, tabletop input
is great for collaboration…
…but suffers from serious precision issues.
2. Working on music alone is boring.
Incorporate high-precision controllers into a tabletop environment
Evaluate in a collaborative audio-editing app
Dynamic Mapping of Physical Controls for Tabletop Groupware
(Fiebrink, CHI 2009)
Data-Driven Exploration of Musical Chord Sequences
(Nichols, IUI 2009)
Problem: let people create and explore music by moving around in a reduced-dimensionality space
Genres, artists make for intuitive labels
Why isn’t this easy? Solution(s):
Divergence-maximizing clustering, PCA
Data-Driven Exploration of Musical Chord Sequences
(Nichols, IUI 2009)
Future work Make Songsmith freakin’ amazing, and port it
to a bajillion platforms, and make an amazing community Web site, and build it into audio hosts… etc…
Other applications of machine learning in creativity support tools…Writing? Painting? Web Design? CAD? Music?
Future work?
Songsmith
Dan Morris, Ian Simon, Sumit Basu, and the MSR Advanced Development Team
Microsoft Research
research.microsoft.com/[email protected]
Live/Google: songsmith