A Variational Parametric Model for Audio Synthesiskrishnasubramani/data/ddp_stage… ·...

136

A Variational Parametric Model forAudio Synthesis

Krishna SubramaniSupervised by Prof Preeti Rao

Electrical EngineeringIndian Institute of Technology Bombay India

236

Audio Synthesis

I What comes to your mind when you hear lsquoAudio Synthesisrsquo

236

Audio Synthesis


Figure One of the early Moog Modular Synthesizers

236

Audio Synthesis


I More generally it involves us specifying controlling parametersto a synthesizer to obtain an audio output

I What are the main parameters which govern the audiogeneration(at a high level)

236

Audio Synthesis


1 Timbre

2 Pitch

3 Loudness

236

Audio Synthesis


I Early analog synthesizers used voltage controlled oscillatorsfilters amplifiers to generate the waveform and lsquoenvelopegeneratorsrsquo to shape it

I Data-driven statistical modeling + computing power=rArr Deep Learning for audio synthesis

236

Audio Synthesis




336

Generative Models for Audio Synthesis

I Rely on ability of algorithms to extract musically relevantinformation from vast amounts of data

I Autoregressive modeling Generative Adversarial Networks andVariational Autoencoders are some of the proposed generativemodeling methods

I Most methods in literature try to model audio signals directlyin the time or frequency domain

436

Our Nearest Neighbours

I [Sarroff and Casey 2014] first to use autoencoders to performframe-wise reconstruction of short-time magnitude spectra

X Learn perceptually relevant lower dimensionalrepresentations(lsquolatent spacersquo) of audio to help in synthesis

times lsquoGraininessrsquo in the reconstructed sound

I [Roche et al 2018] extended this analysis to try out differentautoencoder architectures

X Helped by the release of NSynth [Engel et al 2017]X Usability of the latent space in audio interpolation

I [Esling et al 2018] regularized the VAE latent space in orderto effect control over perceptual timbre of synthesizedinstruments

X Enabled them to lsquogeneratersquo and lsquointerpolatersquo audio in this space

436









436









436









436






X Helped by the release of NSynth [Engel et al 2017]

X Usability of the latent space in audio interpolation



436









436









436









536


I Major issue with previous methods was frame-wiseanalysis-synthesis based reconstruction which fails to modeltemporal evolution

I [Engel et al 2017] inspired by Wavenets [Oord et al 2016]autoregressive modeling capablities for speech extended it tomusical instrument synthesis

X Generation of realistic and creative soundstimes Complex network which was difficult to train and sample from

I [Wyse 2018] also autoregressively modelled the audio albeitby conditioning the waveform samples on additional parameterslike pitch velocity(loudness) and instrument class

X Was able to achieve better control over generation byconditioning

times Unable to generalize to untrained pitches

536








536




X Generation of realistic and creative sounds

times Complex network which was difficult to train and sample from




536








536








536








536








636

Why Parametric

I Rather than generating new timbres(lsquointerpolatingrsquo acrosssounds) we consider the problem of synthesis of a giveninstrument sound with flexible control over the pitch andloudness dynamics

I Pitch shifting without timbre modification uses a source-filtermodel with the filter(spectral envelope) being kept constant[Roebel and Rodet 2005]

I A powerful parametric representation over raw waveform orspectrogram has the potential to achieve high quality with lesstraining data

1 [Blaauw and Bonada 2016] recognized this in context ofspeech synthesis and used a vocoder representation to train agenerative model achieving promising results along the way

636

Why Parametric





736

Dataset

I Good-sounds dataset [Romani Picas et al 2015] consisting ofindividual note and scale recordings for 12 different instruments

I We work with the lsquoViolinrsquo played in mezzo-forte loudnessand choose the 4th octave(MIDI 60-71)

0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I

Figure Instances per note in the overall dataset

I Average note duration for the chosen octave is about 45s pernote

736

Dataset



Why we chose ViolinPopular in Indian Music Human voice-like timbre

Ability to produce continuous pitch

0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset

I We split the data to train(80) and test(20) instances acrossMIDI note labels

I Our model is trained with frames(duration 213ms) from thetrain instances and system performance is evaluated withframes from the test instances

936

Non-Parametric Reconstruction

I Framewise magnitude spectra reconstruction procedure in[Roche et al 2018] cannot generalize to untrained pitches

I We consider two cases - Train including and excluding MIDI 63along with its 3 neighbouring pitches on either side

Sustain

FFT

En

cod

er

Z

Dec

od

er

AE

I Repeat above across all input spectral frames and invertobtained spectrogram using Griffin-Lim [Griffin and Lim 1984]

936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036


Figure Input MIDI 63 1 1

Figure Including MIDI 63 2 2 Figure Excluding MIDI 63 3 3

var ocgs=hostgetOCGs(hostpageNum)for(var i=0iltocgslengthi++)if(ocgs[i]name==MediaPlayButton1)ocgs[i]state=false

1728


1728


1728

1136

Parametric Model

1 Frame-wise magnitude spectrum rarr harmonic representationusing Harmonic plus Residual(HpR) model[Serra et al 1997](currently we neglect the residual)

Sustain

FFTHpR

f0Magnitudes

I Output of HpR block =rArr log-dB magnitudes + harmonics

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model

2 log-dB magnitudes + harmonics rarr TAE algorithm[Roebel and Rodet 2005 IMAI 1979]

I TAE =rArr Iterative Cepstral Liftering

A0(k) = log(|X (k)|)V0 = minusinfinwhile(Ai minus A0 lt ∆)Ai (k) = max(Aiminus1(k)Viminus1(k))

Vi = FFT (lifter(Ai ))

f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model

I No open source implementation available for the TAE thus weimplemented it following procedure highlighted in[Roebel and Rodet 2005 Caetano and Rodet 2012]

I Figure shows a TAE snap from [Caetano and Rodet 2012]

I Similar to results we get 1 4 2 5


3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20

Frequency(kHz)Magnitude(dB)

606571

I Spectral envelope shape varies across pitch

1 Dependence of envelope on pitch[Slawson 1981 Caetano and Rodet 2012]

2 Variation due the TAE algorithm

I TAE rarr smooth function to estimate harmonic amplitudes

1536

Parametric Model

I No of CCs(Cepstral CoefficientsKcc) depends on Samplingrate(Fs)pitch(f0)

Kcc leFs2f0

I For our network we choose the maximum Kcc(lowest pitch) =91 and zero pad the high pitch Kcc to 91 dimension vectors

1636

Generative Models

I Autoencoders [Hinton and Salakhutdinov 2006] - Minimizethe MSE(Mean Squared Error) between input and networkreconstruction

bull Simple to train and performs good reconstruction(since theyminimize MSE)

bull Not truly a generative model as you cannot generate new data

I Variational Autoencoders [Kingma and Welling 2013] -Inspired from Variational Inference enforce a prior on thelatent space

bull Truly generative as we can lsquogeneratersquo new data by samplingfrom the prior

I Conditional Variational Autoencoders[Doersch 2016 Sohn et al 2015] - Same principle as a VAEhowever learns the conditional distribution over an additionalconditioning variable

1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE

Continuous latent space from which we can sample points(andsynthesize the corresponding audio)

Why CVAE over VAE

Conditioning on pitch =rArr Network captures dependenciesbetween the timbre and the pitch =rArr More accurate envelopegeneration + Pitch control

1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836

Network Architecture

CCs

f0

En

cod

er

Z

Dec

od

er

CVAE

I Network input is CCs rarr MSE represents perceptually relevantdistance in terms of squared error between the input andreconstructed log magnitude spectral envelopes

I We train the network on frames from the train instances

I For evaluation MSE values calculated ahead is the averagereconstruction error across all the test instance frames

1936

Network ArchitectureI Main hyperparameters -

1 β - Controls relative weighting between reconstruction andprior enforcement

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5

Latent space dimensionality

MSE

β = 001β = 01β = 1

Figure CVAE varying β

I Tradeoff between both terms choose β = 01

1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036


I Main hyperparameters -

2 Dimensionality of latent space - networks reconstruction ability

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE

Figure CVAE(β = 01) vs AE

I Steep fall initially flatter later Choose dimensionality = 32

2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136


I Network size [91 91 32 91 91]I 91 is the dimension of input CCs 32 is latent space

dimensionality

I Linear Fully Connected Layers + Leaky ReLU activations

I ADAM [Kingma and Ba 2014] with initial lr = 10minus3

I Training for 2000 epochs with batch size 512

2236

Experiments

I Two kinds of experiments to demonstrate networks capabilities

1 Reconstruction - Omit pitch instances during training and seehow well model reconstructs notes of omitted target pitch

2 Generation - How well model lsquosynthesizesrsquo note instances withnew unseen pitches

2236

Experiments




2236

Experiments




2336

Reconstruction

I Two training contexts -

1 T(times) is target X is training instancesMIDI T - 3 T - 2 T - 1 T T + 1 T + 2 T + 3Kept X X X times X X X

2 Octave endpointsMIDI 60 61 62 63 64 65Kept X times times times times timesMIDI 66 67 68 69 70 71Kept times times times times times X

I In each of the above cases we compute the MSE as theframe-wise spectral envelope match across all frames of all thetarget instances

2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE

I Both AE and CVAE reasonably reconstruct the target pitch1 6 2 7 3 8

I CVAE produces better reconstruction especially when thetarget pitch is far from the pitches available in the trainingdata(plot above) 4 9 5 10 6 11

I Conditioning helps to capture the pitch dependency of thespectral envelope more accurately


1704


1704


1704


1536


1536


1536

2536

Reconstruction

I To emulate the effect of pitch conditioning with an AE wetrain the AE by appending the pitch to the input CCs andreconstructing this appended input as shown below

f0

CCscAE

fprime

0

CCsprime

Figure lsquoConditionalrsquo AE(cAE)

I [Wyse 2018] followed a similar approach of appending theconditional variables to the input of his model

I the lsquocAErsquo is comparable to our proposed CVAE in that thenetwork might potentially learn something from f0

2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE

I We train the cAE on the octave endpoints and evaluateperformance as shown above

I Appending f0 does not improve performance It is infact worsethan AE

2736

Generation

I We are interested in lsquosynthesizingrsquo audio

I We see how well the network can generate an instance of adesired pitch(which the network has not been trained on)

Z

Dec

od

er

f0

Figure Sampling from the Network

I We follow the procedure in [Blaauw and Bonada 2016] - arandom walk to sample points coherently from the latent space

2836

Generation

I We train on instances across the entire octave sans MIDI 65and then generate MIDI 65

I On listening we can see that the generated audio still lacks thesoft noisy sound of the violin bowing

1 12 2 13 3 14

I For a more realistic synthesis we generate a violin note withvibrato 4 15


2184


2256


2184


2184

2936

Putting it all together

I We explored autoencoder frameworks in generative models foraudio synthesis of instrumental tones

I Our parametric representation decouples lsquotimbrersquo and lsquopitchrsquothus relying on the network to model the inter-dependencies

I Pitch conditioning allows to generate the learnt spectralenvelope for that pitch thus enabling us to vary the pitchcontour continuously

2936





2936





3036

Putting it all togetherI Our model is still far from perfect however and we have to

improve it further

1 An immediate thing to do will be to model the residual of theinput audio This will help in making the generated audio soundmore natural

2 We have not taken into account dynamics A more completesystem will involve capturing spectral envelope(timbral)dependencies on both pitch and loudness dynamics

3 Conducting more formal listening tests involving synthesis oflarger pitch movements or melodic elements from IndianClassical music

I Our contributions

1 To the best of our knowledge we have not come across anywork using a parametric model for musical tones in the neuralsynthesis framework especially exploiting the conditioningfunction of the CVAE

2 The TAE method has no known PYTHON implementation sowe plan to make our code open-source to the MIR communityfor research

3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I

[Blaauw and Bonada 2016] Blaauw M and Bonada J (2016)Modeling and transforming speech using variational autoencodersIn Interspeech pages 1770ndash1774

[Caetano and Rodet 2012] Caetano M and Rodet X (2012)A source-filter model for musical instrument sound transformationIn 2012 IEEE International Conference on Acoustics Speech and Signal Processing(ICASSP) pages 137ndash140 IEEE

[Doersch 2016] Doersch C (2016)Tutorial on variational autoencodersarXiv preprint arXiv160605908

[Engel et al 2017] Engel J Resnick C Roberts A Dieleman S Norouzi MEck D and Simonyan K (2017)Neural audio synthesis of musical notes with wavenet autoencodersIn Proceedings of the 34th International Conference on Machine Learning-Volume70 pages 1068ndash1077 JMLR org

[Esling et al 2018] Esling P Bitton A et al (2018)Generative timbre spaces regularizing variational auto-encoders with perceptualmetricsarXiv preprint arXiv180508501

3236

References II

[Griffin and Lim 1984] Griffin D and Lim J (1984)Signal estimation from modified short-time fourier transformIEEE Transactions on Acoustics Speech and Signal Processing 32(2)236ndash243

[Hinton and Salakhutdinov 2006] Hinton G E and Salakhutdinov R R (2006)Reducing the dimensionality of data with neural networksscience 313(5786)504ndash507

[IMAI 1979] IMAI S (1979)Spectral envelope extraction by improved cepstrumIEICE 62217ndash228

[Kingma and Ba 2014] Kingma D P and Ba J (2014)Adam A method for stochastic optimizationarXiv preprint arXiv14126980

[Kingma and Welling 2013] Kingma D P and Welling M (2013)Auto-encoding variational bayesarXiv preprint arXiv13126114

[Oord et al 2016] Oord A v d Dieleman S Zen H Simonyan K Vinyals OGraves A Kalchbrenner N Senior A and Kavukcuoglu K (2016)Wavenet A generative model for raw audioarXiv preprint arXiv160903499

3336

References III

[Roche et al 2018] Roche F Hueber T Limier S and Girin L (2018)Autoencoders for music sound modeling a comparison of linear shallow deeprecurrent and variational modelsarXiv preprint arXiv180604096

[Roebel and Rodet 2005] Roebel A and Rodet X (2005)Efficient Spectral Envelope Estimation and its application to pitch shifting andenvelope preservationIn International Conference on Digital Audio Effects pages 30ndash35 Madrid Spaincote interne IRCAM Roebel05b

[Romani Picas et al 2015] Romani Picas O Parra Rodriguez H Dabiri DTokuda H Hariya W Oishi K and Serra X (2015)A real-time system for measuring sound goodness in instrumental soundsIn Audio Engineering Society Convention 138 Audio Engineering Society

[Sarroff and Casey 2014] Sarroff A M and Casey M A (2014)Musical audio synthesis using autoencoding neural netsIn ICMC

[Serra et al 1997] Serra X et al (1997)Musical sound modeling with sinusoids plus noiseMusical signal processing pages 91ndash122

3436

References IV

[Slawson 1981] Slawson W (1981)The color of sound a theoretical study in musical timbreMusic Theory Spectrum 3132ndash141

[Sohn et al 2015] Sohn K Lee H and Yan X (2015)Learning structured output representation using deep conditional generative models

In Advances in neural information processing systems pages 3483ndash3491

[Wyse 2018] Wyse L (2018)Real-valued parametric conditioning of an rnn for interactive sound synthesisarXiv preprint arXiv180510808

3536

Audio examples description I

1 Input MIDI 63 to Spectral Model

2 Spectral Model Reconstruction(trained on MIDI63)

3 Spectral Model Reconstruction(not trained on MIDI63)

4 Input MIDI 60 note to Parametric Model

5 Parametric Reconstruction of input note

6 Input MIDI 63 Note

7 Parametric AE reconstruction of input

8 Parametric CVAE reconstruction of input

9 Input MIDI 65 note(endpoint trained model)

10 Parametric AE reconstruction of input(endpoint trained model)

11 Parametric CVAE reconstruction of input(endpoint trained model)

12 CVAE Generated MIDI 65 Violin note

13 Similar MIDI 65 Violin note from dataset

3636

Audio examples description II

14 CVAE Reconstruction of the MIDI 65 violin note

15 CVAE Generated MIDI 65 Violin note with vibrato

Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

236

Audio Synthesis


236

Audio Synthesis



236

Audio Synthesis




236

Audio Synthesis


1 Timbre

2 Pitch

3 Loudness

236

Audio Synthesis




236

Audio Synthesis




336





436









436









436









436









436










436









436









436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

236

Audio Synthesis



236

Audio Synthesis




236

Audio Synthesis


1 Timbre

2 Pitch

3 Loudness

236

Audio Synthesis




236

Audio Synthesis




336





436









436









436









436









436










436









436









436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

236

Audio Synthesis




236

Audio Synthesis


1 Timbre

2 Pitch

3 Loudness

236

Audio Synthesis




236

Audio Synthesis




336





436









436









436









436









436










436









436









436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

236

Audio Synthesis


1 Timbre

2 Pitch

3 Loudness

236

Audio Synthesis




236

Audio Synthesis




336





436









436









436









436









436










436









436









436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

236

Audio Synthesis




236

Audio Synthesis




336





436









436









436









436









436










436









436









436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

236

Audio Synthesis




336





436









436









436









436









436










436









436









436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

336





436









436









436









436









436










436









436









436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

436









436









436









436









436










436









436









436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

436









436









436









436










436









436









436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

436









436









436










436









436









436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

436









436










436









436









436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

436










436









436









436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

436









436









436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

436









436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

436









536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

536








536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

536








536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

536









536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

536








536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

536








536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

536








536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

536








636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

636

Why Parametric





636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

636

Why Parametric





736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

736

Dataset



0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

736

Dataset





0 5 10 15 20 25 30 35

606162636465666768697071

282626262626

343030

262626

Instances

MID

I



836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

836

Dataset



936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

936




Sustain

FFT

En

cod

er

Z

Dec

od

er

AE


1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1036





1728


1728


1728

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1136

Parametric Model


Sustain

FFTHpR

f0Magnitudes


1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1236

Parametric Model





f0Magnitudes

TAE

CCs

f0

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1336

Parametric Model





3216


2256

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1436

Parametric Model

0 1 2 3 4 5minus80

minus60

minus40

minus20


606571





1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1536

Parametric Model


Kcc leFs2f0


1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1636

Generative Models







1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1636

Generative Models







1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1636

Generative Models







1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1636

Generative Models







1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1736

Generative Models

Why VAE over AE


Why CVAE over VAE


1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1836


CCs

f0

En

cod

er

Z

Dec

od

er

CVAE




1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1936



L prop MSE + βKLD

0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

1936



0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

β = 001β = 01β = 1



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2036




0 10 20 30 40 50 60 70

2

4

6

8middot10minus5


MSE

CVAEAE



2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2136



dimensionality




2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2236

Experiments




2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2236

Experiments




2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2236

Experiments




2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2336

Reconstruction





2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2336

Reconstruction





2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2336

Reconstruction





2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2436

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAE





1704


1704


1704


1536


1536


1536

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2536

Reconstruction


f0

CCscAE

fprime

0

CCsprime




2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2636

Reconstruction

60 62 64 66 68 70 72

1

2

3

4

5middot10minus2

Pitch

MSE

AECVAEcAE



2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2736

Generation



Z

Dec

od

er

f0



2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2836

Generation



1 12 2 13 3 14



2184


2256


2184


2184

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2936





2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2936





2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

2936





3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

3036


improve it further




I Our contributions



3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

3036


improve it further




I Our contributions



3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

3136

References I






3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

3236

References II







3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

3336

References III






3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

3436

References IV





3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

3536















3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

3636




Notes

fdrm17

fdrm16

fdrm15

fdrm14

fdrm13

fdrm12

fdrm11

fdrm10

fdrm9

fdrm8

fdrm7

fdrm6

fdrm5

fdrm3

fdrm1

A Variational Parametric Model for Audio Synthesiskrishnasubramani/data/ddp_stage… ·...

Documents

Transcript of A Variational Parametric Model for Audio Synthesiskrishnasubramani/data/ddp_stage… ·...