Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study
-
Upload
hedwig-bryant -
Category
Documents
-
view
36 -
download
0
description
Transcript of Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study
![Page 1: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/1.jpg)
Synthesis & evaluation of prosodically exaggerated utterances:
A preliminary study
Kyuchul YoonDivision of English
Kyungnam UniversitySpring 2008 Joint Conference of KSPS & KASS
![Page 2: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/2.jpg)
2
Contents
• Synthesis & evaluation of human utterances with exaggerated prosody
• Synthesis of exaggerated prosody– Useful for native utterances– The definition of prosody “exaggeration”– The algorithm
• Evaluation of exaggerated prosody– Useful for evaluating learner utterances– The algorithm & an experiment
![Page 3: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/3.jpg)
3
Teaching & evaluating prosody
• Teaching language prosody– The need for “exaggeration” of native utterances– How to define “exaggeration”
• Evaluating language prosody– Given the native version of an utterance,
evaluate learner’s utterances w/ atypical prosody– How to measure the differences btw/ the native
and learner utterances
![Page 4: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/4.jpg)
4
Exaggerating native prosody
• Exaggeration of the F0 contour– One way would be to make the pitch peaks/valleys
higher/lower
• Exaggeration of the intensity contour– One way would be to manipulate the intensity contour
of the pitch peaks/valleys
• Exaggeration of the segmental durations– One way would be to manipulate the segmental
durations of the pitch peaks/valleys
![Page 5: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/5.jpg)
5
Exaggerating native prosody
The fundamental frequency (F0) contour of an utterance Marianna!.
F0
![Page 6: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/6.jpg)
6
Exaggerating native prosodyIntensity
The intensity contour of an utterance Marianna!.
![Page 7: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/7.jpg)
7
Exaggerating native prosodyDuration
The segmental durations of an utterance Marianna! before and after the exaggeration.
![Page 8: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/8.jpg)
8
Algorithm: prosody exaggeration
• Definition of prosody exaggeration– F0 contour
• Make pitch peaks/valleys higher/lower in Hz values
– Intensity contour• Make pitch peaks higher in dB values
– Segmental durations• Make pitch peaks longer in times values
![Page 9: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/9.jpg)
9
Algorithm: prosody exaggerationF0
![Page 10: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/10.jpg)
10
Algorithm: prosody exaggerationIntensity
![Page 11: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/11.jpg)
11
Algorithm: prosody exaggerationDurations
![Page 12: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/12.jpg)
12
How Praat script works
![Page 13: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/13.jpg)
13
How Praat script worksF0
Intensity
Durations
![Page 14: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/14.jpg)
14
How Praat script worksOriginal
F0Durations
Intensity
F0Durations
![Page 15: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/15.jpg)
15
Evaluating learner prosody• Assumes the existence of the native version• Evaluates the learner versions• Evaluation of the F0 & intensity contours
– Is preceded by duration manipulation:• The durations of the matching segments of the two utterances are
made identical [3]
– Is preceded by F0/intensity normalization & F0 smoothing• The mean difference is added/subtracted to/from learner utterance
– Is followed by pitch/intensity point-to-point comparison
• Evaluation of segmental durations– Done without any duration manipulation. Segment-to-
segment comparison
• Evaluation measure: Euclidean distance metric
![Page 16: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/16.jpg)
16
Algorithm: prosody evaluation
Before & after duration manipulation
native
learnerbefore
learnerafter
![Page 17: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/17.jpg)
17
Algorithm: prosody evaluation
F0 point-to-point comparison btw/ native and learner
native
learnerafter
![Page 18: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/18.jpg)
18
Algorithm: prosody evaluation
Intensity point-to-point comparison btw/ native and learner
native
learnerafter
![Page 19: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/19.jpg)
19
Algorithm: prosody evaluation
Duration segment-to-segment comparison btw/ native and learner
native
learnerbefore
P = (p1, p2, p3,..., pn) and Q = (q1, q2, q3,..., qn) in Euclidean n-space
Euclidean distance metric for evaluation measure
![Page 20: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/20.jpg)
20
A pilot experiment
native
learnerafter
Euclidean distance should be minimum
![Page 21: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/21.jpg)
21
A pilot experiment
native
F0: -100Hz to +100Hz with a 10Hz interval 21 stimuliIntensity: -25dB to +25dB with a 5dB interval 11 stimuliDuration: 0.25, 0.50, 0.75, 1.00, 1.50, 2.00, 2.50, 3.00 times the original 8 stimuli
learnerafter
![Page 22: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/22.jpg)
22
Results & Conclusion
![Page 23: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/23.jpg)
23
Results & Conclusion
![Page 24: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/24.jpg)
24
Results & Conclusion
![Page 25: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/25.jpg)
25
Results & Conclusion
• Prosody exaggeration – Can be a tool for teaching language prosody
– Can be used to test measures for evaluating prosody
• Limitation of the current prosody evaluation– Native utterances should exist to yield measures
• TTS systems with advanced prosody models could be helpful
– “Weights” of the three separate measures (F0/intensity/duration) need to be determined
• Experiments with human evaluators could provide the weights
![Page 26: Synthesis & evaluation of prosodically exaggerated utterances: A preliminary study](https://reader035.fdocuments.in/reader035/viewer/2022062308/56812bb0550346895d8fea5b/html5/thumbnails/26.jpg)
26
References[1] Boersma, Paul. 2001. Praat, a system for doing phonetics by computer. Glot
International 5(9/10). pp.341-345.[2] Moulines, E. & F. Charpentier. 1990. Pitch synchronous waveform processing
techniques for text-to-speech synthesis using diphones. Speech Communication 9. pp.453-467.
[3] Yoon, K. 2007. Imposing native speakers' prosody on non-native speakers' utterances: The technique of cloning prosody. Journal of the Modern British & American Language & Literature 25(4). pp.197-215.