Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J....
-
Upload
nathaniel-russell -
Category
Documents
-
view
219 -
download
1
Transcript of Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J....
![Page 1: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/1.jpg)
Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora
C. AURAN, C. BOUZON & D.J. HIRST
Laboratoire Parole et LangageCNRS UMR6057
Université de Provence
![Page 2: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/2.jpg)
Summary
1. The Aix-MARSEC ProjectBuilding Aix-MARSECAvailability of the databaseMethodology
2. Grapheme-Phoneme Conversion and AlignmentThe Aix-MARSEC MethodologyIntegration into PCE
3. Conclusion and Perspectives
![Page 3: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/3.jpg)
The Aix-MARSEC Project
![Page 4: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/4.jpg)
• Automatic grapheme-to-phoneme conversion
• Automatic phoneme level alignment
• Automatic intonation annotation using the Momel-Intsint methodology
• 8 annotation levels aligned: phonemes, syllable constituents,
syllables, words, feet and rhythmic units, tone groups, Intsint coding
• Tagging and parsing alignment under way
The Aix-MARSEC Project
An evolution from the SEC and MARSEC corpora
SEC
Spoken English Corpus
• 55,000 words, 339 min. and 18 sec. • BBC 1980s recordings• 11 speaking styles• 53 (17 female and 36 male) speakers• Orthographic transcription• Syntactic tagging and parsing• Prosodic annotation: 14 tonetic stress marks
MARSEC
Machine Readable SEC
Aix-MARSEC
Building Aix-MARSEC
• Alignment of words and tone groups with the signal
• Conversion of all the TSM to ASCII characters
![Page 5: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/5.jpg)
The Aix-MARSEC Project
![Page 6: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/6.jpg)
The Aix-MARSEC Project
Availability of the database
• Online version:• Annotation files (TextGrids)
• Phonemes data tables
• Perl and Praat scripts
www.lpl.univ-aix.fr/~EPGA/
• CD-Rom version:• Annotation files (TextGrids)
• Phonemes data tables
• Perl and Praat scripts
• Sound files (.wav format)
![Page 7: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/7.jpg)
The Aix-MARSEC Project
Methodology
Automatic alignment
Orthographic transcription
Raw phonemic transcription
Optimised phonemic transcription
Aligned phonemic transcription
Elision prediction
G2P conversion
SC annotation Syllable annotation Word annotation
TSM annotation
Rhythmic annotation
![Page 8: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/8.jpg)
Grapheme-Phoneme Conversion and Alignment
![Page 9: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/9.jpg)
G2P Conversion and Alignment
Orthographic transcription
Raw phonemic transcription
Optimised phonemic transcription
Elision prediction
G2P conversion
The Aix-MARSEC Methodology
Automatic alignment
Aligned phonemic transcription
SC annotation Syllable annotation Word annotation
![Page 10: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/10.jpg)
G2P Conversion and Alignment
Orthographic transcription
Raw phonemic transcription
G2P conversion
The Aix-MARSEC Methodology
![Page 11: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/11.jpg)
G2P Conversion and Alignment
The Aix-MARSEC Methodology
G2P Conversion: General principles
• Dictionary-based method (4 dictionaries used)
• Specific processing for numbers, abbreviations, etc.
• Syntagmatic effects (linking r, definite article)
Raw transcription
![Page 12: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/12.jpg)
G2P Conversion and Alignment
The Aix-MARSEC Methodology
G2P Conversion: The 4 dictionaries
• Primary pronunciation dictionary (‘Advanced Learners’ Dictionary’, Oxford University Press; 71 000 entries)
• Complementary dictionary (700 entries)
• “Problematic forms” dictionary (for hesitations, partial words,…; 26 entries)
• “Reduced forms” dictionary (75 entries)
![Page 13: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/13.jpg)
G2P Conversion and Alignment
The Aix-MARSEC Methodology
G2P Conversion: Specific issues
• Abbreviations• Numbers• Sequences of numbers and capitals (Post Codes)• Genitives and Contractions• 3rd person and plural forms• Preterite and past participle forms
![Page 14: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/14.jpg)
G2P Conversion and Alignment
Orthographic transcription
Raw phonemic transcription
G2P conversion
The Aix-MARSEC Methodology
Optimised phonemic transcription
Elision prediction
![Page 15: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/15.jpg)
G2P Conversion and Alignment
The Aix-MARSEC Methodology
Elision Prediction: General principles
• Raw transcription ↔ citation forms
• Continuous speech ↔ specific phenomena (elisions, epenthesis, metathesis, etc.)
![Page 16: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/16.jpg)
G2P Conversion and Alignment
The Aix-MARSEC Methodology
Elision prediction: Constraints
- Intonation constraints (TSM)- Temporal constraints:
Minimal threshold: 5ms
Thresholds for specific phonemes (Klatt, 1979)
/t – d/= 55ms; /@/= 55ms; /T/= 110ms
Lengthening « z » factor: z < 0 elision
z ≥ 0 no elision
- Phonotactic constraints (rules)
![Page 17: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/17.jpg)
G2P Conversion and AlignmentElision prediction: Rules
Principles Phonemes Contexts Constraints Examples0 <5ms1 d and TSM and then2 h he('s/ll/d) him his her TSM in her case
3 t d {[t][d]} # {[t][d]} Th.1 - except '-ed' I've got to
4 t d C1 + {[t][d]} # C2 – {[h][j]} Th. mustn't lose
6 l [O:] + [l] (#) C always7 T C + [T] (#) [s] Th. twelfths8 ptk bdg [s| z] + {[p| b][t| d][k| g]} (#) [s| z] tourists
10 @ # [k@n] ('syll (syll [0…n])) # TSM - Th. confront
11 @ {[k][p]} + [@] + [n] # Th. open
5 p k glimpsenasal + {[p][k]} (#) C – {[r][l][j]}
9 @ Th. - */rl/ camera[@] + {[l][r]} (#) + voyelle réduite {[I][@]}
1Th.: duration threshold
![Page 18: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/18.jpg)
G2P Conversion and AlignmentElision prediction: Evaluation
MEASURES
RECALL 50,51 %
PRECISION 74,44 %
SILENCE 49,49 %
NOISE 25,56 %
F-MEASURE 60,18 %
4077 elided phonemes out of 199,770 in the corpus (≈ 2 %)
Half of all elisions are correctly predicted
¾ predicted elisions are correct
Global quality of the algorithm
![Page 19: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/19.jpg)
G2P Conversion and Alignment
Orthographic transcription
Raw phonemic transcription
Optimised phonemic transcription
Elision prediction
G2P conversion
The Aix-MARSEC Methodology
Automatic alignment
Aligned phonemic transcription
![Page 20: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/20.jpg)
G2P Conversion and AlignmentAlignment: General principles
HMM and Viterbi based alignment by Christophe Lévy (LIA, France)
- HMM trained on the TIMIT corpus of American English
- Gaussian Mixture Model (8 components & diagonal covariance matrices estimated through the Expectation-Maximisation algorithm optimising the Maximum-Likelihood criterion)
- 12 MFCC (filter bank analysis) increased by energy, delta and delta-delta coefficients
39-coefficient vector per speech frame
![Page 21: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/21.jpg)
G2P Conversion and Alignment
Absolute mean error: 22 ms
Mean error: - 6,29 ms
Kurtosis: 8,15 (narrow distribution)
Skewness: -0,94 (left bias)
Alignment: Evaluation
![Page 22: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/22.jpg)
G2P Conversion and Alignment
Acceptance Threshold
Optimised transcription
64 ms 93.25 %
32 ms 82.02 %
20 ms 68.37 %
16 ms 59.97 %
15 ms 57.40 %
10 ms 42.43 %
5 ms 23.72 %
Alignment: Evaluation
![Page 23: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/23.jpg)
Integration into PCEIntegration: Motivations
Double focus:
Segmental phenomena
Prosodic phenomena
Formant charts
Tonal alignment
Phoneme level alignment
For phoneticians and phonologists
![Page 24: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/24.jpg)
Integration into PCEIntegration: 2 possible policies
• Direct integration: Exact Aix-MARSEC methodology
Requires word level manual alignment
• Alternative integration: Adaptation of the Aix-MARSEC methodology
Optional elisions predicted on the basis of phonotactic rules only + decision during the alignment phase
![Page 25: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/25.jpg)
Conclusions and Perspectives
![Page 26: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/26.jpg)
Conclusions and Perspectives
• An easily evolutive fully automatic methodology
• Diverse types of phonological / phonetic segmental / prosodic
exploitation (formant charts, temporal, intonational and metrical
studies, …)
• Full interactivity with other ProZEd modules (Momel-Intsint, …)
• Realistic integration into PCE (2 options)
![Page 27: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/27.jpg)
Well… This time it’s for good !!
Presentation available from
www.lpl.univ-aix.fr/~EPGA/
![Page 28: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/28.jpg)
14 ASCII prosodic annotation symbols:
_ low level~ high level< step-down> step-up/’ (high) rise-fall
‘/ high\ high fall fall-rise/ high rise
, low rise‘ low fall,\ (low rise-fall – not used)\, low fall-rise* stressed but unaccented| minor intonation unit boundary|| major intonation unit boundary
(Roach, 1994)
Back to the presentation
![Page 29: Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université](https://reader035.fdocuments.in/reader035/viewer/2022062417/55161c67550346cf6f8b671f/html5/thumbnails/29.jpg)
Reduced forms processing
Creation of a reduced forms dictionary based on O’Connor (1967) and
Faure (1975)
Reduction constraint: TSM absence
Aim: improving G2P conversion
Back to the presentation
Example: TSM: ‘/and → converted into /{nd/
No TSM: and → converted into /@nd/