Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011
description
Transcript of Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011
![Page 1: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/1.jpg)
CS460/626 : Natural Language Processing/Speech, NLP and the Web
(Lecture 36–syllabification and transliteration)
Pushpak BhattacharyyaCSE Dept., IIT Bombay
11th April, 2011
![Page 2: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/2.jpg)
Access from one language to another
Language L1
Translation
Language L2
TransliterationGandhiji गांधीजीBoy
लड़का
![Page 3: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/3.jpg)
Cross Lingual Query
Query Translati
on
Search Engine
English Docume
nt
Translate Docume
nt
गांधी के जन्मस्थान
Birthplace of Gandhi
Output
![Page 4: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/4.jpg)
Sound based transliteration Transliteration
Script Script
Transliteration using pronunciation Convert script to sound Convert sound to script
Similar to interlingua based MT
MeaningSentence in source Language
Sentence in target Language
(गांधीजी)(Gandhiji)
![Page 5: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/5.jpg)
Two approaches to transliteration
Syllabification
Rule Based
Machine Learning Based
Word SyllabifiedApple ap ple
Fundamental fun da men tal
Use GIZA++ Moses on parallel corpus
![Page 6: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/6.jpg)
Phoneme-based approach
Word inSource language
Pronunciationin
Source language
Word inTarget language
PronunciationIn
target language
P( ps | ws)
P ( pt | ps )
P ( wt | pt )
Note: Phoneme is the smallest linguistically distinctive unit of sound.
P(wt)
Wt* = argmax (P (wt). P (wt | pt) . P (pt | ps) . P (ps | ws) )
![Page 7: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/7.jpg)
Basics of syllables“Syllable is a unit of spoken language consisting of a single uninterrupted sound formed generally by a Vowel and preceded or followed by one or more consonants.”
Vowels are the heart of a syllable (Most Sonorous Element) (svayam raajate iti svaraH)
Consonants act as sounds attached to vowels.
![Page 8: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/8.jpg)
Syllable structure A syllable consists of 3 major parts:-
Onset (C) Nucleus (V) Coda (C)
Vowels sit in the Nucleus of a syllable Consonants may get attached as
Onset or Coda. Basic structure - CV
![Page 9: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/9.jpg)
Possible syllable structures The Nucleus is
always present Onset and Coda
may be absent Possible
structures V CV VC CVC
![Page 10: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/10.jpg)
Introduction to sonority theory“The Sonority of a sound is its loudness
relative to other sounds with the same length, stress and speech.”
Some sounds are more sonorous Words in a language can be divided into
syllables Sonority theory distinguishes syllables on
the basis of sounds.
![Page 11: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/11.jpg)
Sonority hierarchy Defined on the basis of amount of
sound associated The sonority hierarchy is as follows:-
Vowels (a, e, i, o, u) Liquids (y, r, l, v) Nasals (n, m) Fricatives (s, z, f,…..sh, th etc.) Affricates (ch, j) Stops (b, d, g, p, t, k)
![Page 12: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/12.jpg)
Sonority scale Obstruents can
be further classified into:- Fricatives Affricates Stops
![Page 13: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/13.jpg)
Sonority theory & syllables“A Syllable is a cluster of sonority, defined by a sonority peak acting as a structural magnet to the surrounding lower sonority elements.”
Represented as waves of sonority or Sonority Profile of that syllable Nucleus
Onset Coda
![Page 14: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/14.jpg)
Sonority sequencing principle
“The Sonority Profile of a syllable must rise until its Peak(Nucleus), and then fall.”
Peak (Nucleus)
Onset Coda
![Page 15: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/15.jpg)
Maximal onset principle“The Intervocalic consonants are maximally
assigned to the Onsets of syllables in conformity with Universal and Language-Specific Conditions.”
Determines underlying syllable division
Example DIPLOMA
DIP LO MA & DI PLO MA
![Page 16: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/16.jpg)
Syllable Structure: a more detailed look
Count of no. of syllables in a word is roughly/intuitively the no. of vocalic segments in a word.
Thus, presence of a vowel is an obligatory element in the structure of a syllable. This vowel is called “nucleus”.
Basic Configuration: (C)V(C). Part of syllable preceding the nucleus is called the
onset. Elements coming after the nucleus are called the
coda. Nucleus and coda together are referred to as the
rhyme.
S ≡ Syllable, O ≡ OnsetR ≡ Rhyme, N ≡ NucleusCo ≡ Coda
![Page 17: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/17.jpg)
Syllable Structure: Examples
‘word’
‘sprint’
![Page 18: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/18.jpg)
Syllable Structure: Examples
‘may’
‘opt’
‘air’
No Coda.
No Onset.
No Coda, No Onset.
![Page 19: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/19.jpg)
Syllable Structure Open Syllable: ends in vowel Closed syllable: ends in consonant or consonant
cluster
Light Syllable: A syllable which is open and ends in a short vowel
General Description – CV. Example, ‘air’.
Heavy Syllable: Closed syllables or syllables ending in diphthong
Example: ‘opt’ Example, ‘may’
![Page 20: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/20.jpg)
Syllabification: Determining Syllable Boundaries
Given a string of syllables (word), what is the coda of one and the onset of another?
In a sequence such as VCV, where V is any vowel and C is any consonant, is the medial C the coda of the first syllable (VC.V) or the onset of the second syllable (V.CV)?
To determine the correct groupings, there are some rules, two of them being the most important and significant:
Maximal Onset Principle, Sonority Hierarchy
![Page 21: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/21.jpg)
(This and statistical syllabification and application to Transliteration: DDP work
by Ankit Aggrwal, 2009)
Rule Based Syllabification
![Page 22: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/22.jpg)
Phonotactics Guides construction of onsets and codas
Restricts combination of phonemes.
In general, rules operate around the sonority hierarchy. E.g., Fricative /s/ is lower on the sonority hierarchy than the lateral /l/, so the
combination /sl/ is permitted in onsets and /ls/ is permitted in codas. Opposite is not allowed.
Thus, ‘slips’ and ‘pulse’ are possible English words. ‘lsips’ and ‘pusl’ are not possible.
![Page 23: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/23.jpg)
Constraints on Onsets1
One-consonant: Only /ŋ/ cannot occur in syllable-initial position. Two-consonant: We refer to the scale of sonority.
Sequence ‘rn’ is ruled out since there is a decrease of sonority. Minimal Sonority Distance: Distance in sonority between the first and the
second element in the onset must be of at least 2 degrees. Thus, only a limited no. of possible two-consonant clusters.
Three-consonant: Restricted to licensed two-consonant onsets preceded by /s/. Also, /s/ can only be followed by a voiceless sound. Therefore, only /spl/, /spr/, /str/, /skr/, /spj/, /stj/, /skj/, /skw/,
/skl/, /smj/ will be allowed. (splinter, spray, strong etc.) While /sbl/, /sbr/, /sdr/, /sgr/, /sθr/ will be ruled out.
![Page 24: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/24.jpg)
Constraints on Onsets2
Possible 2-consonat clusters in an Onset
![Page 25: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/25.jpg)
Constraints on Coda1
![Page 26: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/26.jpg)
Constraints on Coda2
![Page 27: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/27.jpg)
Other Constraints Nucleus: The following can occur as nucleus:
All vowel sounds (monophthongs as well as diphthongs).
Syllabic: Both the onset and the coda are optional (as seen previously). /j/ at the end of an onset (/pj/, /bj/, /tj/, /dj/, /kj/, /fj/, /vj/, /θj/, /sj/,
/zj/, /hj/, /mj/, /nj/, /lj/, /spj/, /stj/, /skj/) must be followed by /uɪ/ or /ʊə/.
Long vowels and diphthongs are not followed by /ŋ/. /ʊ/ is rare in syllable-initial position. Stop + /w/ before /uɪ, ʊ, ʌ, aʊ/ are excluded.
![Page 28: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/28.jpg)
Implementation1
1. Identify the first nucleus (single or multiple consecutive vowels) in the word.
2. Parse all the preceding consonants as the onset of the first syllable.
3. Find the next nucleus in the word. If unsuccessful in finding another nucleus, we'll simply parse the consonants to the right of the current nucleus as the coda of the first syllable, else we will move to the next step. Now the consonant cluster between these two nuclei have to be divided in two parts, one as coda of 1st syllable and the other onset of 2nd syllable.
4. If the no. of consonants in the cluster is one, it will be the onset of the second nucleus as per the Maximal Onset Principle and Constrains on Onset.
5. If it is two, check whether both of these can go to the onset of the second syllable as per the allowable onsets.
![Page 29: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/29.jpg)
Implementation2
6. If not , the first consonant will be the coda of the first syllable and the second consonant will be the onset of the second syllable.
7. If the no. of consonants in the cluster is three, check if they can be the onset of the second syllable, if not, check for the last two, if not, parse only the last consonant as the onset of the second syllable.
8. If the no. of consonants in the cluster is more than three, except the last three consonants, parse all the consonants as the coda of the first syllable. With the remaining 3 consonants, apply step 7.
9. Continue the procedure for the rest of the string.
![Page 30: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/30.jpg)
Accuracy Accuracy = (no. of words correctly syllabified x 100) %
no. of words
10,000 easy words were chosen to establish proof of concept 91 words were found to be incorrectly syllabified. Accuracy: 99.09%. High precision, low recall (a characteristic of knowledge based AI)
![Page 31: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/31.jpg)
Statistical syllabification
![Page 32: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/32.jpg)
Statistical Machine Syllabification
Data Sources of data
1. Election Commission of India (ECI) Name List : This web source provides native Indian names written in both English and Hindi.
2. Delhi University (DU) Student List : This web sources provides native Indian names written in English only. These names were manually transliterated for the purposes of training data.
3. Indian Institute of Technology, Bombay (IITB) Student List: The Academic Office of IITB provided this data of students who graduated in the year 2007.
4. Named Entities Workshop (NEWS) 2009 English-Hindi parallel names : A list of paired names between English and Hindi of size 11k is provided.
We start with 8,000 randomly chosen names from the ECI Name list and syllabify them manually.
![Page 33: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/33.jpg)
Statistical Machine Syllabification
Effect of Training Format Syllable-separated Format
Syllable-marked Format
Future Work
Source Targets u d a k a r su da karc h h a g a n chha ganj i t e s h ji teshn a r a y a n na ra yans h i v shivm a d h a v ma dhavm o h a m m a d mo ham madj a y a n t e e d e v i ja yan tee de vi
Source Targets u d a k a r s u _ d a _ k a rc h h a g a n c h h a _ g a nj i t e s h j i _ t e s hn a r a y a n n a _ r a _ y a ns h i v s h i vm a d h a v m a _ d h a vm o h a m m a d m o _ h a m _ m a dj a y a n t e e d e v i j a _ y a n _ t e e _ d e _ v i
![Page 34: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/34.jpg)
Statistical Machine Syllabification
Top-1 Accuracy: 71.8% (Syllable-separated); 80.5% (Syllable-marked)
Top-5 Accuracy: 83.4% (Syllable-separated); 90.4% (Syllable-marked)
60%65%70%75%80%85%90%95%
100%
1 2 3 4 5
Cum
ulati
ve A
ccur
acy
Accuracy Level
Syllable-separated Syllable-marked
![Page 35: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/35.jpg)
Statistical Machine Syllabification
Effect of Data Size 4 different experiments were performed:
1. 8k: Names from the ECI Name list.2. 12k: An additional 4k names were syllabified to increase the data size.3. 18k: IITB Student List and the DU Student List was included and syllabified.4. 23k: Some more names from ECI Name List and DU Student List were
syllabified and this data acts as the final data for us.
93.8%97.5% 98.3% 98.5% 98.6%
70.0%
75.0%
80.0%
85.0%
90.0%
95.0%
100.0%
1 2 3 4 5
Cum
ulati
ve A
ccur
acy
Accuracy Level
8k 12k 18k 23k
![Page 36: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/36.jpg)
Statistical Machine Syllabification
Effect of Language Model n-gram Order
85.0%
87.0%
89.0%
91.0%
93.0%
95.0%
97.0%
99.0%
1 2 3 4 5
Cum
ulati
ve A
ccur
acy
Accuracy Level
3-gram 4-gram 5-gram 6-gram 7-gram
![Page 37: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/37.jpg)
Effect of Syllabification on Transliteration
![Page 38: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/38.jpg)
Transliteration using syllabification
Moses
Moses
Syllabified data
(source–side)
Automatic Transliterator
Automatic Syllabifier
Syllabified data
(parallel)
Input String
Syllabified String
Transliterated String
Learnt syllabifier
Learnt transliterator
![Page 39: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/39.jpg)
Statistical Machine Transliteration
Top-1 Accuracy: 60.1% (Syllable-separated); 50.2% (Syllable-marked) Top-5 Accuracy: 87.2% (Syllable-separated); 79.3% (Syllable-marked)
45%50%55%60%65%70%75%80%85%90%95%
100%
1 2 3 4 5 6
Cum
ulati
ve A
ccur
acy
Accuracy Level
Syllable-separated Syllable-marked
![Page 40: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/40.jpg)
The Top-5 Accuracy is just 63.0% which is much lower than required.
No syllabification: Baseline Performance
Top-n CorrectCorrect%age
Cumulative%age
1 1,868 41.5% 41.5%2 520 11.6% 53.1%3 246 5.5% 58.5%4 119 2.6% 61.2%5 81 1.8% 63.0%
Below 5 1,666 37.0% 100.0%4,500
40
Additional Additional
![Page 41: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/41.jpg)
Manual, i.e., ideal syllabification: skyline Performance
Top-1 Accuracy: 71.8% (Syllable-separated); 80.5% (Syllable-marked)
Top-5 Accuracy: 87.4% (Syllable-separated); 90.4% (Syllable-marked)
60%65%70%75%80%85%90%95%
100%
1 2 3 4 5
Cum
ulati
ve A
ccur
acy
Accuracy Level
Syllable-separated Syllable-marked
41
![Page 42: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/42.jpg)
Obsevations Transliteration likely to be helped
by syllabification Rule based syllabification: high
precision low recall Statistical syllabification:
reasonable accuracy at top-5
![Page 43: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/43.jpg)
References1. Nasreen AbdulJaleel and Leah S. Larkey. Statistical transliteration for english-
arabic cross language information retrieval. In Conference on Information and Knowledge Management, pages 139–146, 2003.
2. Ann K. Farmer Andrian Akmajian, Richard M. Demers and Robert M. Harnish. Linguistics: An Introduction to Language and Communication. MIT Press, 5th edition, 2001.
3. Association for Computer Linguistics. Collapsed Consonant and Vowel Models: New Approaches for English-Persian Transliteration and Back-Transliteration, 2007.
4. Slaven Bilac and Hozumi Tanaka. Direct combination of spelling and pronunciation information for robust back-transliteration. In Conferences on Computational Linguistics and Intelligent Text Processing, pages 413–424, 2005.
5. Ian Lane Bing Zhao, Nguyen Bach and Stephan Vogel. A log-linear block transliteration model based on bi-stream hmms. HLT/NAACL-2007, 2007.
![Page 44: Pushpak Bhattacharyya CSE Dept., IIT Bombay 11 th April, 2011](https://reader036.fdocuments.in/reader036/viewer/2022081512/568166b3550346895ddab6d1/html5/thumbnails/44.jpg)
References6. H. L. Jin and K. F. Wong. A Chinese dictionary construction algorithm for
information retrieval. In ACM Transactions on Asian Language Information Processing, pages 281–296, December 2002.
7. K. Knight and J. Graehl. Machine transliteration. In Computational Linguistics, pages 24(4):599–612, Dec. 1998.
8. Lee-Feng Chien Long Jiang, Ming Zhou and Chen Niu. Named entity translation with web mining and transliteration. In International Joint Conference on Artificial Intelligence (IJCAL-07), pages 1629–1634, 2007.
9. Dan Mateescu. English Phonetics and Phonological Theory. 2003.10. Della Pietra P. Brown and R. Mercer. The mathematics of statistical machine
translation: Parameter estimation. In Computational Linguistics, page 19(2):263U˝ 311, 1990.
11. Ganapathiraju Madhavi Prahallad Lavanya and Prahallad Kishore. A simple approach for building transliteration editors for Indian languages. Zhejiang University SCIENCE-2005, 2005.