IJCST Vo l . 3, ISS ue 1, Jan . - M ISSN : 0976-8491 ... · PDF file... NCST Mumbai, CDAC...

5
IJCST VOL. 3, ISSUE 1, JAN. - MARCH 2012 ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print) www.ijcst.com 286 INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND T ECHNOLOGY Abstract Machine Translation (MT) is a sub-field of Artificial Intelligence (AI), which translates the text from one language known as source language into the text of another language known as target language. This paper gives a survey of the work done on various Indian machine translation systems either developed or under the development. Some systems are of general domain, but most of the systems have their own particular domains like parliamentary documents translation, news readings, children stories, web based information retrieval etc. Keywords MT, Computational Linguistics, Morphology I. Introduction Indian is the largest democratic country in the world and there are more than 30 languages and approximately 2000 dialects used for the communication by the Indian peoples and out of these languages Hindi and English are taken as language for official work and there are 22 scheduled languages used by the different states for their administrative work and communication purposes. These 22 languages includes Assames, Bengali, Bodo, Dogri, Gujrati, Hindi, Malayalam, Manipuri, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Kannada, Kashmiri, Konkani, Maithili, Santali, Sindhi, Tamil, Telugu, Urdu. Because of different culture and multilingual environment in India there is a big requirement for inter-language translation for the transfer of information and sharing of the ideas. Peoples of different states perform their work in their respective regional languages whereas the work at the Union Government offices is performed in English language which is assumed to be one of the most speaking languages in the world or Hindi Language. So to synchronize between state government and the central / Union government there is a need for translation from regional languages to English language and vice versa. In India because of different culture there is different news papers published locally as well as globally. From the above discussion it is clear that there is large scope of translation of text from English to Indian Languages and vice versa. The initial work on Indian Machine Translation (in the beginning of 90’s) was performed at various locations by different persons like IIT Kanpur, Computer and Information Science department of Hyderabad, NCST Mumbai, CDAC Pune, department of IT, Ministry of Communication and IT Government of India. In the mid 90’s and late 90’s some more machine translation projects also started at IIT Bombay, IIT Hyderabad, department of computer science and Engineering Jadavpur University, Kolkata, JNU New Delhi etc. The next part of the paper gives a brief introduction of the various machine translation works done so far although there are some advancement is going on some projects so the latest information may be taken from the respective websites. The next section is divided into five sub-sections A to E for the different languages. II. Indian Machine Translation Systems The development of the Indian Machine Translation system can be divided into different categories. The scope of this paper is restricted to Hindi, Punjabi, Sanskrit, Bengali and English Language as a source language. A. Machine Translation for Hindi Language as a Source Language 1. ANUSAARAKA A project named “ANUSAARAKAA” for machine translation from one Indian Language to another Language in 1995. It has been used for translation from Telugu, Kannada, Bengali, Punjabi and Marathi to Hindi language translation and vice versa. The ANUSAARAKA system is a language accessor cum Machine Translation system that works on principles of Paninian Grammar (PG)”. The system provides both the robustness incase of failure and no loss of information while translating the text. The output of the system follows the grammar of the source language. The approach for the translation in this system is divided in two parts: 1) The ANUSAARAKA system which is based on language knowledge 2) the domain specific knowledge based on world knowledge, statistical knowledge etc. It was started at IIT Kanpur and now shifted to IIIT Hyderabad Currently ANUSAARAKA is working for Telugu, Kannada, Marathi, Bengali, and Punjabi to Hindi language translation and in near future reverse translation will also be feasible. The output can be post-edited if there is any grammatical error occurs during machine translation. The efficiency of the system was not available [2-4]. 2. The MANTRA (Machine Assisted Translation Tool) A machine translating system named “MANTRA” which translates the text from English to Hindi language with a precise domain in Office order, administrative work texts etc. in 1999. The basis of this system was the Tree Adjoining Grammar (TAG) formalism from the University of Pennsylvania. It uses Lexicalized Tree Adjoining Grammar (LTAG) for representing the English and the Hindi Language. It uses the TAG for parsing as well as Generation purposes. Now this system is also used in the finance, agriculture, health care, information technology, education and the general purpose activities of the government domains. The system named ‘MANTRA-RAJYSABHA’ is developed for the RAJYASABHA purposes. Currently the work for the language pairs English-Bengali, English-Telugu, English-Gujarati, Hindi- English, Hindi- Marathi, Hindi-Bengali is also going on. In 2008 the application area of MANTRA was extended to the education and the banking sectors also [5-6]. 3. Anubharti-II Technology A system with an approach for machine aided translation having the combination of example-based and corpus based approaches and some elementary grammatical analysis. In ANUBHARTI the traditional EBMT approach has been modified to reduce the requirement of a large example base. ANUBHARTI-II in 2004 uses Hindi as a source language for translation to other Indian language [8]. 4. Hinglish Machine Translation System (2004) A MT system which translates pure (standard) Hindi to pure English in 2004. It had been implemented by incorporating additional level to AnglaBHarti-II and AnuBharti-II The system Survey of Indian Machine Translation Systems 1 Sitender, 2 Seema Bawa 1,2 Dept. of CSE, Thapar University, Patiala, Punjab, India

Transcript of IJCST Vo l . 3, ISS ue 1, Jan . - M ISSN : 0976-8491 ... · PDF file... NCST Mumbai, CDAC...

Page 1: IJCST Vo l . 3, ISS ue 1, Jan . - M ISSN : 0976-8491 ... · PDF file... NCST Mumbai, CDAC Pune, ... ISSN : 0976-8491 (Online) | ISSN : 2229-4333 ... D. Machine Translation System for

IJCST Vol. 3, ISSue 1, Jan. - MarCh 2012 ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

w w w . i j c s t . c o m286 InternatIonal Journal of Computer SCIenCe and teChnology

AbstractMachine Translation (MT) is a sub-field of Artificial Intelligence (AI), which translates the text from one language known as source language into the text of another language known as target language. This paper gives a survey of the work done on various Indian machine translation systems either developed or under the development. Some systems are of general domain, but most of the systems have their own particular domains like parliamentary documents translation, news readings, children stories, web based information retrieval etc.

KeywordsMT, Computational Linguistics, Morphology

I. IntroductionIndian is the largest democratic country in the world and there are more than 30 languages and approximately 2000 dialects used for the communication by the Indian peoples and out of these languages Hindi and English are taken as language for official work and there are 22 scheduled languages used by the different states for their administrative work and communication purposes. These 22 languages includes Assames, Bengali, Bodo, Dogri, Gujrati, Hindi, Malayalam, Manipuri, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Kannada, Kashmiri, Konkani, Maithili, Santali, Sindhi, Tamil, Telugu, Urdu. Because of different culture and multilingual environment in India there is a big requirement for inter-language translation for the transfer of information and sharing of the ideas. Peoples of different states perform their work in their respective regional languages whereas the work at the Union Government offices is performed in English language which is assumed to be one of the most speaking languages in the world or Hindi Language. So to synchronize between state government and the central / Union government there is a need for translation from regional languages to English language and vice versa. In India because of different culture there is different news papers published locally as well as globally. From the above discussion it is clear that there is large scope of translation of text from English to Indian Languages and vice versa. The initial work on Indian Machine Translation (in the beginning of 90’s) was performed at various locations by different persons like IIT Kanpur, Computer and Information Science department of Hyderabad, NCST Mumbai, CDAC Pune, department of IT, Ministry of Communication and IT Government of India. In the mid 90’s and late 90’s some more machine translation projects also started at IIT Bombay, IIT Hyderabad, department of computer science and Engineering Jadavpur University, Kolkata, JNU New Delhi etc. The next part of the paper gives a brief introduction of the various machine translation works done so far although there are some advancement is going on some projects so the latest information may be taken from the respective websites. The next section is divided into five sub-sections A to E for the different languages.

II. Indian Machine Translation SystemsThe development of the Indian Machine Translation system can be divided into different categories. The scope of this paper is restricted to Hindi, Punjabi, Sanskrit, Bengali and English

Language as a source language.

A. Machine Translation for Hindi Language as a Source Language

1. ANUSAARAKAA project named “ANUSAARAKAA” for machine translation from one Indian Language to another Language in 1995. It has been used for translation from Telugu, Kannada, Bengali, Punjabi and Marathi to Hindi language translation and vice versa. The ANUSAARAKA system is a language accessor cum Machine Translation system that works on principles of Paninian Grammar (PG)”. The system provides both the robustness incase of failure and no loss of information while translating the text. The output of the system follows the grammar of the source language. The approach for the translation in this system is divided in two parts: 1) The ANUSAARAKA system which is based on language knowledge 2) the domain specific knowledge based on world knowledge, statistical knowledge etc. It was started at IIT Kanpur and now shifted to IIIT Hyderabad Currently ANUSAARAKA is working for Telugu, Kannada, Marathi, Bengali, and Punjabi to Hindi language translation and in near future reverse translation will also be feasible. The output can be post-edited if there is any grammatical error occurs during machine translation. The efficiency of the system was not available [2-4].

2. The MANTRA (Machine Assisted Translation Tool)A machine translating system named “MANTRA” which translates the text from English to Hindi language with a precise domain in Office order, administrative work texts etc. in 1999. The basis of this system was the Tree Adjoining Grammar (TAG) formalism from the University of Pennsylvania. It uses Lexicalized Tree Adjoining Grammar (LTAG) for representing the English and the Hindi Language. It uses the TAG for parsing as well as Generation purposes. Now this system is also used in the finance, agriculture, health care, information technology, education and the general purpose activities of the government domains. The system named ‘MANTRA-RAJYSABHA’ is developed for the RAJYASABHA purposes. Currently the work for the language pairs English-Bengali, English-Telugu, English-Gujarati, Hindi-English, Hindi- Marathi, Hindi-Bengali is also going on. In 2008 the application area of MANTRA was extended to the education and the banking sectors also [5-6].

3. Anubharti-II TechnologyA system with an approach for machine aided translation having the combination of example-based and corpus based approaches and some elementary grammatical analysis. In ANUBHARTI the traditional EBMT approach has been modified to reduce the requirement of a large example base. ANUBHARTI-II in 2004 uses Hindi as a source language for translation to other Indian language [8].

4. Hinglish Machine Translation System (2004)A MT system which translates pure (standard) Hindi to pure English in 2004. It had been implemented by incorporating additional level to AnglaBHarti-II and AnuBharti-II The system

Survey of Indian Machine Translation Systems 1Sitender, 2Seema Bawa

1,2Dept. of CSE, Thapar University, Patiala, Punjab, India

Page 2: IJCST Vo l . 3, ISS ue 1, Jan . - M ISSN : 0976-8491 ... · PDF file... NCST Mumbai, CDAC Pune, ... ISSN : 0976-8491 (Online) | ISSN : 2229-4333 ... D. Machine Translation System for

IJCST Vol. 3, ISSue 1, Jan. - MarCh 2012ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

w w w . i j c s t . c o m InternatIonal Journal of Computer SCIenCe and teChnology 287

gives satisfactory results in more than 90% cases [15].

5. SamparkIn 2009 a machine translation system among Indian Languages was proposed by the Consortium of Institutions (IIT Hyderabad, University of Hyderabad, C-DAC Noida, Anna University, KBC Chennai, IIT Kharagpur, IISc Bangalore, IIIT Allahabad, Tamil University, and Jadavpur University). Currently released systems are {Punjabi, Urdu, Tamil, Marathi} to Hindi and Telugu-Hindi and Telugu to Tamil machine translation system [28].

6. Hindi-Punjabi MT SystemVishal Goyal and Dr. G. S. Josan proposed a machine translation system mainly from Hindi- Punjabi in 2009 at Punjabi University Patiala. It is based on direct word-to-word translation and reported 95% accuracy [30].

7. Hindi Generation from InterlinguaProf. Pushpak Bhattacharyya and Prof. Om. P. Damani reported a work on Hindi generation (Hindi Deconverter) from UNL graphs with the satisfactory results. The linguistic concerns have been clearly separated from the computational tasks which results in the possibility of the generation of the other languages also [31].

8. Web Based Hindi to Punjabi Machine TranslatorIn 2010 Vishal Goyal and Prof. G.S. Lehal proposed a machine translator from Hindi to Punjabi. The methodology used for the translation was direct translation and later on improving the language learning modules for the enhancement of the quality of the system. The accuracy of the translation is approximately 95%.The web tool developed has many application areas like over internet for the particular community of peoples, for the newspapers and sending e-mails from Hindi to Punjabi or vice versa [33].

B. Machine Translation for Punjabi Language as a Source Language

1. Punjabi to Hindi Machine Translation SystemIn 2007 a system based on direct word-word translation for machine translation between Punjabi as source language and Hindi as target Language was proposed .The system has reported 92.8% accuracy [24].

2. Punjabi to Hindi Machine Translation SystemIn 2008 a machine translation system from Punjabi to Hindi was proposed. The approach used for the translation was direct translation along with a mixture of rule based and statistical approach and the use of language corpus for each language. The results of the system were motivating and the accuracy rate of the translation was 90.67% [29].

3. Sahmukhi to Gurmukhi Machine Transliteration System for Punjabi LanguageIn 2008 a transliteration system between the two scripts of Punjabi Language was developed. The methodology used for the transliteration comprises of the pre-processing, rule-based transliteration and post-processing modules. The performance of the system was evaluated manually on the poetry, article, story and the average performance was 91.37% [26].

C. Machine Translation for Sanskrit Language as a Source Language

1. Sanskrit-Hindi AnusaarkaIn 2009 a language accessor cum machine translation system for Sanskrit –Hindi language pair by following the Anusaarka approach was proposed and it allows the user to access the source language text and give the rough output in the target language. The translation mechanism was transparent to the end user [27].

2. Sanskrit Karka Analyzer for Machine TranslationIn 2007 [23], a Sanskrit Karka analyzer was designed by Sudhir K Mishra for making a translation tool for Sanskrit language at JNU Delhi. The methodology adopted was Rule Based MTS (RBMTS). But the system was not able to resolve the ambiguities and also not able to work on Sandhi and Samasa for Sanskrit language.

3. Constrained Based Parser for Sanskrit LanguageIn 2010, Dr. Amba Kulkarni, Sheetal Pokar, Devanand Shukl designed A Constrained Based Parser for Sanskrit Language at University of Hyderabad in the department of Sanskrit Studies. Based on the designing principles obtained from the generative grammars the parser was modeled for finding the directed tress, from the graph with the nodes as words and edges showing the relations between the words. To rule out the non-solutions they used Mimamsa constraints of Akanksa and Sannidhi to give the priority the solutions. The current system allows only the limited and simple sentences to be parsed [32].ANGLABHARTI: A machine translation system at IIT Kanpur in 1991 was developed which translates from English to Indian Languages. The concept of pseudo- Interlingua has been used for the development of the machine aided translation system named “ANGLABHARTI “in 1991. This concept exploits the commonality in the Indian Languages. ANLGABHARTI is based on Rule Based Translation System (RBTS) with context free grammar structure of the English language as a source language and produces a pseudo Interlingua code which is applicable to a Group of Indian Languages. The movement rules are obtained by the corpus analysis and the target constituents are obtained from this analysis. The aim of the ANGLABHARTI was to provide a translation system in which 90% work will be done by the machine and the 10% post editing work will be done by the human [1].

D. Machine Translation System for English-Indian Language

1. Anglabharti-IIA new system for machine translation named ANGLABHARTI-II in 2004 was proposed that addresses many of the shortcomings of the earlier architecture (ANLGABHARTI-I). It uses a Generalized Example-Base (GEB) for hybridization besides a Raw Example-Base (REB). During the development phase, when it was found that the modification in the rule-base was difficult and might result in unpredictable results, the example-base is grown interactively by augmenting it. Provisions were made for pre-editing and paraphrasing. The efficiency of the system was improved as compared to ANGLABHARTI-I [8].

2. MATRAA system for translation in 2004 from English to Hindi was developed and named it MATRA. It was a fully automatic system for indicative English–Hindi MT of general purpose text, but the

Page 3: IJCST Vo l . 3, ISS ue 1, Jan . - M ISSN : 0976-8491 ... · PDF file... NCST Mumbai, CDAC Pune, ... ISSN : 0976-8491 (Online) | ISSN : 2229-4333 ... D. Machine Translation System for

IJCST Vol. 3, ISSue 1, Jan. - MarCh 2012 ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

w w w . i j c s t . c o m288 InternatIonal Journal of Computer SCIenCe and teChnology

system has been applied mainly in the domains of the news. A hybrid approach system Vaakya has been developed at NCST (C-DAC) Bombay [9]. The 65% of translation done by this system initially was in acceptable state, but it needs to handle the phrasal verbs, interrogative and imperative mood sets.

3. Shiva and ShaktiIn 2004 two MT systems were being developed jointly by IIIT Hyderabad, IIS Bangalore and Carnegie Mellon University USA for translation from English to Hindi. The Shiva [10-11], system uses example based approach and the Shakti [11-12], system uses rule-based approach with statistical approach for machine translation. The Shakti system is working for three target languages Hindi, Marathi and Telugu.

4. UNL Based English-Hindi Machine TranslationIn 2003, at IIT [13], Bombay a machine translation system translating from English to Hindi using UNL as Interlingua was developed. The UNL is an International project of the United Nations University, with an aim to create an interlingua for all major human languages. Hindi to UNL and UNL to Hindi is also available. Dr. Bhattacharya is working on MT system from English to Marathi and Bengali using UNL Although this type of translation gives a new area of translation specially for Indian Languages and it shows various concepts of the language divergence between English and Hindi Language but still it has some limitations like the system needs more database of the UW’s so as gather more world knowledge and the Hindi Generator of this system needs to be tested by the UNL expressions generated by the other language en-converters.

5. ANUVAADAK MT SystemSuper Info soft Pvt. Ltd. Delhi has developed a general purpose English-Hindi machine translation tool ANUVAADAK 10.0. For specific domains it has inbuilt dictionaries. In Windows family it runs on any operating system [16].

6. IBM-English-Hindi Machine Translation SystemIBM India Research Lab in 2006 [17-22], proposed to develop a MT based on Example Based approach and later on shifted to the Statistical approach for machine translation from English to Indian Languages.

7. English to {Hindi, Kannada, and Tamil} and Kannada to Tamil language pair EBMTIn 2006 [17-21], a MT system based on bilingual dictionary of sentence, phrases, words and phonetic dictionary was developed. Each dictionary contains parallel corpora for the language pair based on EBMT. EBMT has a set of 75000 commonly used sentences from English and translated into the target Indian Languages.

8. Google TranslateIn 2007, Franz-Josef Och applied the statistical machine translation approach for Google Translate from English to other Languages and vice-versa. Hindi and Urdu are the only Indian Languages present among the 57 Languages for which Google Translate provides translation. Accuracy of the system is good enough to understand the sentence after translation [25].

E. Machine Translation System for Bengali Language as a Source or Target Language

1. VAASAANUBAADA A Bilingual BengaliAssamese automatic machine translation system for translating the news texts by using the Example –Based Machine Translation (EBST) technique in 2002 was developed. In this the translation is done at sentence level. Some preprocessing and post processing work has to be done for the translation. The longer sentences were fragmented at punctuation, which gives high quality translations. Backtracking is used when exact match does not occur at the sentence level, which results in further fragmentation of the sentence [7].

2. ANUBAD Hybrid Machine Translation SystemIn 2004, a MT system for translating English news headlines to Bengali at Jadavpur University Kolkata [12], was developed and in 2005 S.G. Kumar developed EB-ANUBAD [14], system for translating English to Bengali language and it shows 98% correct results although the output was in English language and this was used to understand the Bengali language. The approach taken for the translation was the hybrid approach of rule based and transfer based with a parser for both morphological as well as for the Lexical parsing of the text. A brief description of the various machine translation systems is given below in the table.

Table 1: Indian Machine Translation Systems

Sr. No.

Name of the system

Languages for Translation

Approaches Used Domain YEAR

1 ANGLABHARTI-1 (IIT K) ENG-IL Pseudo-

interlingua General 1991

2 ANUSAARAKA(U H) IL-IL PG General 1195

3 MANTRA(C-DAC- P)

ENG-IL4 HINDI-EMB TAG Administration,

office orders 1999

4 VAASAANUBAADA(A U)

BENGALI-ASSAMESE EBMT NEWS 2002

5 ANGLABHARTI-II(IIT-K) ENGLISH-IL GEBMT General 2004

6 ANUBHARTI-II(IIT-K) HINDI-IL GEBMT General 2004

7 MATRACDAC-M

ENGLISH-HINDI

Transfer based General 2004

8 SHIVA & SHAKTI(IIIT-H, IIS-B) ENG- IL3 EBMT &

RBMT General 2004

9 UNL MTS(IIT-B) ENG-HINDI Interlingua General 2003

10 ANUBAD (J U) ENG-BENGALI

RBMT AND SMT NEWS 2004

11 HINGLISH (IIT-K) HINDI-ENG Pseudo interlingua General 2004

12ANUVAADAK (SUPER INFOSOFT)

ENG-IL Not-Available Not-Available

13 PUNJABI-HINDI(P U)

PUNJABI-HINDI

Direct word to word

General 2007

14SAMPARK(IIIT-H, CDAC-N,IIT-KGP, ANNA-U

IL-ILIL5 CPG Not-Available 2009

15 IBM MTS ENG- HINDI EBMT & SMT Not Available 2006

Page 4: IJCST Vo l . 3, ISS ue 1, Jan . - M ISSN : 0976-8491 ... · PDF file... NCST Mumbai, CDAC Pune, ... ISSN : 0976-8491 (Online) | ISSN : 2229-4333 ... D. Machine Translation System for

IJCST Vol. 3, ISSue 1, Jan. - MarCh 2012ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

w w w . i j c s t . c o m InternatIonal Journal of Computer SCIenCe and teChnology 289

IL4 Hindi, Bengali, Telugu, GujaratiIIT-B Indian Institute of Technology Bombay IIT-KGP Indian Institute of Technology KharagpurIIT-K Indian Institute of Technology Kanpur UH University of HyderabadPU Punjabi UniversityCDAC-N CDAC NoidaIL Indian LanguageJU Jadavpur UniversityANNA-U Anna UniversityIIS-B Indian Institute of Sciences BangaloreEBMT Example Based Machine TranslationENG EnglishSMT Statistical Machine Translation IL3 Hindi, Marathi, TeluguIL5 Punjabi, Urdu, Tamil, Marathi, HindiCPG Computational Paninian GrammarEMB ENG, MARATHI, BENGALICDAC-P CDAC Pune

III. ConclusionMachine translation now became the most encouraging field for the researchers in India. Lots of research projects are going on machine translation for the Indian languages and between English to Indian translation and vice versa. Now the time has arrived to do machine translation not only between English to Indian languages / Indian Languages to English, but needs to develop the machine translation for the languages like Indian to Chinese, Korean, Thai, Japanese etc. and vice versa so that the peoples of this region could understand the culture as well enhance their businesses to these countries with an ease of communication in their regional languages. Now the Indian Government is also sponsoring the various projects for removing the language barriers and developing the multilingual translators.

References[1] R.M.K. Sinha, K Sivaraman, Aditi Agrwal, Renu Jain, Rakesh

Srivastava, Ajai Jain,"ANGLABHARTI: A Multilingual Machine Aided Translation from English to Indian Languages, System", Man and Cybernetics. Intelligent Systems for 21ist Century, IEEE International Conference, pp. 1609, Vol-2, Oct 22-25, 1995.

[2] Akshar Bharati, Chaitanya Vineet, P. Amba Kulkarni, Rajeev Sangal, Anusaaraka,"Machine translation in Stages", A Quarterly in Artificial Intelligence, Vol. 10, No. 3, NCST, Mumbai, pp. 22-25, (July 1997).

[3] V.N. Narayana, Anusaraka,"A Device to Overcome the Language Barrier in India", Ph.D. thesis, Dept. of Computer Sc. and Engg., I.I.T. Kanpur, 1994.

[4] Sanjay Kumar Dwivedi, Pramod Premdas Sukhadeve, "Machine Translation System in Indian Perspective", Journal of Computer Science 6(10, pp. 1082-1087, 2010 Science Publications, 2010.

[5] Hemant Darbari,"Computer-assisted translation system – an Indian perspective", Machine Translation Summit VII, 13th-17th September Kent Ridge Digital Labs, Singapore. pp. 80-85, 1999.

[6] S.Bandyopadhyay,"Use of Machine Translation in India", AAMT J., 10, pp. 22-25, 2004.

[7] Vijayanand Kommaluri, Sirajul Islam Choudhury, Pranab Ratna,"VAASAANUBAADA-Automatic Machine Tran-slation of Bilingual Bengali-Assamese News Texts", Langu-age Engineering Conference. Hyderabad, India. 2002. [Online] Available: http://www.portal.acm.org/citation.cfm?id=788716]

[8] R.M.K. Sinha,"An Engineering Perspective of Machine Translation", AnglaBharti-II and AnuBharti-II Architectures. In proceedings of International Symposium on Machine Translation, NLP and Translation Support System (iSTRANS- 2004). November 17-19. Tata McGraw Hill, New Delhi. pp. 134-38, 2004.

[9] R Ananthakrishnan, M Kavitha, J Hegde Jayprasad, Chandra Shekhar, Ritesh Shah, Sawani Bade, M Sasikumar, "MaTra: A Practical Approach to Fully- Automatic Indicative English-Hindi Machine Translation", In the proceedings of MSPIL-06, 2006.

[10] R. Moona Bharati, P. Reddy, B. Sankar, D.M. Sharma, R. Sangal,"Machine Translation: The Shakti Approach. Pre-Conference Tutorial", ICON-2003. [Online] Availale: http://www.ebmt.serc.iisc.ernet.in/mt/login.html, [Online] Available: http://www.gdit.iiit.net/~mt/shakti

[11] Sivaji Bandyopadhyay,"Use of Machine Translation in India", AAMT Journal, 36. pp. 25-31, 2004.

[12] S. Bandyopadhyay,"ANUBAAD - The Translator from English to Indian Languages", In proceedings of the VIIth State Science and Technology Congress. Calcutta. India. pp. 43-51, 2004.

[13] Shachi Dave, Jigneshu Parikh, Pushpek Bhattacharyya, "Interlingua based English-Hindi Machine Translation and Language Divergence", [Online] Available: http://www.cse.iitb.ac.in/~pb/papers/eng-hindi-mt.ps

[14] Saha Gautam Kumar,"The EB-ANUBAD translator- A Hybride Scheme", Journal of Zhejjang University Science, pp. 1047-1050, 2005

[15] R. M. K. Sinha, Anil Thakur,"Machine Translation of bi-lingual Hindi-English (Hinglish) text in proceedings of the tenth Machine Translation Summit", MT Summit X, Phuket, Thailand, September 13-15. pp.149-156, 2005.

[16] English to Hindi Translator,"ANUVAADAK, Supertech Software and Hardware Pvt. Ltd.", [Online] Available: http://www.mysmartschool.com/pls/portal/portal.MSSStatic.ProductAnuvaadak

[17] D. Gupta, N. Chatterjee,"Identification of Divergence for English to Hindi EBMT", In proceedings of MT SUMMIT IX. New Orleans, Louisiana, USA. pp. 157-162, 2003.

[18] Sivaji Bandyopadhyay,"Teaching MT–An Indian Perspec-tive", In proceedings of the 6th EAMT Workshop on Teaching Machine Translation. Manchester. UK. pp. 13-22, 2002.

[19] Shachi Dave, Jignashu Parikh, Pushpak Bhattacharyya, "Interlingua-based English- Hindi Machine Translation and Language Divergence", Journal of Machine Translation, 16 (4). pp. 251-304, 2001.

[20] Sivaji Bandyopadhyay,"State and Role of Machine Translation in India", Machine Translation Review, 11. pp. 25-27, 2000.

[21] B. K.Murthy, W. R.Deshpande,"Language technology in India: past, present and future", In proceedings of MLIT Symposium 3. GII/GIS for Equal Language Opportunity. Vietnam. October 6-7. pp. 134-137, 1998.

Page 5: IJCST Vo l . 3, ISS ue 1, Jan . - M ISSN : 0976-8491 ... · PDF file... NCST Mumbai, CDAC Pune, ... ISSN : 0976-8491 (Online) | ISSN : 2229-4333 ... D. Machine Translation System for

IJCST Vol. 3, ISSue 1, Jan. - MarCh 2012 ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

w w w . i j c s t . c o m290 InternatIonal Journal of Computer SCIenCe and teChnology

[22] Raghavendra Udupa, Tanveer A. Faruquie,"An English-Hindi Statistical Machine Translation System", in proceedings of First International Joint Conference, Hainan Island, China, March 22-24, pp. 254-262, 2004.

[23] Sudhir K. Mishra,"Sanskrit Karaka Analyzer for Machine Translation", a Ph. D. Thesis, SCSS JNU New Delhi, 2007.

[24] G.S.Josan, G. S. Lehal,"A Punjabi to Hindi Machine Translation System", Coling 2008: Companion volume: Posters and Demonstrations, Manchester, UK, pp. 157-160, 2008.

[25] Och, Franz Josef, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Association for Computational Linguistics, pp. 858-867, June 2007, [Online] Available: http://www.translate.google.com.,http://translate.google.com/about/intl/en_ALL/

[26] Tejinder Singh Saini, Gurpreet Singh, Lehal,"Shahmukhi to Gurmukhi Transliteration System", A Corpus based Approach, Advances in Natural Language Processing and Applications Research in Computing Science 33, pp. 151-162, 2008.

[27] Akashar Bharti, Amba Kulkarni,"Anusaarka: An Accessor cum Machine Translator", Department of Sanskrit Studies, University of Hyderabad, Hyderabad, 2009.

[28] Website source [Online] Available: http://www.sampark.iiit.ac.in, [Online] Available: http://www.tdil-dc.in/index.php?option=com _vertical & parentid=74

[29] G. S. Josan, G. S. Lehal,“A Punjabi to Hindi machine Translation System", Coling 2008: Companion volume: Posters and Demonstrations, Manchester, UK, pp. 157-160, 2008.

[30] Visahl Goyal, Language in India, Ph.D. Thesis on, "Development of a Hindi to Punjabi Machine Translation System" [Online] Available: http://www.languageinindia.com 599 10: 10 October 2010.

[31] Smriti Singh, Mrugank Dalal, Vishal Vachhani, Pushpek Bhattacharyya, Om. P. Damani,"Hindi Generation from Interlingua", MT Summit – Hindi -Generation-2007 . [Online] Available: http://www.cse.iitb.ac.in/~pb/papers/MTSummit-Hindi-Generation-2007.pdf.

[32] EnConverter Specification Version 2.1, UNU/IAS/UNL Centre, Tokyo 150-8304, Japan, 2000

[33] Vishal Goyal, Gurpreet Singh Lehal,"Web Based Hindi to Punjabi Machine Translation System", journal of emerging technologies in web intelligence, Vol. 2, May 2010.

Sitender received his B.E. Degree in Information Technology from M.D. University, Rohtak, India, in 2004, the M.Tech. Degree in Computer Engineering from M.D. University, Rohtak, India, in 2008. He was a lecturer, assistant professor with Department of Computer Science and Engineering and Information Technology, N.C.College of Engineering, Israna (Panipat), Haryana India in 2004-2006, 2008-2012. At

present, He is pursuing his Ph.D. degree from Thapar University, Patiala (India) in Machine Translation Field.