SAnskrit and NLP@Kanchi

download SAnskrit and NLP@Kanchi

of 38

Transcript of SAnskrit and NLP@Kanchi

  • 7/28/2019 SAnskrit and NLP@Kanchi

    1/38

    SanskritandNatural Language Processing

    Dr.Srinivasa VarakhediCenter for Advanced Studies and Research in

    Shabdabodha and NLP

    RASHTRIYA SANSKRIT VIDYAPEETHA DEEMED UNIVERSITY

    Tirupati(A.P)

  • 7/28/2019 SAnskrit and NLP@Kanchi

    2/38

    Dream of a bee..

    j& Mi i |i*x =ni i {REV&**

  • 7/28/2019 SAnskrit and NLP@Kanchi

    3/38

    Present situation of SanskritSanskrit colleges are like 'zoo'!No Govt. support unless we areproductiveHumanities and Languages are beingneglected

    How far this support will continue ?Great tradition of learning is being lostNo scope for novel research

  • 7/28/2019 SAnskrit and NLP@Kanchi

    4/38

    Innovation is the keySanskrit Shastras are competentenough to enter the science worldMove out of Humanities and getmerged with science

    Analogy : Maths, psychology, Logic.We must find practical approach forthese Sanskrit Sciences.

  • 7/28/2019 SAnskrit and NLP@Kanchi

    5/38

    we have lost 80%Meemamsa - No practical approach !

    Nyaya - No use in modern dialectics ?

    Vyakarana No application ??

    What to do ?

  • 7/28/2019 SAnskrit and NLP@Kanchi

    6/38

    Relevance of Sanskrit Shastras in

    Modern Technologyfortunately these shastras are found releventin todays technology

    Computing ideas in PaniniText processing principles in MeemamsaFormal languages in Nyaya

    we lack the technology and applicationarea

    Story of Babbage!!!

  • 7/28/2019 SAnskrit and NLP@Kanchi

    7/38

    Massage of Acharya Shankara

    Bhagavatpada avidyayaa mrtyum tiirtvaa..

    vidyayaa amrtamashnute.. - Ishavasya

    UapanishadSri Shankara Bhagavatpada comments on

    this .. avidyaa = karma ; vidyaa =knowledge

  • 7/28/2019 SAnskrit and NLP@Kanchi

    8/38

    OpportunityEmerging Info technology has provided agreat oportunity to survive

    Mi ixihJ OJOE ?

    Solve a major contemporary problem like MTbasing on the shastrasGet new openings for SanskritistsOpen a new avenue for research

  • 7/28/2019 SAnskrit and NLP@Kanchi

    9/38

    Know How Ultimate aim :finding appropriate place forsanskrit Shastras

    Method: solutions to contemporory problemsadopting modern technology

    Resource needed : Adequate manpower, whoact as a bridge between modern scientistsand technologists one side and sanskritscholars on the other side.

  • 7/28/2019 SAnskrit and NLP@Kanchi

    10/38

    Change the scenario

    Technology

    Western TheoriesINDIAN THEORIES

  • 7/28/2019 SAnskrit and NLP@Kanchi

    11/38

    Opportunities missedIndustrial revolution

    We missed this with some hasty decisions

    IT revolutionIndians are serving in the level of coding ;not in designing level !

    Knowledge Revolutionwe should take this advantage

  • 7/28/2019 SAnskrit and NLP@Kanchi

    12/38

    Need of the hourwe need

    to understand how technology worksto understand the contempomporaryproblems

    Thenwe will be able to give solutions in the lightof sashtras and show the relevence of Indian theories

  • 7/28/2019 SAnskrit and NLP@Kanchi

    13/38

  • 7/28/2019 SAnskrit and NLP@Kanchi

    14/38

    Complexity of the problemDifferent Goal : Two disciplines Technology andShastras - are developed in different contextParadigm difference : Modern Scholars are

    accustomed to visual teaching method, TraditionalPandits on the other hand prefer oral traditionLanguage Barrier : Both of them do not understandeach others language !

    The tuning in of the dialogue will take time

  • 7/28/2019 SAnskrit and NLP@Kanchi

    15/38

    Who would bell the cat ?It needs a long interaction betweentechnologists and Traditional Sanskrit

    ScholarsTechnical institutions are always ready forsuch activitiesThere is NO much interest is seen in SanskritInstitutionsIt is we Sanskritists should to bell the cat

  • 7/28/2019 SAnskrit and NLP@Kanchi

    16/38

    Long process like extraction of

    ghee from milk Nothing miracle happens in the initial stage

    Its a big challenge, one OR two persons arenot enough

    We need hundreds of dedicated persons toachieve a small goal

    A person can climb a small hill ; Team can climbthe Everest

  • 7/28/2019 SAnskrit and NLP@Kanchi

    17/38

    Identifying the problem Analogy:- Braman in Upanishads

    what is Brahman?we can NOT show it as it is impercievable.we can NOT describe it as it is beyond words.

    Hence ,we can direct you towards that by way of negating what we know.

    (+{) - JSxpxvix&

  • 7/28/2019 SAnskrit and NLP@Kanchi

    18/38

    Possible areasMachine TranslationSpeech ProcessingSummary Extraction from huge textsIndo Wordnet as a base for IL-wordnets

    Developing Tools for IL ResearchersKnowledge Representation schemes

  • 7/28/2019 SAnskrit and NLP@Kanchi

    19/38

    Machine TranslationEnglish To Indian Languages

    Word sense disambiguation

    Karaka & Syntax RelationWord-groupingIdiomatic ExpressionShabdasutra

    MT among Indian LanguagesBi-language Electronic DictionariesKaraka & Vibhakti Relation

  • 7/28/2019 SAnskrit and NLP@Kanchi

    20/38

    Major MT systemsIndia

    Angla-Bharati, IIT KanpurShakti, IIIT HyderabadMantra, CDAC Pune

    SaHiT (Sanskrit Hindi Translator), CSS,JNU

    Anusaaraka (RSV, HCU, IIIT)

  • 7/28/2019 SAnskrit and NLP@Kanchi

    21/38

    Major MT systemsOutside India

    UNITRANBabelFish AltaVista (Systran)

    ATR (bimodal, Japan)JANUS (bimodal, US-Germany)SLT (SRI, Cambridge)

    VERBMOBIL (Germany)DIPLOMAT (Carnegie-Mellon)

    Get a 125 page directory of available MT systems athttp://ourworld.compuserve.com/homepages/WJHutchins/Compendium-11.pdf

    http://ourworld.compuserve.com/homepages/WJHutchins/Compendium-11.pdfhttp://ourworld.compuserve.com/homepages/WJHutchins/Compendium-11.pdfhttp://ourworld.compuserve.com/homepages/WJHutchins/Compendium-11.pdfhttp://ourworld.compuserve.com/homepages/WJHutchins/Compendium-11.pdf
  • 7/28/2019 SAnskrit and NLP@Kanchi

    22/38

    Summary ExtractionMeemamsa Principles applied to extractthe summary of a text

    Upakramaadi Tatparya Lingas are usedto extract the summary of a text inIndian Institute of Science, Bangalore,in our consultancy.

  • 7/28/2019 SAnskrit and NLP@Kanchi

    23/38

    Wordnet / Concept-net

    based on NN ontologyWordnet is an electronic lexicalreference resource system designed on

    the basis of semantic relations of wordsSynonymy {Graha, nivaasa,.} Hypernymy {Amra, vriksha, vanaspati}

    Antonnymy {Shreemaan, akinchana}Mecronymy {nAsika, mukha, shariira..}Gradation {Shushka,tara,.tama}

  • 7/28/2019 SAnskrit and NLP@Kanchi

    24/38

    Knowledge EngineeringRepresentation

    For Data representation, several databse

    management systems are available.For representing and retrieving usefulinformation, there are various worked outmethodologies

    Finally Knowledge Representation needsspecial treatment where Indian Knowledgesystems can be applied

  • 7/28/2019 SAnskrit and NLP@Kanchi

    25/38

    Knowledge and its importance in

    AI AI researchers are interested in buildingIntelligent systemsWeb technologies looking forward toSemantic webs instead of syntactic webKnowledge is more valuable than data andInformationData simple DoB. Info Age calculated.Knowledge the judgment about suitabilityfor job at hand etc. This requires a lot of inputs from various K- sources.

  • 7/28/2019 SAnskrit and NLP@Kanchi

    26/38

    Computational Linguistics and

    Paninis Grammar The structure of Paninian Grammar is nothingbut a computer program Babbage !

    It has captured the base of universalprinciples of all languagesCL requires formal rules for analysis andgeneration of languageSlowly Chomsky and others are turningtowards Panini

  • 7/28/2019 SAnskrit and NLP@Kanchi

    27/38

    The System of PaniniPhonetic componentPhonemespratyahara

    Rule base

    Vidhi (operations)Samjnaparibhasha (metarules)adhikara (headings)atide a (extension)niyama (restriction)

    LexiconDhatupaathaGanapaatha

    Lists AffixesRule specific items

  • 7/28/2019 SAnskrit and NLP@Kanchi

    28/38

    Paninian Model for Sentence

    Analysis Action Central themeKarakas Syntactico-semantic roles

    Visheshana-VisheshyabhavaConcept of anabhihitein switching todifferent voice

    Vivakshaa Intention of speakerForm and meaning

  • 7/28/2019 SAnskrit and NLP@Kanchi

    29/38

    Navya Nyaya -> AI ?Classify Nyaya into five parts ..

    1. Ontology2. Epistemology3. Technical Language

    4. Semantics5. Art of debate and fallacies

  • 7/28/2019 SAnskrit and NLP@Kanchi

    30/38

    OntologyIncludes

    Categories - Substance, Quality etc.,Relations SamavAya, SvarUpa Universals Types or classes

    Ontology helps to various areas like NLP,K-Repr, K-Engg, especially in Cognitivesciences.

  • 7/28/2019 SAnskrit and NLP@Kanchi

    31/38

    EpistemologyDeals with

    Cognitive processCognitive structure

    It helps to solve the problems of cognitivesciences and K-repr.

  • 7/28/2019 SAnskrit and NLP@Kanchi

    32/38

    Technical LanguageNNL is a Restricted Language that hasboth the features power of

    mechanism of Artificial Languages andpower of of expression of NaturalLanguages.

    The basic ideas behind this language willbe helpful in Knowledge Represenation.

  • 7/28/2019 SAnskrit and NLP@Kanchi

    33/38

    SemanticsWay of analysis of semantics shown by NavyaNaiyayikas has been crucially found helpful in

    NLP and Machine Translation

    Eg. Classification of words rUdha, yoga

    Syntactical analysisPower of definitionsKR & NN

  • 7/28/2019 SAnskrit and NLP@Kanchi

    34/38

    Semantics in MTLexicography

    Word/concepts nets based NN ontologyClassification of padas (words)

    Rudha word has convention I.e names Yougik word has etymologicalmeaningcook, driver,

    Yoga-rudha which has etymology as well asconventionCD -driver

  • 7/28/2019 SAnskrit and NLP@Kanchi

    35/38

    WSD using different

    techniquesDefinitions of Karaka relation withoutany overlap

    Kartrtvam = kriyAnukUlakritimattvamKarmattvam = para-samaveta-kriyA-janya-phala-Ashrayatvam

    Going Rama and Forest

    Who is going where ?Result contact is possible in Rama too..To avoid such overlap, this def. Is useful

  • 7/28/2019 SAnskrit and NLP@Kanchi

    36/38

    Refinement of karaka

    RelationsClassification of Karma

    Karma Reachable, understandable so on.

    Analysis of root semanticsLeave He left the place / left from theplace

    Analysis of expectancy (AkAnkshA)Rats killed cats

  • 7/28/2019 SAnskrit and NLP@Kanchi

    37/38

    To infinity relationI stand up to speak I want o speak He goes to London to study lawHe wants to study law in London

    To walk in mornings is good for health

  • 7/28/2019 SAnskrit and NLP@Kanchi

    38/38

    Special thanks toThe authorities of

    Sri Chandrashekharendra Sarasvati VishvamahavidyalayaKanchipuram

    Namaste!