Critic SPL

of 25

  • date post

    29-May-2018
  • Category

    Documents

  • view

    217
  • download

    0

Embed Size (px)

Transcript of Critic SPL

  • 8/8/2019 Critic SPL

    1/25

    FOR!'{!\L LANGUAGES: ORI':;Ins LA2JD DIFECTIOIJS

    S.A. GREIBACH

    Department of System ScienceUnivers ity of California, Los AnB;eles, CA 900241. IntIDduction

    My purpose is to survey th e o rlg ln s o f th etheory of formal languages and automata through1964 and t o i nd ic at e some of th e main directionswhich th e :-;ttldy of th e sub4ect has takc::>n since tIlen.In th e discussion o f o ri gi ns , I shall concentrateon those I know best: the work on mathematicall inguistics and automatic translation of th e 50sand early 60s, which le d me in to the field . I hadhoped to trace th e developments since 1964 thIDughth e papers presented a t th e IEEE Symposia on th eFOill1dations of Computer Science ( ini t ia11v, Switchin g Circuit Theory and Logical Design; la te r ,Switching and Automata TI1eory) and th e companion,parvenu, AC11 Symposia on th e Theory of Computing.This good intention was impossible to s us ta in , part icular ly as I moved fIDm th e exuberant 60s intoth e grim 7Os. LDng gone are the days in which ab stracts o f o ne 's summer research and t he t he se s ofo ne 's b es t students automat ical ly appeared a t th enext Conference! St i l l I shall give pride of placeto such cont1'ibutions where possible.

    There (:il'e at least f ive (not altogether dist inct) sources foY' th e ideas developed in fonnallanguage theory. They Eire:elogic and recursive func t ion theo rys\vitching cil 'cuit trleory c]IKlemodeling of b i ol og i ca l systems.,

    developm::ntal systems andeITBthematical and computational l inguisticscomputer prograrmning and the of ALGOLand other Problem Qr)iented lrl.nguages

    Recursive func tion theo ry is ;:3urveyed elsewhel'C inthese Proceedings so I shal l mention relevant ideasonly in passing. The emphasis of this paper is onformal languages, so I shall descrilR only th e l'el at ed p a rt s of th e development of f ini te automatatheory, skimping 01' skipping purely machine 01 ' system oriented topics.Phrase structure gpammars wen=? originally clescribed by Chomsky [l959a] as a forrna.lization ofth e Irrrrnediate Constituent (IC) analysis used by l in-tWists ~ " ' 1 describing th p morpholog:/- aDd syntax' o fnatural languages. The work on :rn::lchine translation

    a t various inst i tut ions used varioustheories as bases for automatic syntactic analysist Th .lS paper was supported i l l part by th e NationalScience Foundation under Grants NSF MCS-78-04725and NSF MCS-79-01439.

    and appropriate programming techniques were devised.At th e same time, similar methods of syntax specification and programming techniques were used inth e defini t ion and implementation of Problem Oriente d Programming Languages. In th e early 60s, a l lo f th ese var ious threads were brought together asi t was recognized, and then formal ly proved, thata l l these models defined th e same class of languages,namely, th e family of context-free (CF) phrasestructure languages.

    TI1e rest of this paper is divided into threemain sections chronologically: p rio r to 1956; 1956to 1964; 1965 to th e present. The year 1956 sawt he publ ica ti on of th e t en ta t ive defin it ion ofphrase structure grammar [Chomsky,1956], though th eChomsky hierarchy of languages and ITBchines did notappear in pr in t unt i l 1959 [Chomsky,1959 J. Theyear 1964 i s a watershed: th e Chomsky hierarchy oflanguages, grarmnars and machines was completed.After tha t , formal language theory as a disciplinediverged from mathe:rn::ltical or computational l inguist ics . Most (though no t a l l ) of th e subsequent de velopments were inspired by ideas r el at ed t o computer rather than natural languages and to computin g and programning. Thus, fr om approximately 1965on , one can regard formal languaF\e t heory a s abranch (originally th e main one) 8f theoreticalcomputer science.

    Past 1964, time and space allow me to touchon only a few themes and mv references become avery spa rse subset of th e available l i tera tu re . Ishal l , no t unnaturally, dwell mos t on th e d i r e c t i o n ~ ;in which my oVJTl research went.. Before Ctlomsky: 1936-1955

    As evel'y computer science student should kr10v],th e Chomsky hierarchy in i t s standard formulationcon si st s o f foul' c la ss es o f languages, each classp1'operly conta in ing the next: recursively enumerab le, con tex t -sensi t ive , context- fr ee, r egular.These classes can be defined by placing an increasingly s t r i c t set of restrict ions on general rewritin g systems or equivalently by consider ing thelanguages accepted by nondeterministic machineswith an increasingly restricted type of data structure (o r Horking tape): 'I\.lring maChL1J.e, l inearbounded automaton, pushdown store automaton, f ini testate automaton. 1he formal proof of equivalenceof grammars and IIBchines was completed in 1964,bu t some of the definitions ar e much older.

    CH ]471-2/79/0000-0066$00.75 ( ~ ~ ) 1979 IEEE 66

  • 8/8/2019 Critic SPL

    2/25

    anThe types of machines and grarrnnars used todef in e the c la ss o f r ecur si ve ly enumerable languageswere defined and studied in th e 30s and 40s byTuring [1936J and Post [1943,1947J, respectively.Many of th e notions I survey here are now presentedin quite different formulations from those in th eoriginal papers. By contrast , Turing's definitionof a "Turing TlBchine" , i ts jusTi fi_c:at-ion and th econstruction of the f i rs t univer'lsa1 Turing machine,are l i t t l e different from th e notation we us e now.

    The -halting problem is no t explicit ly discussed; onth e other hand, th e tabular description of machines(frequently with added documentation or comments!)is more convenient than some I have seen in recentbooks. "No attempt has yet been JIBde to show thatth e "computable" numbers include a ll nlIDlbers whichVJould naturally be regarded as computable. All (1rguments that can be given are bound to be , funrl amentally , appeals to in tuit ion and, for this reason,rather unsatisfactory ITk1thematically. The realquestion at issue is , 'What are t he possi bl e processes that can be carried ou t in computing a number?' "[Turing, 1936, p 249J.t

    The generation of sets by rewrlt lng systems,called production systems, ~ , v a s formalized by Post[1943J who observed that th e sets so generated arerecursively enumerable sets and v ic e ver sa . Theparticular type of production sys tem used by Chomsky was defined and termed "semi-Thue" by Post[1947J who g ive s th e natural mapping from Turingmachines (Post uses quadruples rather than Turing'squintuples) to semi-Thue systems. In 1936, Postgave, independently, an informal description of computat ion quite s imil ar t o Turing's and declaredthat proving enough formulations equivalent to recursiveness would change Church's thesis "not : ~ much to a definit ion or to an axiom bu t to a natu-pa l law" [Post, 1936,p 105 J.

    Thus, th e grammars (o r generat ing systems)and machines at th e "top" of th e Chomsky hierarchywere known, and known to be equivalent, by 1947.Another paper by Post from this periOd la te r provided th e key tool used in proving problems in forITBI language theory undecidable: th e CorrespondenceProblem [Post, 1946]. (It is nlITIored that thispaper was inspired by World War I I experience incryptography! )

    At th e "tJOttom" of the h iera rchy l th e f 2..rH-i ly of regular languages f i r s t defined ?y Kleene[lQSl and 1956J. Kleene's formulation lS based onth e paper of McCulloch and Pitts [1943J on th elogical analys is of nervous activity in which certain asslllTlptions regarding nerve nete:: were formulated. In Kleene's paper, f ini te automata arerepresented by nerve nets. Instead of words orstrings, th e paper discusses t ab le s o f input patterns tagged as ini t ia l or nonini t ia l . A table

    +IAccoroing to Professor I. J . Good, th e f i r s t "Tur-ing Tl12lchine" was a data-processor, nicknamed th eBombe, used by ULTAA during World War I I to crackth e Enigma machine [Lewin, ULTRA Goes to ~ ~ r : TheSecret 8 t o r y ~ Hutc11inson,1978,p 58J. SO Turingmachines ar e very practical!

    67

    represents an . . . a l with an a t t ~ 1 i fi t is in i t i a l and, i f noninit ial , represents. . . . . . al with t he l oc ati on of an in timeunspecified. An event is represented by a seto f t ab le s. The concatenation EF of events E

    and F is the set of tables where X lS a tableIn r which is no t in i t i a l and Y is in F (so inClu'rent no-tat ion i s reLlIIv IT) . The class of regular events is defined as th e closure of th e unitevents ane1 the empty set under) union, concatenationand an operation we can call E':r th e correspondence between rep;ular events aJicl finite automata isestablished.

    The context-free phrase :=:;tructure grammar is( ~ n e attempt to fnrmcilize th e notion of IrrunedidtcConstitupnt erC) anal v s i c ~ , an approach to th e de; ~ c r i p t i o n o f th e morphology and syntax o f n at ur allan,P:uage advocated in the 40s ancI early 50s by!\rrerican linguists such as Bloch, Harris and Wells.Bloch explains, "I n a n a l y z i n , P ~ a given sentence, wef i rs t 5501(Jte th e irrunc'diate constl tuents of th esentence as a whole, th('n th e ( ~ o n s t i tuents of eachconstituent, and so nn to t he u lt imat e constituents- - a t c,-very s t ep choos ing our consti tuents in sucha way that th e total number o f d if fe re nt constructi on s w il l reJYBin as SITk'il l as possible" [Bloch,1946J. And again, "When a word contains three ormore morphemes,Lt is usually necessary to analyzej t into two and only two IMMEDIl\TE CONTITUilITS OJei ther or both of which may be susceptible of further anaJysis" [Bloch .-md Trager, 1942J.

    Various attempts were rrade a t formalizingthis p roc edu re. In "Frum 1"1orpheme to Utte