Lexical Analyser Parser

download Lexical Analyser Parser

of 37

Transcript of Lexical Analyser Parser

  • 8/2/2019 Lexical Analyser Parser

    1/37

    LEXICAL ANALYZER ANDPARSER

  • 8/2/2019 Lexical Analyser Parser

    2/37

    2

    COMPILER

    A compiler is a program takes a programwritten in a source language and translates itinto an equivalent program in a target

    language.

    source program COMPILER targetprogram

    error messages

    ( Normally a program written in

    a high-level programming language)

    ( Normally the equivalent program in

    machine code relocatable object file)

  • 8/2/2019 Lexical Analyser Parser

    3/37

    PHASES OF COMPILER

    3

  • 8/2/2019 Lexical Analyser Parser

    4/374

    PHASES OF A COMPILER

    Lexical

    Analyzer

    Semantic

    Analyzer

    Syntax

    Analyzer

    Intermediate

    Code Generator

    Code

    Optimizer

    Code

    Generator

    Target

    ProgramSource

    Program

    Each phase transforms the source program from one representationinto another representation.

    They communicate with error handlers.

    They communicate with the symbol table.

  • 8/2/2019 Lexical Analyser Parser

    5/37

    LEXICAL ANALYZER

  • 8/2/2019 Lexical Analyser Parser

    6/37

    INTRODUCTION

    A lexical analyzer breaks an input stream of charactersinto tokens. Programs performing lexical analysis arecalled lexical analyzer or lexer.

    A lexer consists of scanner and tokenizer.

    Writing lexical analyzers by hand can be a tediousprocess, so software tools have been developed to easethis task.

    Perhaps the best known such utility is Lex. Lex is alexical analyzer generator for the UNIX operatingsystem, targeted to the C programming language

  • 8/2/2019 Lexical Analyser Parser

    7/37

    ROLE OF THE LEXICAL ANALYZER

    7

  • 8/2/2019 Lexical Analyser Parser

    8/37

    INTRODUCING BASIC TERMINOLOGY

    What are Major Terms for Lexical Analysis? TOKEN

    A classification for a common set of strings

    Examples Include , , etc.

    PATTERN

    The rules which characterize the set of strings for a token

    Recall File and OS Wildcards ([A-Z]*.*)

    LEXEME

    Actual sequence of characters that matches pattern and is classifiedby a token

    Identifiers: x, count, name, etc

    8

  • 8/2/2019 Lexical Analyser Parser

    9/37

    The input program as you see it.

    main ()

    {

    int i, sum;

    sum = 0;

    for (i=1; i

  • 8/2/2019 Lexical Analyser Parser

    10/37

    10

  • 8/2/2019 Lexical Analyser Parser

    11/37

    11

  • 8/2/2019 Lexical Analyser Parser

    12/37

    LEXICAL ANALYZER RESPONSIBILITIES Lexical analyzer [Scanner]

    Scan input

    Remove white spaces,tabs,new line characters

    Remove comments

    Manufacture tokens

    Generate lexical errors

    Pass token to parser

  • 8/2/2019 Lexical Analyser Parser

    13/37

    13

  • 8/2/2019 Lexical Analyser Parser

    14/37

    14

  • 8/2/2019 Lexical Analyser Parser

    15/37

    15

  • 8/2/2019 Lexical Analyser Parser

    16/37

    LEX INTRODUCTION

    Lex is one of the compiler writing tools, that is used togenerate a lexical analyzer or scanner from descriptionof tokens of programming language to be implemented.

    Lex takes a specially-formatted specification filecontaining the details of a lexical analyzer. This toolthen creates a C source file for the associated table-driven lexer.

  • 8/2/2019 Lexical Analyser Parser

    17/37

    LEX SPECIFICATION

    Input to the Lex is a text file containing regular expression alongwith the actions to be taken by the generated scanner when eachregular expression is matched.

    The output is a file that contains C source code definingprocedure yylex(),which implements DFA corresponding to regularexpression given in input file.

    The output file is usually called lex.yy.c or lexyy.c, which when

    compiled linked to the main program acts as a scanner or lexicalanalyzer recognizing tokens specified by regular expression of theinput file.

  • 8/2/2019 Lexical Analyser Parser

    18/37

    LEX SPECIFICATIONS

    A Lex input file is consists of three parts, a collection ofdefinitions, a collection of rules, and a collection of usersubroutines. These three sections are separated bydouble-percent directives (``%%'').

    A proper Lex specification has the following format.

  • 8/2/2019 Lexical Analyser Parser

    19/37

    LEX SPECIFICATIONS

    {definition}

    %%

    {rules}

    %%

    {user subroutines}

    Where the definition & the user subroutines are often

    omitted. The second %% is optional, but the first isrequired to mark the beginning of rules.

  • 8/2/2019 Lexical Analyser Parser

    20/37

    The input program as you see it.

    main ()

    {

    int i, sum;

    sum = 0;

    for (i=1; i

  • 8/2/2019 Lexical Analyser Parser

    21/37

    21

  • 8/2/2019 Lexical Analyser Parser

    22/37

    22

  • 8/2/2019 Lexical Analyser Parser

    23/37

    23

  • 8/2/2019 Lexical Analyser Parser

    24/37

    24

  • 8/2/2019 Lexical Analyser Parser

    25/37

    25

  • 8/2/2019 Lexical Analyser Parser

    26/37

    26

  • 8/2/2019 Lexical Analyser Parser

    27/37

    27

  • 8/2/2019 Lexical Analyser Parser

    28/37

    28

  • 8/2/2019 Lexical Analyser Parser

    29/37

    29

  • 8/2/2019 Lexical Analyser Parser

    30/37

    MAIN FEATURES Simple implementation.

    Fast lexical analysis.

    Efficient resource utilization.

    Portable.

  • 8/2/2019 Lexical Analyser Parser

    31/37

    APPLICATIONS AND FUTURE WORK Text Editing

    Text Processing

    Pattern Matching File Searching

  • 8/2/2019 Lexical Analyser Parser

    32/37

    PARSER

  • 8/2/2019 Lexical Analyser Parser

    33/37

    PARSING

    Parsing (syntactic analysis) is the processof analyzing a sequence of tokens todetermine their grammatical structure with

    respect to a given (more or less) formalgrammar.

    YACC SPECIFICATION

  • 8/2/2019 Lexical Analyser Parser

    34/37

    YACC SPECIFICATIONYacc (yet another compiler compiler) is a parser generator,which is a program that takes as its input a specification of

    syntax of the programming language, and produces as itsoutput a parse procedure for that language whose name isyyparse().

    The notation used for preparing this specification is agrammer(CFG).

    Input to yacc is a specification file usually with .y suffix,

    containing the rules of grammar specifying the structure oflanguage to be implemented. The output is C source code forparser, usually in a file y.tab.c or ytab.c.

    FORMAT OF SPECIFICATION FILE

  • 8/2/2019 Lexical Analyser Parser

    35/37

    FORMAT OF SPECIFICATION FILE

    { definition }%%

    { rules }

    %%

    { programs }

    The definition section contains information about tokens,data types, and grammar rules. It also includes any C codethat must go directly into the output file at its beginning.

  • 8/2/2019 Lexical Analyser Parser

    36/37

    CREDITS

    Credits goes out to

    A special thanks goes out to

  • 8/2/2019 Lexical Analyser Parser

    37/37

    THANK YOU!