Compiler Construction Project 2 - University of...

34
Compiler Construction Project 2 Christian Mann November 19, 2012

Transcript of Compiler Construction Project 2 - University of...

Page 1: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

Compiler Construction

Project 2

Christian Mann

November 19, 2012

Page 2: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

Contents

Introduction 1

Methodology 1Grammar Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Parse Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Implementation 1Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Nullable Productions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Removal of Left Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Left Factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3First and Follow Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Parse Table Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Dangling Else Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Recursive Descent Parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Synchronization Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Discussion and Conclusions 4Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Grammar Transformations 6Initial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Left Recursion Removed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Left Factoring Performed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Sample Inputs and Outputs 12Minimal Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Pascal Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Parse Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Blank File (Invalid) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Pascal Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Listing File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Program Listings 13C code (generated) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

parser.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13Python code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

massage.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28firstfollow.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31table.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

Introduction

This is the second phase of the front-end of a compiler for a limited subset of the Pascal language.This phase involves the implementation of the Syntax Analyzer. The analyzer employs recur-

sive descent parsing based on an LL(1) grammar and processes the tokens obtained by invokingthe lexical analyzer developed in Project 1.

1

Page 3: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

In addition to identifying and reporting syntax errors (via the listing file) and suggesting theappropriate corrections, the syntax analyzer also implements error recovery using synchronizingtokens.

Methodology

Grammar Transformations

To build an LL(1) recursive descent parser, it is necessary to first convert the grammar to LL(1)form. This is accomplished by performing the following grammar transformations:

1. Removal of ambiguity

2. Elimination of nullable productions

3. Elimination of immediate and deep left recursion

4. Left factoring

All of these processes will be explained later in the report.

Parse Table

After transforming the grammar to LL(1) form, it is possible to compute the First and Followsets for each nonterminal, so that the parse table can be constructed. Given the parse table it ispossible to parse any given string in linear time, implementing the Viable Prefix Property.

Implementation

Preliminaries

To start, the grammar was transcribed into a digital form, in the form of a Python data structure.The essential data structure is a list of (α, β) tuples, where α is a string and β is a list of strings.

Throughout this section, V will refer to the set of variables, T will refer to the set of terminals,S will refer to the start (top-level) variable, and P will refer to the set of productions.

Ambiguity

The language we were given was not inherently ambiguous, but the grammar was ambiguous,exhibiting the dangling else ambiguity:

statement→ if expression then statement else statement→ if expression then statement

Since this was the only ambiguity in the language, we simply chose to deal with it after con-struction of the parse table.

2

Page 4: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

Nullable Productions

This transformation is only necessary if the grammar exhibits left recursion, immediate or deep.To perform this transformation, it is first necessary to identify the variables that are nullable

(i.e. Vε = v ∈ V : v∗⇒ ε). This is accomplished with a flood-fill algorithm.

Once the set of nullable variables has been identified, the algorithm proceeds:

1. P = P (α, β) : β = ε

2. For each (α, β) ∈ P :

3. If there are any variables in β that are in Vε, then add a new rule to P with that variableat that spot removed.

4. Repeat these steps until the grammar does not change on an iteration.

Removal of Left Recursion

v ∈ V exhibits immediate left recursion if:

v → vγ

v ∈ V exhibits deep left recursion if:v∗⇒ wσ

∗⇒ vγ

The grammar does not exhibit deep left recursion, so we will not discuss its algorithm forremoval.

Elimination of immediate left recursion follows the form:

v → vα|β

becomes

v → βv′

v′ → αv′

This rule is applied for each variable.This is fairly straightforward to implement; the only tricky bit is creating v′ from v while

maintaining uniqueness. This is obtained by repeatedly appending a “shiv” (such as an under-score) to the variable until a new name (one that is not contained within V ) is obtained.

Left Factoring

Left factoring is the following substitution:

v → αβ|αγ

becomes

v → v′

v′ → β|γ

This substitution is performed as many times as is possible.This algorithm is also fairly straightforward. To optimize for space concerns, it is helpful to

always choose the largest α possible.

3

Page 5: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

First and Follow Sets

Let $ denote the end-of-file token.

First(v) = {a ∈ T : α∗⇒ aγ} ∪ {ε : a

∗⇒ ε}

follow(v) = {a ∈ T : S∗⇒ σvaγ} ∪ {$ : S

∗⇒ γv}

These are computed using, once again, a step repeated until no changes occur:

1. x ∈ T =⇒ First(x) = {x}

2. x→ ε =⇒ ε ∈ First(x)

3. v → y1γ =⇒ (First(y1)− {ε}) ⊆ First(v)

4. v → y1y2γ ∧ ε ∈ First(y1) =⇒ First(y2)− {ε} ⊆ First(v)

5. The above step can be extended to the length of β.

6. v → y1y2...yn ∧ (∀yi)(ε ∈ First(yi)) =⇒ ε ∈ First(v)

Computation of the follow set goes:

1. $ ∈ follow(S)

2. A→ aBy1γ =⇒ First(y1)− {ε} ⊆ follow(B)

3. A→ aBy1y2γ ∧ ε ∈ First(y1) =⇒ First(y2)− ε ⊆ follow(B)

4. The above step can be extended to the length of β.

5. v → y1y2...yn ∧ (∀yi)(ε ∈ First(yi)) =⇒ follow(v) ⊆ follow(v)

Again, the steps are repeated until no changes are made.

Parse Table Construction

The parse table is a two-dimensional table M , indexed on variables (V ) and terminals (T ).Given the first and follow sets, the construction of the parse table is as follows:

1. A→ B ∧ c ∈ First(B) =⇒ M [A][c] = B

2. A→ ε ∧ c ∈ follow(A) =⇒ M [A][c] = ε

For an unambiguous LL(1) grammar, the above relationship is consistent. All empty spacesare syntax errors.

Dangling Else Ambiguity

Our grammar is ambiguous, as described above. This is exhibited in the parse table – there aretwo valid definitions for M [statement′][else]:

M [statement′][else] = ε

M [statement′][else] = else statement

We chose to simply special-case this ambiguity by selecting the second definition for this project.As a consequence, else clauses bind to the closest (latest) if clause.

4

Page 6: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

Recursive Descent Parser

Construction of the recursive descent parser is simple, given the infrastructure of Project 1 (thelexical analyzer and tokenizer). A global variable currTerm is maintained, that is the nextterminal to be parsed; i.e. the next one in the stream.

Parsing a nonterminal is quite easy. To parse a nonterminal A, for instance, look upM [A][currTerm], and parse (or match) each item in the list in turn. If this is an empty space (not the empty string, but rather a spot that is undefined), then the parser enters synchronizationmode

To match a terminal type, the type is compared with the type of the current terminal. Ifthey match, then a new terminal is obtained. If they do not match, then the parser enterssynchronization mode (more on that later).

The function consume(NonTerminal nt) implements this functionality using a large switch-case statement for nt, each with switch-case statements for currTerm. The parse table is em-bedded in the file itself; there is no exterior storage mechanism. This will be helpful in futureprojects, which will require code insertions in between these steps.

For now, deduplication is performed on the inner level, using C’s case fallthrough. Thisreduces the amount of generated code by a large factor.

Synchronization Mode

When the parser encounters an error, such as an undefined parse table entry or a denied matchrequest, it attempts to recover after printing an appropriate error message to the screen.

The parser employs panic-mode recovery, a simple technique that simply bails out of thecurrent nonterminal, looking for a safe symbol at which to resume parsing. We place all of thesymbols in follow(A) into synch(A), in addition to $.

Discussion and Conclusions

Testing

Unit testing was very important to ensure the correctness of the program. To that end, programswere written that rigorously tested each production on edge and corner cases, for both valid anderroneous inputs. In addition, many sample Pascal programs were written and the program asa whole was verified to work correctly.

References

[1] Aho, Sethi, and Ullman. Compilers: Principles, Techniques, and Tools. 1st ed. Pearson Ed-ucation, Inc: 2006. Print.

5

Page 7: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

Grammar Transformations

Initial

arguments→ ε→ ( parameter list )

compound statement→ begin optional statements end

declarations→ ε→ declarations var id : type ;

expression→ simple expression→ simple expression relop simple expression

expression list→ expression→ expression list , expression

factor→ id→ id [ expression ]→ ( expression )→ not factor→ num

identifier list→ identifier list , id→ id

optional statements→ ε→ statement list

parameter list→ parameter list ; id : type→ id : type

procedure statement→ call id→ call id ( expression list )

program→ program id ( identifier list ) ; declarations subprogram declarations compound statement .

sign→ -→ +

simple expression→ sign term→ simple expression addop term→ term

standard type→ integer→ real

statement

6

Page 8: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

→ compound statement→ procedure statement→ variable := expression→ if expression then statement→ if expression then statement else statement→ while expression do statement

statement list→ statement→ statement list ; statement

subprogram declaration→ subprogram head declarations compound statement→ subprogram head declarations subprogram declarations compound statement

subprogram declarations→ ε→ subprogram declarations subprogram declaration ;

subprogram head→ procedure id arguments ;

term→ factor→ term mulop factor

type→ standard type→ array [ num .. num ] of standard type

variable→ id→ id [ expression ]

Left Recursion Removed

arguments→ ( parameter list )

compound statement→ begin optional statements end→ begin end

declarations→ var id : type ; declarations’

declarations’→ ε→ var id : type ; declarations’

expression→ simple expression→ simple expression relop simple expression

expression list→ expression expression list’

expression list’→ ε→ , expression expression list’

factor→ id

7

Page 9: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

→ id [ expression ]→ ( expression )→ not factor→ num

identifier list→ id identifier list’

identifier list’→ ε→ , id identifier list’

optional statements→ statement list

parameter list→ id : type parameter list’

parameter list’→ ε→ ; id : type parameter list’

procedure statement→ call id→ call id ( expression list )

program→ program id ( identifier list ) ; compound statement .→ program id ( identifier list ) ; declarations compound statement .→ program id ( identifier list ) ; declarations subprogram declarations compound statement .→ program id ( identifier list ) ; subprogram declarations compound statement .

sign→ -→ +

simple expression→ sign term simple expression’→ term simple expression’

simple expression’→ ε→ addop term simple expression’

standard type→ integer→ real

statement→ compound statement→ procedure statement→ variable := expression→ if expression then statement→ if expression then statement else statement→ while expression do statement

statement list→ statement statement list’

statement list’→ ε→ ; statement statement list’

subprogram declaration

8

Page 10: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

→ subprogram head compound statement→ subprogram head declarations compound statement→ subprogram head declarations subprogram declarations compound statement→ subprogram head subprogram declarations compound statement

subprogram declarations→ subprogram declaration ; subprogram declarations’

subprogram declarations’→ ε→ subprogram declaration ; subprogram declarations’

subprogram head→ procedure id arguments ;→ procedure id ;

term→ factor term’

term’→ ε→ mulop factor term’

type→ standard type→ array [ num .. num ] of standard type

variable→ id→ id [ expression ]

Left Factoring Performed

arguments→ ( parameter list )

compound statement→ begin compound statement’

compound statement’→ optional statements end→ end

declarations→ var id : type ; declarations’

declarations’→ ε→ var id : type ; declarations’

expression→ simple expression expression’

expression’→ ε→ relop simple expression

expression list→ expression expression list’

expression list’→ ε→ , expression expression list’

factor

9

Page 11: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

→ id factor’→ ( expression )→ not factor→ num

factor’→ ε→ [ expression ]

identifier list→ id identifier list’

identifier list’→ ε→ , id identifier list’

optional statements→ statement list

parameter list→ id : type parameter list’

parameter list’→ ε→ ; id : type parameter list’

procedure statement→ call id procedure statement’

procedure statement’→ ε→ ( expression list )

program→ program id ( identifier list ) ; program”

program’→ compound statement .→ subprogram declarations compound statement .

program”→ compound statement .→ declarations program’→ subprogram declarations compound statement .

sign→ -→ +

simple expression→ sign term simple expression’→ term simple expression’

simple expression’→ ε→ addop term simple expression’

standard type→ integer→ real

statement→ compound statement→ procedure statement→ variable := expression

10

Page 12: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

→ if expression then statement statement’→ while expression do statement

statement’→ ε→ else statement

statement list→ statement statement list’

statement list’→ ε→ ; statement statement list’

subprogram declaration→ subprogram head subprogram declaration”

subprogram declarations→ subprogram declaration ; subprogram declarations’

subprogram declarations’→ ε→ subprogram declaration ; subprogram declarations’

subprogram declaration’→ compound statement→ subprogram declarations compound statement

subprogram declaration”→ compound statement→ declarations subprogram declaration’→ subprogram declarations compound statement

subprogram head→ procedure id subprogram head’

subprogram head’→ arguments ;→ ;

term→ factor term’

term’→ ε→ mulop factor term’

type→ standard type→ array [ num .. num ] of standard type

variable→ id variable’

variable’→ ε→ [ expression ]

11

Page 13: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

Sample Inputs and Outputs

Minimal Example

Pascal Code

1 program anemic(input);

2 begin

3 end.

Parse Tree

1 NT_PROGRAM

2 T_PROGRAM

3 "program"

4 T_ID

5 "anemic"

6 T_LPAREN

7 "("

8 NT_IDENTIFIER_LIST

9 T_ID

10 "input"

11 NT_IDENTIFIER_LIST_

12 T_RPAREN

13 ")"

14 T_SEMICOLON

15 ";"

16 NT_PROGRAM11

17 NT_COMPOUND_STATEMENT

18 T_BEGIN

19 "begin"

20 NT_COMPOUND_STATEMENT1

21 T_END

22 "end"

23 T_PERIOD

24 "."

Blank File (Invalid)

Pascal Code Listing File

1 1. SYNERR , column 2: Unexpected T_EOF;

expected: T_PROGRAM

12

Page 14: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

Program Listings

C code (generated)

parser.c

1

2 #include <stdio.h>

3 #include <stdlib.h>

4 #include <stdbool.h>

5 #include <string.h>

6 #include "types.h"

7 #include "machines.h"

8 #include "parser.h"

9

10 int depth = 0;

11

12 #define X(a) case a: return #a;

13 char* ntToString(NonTerminal nt) {

14 switch(nt) {

15 NONTERMS

16 }

17 return "";

18 }

19 #undef X

20

21 void parse(void);

22 void consume(NonTerminal);

23 int match(int , NonTerminal);

24 void synch(NonTerminal);

25 void synerr(int*, int , Terminal);

26 void lexerr(Terminal);

27 Terminal nextTerminal(void);

28

29 Terminal currTerm;

30 FILE *fSrc , *fTree , *fList;

31 char sLine [90] = {0};

32 char *psLine;

33 int cLine;

34 int cColumn;

35

36 void parse() {

37 NonTerminal top_nonterminal = NT_PROGRAM;

38 currTerm = nextTerminal ();

39 consume(top_nonterminal);

40 }

41

42 void consume(NonTerminal nt) {

43 for(int i = 0; i < depth; i++) fprintf(fTree , " ");

44 fprintf(fTree , "%s\n", ntToString(nt));

45 depth ++;

46 switch(nt) {

47 case NT_ARGUMENTS:

48 switch(currTerm.type) {

49 case T_LPAREN:

50 if(! match(T_LPAREN , nt)) goto nt_arguments_synch;

51 consume(NT_PARAMETER_LIST);

52 if(! match(T_RPAREN , nt)) goto nt_arguments_synch;

53 break;

54 default:

55 synerr ((int[]){T_LPAREN}, 1, currTerm);

13

Page 15: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

56 nt_arguments_synch:

57 synch(nt);

58 break;

59 }

60 break;

61 case NT_COMPOUND_STATEMENT:

62 switch(currTerm.type) {

63 case T_BEGIN:

64 if(! match(T_BEGIN , nt)) goto nt_compound_statement_synch;

65 consume(NT_COMPOUND_STATEMENT1);

66 break;

67 default:

68 synerr ((int[]){T_BEGIN}, 1, currTerm);

69 nt_compound_statement_synch:

70 synch(nt);

71 break;

72 }

73 break;

74 case NT_COMPOUND_STATEMENT1:

75 switch(currTerm.type) {

76 case T_BEGIN:

77 case T_CALL:

78 case T_ID:

79 case T_IF:

80 case T_WHILE:

81 consume(NT_OPTIONAL_STATEMENTS);

82 if(! match(T_END , nt)) goto nt_compound_statement1_synch;

83 break;

84 case T_END:

85 if(! match(T_END , nt)) goto nt_compound_statement1_synch;

86 break;

87 default:

88 synerr ((int[]){T_ID , T_IF , T_WHILE , T_BEGIN , T_CALL , T_END}, 6,

currTerm);

89 nt_compound_statement1_synch:

90 synch(nt);

91 break;

92 }

93 break;

94 case NT_DECLARATIONS:

95 switch(currTerm.type) {

96 case T_VAR:

97 if(! match(T_VAR , nt)) goto nt_declarations_synch;

98 if(! match(T_ID , nt)) goto nt_declarations_synch;

99 if(! match(T_COLON , nt)) goto nt_declarations_synch;

100 consume(NT_TYPE);

101 if(! match(T_SEMICOLON , nt)) goto nt_declarations_synch;

102 consume(NT_DECLARATIONS_);

103 break;

104 default:

105 synerr ((int[]){T_VAR}, 1, currTerm);

106 nt_declarations_synch:

107 synch(nt);

108 break;

109 }

110 break;

111 case NT_DECLARATIONS_:

112 switch(currTerm.type) {

113 case T_BEGIN:

114 case T_PROCEDURE:

115 case T_SEMICOLON:

116 break;

14

Page 16: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

117 case T_VAR:

118 if(! match(T_VAR , nt)) goto nt_declarations__synch;

119 if(! match(T_ID , nt)) goto nt_declarations__synch;

120 if(! match(T_COLON , nt)) goto nt_declarations__synch;

121 consume(NT_TYPE);

122 if(! match(T_SEMICOLON , nt)) goto nt_declarations__synch;

123 consume(NT_DECLARATIONS_);

124 break;

125 default:

126 synerr ((int[]){T_BEGIN , T_VAR , T_PROCEDURE , T_SEMICOLON}, 4, currTerm);

127 nt_declarations__synch:

128 synch(nt);

129 break;

130 }

131 break;

132 case NT_EXPRESSION:

133 switch(currTerm.type) {

134 case T_ID:

135 case T_LPAREN:

136 case T_MINUS:

137 case T_NOT:

138 case T_NUM:

139 case T_PLUS:

140 consume(NT_SIMPLE_EXPRESSION);

141 consume(NT_EXPRESSION1);

142 break;

143 default:

144 synerr ((int[]){T_ID , T_LPAREN , T_NUM , T_PLUS , T_MINUS , T_NOT}, 6,

currTerm);

145 synch(nt);

146 break;

147 }

148 break;

149 case NT_EXPRESSION1:

150 switch(currTerm.type) {

151 case T_COMMA:

152 case T_DO:

153 case T_ELSE:

154 case T_END:

155 case T_RBRACK:

156 case T_RPAREN:

157 case T_SEMICOLON:

158 case T_THEN:

159 break;

160 case T_RELOP:

161 if(! match(T_RELOP , nt)) goto nt_expression1_synch;

162 consume(NT_SIMPLE_EXPRESSION);

163 break;

164 default:

165 synerr ((int[]){T_THEN , T_ELSE , T_COMMA , T_SEMICOLON , T_RBRACK , T_DO ,

T_RPAREN , T_RELOP , T_END}, 9, currTerm);

166 nt_expression1_synch:

167 synch(nt);

168 break;

169 }

170 break;

171 case NT_EXPRESSION_LIST:

172 switch(currTerm.type) {

173 case T_ID:

174 case T_LPAREN:

175 case T_MINUS:

176 case T_NOT:

15

Page 17: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

177 case T_NUM:

178 case T_PLUS:

179 consume(NT_EXPRESSION);

180 consume(NT_EXPRESSION_LIST_);

181 break;

182 default:

183 synerr ((int[]){T_ID , T_LPAREN , T_NUM , T_PLUS , T_MINUS , T_NOT}, 6,

currTerm);

184 synch(nt);

185 break;

186 }

187 break;

188 case NT_EXPRESSION_LIST_:

189 switch(currTerm.type) {

190 case T_COMMA:

191 if(! match(T_COMMA , nt)) goto nt_expression_list__synch;

192 consume(NT_EXPRESSION);

193 consume(NT_EXPRESSION_LIST_);

194 break;

195 case T_RPAREN:

196 break;

197 default:

198 synerr ((int[]){T_COMMA , T_RPAREN}, 2, currTerm);

199 nt_expression_list__synch:

200 synch(nt);

201 break;

202 }

203 break;

204 case NT_FACTOR:

205 switch(currTerm.type) {

206 case T_ID:

207 if(! match(T_ID , nt)) goto nt_factor_synch;

208 consume(NT_FACTOR1);

209 break;

210 case T_LPAREN:

211 if(! match(T_LPAREN , nt)) goto nt_factor_synch;

212 consume(NT_EXPRESSION);

213 if(! match(T_RPAREN , nt)) goto nt_factor_synch;

214 break;

215 case T_NOT:

216 if(! match(T_NOT , nt)) goto nt_factor_synch;

217 consume(NT_FACTOR);

218 break;

219 case T_NUM:

220 if(! match(T_NUM , nt)) goto nt_factor_synch;

221 break;

222 default:

223 synerr ((int[]){T_ID , T_NUM , T_LPAREN , T_NOT}, 4, currTerm);

224 nt_factor_synch:

225 synch(nt);

226 break;

227 }

228 break;

229 case NT_FACTOR1:

230 switch(currTerm.type) {

231 case T_ADDOP:

232 case T_COMMA:

233 case T_DO:

234 case T_ELSE:

235 case T_END:

236 case T_MULOP:

237 case T_RBRACK:

16

Page 18: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

238 case T_RELOP:

239 case T_RPAREN:

240 case T_SEMICOLON:

241 case T_THEN:

242 break;

243 case T_LBRACK:

244 if(! match(T_LBRACK , nt)) goto nt_factor1_synch;

245 consume(NT_EXPRESSION);

246 if(! match(T_RBRACK , nt)) goto nt_factor1_synch;

247 break;

248 default:

249 synerr ((int[]){T_THEN , T_ELSE , T_COMMA , T_SEMICOLON , T_LBRACK , T_ADDOP ,

T_DO , T_RPAREN , T_RELOP , T_MULOP , T_RBRACK , T_END}, 12, currTerm);

250 nt_factor1_synch:

251 synch(nt);

252 break;

253 }

254 break;

255 case NT_IDENTIFIER_LIST:

256 switch(currTerm.type) {

257 case T_ID:

258 if(! match(T_ID , nt)) goto nt_identifier_list_synch;

259 consume(NT_IDENTIFIER_LIST_);

260 break;

261 default:

262 synerr ((int[]){T_ID}, 1, currTerm);

263 nt_identifier_list_synch:

264 synch(nt);

265 break;

266 }

267 break;

268 case NT_IDENTIFIER_LIST_:

269 switch(currTerm.type) {

270 case T_COMMA:

271 if(! match(T_COMMA , nt)) goto nt_identifier_list__synch;

272 if(! match(T_ID , nt)) goto nt_identifier_list__synch;

273 consume(NT_IDENTIFIER_LIST_);

274 break;

275 case T_RPAREN:

276 break;

277 default:

278 synerr ((int[]){T_COMMA , T_RPAREN}, 2, currTerm);

279 nt_identifier_list__synch:

280 synch(nt);

281 break;

282 }

283 break;

284 case NT_OPTIONAL_STATEMENTS:

285 switch(currTerm.type) {

286 case T_BEGIN:

287 case T_CALL:

288 case T_ID:

289 case T_IF:

290 case T_WHILE:

291 consume(NT_STATEMENT_LIST);

292 break;

293 default:

294 synerr ((int[]){T_ID , T_BEGIN , T_IF , T_WHILE , T_CALL}, 5, currTerm);

295 synch(nt);

296 break;

297 }

298 break;

17

Page 19: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

299 case NT_PARAMETER_LIST:

300 switch(currTerm.type) {

301 case T_ID:

302 if(! match(T_ID , nt)) goto nt_parameter_list_synch;

303 if(! match(T_COLON , nt)) goto nt_parameter_list_synch;

304 consume(NT_TYPE);

305 consume(NT_PARAMETER_LIST_);

306 break;

307 default:

308 synerr ((int[]){T_ID}, 1, currTerm);

309 nt_parameter_list_synch:

310 synch(nt);

311 break;

312 }

313 break;

314 case NT_PARAMETER_LIST_:

315 switch(currTerm.type) {

316 case T_RPAREN:

317 break;

318 case T_SEMICOLON:

319 if(! match(T_SEMICOLON , nt)) goto nt_parameter_list__synch;

320 if(! match(T_ID , nt)) goto nt_parameter_list__synch;

321 if(! match(T_COLON , nt)) goto nt_parameter_list__synch;

322 consume(NT_TYPE);

323 consume(NT_PARAMETER_LIST_);

324 break;

325 default:

326 synerr ((int[]){T_RPAREN , T_SEMICOLON}, 2, currTerm);

327 nt_parameter_list__synch:

328 synch(nt);

329 break;

330 }

331 break;

332 case NT_PROCEDURE_STATEMENT:

333 switch(currTerm.type) {

334 case T_CALL:

335 if(! match(T_CALL , nt)) goto nt_procedure_statement_synch;

336 if(! match(T_ID , nt)) goto nt_procedure_statement_synch;

337 consume(NT_PROCEDURE_STATEMENT1);

338 break;

339 default:

340 synerr ((int[]){T_CALL}, 1, currTerm);

341 nt_procedure_statement_synch:

342 synch(nt);

343 break;

344 }

345 break;

346 case NT_PROCEDURE_STATEMENT1:

347 switch(currTerm.type) {

348 case T_ELSE:

349 case T_END:

350 case T_SEMICOLON:

351 break;

352 case T_LPAREN:

353 if(! match(T_LPAREN , nt)) goto nt_procedure_statement1_synch;

354 consume(NT_EXPRESSION_LIST);

355 if(! match(T_RPAREN , nt)) goto nt_procedure_statement1_synch;

356 break;

357 default:

358 synerr ((int[]){T_ELSE , T_LPAREN , T_SEMICOLON , T_END}, 4, currTerm);

359 nt_procedure_statement1_synch:

360 synch(nt);

18

Page 20: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

361 break;

362 }

363 break;

364 case NT_PROGRAM:

365 switch(currTerm.type) {

366 case T_PROGRAM:

367 if(! match(T_PROGRAM , nt)) goto nt_program_synch;

368 if(! match(T_ID , nt)) goto nt_program_synch;

369 if(! match(T_LPAREN , nt)) goto nt_program_synch;

370 consume(NT_IDENTIFIER_LIST);

371 if(! match(T_RPAREN , nt)) goto nt_program_synch;

372 if(! match(T_SEMICOLON , nt)) goto nt_program_synch;

373 consume(NT_PROGRAM11);

374 break;

375 default:

376 synerr ((int[]){T_PROGRAM}, 1, currTerm);

377 nt_program_synch:

378 synch(nt);

379 break;

380 }

381 break;

382 case NT_PROGRAM1:

383 switch(currTerm.type) {

384 case T_BEGIN:

385 consume(NT_COMPOUND_STATEMENT);

386 if(! match(T_PERIOD , nt)) goto nt_program1_synch;

387 break;

388 case T_PROCEDURE:

389 consume(NT_SUBPROGRAM_DECLARATIONS);

390 consume(NT_COMPOUND_STATEMENT);

391 if(! match(T_PERIOD , nt)) goto nt_program1_synch;

392 break;

393 default:

394 synerr ((int[]){T_BEGIN , T_PROCEDURE}, 2, currTerm);

395 nt_program1_synch:

396 synch(nt);

397 break;

398 }

399 break;

400 case NT_PROGRAM11:

401 switch(currTerm.type) {

402 case T_BEGIN:

403 consume(NT_COMPOUND_STATEMENT);

404 if(! match(T_PERIOD , nt)) goto nt_program11_synch;

405 break;

406 case T_PROCEDURE:

407 consume(NT_SUBPROGRAM_DECLARATIONS);

408 consume(NT_COMPOUND_STATEMENT);

409 if(! match(T_PERIOD , nt)) goto nt_program11_synch;

410 break;

411 case T_VAR:

412 consume(NT_DECLARATIONS);

413 consume(NT_PROGRAM1);

414 break;

415 default:

416 synerr ((int[]){T_BEGIN , T_VAR , T_PROCEDURE}, 3, currTerm);

417 nt_program11_synch:

418 synch(nt);

419 break;

420 }

421 break;

422 case NT_SIGN:

19

Page 21: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

423 switch(currTerm.type) {

424 case T_MINUS:

425 if(! match(T_MINUS , nt)) goto nt_sign_synch;

426 break;

427 case T_PLUS:

428 if(! match(T_PLUS , nt)) goto nt_sign_synch;

429 break;

430 default:

431 synerr ((int[]){T_PLUS , T_MINUS}, 2, currTerm);

432 nt_sign_synch:

433 synch(nt);

434 break;

435 }

436 break;

437 case NT_SIMPLE_EXPRESSION:

438 switch(currTerm.type) {

439 case T_ID:

440 case T_LPAREN:

441 case T_NOT:

442 case T_NUM:

443 consume(NT_TERM);

444 consume(NT_SIMPLE_EXPRESSION_);

445 break;

446 case T_MINUS:

447 case T_PLUS:

448 consume(NT_SIGN);

449 consume(NT_TERM);

450 consume(NT_SIMPLE_EXPRESSION_);

451 break;

452 default:

453 synerr ((int[]){T_ID , T_LPAREN , T_NUM , T_PLUS , T_MINUS , T_NOT}, 6,

currTerm);

454 synch(nt);

455 break;

456 }

457 break;

458 case NT_SIMPLE_EXPRESSION_:

459 switch(currTerm.type) {

460 case T_ADDOP:

461 if(! match(T_ADDOP , nt)) goto nt_simple_expression__synch;

462 consume(NT_TERM);

463 consume(NT_SIMPLE_EXPRESSION_);

464 break;

465 case T_COMMA:

466 case T_DO:

467 case T_ELSE:

468 case T_END:

469 case T_RBRACK:

470 case T_RELOP:

471 case T_RPAREN:

472 case T_SEMICOLON:

473 case T_THEN:

474 break;

475 default:

476 synerr ((int[]){T_THEN , T_ELSE , T_COMMA , T_SEMICOLON , T_ADDOP , T_DO ,

T_RPAREN , T_RELOP , T_RBRACK , T_END}, 10, currTerm);

477 nt_simple_expression__synch:

478 synch(nt);

479 break;

480 }

481 break;

482 case NT_STANDARD_TYPE:

20

Page 22: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

483 switch(currTerm.type) {

484 case T_INTEGER:

485 if(! match(T_INTEGER , nt)) goto nt_standard_type_synch;

486 break;

487 case T_REAL:

488 if(! match(T_REAL , nt)) goto nt_standard_type_synch;

489 break;

490 default:

491 synerr ((int[]){T_INTEGER , T_REAL}, 2, currTerm);

492 nt_standard_type_synch:

493 synch(nt);

494 break;

495 }

496 break;

497 case NT_STATEMENT:

498 switch(currTerm.type) {

499 case T_BEGIN:

500 consume(NT_COMPOUND_STATEMENT);

501 break;

502 case T_CALL:

503 consume(NT_PROCEDURE_STATEMENT);

504 break;

505 case T_ID:

506 consume(NT_VARIABLE);

507 if(! match(T_ASSIGNOP , nt)) goto nt_statement_synch;

508 consume(NT_EXPRESSION);

509 break;

510 case T_IF:

511 if(! match(T_IF , nt)) goto nt_statement_synch;

512 consume(NT_EXPRESSION);

513 if(! match(T_THEN , nt)) goto nt_statement_synch;

514 consume(NT_STATEMENT);

515 consume(NT_STATEMENT1);

516 break;

517 case T_WHILE:

518 if(! match(T_WHILE , nt)) goto nt_statement_synch;

519 consume(NT_EXPRESSION);

520 if(! match(T_DO , nt)) goto nt_statement_synch;

521 consume(NT_STATEMENT);

522 break;

523 default:

524 synerr ((int[]){T_ID , T_BEGIN , T_IF , T_WHILE , T_CALL}, 5, currTerm);

525 nt_statement_synch:

526 synch(nt);

527 break;

528 }

529 break;

530 case NT_STATEMENT1:

531 switch(currTerm.type) {

532 case T_ELSE:

533 if(! match(T_ELSE , nt)) goto nt_statement1_synch;

534 consume(NT_STATEMENT);

535 break;

536 case T_END:

537 case T_SEMICOLON:

538 break;

539 default:

540 synerr ((int[]){T_ELSE , T_SEMICOLON , T_END}, 3, currTerm);

541 nt_statement1_synch:

542 synch(nt);

543 break;

544 }

21

Page 23: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

545 break;

546 case NT_STATEMENT_LIST:

547 switch(currTerm.type) {

548 case T_BEGIN:

549 case T_CALL:

550 case T_ID:

551 case T_IF:

552 case T_WHILE:

553 consume(NT_STATEMENT);

554 consume(NT_STATEMENT_LIST_);

555 break;

556 default:

557 synerr ((int[]){T_ID , T_BEGIN , T_IF , T_WHILE , T_CALL}, 5, currTerm);

558 synch(nt);

559 break;

560 }

561 break;

562 case NT_STATEMENT_LIST_:

563 switch(currTerm.type) {

564 case T_END:

565 break;

566 case T_SEMICOLON:

567 if(! match(T_SEMICOLON , nt)) goto nt_statement_list__synch;

568 consume(NT_STATEMENT);

569 consume(NT_STATEMENT_LIST_);

570 break;

571 default:

572 synerr ((int[]){T_SEMICOLON , T_END}, 2, currTerm);

573 nt_statement_list__synch:

574 synch(nt);

575 break;

576 }

577 break;

578 case NT_SUBPROGRAM_DECLARATION:

579 switch(currTerm.type) {

580 case T_PROCEDURE:

581 consume(NT_SUBPROGRAM_HEAD);

582 consume(NT_SUBPROGRAM_DECLARATION11);

583 break;

584 default:

585 synerr ((int[]){T_PROCEDURE}, 1, currTerm);

586 synch(nt);

587 break;

588 }

589 break;

590 case NT_SUBPROGRAM_DECLARATION1:

591 switch(currTerm.type) {

592 case T_BEGIN:

593 consume(NT_COMPOUND_STATEMENT);

594 break;

595 case T_PROCEDURE:

596 consume(NT_SUBPROGRAM_DECLARATIONS);

597 consume(NT_COMPOUND_STATEMENT);

598 break;

599 default:

600 synerr ((int[]){T_BEGIN , T_PROCEDURE}, 2, currTerm);

601 synch(nt);

602 break;

603 }

604 break;

605 case NT_SUBPROGRAM_DECLARATION11:

606 switch(currTerm.type) {

22

Page 24: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

607 case T_BEGIN:

608 consume(NT_COMPOUND_STATEMENT);

609 break;

610 case T_PROCEDURE:

611 consume(NT_SUBPROGRAM_DECLARATIONS);

612 consume(NT_COMPOUND_STATEMENT);

613 break;

614 case T_VAR:

615 consume(NT_DECLARATIONS);

616 consume(NT_SUBPROGRAM_DECLARATION1);

617 break;

618 default:

619 synerr ((int[]){T_BEGIN , T_VAR , T_PROCEDURE}, 3, currTerm);

620 synch(nt);

621 break;

622 }

623 break;

624 case NT_SUBPROGRAM_DECLARATIONS:

625 switch(currTerm.type) {

626 case T_PROCEDURE:

627 consume(NT_SUBPROGRAM_DECLARATION);

628 if(! match(T_SEMICOLON , nt)) goto nt_subprogram_declarations_synch;

629 consume(NT_SUBPROGRAM_DECLARATIONS_);

630 break;

631 default:

632 synerr ((int[]){T_PROCEDURE}, 1, currTerm);

633 nt_subprogram_declarations_synch:

634 synch(nt);

635 break;

636 }

637 break;

638 case NT_SUBPROGRAM_DECLARATIONS_:

639 switch(currTerm.type) {

640 case T_BEGIN:

641 case T_SEMICOLON:

642 break;

643 case T_PROCEDURE:

644 consume(NT_SUBPROGRAM_DECLARATION);

645 if(! match(T_SEMICOLON , nt)) goto nt_subprogram_declarations__synch;

646 consume(NT_SUBPROGRAM_DECLARATIONS_);

647 break;

648 default:

649 synerr ((int[]){T_BEGIN , T_SEMICOLON , T_PROCEDURE}, 3, currTerm);

650 nt_subprogram_declarations__synch:

651 synch(nt);

652 break;

653 }

654 break;

655 case NT_SUBPROGRAM_HEAD:

656 switch(currTerm.type) {

657 case T_PROCEDURE:

658 if(! match(T_PROCEDURE , nt)) goto nt_subprogram_head_synch;

659 if(! match(T_ID , nt)) goto nt_subprogram_head_synch;

660 consume(NT_SUBPROGRAM_HEAD1);

661 break;

662 default:

663 synerr ((int[]){T_PROCEDURE}, 1, currTerm);

664 nt_subprogram_head_synch:

665 synch(nt);

666 break;

667 }

668 break;

23

Page 25: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

669 case NT_SUBPROGRAM_HEAD1:

670 switch(currTerm.type) {

671 case T_LPAREN:

672 consume(NT_ARGUMENTS);

673 if(! match(T_SEMICOLON , nt)) goto nt_subprogram_head1_synch;

674 break;

675 case T_SEMICOLON:

676 if(! match(T_SEMICOLON , nt)) goto nt_subprogram_head1_synch;

677 break;

678 default:

679 synerr ((int[]){T_LPAREN , T_SEMICOLON}, 2, currTerm);

680 nt_subprogram_head1_synch:

681 synch(nt);

682 break;

683 }

684 break;

685 case NT_TERM:

686 switch(currTerm.type) {

687 case T_ID:

688 case T_LPAREN:

689 case T_NOT:

690 case T_NUM:

691 consume(NT_FACTOR);

692 consume(NT_TERM_);

693 break;

694 default:

695 synerr ((int[]){T_ID , T_NUM , T_LPAREN , T_NOT}, 4, currTerm);

696 synch(nt);

697 break;

698 }

699 break;

700 case NT_TERM_:

701 switch(currTerm.type) {

702 case T_ADDOP:

703 case T_COMMA:

704 case T_DO:

705 case T_ELSE:

706 case T_END:

707 case T_RBRACK:

708 case T_RELOP:

709 case T_RPAREN:

710 case T_SEMICOLON:

711 case T_THEN:

712 break;

713 case T_MULOP:

714 if(! match(T_MULOP , nt)) goto nt_term__synch;

715 consume(NT_FACTOR);

716 consume(NT_TERM_);

717 break;

718 default:

719 synerr ((int[]){T_THEN , T_ELSE , T_COMMA , T_SEMICOLON , T_ADDOP , T_DO ,

T_RPAREN , T_MULOP , T_RELOP , T_RBRACK , T_END}, 11, currTerm);

720 nt_term__synch:

721 synch(nt);

722 break;

723 }

724 break;

725 case NT_TYPE:

726 switch(currTerm.type) {

727 case T_ARRAY:

728 if(! match(T_ARRAY , nt)) goto nt_type_synch;

729 if(! match(T_LBRACK , nt)) goto nt_type_synch;

24

Page 26: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

730 if(! match(T_NUM , nt)) goto nt_type_synch;

731 if(! match(T_DOUBLEPERIOD , nt)) goto nt_type_synch;

732 if(! match(T_NUM , nt)) goto nt_type_synch;

733 if(! match(T_RBRACK , nt)) goto nt_type_synch;

734 if(! match(T_OF , nt)) goto nt_type_synch;

735 consume(NT_STANDARD_TYPE);

736 break;

737 case T_INTEGER:

738 case T_REAL:

739 consume(NT_STANDARD_TYPE);

740 break;

741 default:

742 synerr ((int[]){T_ARRAY , T_INTEGER , T_REAL}, 3, currTerm);

743 nt_type_synch:

744 synch(nt);

745 break;

746 }

747 break;

748 case NT_VARIABLE:

749 switch(currTerm.type) {

750 case T_ID:

751 if(! match(T_ID , nt)) goto nt_variable_synch;

752 consume(NT_VARIABLE1);

753 break;

754 default:

755 synerr ((int[]){T_ID}, 1, currTerm);

756 nt_variable_synch:

757 synch(nt);

758 break;

759 }

760 break;

761 case NT_VARIABLE1:

762 switch(currTerm.type) {

763 case T_ASSIGNOP:

764 break;

765 case T_LBRACK:

766 if(! match(T_LBRACK , nt)) goto nt_variable1_synch;

767 consume(NT_EXPRESSION);

768 if(! match(T_RBRACK , nt)) goto nt_variable1_synch;

769 break;

770 default:

771 synerr ((int[]){T_LBRACK , T_ASSIGNOP}, 2, currTerm);

772 nt_variable1_synch:

773 synch(nt);

774 break;

775 }

776 break;

777

778 }

779 depth --;

780 }

781

782 void synch(NonTerminal nt) {

783 int *synchSet;

784 int len;

785 switch(nt) {

786 case NT_PROCEDURE_STATEMENT:

787 case NT_STATEMENT:

788 case NT_STATEMENT1:

789 case NT_PROCEDURE_STATEMENT1:

790 synchSet = (int[]){T_EOF , T_END , T_ELSE , T_SEMICOLON }; len = 4;

791 break;

25

Page 27: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

792 case NT_SUBPROGRAM_DECLARATION:

793 case NT_SUBPROGRAM_DECLARATION11:

794 case NT_ARGUMENTS:

795 case NT_SUBPROGRAM_DECLARATION1:

796 synchSet = (int[]){T_EOF , T_SEMICOLON }; len = 2;

797 break;

798 case NT_PARAMETER_LIST:

799 case NT_IDENTIFIER_LIST:

800 case NT_EXPRESSION_LIST_:

801 case NT_PARAMETER_LIST_:

802 case NT_IDENTIFIER_LIST_:

803 case NT_EXPRESSION_LIST:

804 synchSet = (int[]){T_EOF , T_RPAREN }; len = 2;

805 break;

806 case NT_PROGRAM:

807 case NT_PROGRAM1:

808 case NT_PROGRAM11:

809 synchSet = (int[]){T_EOF}; len = 1;

810 break;

811 case NT_SIMPLE_EXPRESSION_:

812 case NT_SIMPLE_EXPRESSION:

813 synchSet = (int[]){T_THEN , T_EOF , T_COMMA , T_SEMICOLON , T_RBRACK , T_DO ,

T_ELSE , T_RPAREN , T_RELOP , T_END}; len = 10;

814 break;

815 case NT_SIGN:

816 synchSet = (int[]){T_ID , T_NUM , T_EOF , T_LPAREN , T_NOT }; len = 5;

817 break;

818 case NT_SUBPROGRAM_HEAD:

819 case NT_SUBPROGRAM_HEAD1:

820 synchSet = (int[]){T_BEGIN , T_EOF , T_VAR , T_SEMICOLON , T_PROCEDURE }; len =

5;

821 break;

822 case NT_TERM:

823 case NT_TERM_:

824 synchSet = (int[]){T_THEN , T_EOF , T_COMMA , T_SEMICOLON , T_ADDOP , T_RBRACK ,

T_DO , T_ELSE , T_RPAREN , T_RELOP , T_END}; len = 11;

825 break;

826 case NT_TYPE:

827 case NT_STANDARD_TYPE:

828 synchSet = (int[]){T_EOF , T_SEMICOLON , T_RPAREN }; len = 3;

829 break;

830 case NT_OPTIONAL_STATEMENTS:

831 case NT_STATEMENT_LIST_:

832 case NT_STATEMENT_LIST:

833 synchSet = (int[]){T_EOF , T_END}; len = 2;

834 break;

835 case NT_DECLARATIONS:

836 case NT_DECLARATIONS_:

837 synchSet = (int[]){T_BEGIN , T_EOF , T_SEMICOLON , T_PROCEDURE }; len = 4;

838 break;

839 case NT_SUBPROGRAM_DECLARATIONS:

840 case NT_SUBPROGRAM_DECLARATIONS_:

841 synchSet = (int[]){T_BEGIN , T_EOF , T_SEMICOLON }; len = 3;

842 break;

843 case NT_COMPOUND_STATEMENT:

844 case NT_COMPOUND_STATEMENT1:

845 synchSet = (int[]){T_PERIOD , T_EOF , T_END , T_ELSE , T_SEMICOLON }; len = 5;

846 break;

847 case NT_VARIABLE1:

848 case NT_VARIABLE:

849 synchSet = (int[]){T_EOF , T_ASSIGNOP }; len = 2;

850 break;

26

Page 28: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

851 case NT_EXPRESSION:

852 case NT_EXPRESSION1:

853 synchSet = (int[]){T_THEN , T_EOF , T_COMMA , T_SEMICOLON , T_RBRACK , T_DO ,

T_ELSE , T_RPAREN , T_END }; len = 9;

854 break;

855 case NT_FACTOR1:

856 case NT_FACTOR:

857 synchSet = (int[]){T_THEN , T_EOF , T_COMMA , T_SEMICOLON , T_ADDOP , T_DO ,

T_RPAREN , T_ELSE , T_RELOP , T_MULOP , T_RBRACK , T_END}; len = 12;

858 break;

859

860 }

861 while(true) {

862 for(int i = 0; i < len; i++) {

863 if(currTerm.type == synchSet[i]) {

864 char* lexPrint = currTerm.lexeme;

865 if(currTerm.type == T_EOF) {

866 fprintf(stderr , "Reached end -of-file; terminating\n");

867 exit (1);

868 } else {

869 fprintf(stderr , "Synchronized on term \"%s\" (%s)\n", lexPrint ,

convertConstantToString(currTerm.type));

870 return;

871 }

872 }

873 }

874 currTerm = nextTerminal ();

875 }

876 }

877

878 int match(int termtype , NonTerminal nt) {

879 if(currTerm.type == termtype) {

880 for(int i = 0; i < depth; i++) fprintf(fTree , " ");

881 fprintf(fTree , "%s\n", convertConstantToString(currTerm.type));

882 for(int i = 0; i < depth +1; i++) fprintf(fTree , " ");

883 fprintf(fTree , "\"%s\"\n", currTerm.lexeme);

884 currTerm = nextTerminal ();

885 return true;

886 }

887 synerr ((int[]){termtype}, 1, currTerm);

888 return false;

889 }

890

891 void synerr(int *expected , int expLen , Terminal encountered) {

892 char *str = " one of";

893 if(expLen == 1) str = "";

894 fprintf(fList , "SYNERR , column %d: Unexpected %s; expected%s: ", cColumn ,

convertConstantToString(encountered.type), str);

895 for(int i = 0; i < expLen - 1; i++) {

896 fprintf(fList , "%s, ", convertConstantToString(expected[i]));

897 }

898 fprintf(fList , "%s\n", convertConstantToString(expected[expLen -1]));

899 fprintf(stderr , "SYNERR , line %d, column %d: Unexpected %s; expected%s: ",

cLine , cColumn , convertConstantToString(encountered.type), str);

900 for(int i = 0; i < expLen - 1; i++) {

901 fprintf(stderr , "%s, ", convertConstantToString(expected[i]));

902 }

903 fprintf(stderr , "%s\n", convertConstantToString(expected[expLen -1]));

904 }

905

906 int main(int argc , char** argv) {

907 if(argc < 2) exit (2);

27

Page 29: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

908 char *sfSrc = argv [1];

909 fSrc = fopen(sfSrc , "r");

910 char sfTree [80];

911 strcpy(sfTree , sfSrc);

912 strcpy(sfTree + strlen(sfSrc) - 4, ".tree");

913 fTree = fopen(sfTree , "w");

914 char sfList [80];

915 strcpy(sfList , sfSrc);

916 strcpy(sfList + strlen(sfSrc) - 4, ".lst");

917 fList = fopen(sfList , "w");

918

919 cLine = 1;

920 cColumn = 0;

921 machinesInit("data/reserved -words.txt");

922 parse();

923 fclose(fSrc);

924 fclose(fTree);

925 return 0;

926 }

927

928 Terminal nextTerminal () {

929 if(! psLine) psLine = sLine;

930 if(!* psLine) {

931 fgets(sLine , sizeof(sLine), fSrc);

932 fprintf(fList , "%d. %s", cLine , sLine);

933 if(feof(fSrc)) {

934 sLine [0] = EOF;

935 sLine [1] = 0;

936 }

937 psLine = sLine;

938 cLine ++;

939 cColumn = 1;

940 }

941 MachineResult res = identifyToken(psLine);

942 cColumn += res.newString - psLine;

943 psLine = res.newString;

944 if(res.type == T_WS) {

945 return nextTerminal ();

946 }

947 if(res.type == T_LEXERR) {

948 lexerr(res);

949 return nextTerminal ();

950 }

951 return res;

952 }

953

954 void lexerr(Terminal res) {

955 fprintf(fList , "%s, column %d\n", convertConstantToString(res.error), cColumn);

956 fprintf(stderr , "%s, line %d, column %d\n", convertConstantToString(res.error),

cColumn , cLine);

957 }

Python code

massage.py

1 from rules import *

2 import copy

3 import itertools

4

5 LEFT_RECURSION_SHIV = ’_’

6 LEFT_FACTORING_SHIV = ’_’

28

Page 30: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

7

8 def remove_epsilon_productions ((V, T, S, P)):

9 #identify set of nullable variables

10 Ve = set()

11 while True:

12 changed = False

13 for (lhs , rhs) in P:

14 if rhs == [""] or all(w in Ve for w in rhs):

15 if lhs not in Ve:

16 changed = True

17 Ve.add(lhs)

18 if not changed:

19 break

20 grammar_produces_e = (S in Ve)

21

22 # use set of nullable variables

23 new_P = [(lhs , rhs) for (lhs , rhs) in copy.deepcopy(P) if rhs != [""]]

24 while True:

25 old = copy.deepcopy(new_P)

26 for (lhs , rhs) in new_P:

27 for i in range(len(rhs)):

28 if rhs[i] in Ve:

29 new_rule = (lhs , rhs[:i] + rhs[i+1:])

30 if new_rule not in new_P:

31 new_P.append(new_rule)

32 if old == new_P:

33 break

34

35 return (V, T, S, new_P)

36

37 def eliminate_left_recursion ((V, T, S, P)):

38 if contains_epsilon_productions ((V, T, S, P)):

39 #print "contains ep prods"

40 (V, T, S, P) = remove_epsilon_productions ((V, T, S, P))

41

42 new_P = copy.copy(P)

43 new_V = copy.copy(V)

44 for v in V:

45 rhss = [r for (l, r) in P if l == v]

46 if any(r[0] == v for r in rhss):

47 # remove original rules

48 for r in rhss:

49 new_P.remove ((v, r))

50 # create new rules

51 alpha = [r[1:] for r in rhss if r[0] == v]

52 beta = [r for r in rhss if r[0] and r[0] != v]

53 vprime = v

54 while vprime in V + new_V:

55 vprime += LEFT_RECURSION_SHIV

56 new_V.append(vprime)

57

58 for b in beta:

59 new_P.append ((v, b + [vprime ]))

60

61 for a in alpha:

62 new_P.append ((vprime , a + [vprime ]))

63 new_P.append ((vprime , [’’]))

64

65 return (new_V , T, S, new_P)

66

67 def perform_left_factoring ((V, T, S, P)):

68

29

Page 31: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

69 def common_starting_subseq(s, t):

70 subseq = []

71 for (e1, e2) in zip(s, t):

72 if e1 == e2:

73 subseq.append(e1)

74 else:

75 return subseq

76 return subseq

77

78 def longest_common_start(seqs):

79 if len(seqs) < 2: return []

80 else: return max([ common_starting_subseq(s1, s2) for (s1 , s2) in

itertools.combinations(seqs , 2)], key=len)

81

82 def startsWith(seq , subseq):

83 for (e1, e2) in zip(seq , subseq):

84 if e1 != e2:

85 return False

86 return True

87

88 def obtainConflict(rules , var):

89 start = longest_common_start ([r for (l,r) in rules if l == var])

90 if start: return start , [r for (l,r) in rules if l == var and startsWith(r,

start)]

91 else: return False

92

93 new_P = copy.copy(P)

94 new_V = copy.copy(V)

95

96 for var in new_V:

97 conflict = obtainConflict(new_P , var)

98 while conflict:

99 varprime = var

100 while varprime in new_V:

101 varprime += LEFT_FACTORING_SHIV

102 new_V.append(varprime)

103 alpha , rhss = conflict

104

105 new_P.append ((var , alpha + [varprime ]))

106 for rhs in rhss:

107 new_P.remove ((var , rhs))

108 beta = rhs[len(alpha):]

109 if beta == []:

110 beta = [’’] # hotfix

111 new_P.append ((varprime , beta))

112

113 conflict = obtainConflict(new_P , var)

114

115 return (new_V , T, S, new_P)

116

117

118 def contains_epsilon_productions ((V, T, S, P)):

119 for (l, r) in P:

120 if r == [""] and l != S:

121 return True

122 return False

123

124 def contains_immediate_left_recursion ((V, T, S, P)):

125 for (l, r) in P:

126 if r[0] == l:

127 return True

128 return False

30

Page 32: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

129

130 if __name__ == ’__main__ ’:

131 G = (V, T, S, P)

132 (Vh , Th , Sh, Ph) = H = eliminate_left_recursion(G)

133 print ’\n’.join(’%s %s’ % (l,r) for (l,r) in Ph)

134 print

135 (Vj , Tj , Sj, Pj) = J = perform_left_factoring(H)

136 print ’\n’.join(’%s %s’ % (l,r) for (l,r) in Pj)

firstfollow.py

1 import sys

2 from massage import *

3 from collections import defaultdict

4 import pprint

5 import copy

6

7 G = (V, T, S, P)

8 H = (Vh, Th, Sh, Ph) = perform_left_factoring(eliminate_left_recursion(G))

9 #H = (Vh, Th, Sh , Ph) = (V, T, S, P)

10 #print Vh, Th

11

12 if __name__ == ’__main__ ’:

13 print ’\n’.join(sorted("%s -> %s" % (l,r) for (l,r) in Ph))

14 print

15

16 def firsts ():

17 fir = defaultdict(set)

18 for t in Th: fir[t] = set([t])

19 while True:

20 old = copy.deepcopy(fir)

21 for (l,r) in Ph:

22 if r == [’’] or r == []:

23 fir[l].add(’’)

24 else:

25 if ’’ not in fir[r[0]]:

26 fir[l]. update(fir[r[0]])

27 else:

28 i = 0

29 while i < len(r) and (’’ in fir[r[i]] or r[i] in Th):

30 fir[l]. update(fir[r[i]] - set([’’]))

31 i += 1

32 if all(’’ in fir[ri] for ri in r):

33 fir[l].add(’’)

34 if old == fir:

35 break

36 return fir

37

38 def follows ():

39 fir = firsts ()

40 fol = defaultdict(set)

41 fol[Sh].add(’T_EOF’)

42 while True:

43 old = copy.deepcopy(fol)

44 for (l,r) in Ph:

45 for i in range(len(r)):

46 B = r[i]

47 if B in Vh and i+1 < len(r):

48 if r[i+1] in Th:

49 fol[B].add(r[i+1])

50 else:

51 j = i+1

31

Page 33: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

52 while True:

53 if ’-d’ in sys.argv: print "Adding first(%s) to follow (%s)" %

(r[j], B)

54 fol[B]. update(fir[r[j]] - set([’’]))

55 j += 1

56 if j >= len(r):

57 if ’-d’ in sys.argv: print "Adding follow (%s) to follow (%s)" %

(l, B)

58 fol[B]. update(fol[l])

59 break

60 if ’’ not in fir[r[j-1]]:

61 break

62

63 elif B in Vh and i == len(r) -1:

64 if ’-d’ in sys.argv: print "Adding follow (%s) to follow (%s)" % (l, B)

65 fol[B]. update(fol[l])

66 if old == fol:

67 break

68 return fol

69

70 if __name__ == ’__main__ ’:

71 print ’\n’.join("first[%s] = %s" % (v, ’, ’.join(map(repr ,s))) for (v,s) in

sorted(firsts ().items()) if v in Vh)

72 print

73 print ’\n’.join("follow [%s] = %s" % (v, ’, ’.join(map(repr ,s))) for (v,s) in

sorted(follows ().items()) if v in Vh)

table.py

1 from firstfollow import *

2 from massage import *

3

4 Vh, Th = sorted(Vh), sorted(Th)

5 H = (Vh, Th, Sh, Ph)

6 fir = firsts ()

7 fol = follows ()

8

9 #construct table

10 pt = defaultdict(lambda: defaultdict(list))

11

12 def first(w):

13 if w == []:

14 return set([’’])

15 if w[0] in Th:

16 return set([w[0]])

17 if w[0] in Vh:

18 if ’’ in fir[w[0]]:

19 return (fir[w[0]] - set([’’])) | first(w[1:])

20 else:

21 return fir[w[0]]

22

23 for (l,r) in Ph:

24 for symb in first(r):

25 if symb != ’’:

26 pt[l][symb]. append(r)

27 else:

28 for symb in fol[l]:

29 if l == ’NT_STATEMENT ’+LEFT_FACTORING_SHIV and symb == ’T_ELSE ’:

30 # special -case: dangling else ambiguity

31 pass

32 else:

33 pt[l][symb]. append(r)

32

Page 34: Compiler Construction Project 2 - University of Tulsapersonal.utulsa.edu/~christian-mann/report2.pdf!compound statement .!subprogram declarations compound statement . program"!compound

34

35 for v in Vh:

36 for t in Th:

37 if pt[v] and pt[l][t]:

38 if len(pt[v][t]) > 1:

39 sys.stderr.write(’WTF’)

40 sys.stderr.write(’ ’.join([v, t, pt[v][t]]))

41 sys.exit()

33