Compiler Questions

CS 375, Compilers: Example Questions for Midterm Exam

Addressing:

Consider the following declarations: type complex = record re, im: integer end;

person = record name: alfa; age: integer; location: complex; salary: real end;

var people = array [ 7..14, (austin, dallas, houston)] of person;

Assuming alfa is 8 bytes, integer is 4 bytes, and real is 8 bytes, how much storage is occupied by the array people?

Calculate the effective address of the expression: people[10,dallas].location.im

Show how it was derived and give the aref form.

Other Questions:

Show how an operator precedence parser would parse the string: A - (B / C - D) / E + F

Show the contents of the stacks at each step; produce a tree as output.

Give one advantage and one disadvantage of hashing as a method of symbol table organization.

Consider the regular expression (a | b)*b+b* . What is the simplest regular expression that denotes the same language?

Give the allowable form of productions for a Regular grammar. Consider the following grammar: S --> a S S --> S b S --> b

o What kind of grammar is this? o What kind of language does it denote? o Is there a simpler kind of grammar that denotes the same language? If so, give the

grammar; if not, explain why not. Briefly and clearly define the following terms: ... 8 terms chosen from the vocabulary list

on the study guide.

CS 375, Compilers: Example Questions for Final Exam

Addressing:

Consider the following declarations: type complex = record re, im: real end;

item = record name: alfa; age: integer; location: complex; displayed: boolean; color: array[(red,green,blue)] of integer; link: ^item; cost: real end;

var items = array [ 8..12, (low, med, hi), 7..11] of item;

Assuming alfa is 8 bytes, integer is 4 bytes, and real is 8 bytes, how much storage is occupied by the array items?

Show the intermediate code for the expression: items[10, hi, 8].link^.color[blue]

Show how it was derived and give the aref form.

Other Questions:

A robot moves on a square grid. The robot can go forward (f), turn left (l), or turn right (r). Give a grammar to describe the language of all sequences of moves that leave the robot pointing in the same direction as when it started.

What kind of grammar is the above (in the Chomsky hierarchy)? Describe a kind of local optimization. Describe what it means for a subexpression to be (a) available, (b) busy, (c) killed. Describe sources of extra run-time overhead in (a) time and (b) space in an object-

oriented language. Draw boxes around the following code to show the basic blocks: n := k*2; if n < j then write('less') else begin k := k - 1; write('more') end; writeln(k);

Number the blocks and draw a flow graph; give the matrix form of the flow graph and its transitive closure.

Give an advantage and a disadvantage for (a) call by reference (b) call by value. What are the most important things to optimize in a scientific program? Why?

Give three examples of computer architecture innovations that require compiler innovations to be effective.

Briefly and clearly define the following terms: ... 20 terms chosen from the vocabulary list on the study guide.

Some sample exam 2 questions:

1. BRIEFLY define the following terms and give an example of how each term is used: (4 each, no more than six on the midterm)

o SLR Parsing o Canonical LR(1) Table o LR(0) Items o LALR Parsing o Closure of a Set of LR(0) Items o Syntax-directed translation o Abstract syntax tree o Static scoping o Dynamic scoping o Declaration of a name o Use of a name o Undeclared name o Multiply declared name o Symbol table o Overloading o Parameter passing o Call by value o Call by reference o Call by value-result o Base/immutable type o Type constructor o Type coercion o Overloading o Type polymorphism o Name equivalence for types o Structural equivalence for types o Interpreter o Activation/call of a function o Lifetime of a function o Activation tree o Activation record o Alignment (to words)

2. Describe the relationship between a production and an item in an LR(0) grammar. How does this relate to the notion of the stack in an LR(0) grammar? Give an algorithm for constructing the closure of a set of items LR(0) with respect to a particular grammar. (10)

3. For the following grammar, construct the set of LR(0) states to recognize viable prefixes of this language. Then fill out an SLR parse table for this grammar and indicate whether the grammar is ambiguous. (15) S -> A $ A -> -- A B | ++ A B | id B B -> -- B | ++ B | .

4. What is the difference between SLR and LR(1) parsing? What changes when constructing an LR table versus an SLR table? How does an LALR parser differ from SLR and LR(1) parsers? (10)

5. In bison, how do you (1) resolve conflicts related to operator precedence, (2) resolve conflicts due to operator associativity, and (3) attach a syntax-directed translation action to a production? (10)

6. Below is a grammar for understanding simple arithmetic expressions: E -> E * E -> E + E -> ( E ) -> int Assuming that precedence and associativity have been handled, what translation actions would you add to the grammar to get it to print out the input expression in a postfix notation (e.g., (3 + 5) * 4 would print out as 3 5 + 4 *). Now add actions to print out the expression in a prefix notation (* + 3 5 4). (15)

7. Discuss how the semantic stack works for an LL parser and how an AST can emerge from that stack. Show the actions you would attach to the grammar in the previous question to build up an AST. Once you have added the actions, show any further adjustments you would need to make to the grammar to make it LL (1). (Assume * has higher precedence than + and both are left associative.) (15)

8. Describe static and dynamic scoping and how they differ. Give an example of a program that has different output assuming static versus dynamic scoping. (10)

9. Most programming languages have scoping rules with respect to variables. Why? Give an example of a situation that requires scoping rules to resolve (and where scoping is useful). Would it be possible to avoid having scoping rules (and if so, what are the advantages and disadvantages of your approach)? (15)

10. Define parameter passing and the parameter passing methods call by value and call by reference. Give an example of a program that works differently under the two different mechanisms. (15)

11. What are the two main jobs of a symbol table? What types of errors does it catch in performing these jobs? (10)

12. What are the main operations for a symbol table? Discuss the data structures associated with a symbol table maintained as a list of hashtables and how the operations of a symbol table are implemented in that case. Give an example of what your symbol table would look like for a sample program. (15)

13. Answer the previous question but where the symbol table is maintained as a hash table of lists. (15)

14. What is a type system? Who defines it and how is it used in creating a compiler? (10) 15. Give three examples of tasks that are typically performed in a type checker. What types

of errors are recognized in a type checker? (10)

16. What are three of the constructors available for constructing types? What operations are typically associated with these constructors and how are they used in type checking? (10)

17. What is the difference between static and dynamic type checking? What is the difference between name and structural equivalence? What is a cyclic type? Give examples to demonstrate your answers. (15)

18. Show an outline of how a piece of code might work to type check a divide operator that can apply to two operands of type int or float but does not apply to values of type bool. How would your code change if coercion from bools to ints and ints to floats is allowed (but not bools to floats)? (15)

19. How would you type check a for statement from C++? What symbol table issues are introduced by a for statement? (15)

20. What does it mean for a name to be overloaded? What sections of your compiler must be adapted to allow for overloading? (15)

21. What are the advantages and disadvantages of an interpreter? Describe two different types of interpreters (as discussed in class). (15)

22. What are some of the key questions that must be answered in order to implement an interpreter that works by first parsing an entire file and then executing the resulting AST (you may indicate other steps of compilation you think should be included before intepretation occurs)? (15)

23. How is memory generally laid out for a program? What are the various parts of the memory associated with the program used for? (10)

24. What are the components of an activation record? How is an AR used? (10) 25. Data can be allocated statically, on the stack, or in the heap. What types of variables are

the different sections of the program used for? (10)

CSCI 310 Spring 2004, Final Exam Review Questions

1. Regular expressions a. Give a regular expression for an identifier in minijava b. Give a regular expression for comments in minijava

2. Suppose we have token A matching regular expression "abc", token B matching regular expression "abc*" in our grammar. On input abcc, which token is matched? Why?

3. What is an ambiguous grammar? Why are they bad when used to design or implement programming languages?

4. Define Left-recursive and right-recursive as they pertain to grammars. 5. Consider this excerpt of our MiniJava grammar: 6. E -> E op E7. E -> E [E]8. E -> id | integerLiteral9. op -> + | - | *

a. Construct the First and Follow sets for this grammar b. Can this grammar be parsed by an LL(0) parser? If not, rewrite the grammar so that it

can be parsed by an LL(0) parser.

c. Construct the State diagram and shift/reduce table for compiling the above grammar by an LR(0) grammar.

d. Can this grammar be parsed by an LR(0) parser? Can it be parsed by an SLR parser? If not, rewrite the grammar so that it can be parsed by an SLR parser.

10. A common syntax error is a mis-spelled reserved word. How would you modify your scanner and/or parser to specifically catch misspellings of reserved words? As long as you catch them, could you just go ahead an continue compiling (and in fact output an executable)? (If so, explain how; if not, explain why not.)

11. Our compiler operates in several passes -- parser, symbol table construction, type checker, IRT construction, code generation. This is convenient, but is it necessary? Explain any features of MiniJava that require multiple passes.

12. The Token class generated by sablecc includes fields with the line number and position for each token. Assuming you do not build an abstract syntax tree, but instead pass the entire parse tree onto the later passes of the compiler, explain how you could build a Visitor class that goes through the parse tree and assigns a line number to each expression and statement node in the parse tree.

13. Explain the issues involved in typechecking 14. x = this.foo(1, 3, false);

(i.e. describe all the things you must check for correctness.)

15. Explain the use of the static link in the Activation Record. 16. Suppose you language could have functions as a return type -- i.e. a method could return a

function as a return value. Describe the issues involved in implmenting this in a compiler. 17. Translate: x = a+ 5 into an IR Tree. 18. Suppose you did not implement any operator precedence in your grammar. How will this impact

your implementation of IR Trees? 19. Data I use in my research is given to me in a form that is easy for humans to read and write, but

not very convenient for the processing I do on it. Therefore, these data files get translated into a different format. The human readable format looks like the following:

20. SUBJECT ID : S1021.22. CRITERIA 2304 : operations23. CATEGORY 4240 : does not apply24. CARDS: 4 6 7 10 13 14 15 16 18 19 21 23 26 25. CATEGORY 4241 : objects in the operation26. CARDS: 5 11 17 22 25 27. CATEGORY 4242 : coding features to change the object28. CARDS: 8 9 12 20 24 29. CATEGORY 4243 : names of the operation30. CARDS: 1 2 3 31.32. CRITERIA 2305 : things i learned in computer science33. CATEGORY 4244 : CS I34. CARDS: 1 3 5 7 8 9 10 13 17 20 35. CATEGORY 4245 : CS II36. CARDS: 2 11 12 14 16 18 19 21 24 25 26 37. CATEGORY 4246 : Data Structures38. CARDS: 4 22 39. CATEGORY 4247 : haven't learned40. CARDS: 6 15 23

41.42. CRITERIA 2306 : places i usually define things, make things, create43. CATEGORY 4248 : does not apply44. CARDS: 4 6 7 10 15 23 45. CATEGORY 4249 : defined outside of method46. CARDS: 2 5 11 14 17 18 19 25 47. CATEGORY 4250 : made or constructed inside method48. CARDS: 1 3 8 9 12 13 16 20 21 22 24 26 49. *50. SUBJECT ID : S0951.52. CRITERIA 2300 : things at same level in terms of abstraction when you

write a program53. CATEGORY 4227 : synonyms for function54. CARDS: 1 2 3 55. CATEGORY 4228 : broad terms looking at program from high level56. CARDS: 5 7 15 57. CATEGORY 4229 : things you might see within a method58. CARDS: 8 9 12 16 19 20 21 24 59. CATEGORY 4230 : things you would see in a class60. CARDS: 11 17 18 25 61. CATEGORY 4231 : things that don't fit in the others62. CARDS: 4 6 10 13 14 22 23 26

Each set of data for a subject begins with the keyword SUBJECTID a ':' and then some string that starts with S or E and is followed by integers. The data is then grouped by CRITERIA, with several CATEGORYs in the criteria, each having a set of CARDS. The data for a particular subject ends when either we hit an asterisk, in which case it will be followed by another subject, or if we hit the end-of-file, in which case there is no more data.

The format we want to put the data in is:

4240,S10,2304,does not apply,000101100100111101101010014241,S10,2304,objects in the operation,000010000010000010000100104242,S10,2304,coding features to change the object,000000011001000000010001004243,S10,2304,names of the operation,111000000000000000000000004244,S10,2305,CS I,101010111100100010010000004245,S10,2305,CS II,010000000011010101101001114246,S10,2305,Data Structures,000100000000000000000100004247,S10,2305,haven't learned,000001000000001000000010004248,S10,2306,does not apply,000101100100001000000010004249,S10,2306,defined outside of method,010010000010010011100000104250,S10,2306,made or constructed inside method,10100001100110010001110101

In this comma-separated format, the line holds the category number, the subject id, the category name, and a 26-length vector of 0's and 1's indicating which cards are in the category described by this line.

1. Give a grammar to describe the human-readable format.

2. Assuming you have a parser that can handle your grammar and produces code and a visitor structure like sablecc, how would you write a visitor to convert the parsed data into the comma-separated format?

63. Rewrite the trees in this file and this file to remove ESEQs.

CompilersG22.2130 Spring 2003

Sample Exam Questions

1.a. Construct a nondeterministic finite automaton (NFA) for each of the following

regular expressions

i. (a|b)* ii. (a*|b*)*

iii. ((|a)b*)*

b. Construct a deterministic finite automaton (DFA) from each NFA constructed above.

2.a. Construct the set of LR(0) items for the following grammar (for regular

expressions):

R -> R '|' R | RR | R* | (R) | a | b

Note that the first bar, '|', is a symbol in the alphabet, not the separator meta-symbol. The quote symbol is not in the alphabet, though.

b. Construct the SLR parse table for the above grammar. Resolve any shift-reduce conflicts according to the following precedence rules:

* has the highest precedence and is left-associative concatenation has the second highest precedence and is left associative. '|' has the lowest precedence and is left associative.

http://www.cs.xu.edu/csci310/05s/tct12.out

http://www.cs.xu.edu/csci310/05s/tct3.out

3. Describe the advantages and disadvantages of generating intermediate code (such as quads) vs. generating machine code directly from an AST.

4. Write the code for generating three-address intermediate code for a for-loop in C of the form

for(e1; e2; e3) stmt

Assume that the code for generating expressions and statements has already been written.

5.a. Construct a register interference graph for the program shown in figure 9.13 on

page 544 of the dragon book.

b. What is the minimum number of physical registers needed in order to avoid spilling?

6. Given the following Pascal program,

7. Program Bar;8. Procedure Top;9. var z: integer;10. Procedure A(x: integer);11. begin12. writeln(x+z);13. end;14. Procedure B(y: integer);15. Procedure C(Procedure Q(z:integer))16. begin17. Q(y);18. end;19. begin20. C(A);21. end22. begin 23. z := 7;24. B(6);25. end26. begin 27. Top;28. end

draw the stack, including all the elements (local variables, parameters, static link, dynamic link) of each activation record after Q is called in the body of C.

29. Define the terms basic block, flow graph, local optimization, global optimization, interprocedural optimization, and peephole optimization.

CSCI 4627Spring 2003

Answers to practice questions for exam 1

1. Give a regular expression using flex notation that descibes the set of strings that start with a, end with c and have zero or more digits between them.

a[0-9]*c

2. Write a program using flex that copies the standard input to the standard output, except that it replaces each sequence of consecutive blanks and tabs by a single blank.

3. %{4. #include <stdio.h>5. %}6. %%7. [ \t]+ {printf(" ");}8. "\n" {printf("\n");}9. . {printf("%s", yytext);}10. %%11. int main()12. {13. yylex();14. return 0;15. }

16. Show that the following grammar is ambiguous.

S -> S c S

S -> d

17. It suffices to show two different parser trees for dcdcd. 18. S S19. /|\ /|\20. / | \ / | \21. S c S S c S22. /|\ | | /|\23. / | \ d d / | \24. S c S S c S25. | | | |26. d d d d

27. Show a parse tree and leftmost derivation for string cccdd with respect to grammar

S -> c S d

S -> c

S => cSd

=> ccSdd

=> cccdd

29.30. S31. /|\32. / | \33. c S d34. /|\35. / | \36. c S d37. |38. c

39. What are the First and Follow sets for the following grammar? The start nonterminal is E.

E -> A

E -> L

A -> n

A -> i

L -> ( S )

S -> E , S

S -> E

First(E) = {'n', 'i', '(' }

First(A) = {'n', 'i' }

First(L) = {'(' }

First(S) = {'n', 'i', '(' }

Follow(E) = {',', ')', $ }

Follow(A) = {',', ')', $ }

Follow(L) = {',', ')', $ }

Follow(S) = {')' }

41. Left factor the grammar of the preceding question and then write a recursive descent parser for it. The parser should only read a string and tell whether it is in this language.

The tokens of this language are characters. Presume a lexical analyzer is available, and that statement match(c) will check the current lookahead token to see whether it is character c. If so, it will put the next token into variable lookahead. If not, it will print "no" and stop the program. Statement INIT_LEXER initializes the lexer, setting lookahead to the first token.

E -> A

E -> L

A -> n

A -> i

L -> ( S )

S -> E S'

S'-> , S

S'-> (empty)

void E(), A(), L(), S(), Sprime();

void E() { if(lookahead == '(') L(); else A(); }

void A() { if(lookahead == 'n') match('n'); else match('i'); }

void L() { match('('); S(); match(')');

}

void S() { E(); Sprime(); }

void Sprime() { if(lookahead == ',') { match(','); S(); } }

int main() { INIT_LEXER; E(); printf("yes"); return 0; }

42. When a (top-down) LL(1) parser needs to decide whether to use a production, how much of the sequence of tokens generated by the right-hand side of the production can the parser see? When a (bottom-up) SLR(1) parser needs to make the same decision, how much does it see?

The LL(1) parser sees just the first token of the sequence of tokens generated by the right-hand side of the production. The SLR(1) parser sees the entire right hand side plus one more token after that.

Midterm #1: Sample Questions (v1.2)

This page includes questions by the instructor (John Boyland) and will include questions that students in CompSci 654 wrote as part of Homework #4 (edited and made anonymous). At least one of the (appropriate) questions will appear on the first midterm. The actual midterm will have four questions, although some may consist of a set of short-answer/review questions.

Review Questions

Any of the review questions from Chapters 1 and 2 may appear on the midterm.

Definitions and Short Answers

If one is designing and implementing a new language, what steps are needed? What is the difference between a phase of a compiler and a pass of a compiler? Briefly describe why one may wish to split a compile into modules corresponding to the phases.

Give advantages and disadvantages to using tools such as flex and bison to implement parts of a compiler.

Describe what the ``front-end'' and ``back-end'' of a compiler are. Why are they distinguished? What is a checked dynamic semantic error ? What is maximal munch? When does it come into play? What should you do about it? Scanners need to return a series of tokens, whereas a DFA just returns Y or N. How does one

implement a scanner using a DFA? What problems can you foresee in the implementation? What is the difference between a deterministic and a nondeterministic automaton ? Why is it important that the scanner not frequently look at characters multiple times ? List four ways in which a lexical error can be handled. What method does the Cool compiler

use ? What advantages do context-free grammars have over regular expressions for describing

programming languages. What is a derivation ? What is a sentential form ? What is the yield of a parse tree ? Compare and contrast top-down and bottom-up parsing in terms of complexity, class of

grammars accepted, contents of the parse stack, handling of tokens and type of derivation obtained.

What are the advantages of bottom-up parsing over top-down parsing ? What are FIRST and FOLLOW sets used for in SLL parsing? Which set is used for SLR parsing? The simplest way to handle an error is to print a message and then stop. Why should one do

more? Describe at least threee different ways to handle errors, pointing out advantages and disadvantages.

What is a shift/reduce conflict? a reduce/reduce conflict? What is an ambiguous grammar? Give an example. What connection is there between ambiguous grammars, inherently ambiguous languages and

parsing conflicts? Why don't we ever get shift/shift conflicts in LR parsing ? What is the difference between a teminator and a separator in a list of items?

Algorithms

Regular Expressions

Construct a regular expression that matches the language of all strings of ones and zeroes which contain exactly 3 zeroes.

Automata for regular expressions

Give the NFA construction for the following regular expression. Then find a DFA for the same language: (0*1+)*0*

Same as above for (aa|bb)*.

Construct an automaton for the following regular expression for Ada integers. A nondeterministic automaton is OK.

[0-9](_?[0-9])*(#[0-9A-Fa-f](_?[0-9A-Fa-f])*#)?

Regular expressions for automata

Give a regular expression for the following automata where the set of states is {A,O}; A is the start state and O is the only final state. The transitions are written as triples:

(from,chars,to)The automaton can go from from to to when the input is any character in the set given ([ ] denotes a space character):

1. First Automaton 2. (A,0,A)3. (A,1,O)4. Second Automaton (where "e" means epsilon) 5. (A,0,A)6. (A,e,O)7. (O,e,A)8. (O,1,O)

Constructing parse trees

Given the following grammar

a -> t + tt -> f * f | ff -> ID | INT_CONSTconstruct a parse tree for 3*5+7. (What additions would be necessary to this grammar to let it parse 3*5*7? Is the grammar ambiguous now ?)

LR parsing tables

Given the following grammar, give the LR(0) parse tables (the characteristic finite-state machine (CFSM)):

S -> EE -> E '+' EE -> 'a'

List all the LR items associated with the following grammar. Then construct the CFSM.

S -> expr expr -> expr '*' primary | primaryprimary -> '7'

Synthesis

Floating point numbers

It has been proposed to add floating point numbers to Cool. The proposed form for floating point constants is that a single dot occurs anywhere within a non-empty string of digits (including at beginning or end) and that immediately following may appear an exponent of the following form: an ``e'' either upper or lower case, then an optional plus or minus sign and then a nonempty string of digits. Write a regular expression suitable for flex that matches exactly this set of floating-point constants. You may use flex definitions.

Cool ``if'' expressions

It has been proposed to making the else part of a conditional optional for Cool. Write the modified context-free grammar production(s). Do we get the ``dangling else'' problem? Explain!

Give a LR(1) context-free grammar for Cool match expressions.

Arrays in Cool

Write the extra productions to the syntax that would be required to use a nicer syntax for arrays in Cool. Make sure your syntax can handle expressions such as

a[1] := a[i] + 1What new tree node constructors would we need? Give their C prototypes, and give the bison actions for your new syntactic rules.

Analysis

Regular Expressions

The following regular expression was proposed for C comments:

"/*" [*]* [^*]* [*]* "*/" but it didn't match the following comment:

/* Here Y=2*x+1; */and it ate up this entire line as a single comment:

/**/ y = "Hello, world\n"; /**/

Explain why the first comment isn't matched and the second ``comment'' was matched. Then fix the problem.

Some languages (such as Cool) have line comments; others (such as C) use bracketed comments (see above); C++ has both kinds. Compare and contrast the benefits and problems of each kind. In your answer touch on the following issues:

The various purposes of comments. Whether it is clear what things are comments and what things are not. Whether usage is likely to lead to errors. How easy they are to implement.

Does there exist a regular expression which describe non-empty strings containing the same number of a's and b's and no other characters? What about a context-free grammar ?

Finite State Automata

Describe the following DFA in both plain (short!) English and as a regular expression. There are four states {A,B,C,D}; A is the start state and D is the only final state:

(A,1,A)(A,0,B)(B,1,C)(B,0,B)(C,0,D)(C,1,A)(D,0,B)(D,1,C)

Associativity

Given the following grammar

expr -> term | expr '+' term term -> factor | term '*' factorfactor -> INT | '(' expr ')'is + left or right associative? What about *? Does * have higher or lower precedence than + ? Rewrite the grammar so that it accepts the smae language, but the associativity of + is reversed.

Ambiguity

In language Drool, we have two operators: the dispatch operator ., and the assignment operator =, which is right-associative. The former has a higher precedence than the latter. Give a non-ambiguous CFG so that expressions such as the following are parsed correctly:

z = x = obj1.f1().f2()

Given the following context-free grammar

stmt : if_then | if_then_else | other_stmtsif_then : IF condition THEN stmt

if_then_else : IF condition THEN stmt ELSE stmtYou may assume that nonterminals other_stmts and condition have already been defined.

1. Show that this context-free grammar is ambiguous. 2. Rewrite the grammar so that it is not ambiguous, but still accepts the same language

Prove that the following grammar is ambiguous. Write a grammar that accepts the same language that is not ambiguous. Is the language inherently ambiguous? Is it regular?

S -> z S | ZZ -> | Zz

LL Parsing

Given the following grammar:

stmtlist -> stmtlist stmt | stmtstmt -> noun iverb adv | noun tverb nounnoun -> COMPILERS | I | YOUiverb -> SLEEP | WORKadv -> HARD | WELLtverb -> LOVE | HATE | SEEIs it LL(1) ? LL(k) for any k ? If so, give predict sets. If not, how can it be corrected to be LL(k)? LL(1)?

LR Parsing

Given the context-free gramar

expr1 : expr2 ':' | expr1 ';' ;expr2 : expr2 '%' ID | ID ;Is this grammar LL(1), LR(1).

Given the following grammar fragment for Cool:

expr : ... | IF '( expr ')' expr ELSE expr | expr '+' expr | ... ;When the Cool grammar is processed by bison we get a shift-reduce conflict for '+':

state 146

expr -> IF '(' expr ')' expr ELSE expr . (rule 35) expr -> expr . '+' expr

'+' shift, and go to state 92

'+' [reduce using rule 35 (expr)] $default reduce using rule 35 (expr)

Give a sample Cool expression that will lead the parser into such a state when parsing. In this case, what is the correct action for the parser to take?

Describe the ``dangling else'' problem of Pascal and C/C++. Is the problem we see here an instance of the ``dangling else'' problem?

If we don't correct the problem, how will bison resolve the conflict? Will this achieve the correct result for all inputs?

How can the problem be fixed if we do not rely on bison's default conflict resolution?

-1 down vote favorite

P -> {D ; C}

D -> d; D| d

C -> c; C | c

a) Is the grammar LL(1)? Explain your answer.

b) Is the grammar SLR(1)? Explain your answer.

c) Is the grammar LR(1)? Explain your answer.

d) Is the grammar LALR? Explain your answer.

As for my answers I actually got no for them all... so I'm thinking I did something wrong

Here is my explanation.

a) It is not LL(1) because it is not left factored.

b) It is not SLR, because of the transition diagram

item 2 ( which is... )

D-> d . ; D

D-> d .

We need to consult the follow set, Follow(D) = ;

Therefore this is not SLR

c) It is not LR(1) because of...

item 1

P-> {D.;C} , $

D-> .d;D , ;

D-> .d , ;

item 2

D-> d.; D , ;

D-> d. , ;

item 3

D-> d; . D , ;

D-> .d;D , ;

D-> .d , ;

Since item 2 goes to item 3 with ;, AND "D-> d."'s (in item 2) look ahead token is also ;. this causes a reduce to shift conflict, therefore this grammar is not LR(1)

d) This grammar is not LALR because it is not LR(1)

Thanks for your help!

Compiler Questions

Documents

Transcript of Compiler Questions