Grammar Variation in Compiler Design

25
Grammar Variation in Compiler Design Carl Wu

description

Grammar Variation in Compiler Design. Carl Wu. Three topics. Syntax Grammar vs. AST Component(?)-based grammar Aspect-oriented grammar. Grammar vs. AST (I). How to automatically generate a tree from a grammar?. Grammar vs. AST (I). Stmt ::= Block | “if” Expr “then” Stmt - PowerPoint PPT Presentation

Transcript of Grammar Variation in Compiler Design

Page 1: Grammar Variation in Compiler Design

Grammar Variation in Compiler Design

Carl Wu

Page 2: Grammar Variation in Compiler Design

Three topics

• Syntax Grammar vs. AST

• Component(?)-based grammar

• Aspect-oriented grammar

Page 3: Grammar Variation in Compiler Design

Grammar vs. AST (I)

How to automatically generate a tree from a grammar?

Page 4: Grammar Variation in Compiler Design

Grammar vs. AST (I)

Stmt ::= Block

| “if” Expr “then” Stmt

| IdUse “:=” Exp

Page 5: Grammar Variation in Compiler Design

Grammar vs. AST (I)

Stmt ::= Block | “if” Exp “then” Stmt | IdUse “:=” Exp

JastAdd Specification (Tree)abstract Stmt;BlockStmt : Stmt ::= Block;IfStmt : Stmt ::= Exp Stmt;AssignStmt : Stmt ::= IdUse Exp;

Page 6: Grammar Variation in Compiler Design

Grammar vs. AST (I)

Restricted CFG Definition

A ::= B C D √ => aggregation

A ::= B | C | D √ => inheritance

A ::= B C | D ×

Page 7: Grammar Variation in Compiler Design

Grammar vs. AST (I)

RCFG Specification

Stmt :: Block | IfStmt | AssignStmt

IfStmt :: “if” Exp “then” Stmt

AssignStmt :: IdUse “:=” Exp

AssignStmtBlockIfStmt

Stmt

Exp Stmt IdUse Exp

Page 8: Grammar Variation in Compiler Design

Grammar vs. AST (II)

Parse tree vs. IR tree

Page 9: Grammar Variation in Compiler Design

Grammar vs. AST (II)

• In an IDE, there are multiple visitors for the same source code (>12 !).

• Different requirement for the tree structure:– Syntax vs. semantics– Immutable vs. transformable (optimization)– Parse tree vs. IR tree

Page 10: Grammar Variation in Compiler Design

Grammar vs. AST (II)

• Generate two tree structures from the same grammar!

• One immutable, strong-typed, concrete parse tree – Read only!

• One transferable, untyped, abstract IR tree – Read and write!

Page 11: Grammar Variation in Compiler Design

Grammar vs. AST (II)IfStmt :: “if” Exp “then” Stmt

Class ASTNode{protected ASTNode[] children;

}class IfStmt extends ASTNode{

final protected Token token_if, Exp exp, Token token_then, Stmt stmt;IfStmt(Token token_if, Exp exp, Token token_then, Stmt stmt){

// parse tree construction this.token_if = token_if;this.exp = exp;this.token_then = token_then;this.stmt = stmt;// IR tree constructionchildren[0] = exp;children[1] = stmt;

}}

Page 12: Grammar Variation in Compiler Design

Component(?)-based grammar

Page 13: Grammar Variation in Compiler Design

Component vs. module

• What is the different between a component and a module?

• What is a modularized grammar?

• What is an ideal component-based grammar?

Page 14: Grammar Variation in Compiler Design

Component vs. module

Grammar Component

Grammar Component

Grammar Component

Grammar Component

ParserParser

ParserParser

Grammar Module

Grammar Module

Grammar Module

Grammar Module

GrammarGrammar

ParserParser

Modularized grammar

Component-based grammar

Page 15: Grammar Variation in Compiler Design

Benefits

• Benefits from modularized grammar– Easy to read, write, change– Eliminate naming conflicts

• Additional benefits brought from component-based grammar– Each component can be designed, developed and

tested individually. – Any change to certain component does not require

compiling all the other components.– Different type of grammars/parsing algorithms can be

used for different component, e.g., one component can be LL, one can be LALR.

Page 16: Grammar Variation in Compiler Design

Difficulty in designing component-based grammar

• No clear guards between two components. – Switch the control to a new parser or stay in the

same?– Suitable for embed languages, e.g., Jscript in Html– Not suitable for an integral language, e.g., Java

• Two much coupling between two components. – Not just reuse the component as a whole, may also

reuse the internal productions and symbols.– Not applicable for LR parsers, once the table is built,

you can’t reuse the internal productions (no way to jump into a table).

Page 17: Grammar Variation in Compiler Design

Ideal vs. reality

JavaClass

Interface

Object_type

Statement

Expression

Type

Binary_expr

Unary_expr

Primary

Array

JavaClass

Interface

Object_type

Statement

Expression

Type

Binary_expr

Unary_expr

Primary

Array

Page 18: Grammar Variation in Compiler Design

Suggestions?

Page 19: Grammar Variation in Compiler Design

Aspect-oriented grammar

Page 20: Grammar Variation in Compiler Design

Aspect-oriented grammar

• Join-point: grammar patterns that crosscut multiple productions

• Punctuations, identifiers, modifiers…

Page 21: Grammar Variation in Compiler Design

Example

• ";“ appears 25 times in one of the Java grammars

• “.” appears 74 times in one of the Cobol grammars

• Every one of them should be carefully placed!

Page 22: Grammar Variation in Compiler Design

<Sentence> ::= <Accept Stm> '.' | <Add Stm> '.' | <Add Stm Ex> <End-Add Opt> '.' | <Call Stm> '.' | <Call Stm Ex> <End-Call Opt> '.' | <Close Stm> '.' | <Compute Stm> '.' | <Compute Stm Ex> <End-Compute Opt>

'.' | <Display Stm> '.' | <Divide Stm> '.' | <Divide Stm Ex> <End-Divide Opt> '.' | <Evaluate Stm> <End-Evaluate Opt> '.' | <If Stm> <End-If Opt>'.' | <Move Stm> '.' | <Move Stm Ex> <End-Move Opt> '.' | <Multiply Stm>'.' | <Multiply Stm Ex> <End-Multiply Opt> '.'

| <Open Stm> '.' | <Perform Stm> '.' | <Perform Stm Ex> <End-Perform Opt>

'.' | <Read Stm> '.' | <Read Stm Ex> <End-Read Opt> '.' | <Release Stm> '.' | <Rewrite Stm> '.' | <Rewrite Stm Ex> <End-Rewrite Opt> '.' | <Set Stm> '.' | <Start Stm> '.' | <Start Stm Ex> <End-Start Opt> '.' | <String Stm> '.' | <String Stm Ex> <End-String Opt> '.' | <Subtract Stm>'.' | <Subtract Stm Ex> <End-Substract Opt>

'.' | <Write Stm> '.' | <Write Stm Ex> <End-Write Opt> '.' | <Unstring Stm>'.' | <Unstring Stm Ex> <End-Unstring Opt> '.' | <Misc Stm> '.'

pointcut PreDot(): <Sentence>;

after PreDot(): ‘.'

Page 23: Grammar Variation in Compiler Design

Another example

pointcut Content(): … …

before Content(): “(”;

after Content(): “)”;

Guarantee they match!

Page 24: Grammar Variation in Compiler Design

Grammar weaving

Base GrammarBase Grammar

Grammar AspectGrammar Aspect

Result grammarResult grammar

ParserParser

Page 25: Grammar Variation in Compiler Design

What do you think?