编译器设计07

这一节还是继续深入语法的研究。主要讲述了循环语句和条件判断语句,错误同步和恢复,Pascal的语法检查器,for循环语句和If条件语句的的解析。
展开查看详情

1.CS 153: Concepts of Compiler Design September 11 Class Meeting Department of Computer Science San Jose State University Fall 2018 Instructor: Ron Mak www.cs.sjsu.edu/~mak 1

2.2 Quick Review of the Framework FROM: TO: Next topic: Parsing control statements, and generating parse trees. Chapter 7

3.3 Pascal Control Statements Looping statements REPEAT UNTIL WHILE DO FOR TO FOR DOWNTO Conditional statements IF THEN IF THEN ELSE CASE

4.4 Statement Syntax Diagram

5.5 Pascal Statement Parsers New statement parser subclasses. RepeatStatementParser WhileStatementParser ForStatementParser IfStatementParser CaseStatementParser Each parse() method builds a parse subtree and returns the root node.

6.6 REPEAT Statement Example: Keep looping until the boolean expression becomes true . Execute the loop at least once. Use LOOP and TEST nodes for source language independence. Exit the loop when the test expression evaluates to true . REPEAT j := i ; k := i UNTIL i <= j

7.7 Syntax Error Handling Recall that syntax error handling in the front end is a three-step process . Detect the error. Flag the error. Recover from the error. Good syntax error handling is important!

8.8 Options for Error Recovery Stop after the first error. No error recovery at all. Easiest for the compiler writer, annoying for the programmer. Worse case: The compiler crashes or hangs. Become hopelessly lost. Attempt to continue parsing the rest of the source program. Spew out lots of irrelevant and meaningless error messages. No error recovery here, either … … but the compiler writer doesn ’ t admit it!

9.9 Options for Error Recovery , cont’d Skip tokens after the erroneous token until … The parser finds a token it recognizes, and It can safely resume syntax checking the rest of the source program.

10.10 Parser Synchronization Skipping tokens to reach a safe, recognizable place to resume parsing is known as synchronizing . “ Resynchronize the parser ” after an error. Good error recovery with top-down parsers can be more art than science. How many tokens should the parser skip? Skipping too many (the rest of the program?) can be considered “ panic mode ” recovery. For this class, we ’ ll take a rather simplistic approach to synchronization.

11.11 public Token synchronize ( EnumSet syncSet ) throws Exception { Token token = currentToken (); if ( ! syncSet.contains ( token.getType ()) ) { errorHandler.flag (token, UNEXPECTED_TOKEN, this); do { token = nextToken (); } while (!(token instanceof EofToken ) && ! syncSet.contains ( token.getType ()) ); } return token; } Method synchronize() The synchronize() method of class PascalParserTD . Pass it an enumeration set of “ good ” token types. The method skips tokens until it finds one that is in the set . Flag the first bad token. Recover by skipping tokens not in the synchronization set. Resume parsing at this token! (It ’ s the first token after the error that is in the synchronization set. wci.frontend.pascal.PascalParserTD

12.Parse a REPEAT Statement 12 public ICodeNode parse (Token token)     throws Exception {     ICodeNode statementNode = null;     switch (( PascalTokenType ) token.getType ()) { ...         case REPEAT : {             RepeatStatementParser repeatParser =                 new RepeatStatementParser (this);             statementNode = repeatParser.parse (token) ;             break;         } ...     }     ...     return statementNode ; } wci.frontend.pascal.parsers.StatementParser

13.public ICodeNode parse (Token token)     throws Exception {     token = nextToken ();  // consume the REPEAT     // Create the LOOP and TEST nodes.     ICodeNode loopNode = ICodeFactory.createICodeNode (LOOP);     ICodeNode testNode = ICodeFactory.createICodeNode (TEST);     // Parse the statement list terminated by the UNTIL token.     // The LOOP node is the parent of the statement subtrees.     StatementParser statementParser = new StatementParser (this);     statementParser.parseList (token, loopNode , UNTIL, MISSING_UNTIL);     token = currentToken ();     // Parse the expression.     // The TEST node adopts the expression subtree as its only child.     // The LOOP node adopts the TEST node.     ExpressionParser expressionParser = new ExpressionParser (this);     testNode.addChild ( expressionParser.parse (token) );     loopNode.addChild ( testNode );     return loopNode ; } Parse a REPEAT Statement , cont’d 13 wci.frontend.pascal.parsers.RepeatStatementParser

14.Parse a REPEAT Statement , cont’d 14 protected void parseList (Token token, ICodeNode parentNode ,                          PascalTokenType terminator,                          PascalErrorCode errorCode )     throws Exception {     // Synchronization set for the terminator.     EnumSet < PascalTokenType > terminatorSet = STMT_START_SET.clone ();     terminatorSet.add (terminator);     // Loop to parse each statement until the END token     // or the end of the source file.     while (!(token instanceof EofToken ) &&            ( token.getType () != terminator)) {         // Parse a statement.  The parent node adopts the statement node.         ICodeNode statementNode = parse(token);         parentNode.addChild ( statementNode );         token = currentToken ();         TokenType tokenType = token.getType (); // Synchronization set for starting a statement. protected static final EnumSet < PascalTokenType > STMT_START_SET =     EnumSet.of (BEGIN, CASE, FOR, PascalTokenType.IF , REPEAT, WHILE,                IDENTIFIER, SEMICOLON); wci.frontend.pascal.parsers.StatementParser

15.Parse a REPEAT Statement , cont’d 15         // Look for the semicolon between statements.         if ( tokenType == SEMICOLON) {             token = nextToken ();  // consume the ;         }         // If at the start of the next statement, then missing a semicolon.         else if ( STMT_START_SET.contains ( tokenType )) {             errorHandler.flag (token, MISSING_SEMICOLON, this);         }         // Synchronize at the start of the next statement         // or at the terminator.         token = synchronize( terminatorSet );     }     // Look for the terminator token.     if ( token.getType () == terminator) {         token = nextToken ();  // consume the terminator token     }     else {         errorHandler.flag (token, errorCode , this);     } } wci.frontend.pascal.parsers.StatementParser

16.16 Pascal Syntax Checker II: REPEAT Demo (Chapter 7) java - classpath classes Pascal compile - i repeat.txt java - classpath classes Pascal compile – i repeaterrors.txt

17.17 WHILE Statement Example: Exit the loop when the test expression evaluates to false . How can we eliminate the NOT node? WHILE i > j DO k := I

18.18 Class WhileStatementParser // Synchronization set for DO. private static final EnumSet < PascalTokenType > DO_SET = StatementParser.STMT_START_SET.clone (); static { DO_SET.add (DO); DO_SET.addAll ( StatementParser.STMT_FOLLOW_SET ); } // Synchronization set for starting a statement. protected static final EnumSet < PascalTokenType > STMT_START_SET = EnumSet.of (BEGIN, CASE, FOR, PascalTokenType.IF , REPEAT, WHILE, IDENTIFIER, SEMICOLON); // Synchronization set for following a statement. protected static final EnumSet < PascalTokenType > STMT_FOLLOW_SET = EnumSet.of (SEMICOLON, END, ELSE, UNTIL, DOT); DO_SET contains all the tokens that can start a statement or follow a statement, plus the DO token. wci.frontend.pascal.parsers.StatementParser wci.frontend.pascal.parsers.WhileStatementParser

19.public ICodeNode parse (Token token) throws Exception { token = nextToken (); // consume the WHILE ICodeNode loopNode = ICodeFactory.createICodeNode (LOOP); ICodeNode testNode = ICodeFactory.createICodeNode (TEST); ICodeNode notNode = ICodeFactory.createICodeNode ( ICodeNodeTypeImpl.NOT ); loopNode.addChild ( testNode ); testNode.addChild ( notNode ); ExpressionParser expressionParser = new ExpressionParser (this); notNode.addChild ( expressionParser.parse (token)); token = synchronize(DO_SET); if ( token.getType () == DO) { token = nextToken (); // consume the DO } else { errorHandler.flag (token, MISSING_DO, this); } StatementParser statementParser = new StatementParser (this); loopNode.addChild ( statementParser.parse (token) ); return loopNode ; } 19 Class WhileStatementParser , cont ’ d Synchronize the parser here! If the current token is not DO , then skip tokens until we find a token that is in DO_SET . We ’ re in this method because the parser has already seen WHILE . wci.frontend.pascal.parsers.WhileStatementParser

20.20 Pascal Syntax Checker II: WHILE We can recover (better) from syntax errors. Demo. java - classpath classes Pascal compile - i while.txt java - classpath classes Pascal compile - i whileerrors.txt

21.21 FOR Statement Example: Initial assignment. Node type GT for TO and LT for DOWNTO . DO statement. Increment/decrement: Node type ADD for TO and SUBTRACT for DOWNTO . FOR k := j TO 5 DO n := k

22.22 Pascal Syntax Checker II: FOR Demo . java - classpath classes Pascal compile - i for.txt java - classpath classes Pascal compile - i forerrors.txt

23.23 IF Statement Example: Third child only if there is an ELSE . IF ( i = j) THEN t := 200 ELSE f := -200;

24.24 The “ Dangling ” ELSE Consider: Which THEN does the ELSE pair with? Is it: IF i = 3 THEN IF j = 2 THEN t := 500 ELSE f := -500 Or is it: IF i = 3 THEN IF j = 2 THEN t := 500 ELSE f := -500 IF i = 3 THEN IF j = 2 THEN t := 500 ELSE f := -500

25.25 The “ Dangling ” ELSE , cont’d According to Pascal syntax, the nested IF statement is the THEN statement of the outer IF statement IF i = 3 THEN IF j = 2 THEN t := 500 ELSE f := -500 Therefore, the ELSE pairs with the closest (i.e., the second) THEN .

26.26 Scanner and Parser Rules of Thumb Scanner At any point in the source file, extract the longest possible token . Example: <<= is one shift-left-assign token Not a shift-left token followed by an assign token Parser At any point in the source file, parse the longest possible statement . Example: IF i = 3 THEN IF j = 2 THEN t := 500 ELSE f := -500

27.27 Pascal Syntax Checker II: IF Demo. java - classpath classes Pascal compile - i if.txt java - classpath classes Pascal compile - i iftest.txt

28.Assignment #3: WHEN Statement Add a new WHEN statement to Pascal! Example: Semantically similar to: 28 WHEN i = 1 => f := 10; i = 2 => f := 20; i = 3 => f := 30; i = 4 => f := 40; OTHERWISE => f := -1 END IF i = 1 THEN f := 10 ELSE IF i = 2 THEN f := 20 ELSE IF i = 3 THEN t := 30 ELSE IF i = 4 THEN f := 40 ELSE f := -1;

29.Assignment #3 , cont’d Draw a syntax diagram to describe the grammar of the WHEN statement. Start with Chapter 8’s source code. Do the work in two stages: Modify the frontend parser to parse the WHEN statement and build an appropriate parse tree. Modify the backend executor to execute the parse tree. 29