中间代码生成

在编译计算机语言的过程中我们希望输出的结果能够更接近实际的机器,容易操作,目标中立以及可以理解,本章节就主要介绍了为达到这些,我们中间代码的生成需要如何一步一步的进行编译以及在代码生成的过程中需要注意的问题。
展开查看详情

1.Chap 8: Intermedicate Code Generation Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way, Unit 2155 Storrs, CT 06269-3155 steve@engr.uconn.edu http://www.engr.uconn.edu/~steve (860) 486 - 4818 Material for course thanks to: Laurent Michel Aggelos Kiayias Robert LeBarre

2.Overview Goal: Generate a Machine Independent Intermediate Form that is Suitable for Optimization and Portability We’ll Focus on Revisit AST and DAGs with Attribute Grammars Introduce and Elaborate on Three Address Coding Review 3AC for: Declarations Assignments and Boolean Expressions Case Statements and Procedure Calls Review Prof. Michel’s Approach to Intermediate Code Generation Concluding Remarks/Looking Ahead

3.Motivation What we have so far... A Parser tree With all the program information Known to be correct Well-typed Nothing missing No ambiguities What we need... Something “Executable” Closer to An operations schedule Actual machine level of abstraction

4.What We Want A Representation that Is closer to actual machine Is easy to manipulate Is target neutral (hardware independent) Can be interpreted

5.Recall ASTs and DAGs Intermediate Forms of a Program: ASTs: Abstract Syntax Trees (below left) DAGs: Directed Acyclic Graphs (below right) What is the Expression? assign a + * * b c uminus b c uminus assign a + * b c uminus

6.Attribute Grammar Attribute Grammar for Generating an AST as a side Effect of the Translation Process:

7.Representing Attribute Grammars Two Different Forms: Linked Data Structure Multi-Dimensional Array

8.Objective Directly Generate Code From AST or DAG as a Side Effect of Parsing Process (Attribute Grammar) Consider Code Below: Each is Referred to as “3 Address Coding (3AC)” since there are at Most 3 Addresses per Statement One for Result and At Most 2 for Operands

9.The Intermediate Code Generation Machine A Machine with Infinite number of temporaries (think registers) Simple instructions 3-operands Branching Calls with simple calling convention Simple code structure Array of instructions Labels to define targets of branches.

10.Temporaries The machine has an infinite number of temporaries Call them t 0 , t 1 , t 2 , .... Temporaries can hold values of any type The type of the temporary is derived from the generation Temporaries go out of scope with the function they are in

11.What is Three-Address Coding? A simple type of instruction 3 / 2 Operands x,y,z Each operand could be A literal A variable A temporary Example x := y op z x + y * z t 0 := y * z t 1 := x + t 0 x := op z

12.Types of Three Address Statements Assignment Statements of Form: X := Y op Z op is a Binary Arithmetic or Logical Operation Assignment Instructions of Form: X := op Y op is Unary Operation such as Unary Minus, Logical Negative, Shift/Conversion Operations Copy Statements of Form: X := Y where value of Y assigned to X Unconditional Jump of Form: goto L which goes to a three address statement labeled with L

13.Types of Three Address Statements Conditional Jumps of Form: if x relop y goto L with relop as relational operators and the goto executed if the x relop y is true Parameter Operations of Form: param a (a parameter of function) call p, n (call function p with n parameters) return y (return value y from function – optional) param a param b param c call p, 3

14.Types of Three Address Statements Indexed Assignments of Form: X := Y[i] (Set X to i-th memory location of Y) X[i] := Y (Set i-th memory location of X to Y) Note the limit of 3 Addresses (X, Y, i) Cannot do: x[i] := y[j]; (4 addresses!) Address and Pointer Assignments of Form: X := & Y (X set to the Address of Y) X := * Y (X set to the contents pointed to by Y) * X := Y (Contents of X set to Value of Y)

15.Attribute Grammar for Assignments Concepts: Need to Introduce Temporary Variables as Necessary do Decompose Assignment Statement Every Generated Line of Code Must have at Most 3 Addresses! Three Attributes S.Code : Three Address Code for Non-Term S E.Place : The Name that Holds Value of E E.Code : Sequence of 3AC for Expression E Function: newtemp: Generate a New temp Variable

16.Attribute Grammar How would Grammar work for: A := - B X := Y + Z W := B*C + D

17.Three Address Code for While Loops Grammar Rule: S  while E do S 1 Conceptually: Check Expression E If Not True, Skip to First Statement after S 1 Execute S 1 Goto: Check Expression E Function: newlabel : Generates a new label for goto/jump Generate labels for Check Expression and 1 st Statement after S 1

18.Attribute Grammar for While

19.Declarations Recall Runtime Environment in Chapter 7 One Critical Issue was Memory Layout (Static, Stack, Heap) Stack Utilized during Procedure/Function Calls to Allocate Space for P/F Variables This now Includes Temporaries for 3AC We need to Track Name Type (Int, real, boolean, etc.) Offset (with respect to some relative address) Function enter (name, type, offset) creates symbol table entry offset global initially 0

20.Attribute Grammar for Memory Layout Note: Does not include Temporaries (Need to also Allocate Memory for them as well) How could this be supported?

21.Handling Nested Procedure/Functions Recall that in Pascal, Procedures and Functions can be Defined within Other Procedures/Functions Grammar: P  D ; S D  D ; D | id: T | proc id; D ; S In this Situation, Symbol Table Can’t be Monolithic Structure but Must Support Nested Declarations Notations: M, N: Marker Non Terminals so that All S. Actions Occur at End of RHS of Rule mktable (create ST), enter (ST entry) addwidth – size of symbol table enterproc – Add a Procedure Name to ST

22.Nested Symbol Table Recall Quicksort from Chapter 7

23.Corresponding Attribute Grammar

24.From Concatenation to Generation Up to Now, Attribute Grammars have focused on: Concatenation to Put Together a Long String Resulting String Output at “End” However, this is not as Feasible from an Implementation Perspective Instead, we Would Like to Generate 3AC Statements as they Occur Introduce emit Function At Relevant Stages of Attribute Grammar, output the entire 3AC Statement into a File (Stream) Assumes Names of Symbols (IDs) have been Inserted into Symbol Table with Types by Combination of Lexical Analysis and Declarations

25.Revised Attribute Grammar with Emit

26.Attribute Grammar for Boolean Expressions

27.Generated Code Consider: a < b or c < d and e < f How Does Attribute Grammar Generate Code? Do RM Derivation and Create Parse Tree 100: if a < b goto 103 101: t1 := 0 102: goto 104 103: t1 := 1 104: if c < d goto 107 105: t2 := 0 106: goto 108 107: t2 := 1 108: if e < f goto 111 109: t3 := 0 110: goto 112 III: t3 := 1 112: t4 := t2 and t3 113: t5 := t1 or t4

28.Code Generation - Conceptual For If-Then, If-Then-Else, and While-Do

29.Code Generation - Conceptual For If-Then, If-Then-Else, and While-Do