探索类型和对等

The computer can not understand the high-level language directly, only can understand the machine language directly, so it is necessary to translate the high-level language into machine language, the computer can run the program written in the high-level language. There are two ways of Translation: one is compiling, the other is interpretation. Compiled language programs written before execution, the program has been compiled, for example, exe file, to be executed in the future without having to translate again. The result of compilation can be used directly (exe file). Since the translation is only done once, it is not necessary to translate when executing, so the program execution efficiency of compiled language is high. Interpretive language programs are translated at the time of execut
展开查看详情

1.Chapter 6 Exploring Types and Equivalence Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way, Box U-255 Storrs, CT 06269-3255 Steven.Demurjian@uconn.edu http://www.engr.uconn.edu/~steve (860) 486–4818 (Office) (860) 486-3719 (CSE Office)

2.Review a Number of Other PPTs Org of Programming Languages-Cheng (Fall 2004) Sebesta chapter 6 condensed –easier to follow http://www.cse.msu.edu/~cse452/Fall2004/Lectures/06-data-types-school.ppt Type Systems and Structures- Bermúdez https://www.cise.ufl.edu/class/cop5556sp16/Lecture_22 .ppt Steve’s Types and Type Checking Type Systems – Doupé https://adamdoupe.com/teaching/classes/cse340-principles-of-programming-languages-f15/slides/05-Type-Systems.pptx

3.CSE 452: Programming Languages Data Types http://www.cse.msu.edu/~cse452/Fall2004/Lectures/06-data-types-school.ppt

4.Where are we? High-level Programming Languages Logic Functional Imperative Concepts specification (syntax, semantics) variables (binding, scoping, types, …) statements (control, selection, assignment,…) Implementation compilation (lexical & syntax analysis) Assembly Language Machine Language Object Oriented You are here

5.Terminology Strong typing:language prevents you from applying an operation to data on which it is not appropriate. Static typing: compiler can do all the checking at compile time. Examples: Common Lisp is strongly typed, but not statically typed. Ada is statically typed. Pascal is almost statically typed. Java is strongly typed, with a non-trivial mix of things that can be checked statically and things that have to be checked dynamically.

6.Type System Has rules for : Type equivalence (when are the types of two values the same?) Type compatibility (when can a value of type A be used in a context that expects type B?) Type inference (what is the type of an expression, given the types of the operands?)

7.Type compatability/equivalence Compatability: tells you what you can do More useful concept of the two Erroneously used interchangeably Equivalence: What are important differences between type declarations? Format does not matter: struct { int a, b; } Same as struct { struct{ int a, b; AND int a; } int b;}

8.Equivalence: two approaches Two types: name and structural equivalence Name Equivalence: based on declarations More commonly used in current practice Strict name equivalence: Types are equivalent if refer to same declaration Loose name equivalence: Types are equivalent if they refer to same outermost constructor (refer to same declaration after factoring out any type aliases) Structural Equivalence: based on meaning/semantics behind the declarations. Simple comparison of type descritpions Substitute out all names; Expand all the way to built-in types

9.Data Types A data type defines a collection of data objects, and a set of predefined operations on the objects type: integer operations: +, -, *, /, %, ^ Evolution of Data Types Early days: all programming problems had to be modeled using only a few data types FORTRAN I (1957) provides INTEGER, REAL, arrays Current practice: Users can define abstract data types (representation + operations)

10.Character Types Characters are stored in computers as numeric codings Traditionally use 8-bit code ASCII, which uses 0 to 127 to code 128 different characters ISO 8859-1 also use 8-bit character code, but allows 256 different characters Used by Ada 16-bit character set named Unicode Includes Cyrillic alphabet used in Serbia, and Thai digits First 128 characters are identical to ASCII used by Java and C#

11.Character String Types Values consist of sequences of characters Design issues: Is it a primitive type or just a special kind of character array? Is the length of objects static or dynamic? Operations: Assignment Comparison (=, >, etc.) Catenation Substring reference Pattern matching Examples: Pascal Not primitive; assignment and comparison only Fortran 90 Somewhat primitive; operations include assignment, comparison, catenation, substring reference, and pattern matching

12.Character Strings Examples Ada N := N1 & N2 (catenation) N(2..4) (substring reference) C and C++ Not primitive; use char arrays and a library of functions that provide operations SNOBOL4 (a string manipulation language) Primitive; many operations, including elaborate pattern matching Perl and JavaScript Patterns are defined in terms of regular expressions; a very powerful facility Java String class (not arrays of char); Objects are immutable StringBuffer is a class for changeable string objects

13.Character Strings String Length Static – FORTRAN 77, Ada, COBOL e.g. (FORTRAN 90) CHARACTER (LEN = 15) NAME; Limited Dynamic Length – C and C++ actual length is indicated by a null character Dynamic – SNOBOL4, Perl, JavaScript Evaluation (of character string types) Aid to writability As a primitive type with static length, they are inexpensive to provide Dynamic length is nice, but is it worth the expense? Implementation

14.Ordinal Data Types Range of possible values can be easily associated with the set of positive integers Enumeration types user enumerates all the possible values, which are symbolic constants enum days {Mon, Tue, Wed, Thu, Fri, Sat, Sun}; Design Issue: Should a symbolic constant be allowed to be in more than one type definition? Type checking Are enumerated types coerced to integer? Are any other types coerced to an enumerated type?

15.Enumeration Data Types Examples Pascal cannot reuse constants; can be used for array subscripts, for variables, case selectors; can be compared Ada constants can be reused (overloaded literals); disambiguate with context or type_name’(one of them) (e.g, Integer’Last) C and C++ enumeration values are coerced into integers when put in integer context Java does not include an enumeration type, but provides the Enumeration interface can implement them as classes class colors { public final int red = 0; public final int blue = 1; }

16.Enumeration Data Types Examples Pascal cannot reuse constants; can be used for array subscripts, for variables, case selectors; can be compared Ada constants can be reused (overloaded literals); disambiguate with context or type_name’(one of them) (e.g, Integer’Last) C and C++ enumeration values are coerced into integers when put in integer context Java does not include an enumeration type, but provides the Enumeration interface can implement them as classes class colors { public final int red = 0; public final int blue = 1; }

17.Arrays Indexing is a mapping from indices to elements map(array_name, index_value_list)  an element Index Syntax FORTRAN, PL/I, Ada use parentheses: A(3) most other languages use brackets: A[3] Subscript Types: FORTRAN, C - integer only Pascal - any ordinal type (integer, boolean, char, enum) Ada - integer or enum (includes boolean and char) Java - integer types only

18.Arrays Five Categories of Arrays (based on subscript binding and binding to storage) Static Fixed stack dynamic Stack dynamic Fixed Heap dynamic Heap dynamic

19.Arrays Static range of subscripts and storage bindings are static e.g. FORTRAN 77, some arrays in Ada Arrays declared in C and C++ functions that include the static modifier are static Advantage: execution efficiency (no allocation or deallocation) Fixed stack dynamic range of subscripts is statically bound, but storage is bound at elaboration time Elaboration time: when execution reaches the code to which the declaration is attached Most Java locals, and C locals that are not static Advantage: space efficiency

20.Arrays Stack-dynamic range and storage are dynamic, but fixed from then on for the variable’s lifetime e.g. Ada declare blocks declare STUFF : array (1..N) of FLOAT; begin ... end; Advantage: flexibility - size need not be known until array is about to be used

21.Arrays Fixed Heap dynamic Binding of subscript ranges and storage are dynamic, but are both fixed after storage is allocated Binding done when user program requests them, rather than at elaboration time and storage is allocated on the heap, rather than the stack In Java, all arrays are objects (heap-dynamic) C# also provides fixed heap-dynamic arrays

22.Arrays Heap-dynamic subscript range and storage bindings are dynamic and not fixed e.g. (FORTRAN 90) INTEGER, ALLOCATABLE, ARRAY (:,:) :: MAT (Declares MAT to be a dynamic 2-dim array) ALLOCATE (MAT (10, NUMBER_OF_COLS)) (Allocates MAT to have 10 rows and NUMBER_OF_COLS columns) DEALLOCATE MAT (Deallocates MAT’s storage) Perl and JavaScript support heap-dynamic arrays arrays grow whenever assignments are made to elements beyond the last current element Arrays are shrunk by assigning them to empty array Perl: @myArray = ( );

23.Arrays Number of subscripts (dimensions) FORTRAN I allowed up to three FORTRAN 77 allows up to seven Others - no limit Array Initialization Usually just a list of values that are put in the array in the order in which the array elements are stored in memory Examples: FORTRAN - uses the DATA statement Integer List(3) Data List /0, 5, 5/ C and C++ - put the values in braces; let compiler count them int stuff [] = {2, 4, 6, 8}; Ada - positions for the values can be specified SCORE : array (1..14, 1..2) := (1 => (24, 10), 2 => (10, 7), 3 =>(12, 30), others => (0, 0)); Pascal does not allow array initialization

24.Arrays: Operations Ada Assignment; RHS can be an aggregate constant or an array name Catenation between single-dimensioned arrays FORTRAN 95 Includes a number of array operations called elementals because they are operations between pairs of array elements E.g., add (+) operator between two arrays results in an array of the sums of element pairs of the two arrays Slices A slice is some substructure of an array FORTRAN 90 INTEGER MAT (1 : 4, 1 : 4) MAT(1 : 4, 1) - the first column MAT(2, 1 : 4) - the second row Ada - single-dimensioned arrays only LIST(4..10)

25.Arrays Implementation of Arrays Access function maps subscript expressions to an address in the array Single-dimensioned array address(list[k]) = address(list[lower_bound]) + (k-1)*element_size = (address[lower_bound] – element_size) + (k * element_size) Multi-dimensional arrays Row major order: 3, 4, 7, 6, 2, 5, 1, 3, 8 Column major order 3, 6, 1, 4, 2, 3, 7, 5, 8 4 7 2 5 1 3 8

26.Associative Arrays An unordered collection of data elements that are indexed by an equal number of values called keys also known as hashes Design Issues: What is the form of references to elements? Is the size static or dynamic?

27.Associative Arrays Structure and Operations in Perl Names begin with % Literals are delimited by parentheses %hi_temps = ("Monday" => 77, "Tuesday" => 79,…); Subscripting is done using braces and keys e.g., $hi_temps{"Wednesday"} = 83; Elements can be removed with delete e.g., delete $hi_temps{"Tuesday"};

28.Records A (possibly heterogeneous) aggregate of data elements in which the individual elements are identified by names Design Issues: What is the form of references? What unit operations are defined?

29.Records Record Definition Syntax COBOL uses level numbers to show nested records; others use recursive definitions COBOL 01 EMPLOYEE-RECORD. 02 EMPLOYEE-NAME. 05 FIRST PICTURE IS X(20). 05 MIDDLE PICTURE IS X(10). 05 LAST PICTURE IS X(20). 02 HOURLY-RATE PICTURE IS 99V99. Level numbers (01,02,05) indicate their relative values in the hierarchical structure of the record PICTURE clause show the formats of the field storage locations X(20): 20 alphanumeric characters 99V99: four decimal digits with decimal point in the middle