CPS 506 Comparative Programming Languages Object-Oriented Programming Language Paradigm.
CPS 506 Comparative Programming Languages Type Systems, Semantics and Data Types.
-
Upload
robert-henry -
Category
Documents
-
view
220 -
download
0
Transcript of CPS 506 Comparative Programming Languages Type Systems, Semantics and Data Types.
CPS 506Comparative Programming
LanguagesType Systems, Semantics and Data
Types
Type Systems
• A completely defined language: Defined syntax, semantics and type system
• Type: A set of values and operations– int• Values=Z• Operations={+, -, *, /, mod}
–Boolean• Values={true, false} • Operations={AND, OR, NOT, XOR}
2
Type Systems
• Type System– A system of types and their associated
variables and objects in a program
– To formalize the definition of data types and their usage in a programming language
– A bridge between syntax and semantics• Type checked in compile time: a part of syntax
analysis• Type checked in run time: a part of semantics
3
Type Systems (con’t)
• Statically Typed: each variable is associated with a single type during its life in run time.–Could be explicit or implicit
declaration–Example: C and Java, Perl–Type rules are defined on abstract
syntax (Static Semantics)
4
Type Systems (con’t)
• Dynamically Typed: a variable type can be changed in run time– Example: LISP, JavaScript, PHPJava Script example:List = [10.2 , 3.5]…List = 47– Less reliable, difficult to debug– More flexible– Fast compilation– Slow execution (Type checking in run-time)
5
Type Systems (con’t)
• Type Error: a non well-defined operation on a variable in run time– Example: union in Cunion flexType {
int i;float f;
};union flexType u;float x;…u.I = 10;x = u.f;…
– Another example in C ?
6
Type Systems (con’t)
• Strongly Typed: All type errors are detected in compile or run time before execution– More reliable– Example: Java is nearly strongly typed, but C is not
x+1 regardless of the type x– Coercion (implicit type conversion) rules have an effect on
strong typing
• Weak type examplex = 2;y = “5”;print x+y
Visual Basic: 7JavaScript: “25”
7
Type Systems (con’t)
• Type Safe: A language without type error–Strongly Typed -> Type Safe–Example: Java, Haskell, and ML
8
Type Binding
• The process of associating an attribute, name, location, value, or type, to an object• Example
int i; Identifier i is bound to the integer type and to a location specified by the underlying compiler
i = 10; Identifier i is bound to value 10 or value 10 is bound to a location
9
Type Binding (con’t)
• Binding time– Language definition time
• Java: Integers are bound to int, and real numbers are bound to float
– Language implementation time• Bounding real values to IEEE 754 standard
– Program writing time• Declaration of variables
– Compile/Load time• Bounding static objects to stack or fixed memory• Execution code is assigned to a memory block
– Run time• Value are bound to variables
10
Type Binding (con’t)
• Early binding– An element is bound to a property as early as
possible– The earlier the binding the more efficient the
language
• Late Binding– Delay binding until the last possible time– The later the binding the more flexible the language– Supports overloading and overriding in Object
Oriented languages– C++ example ?
11
Type Checking
• Type checking is the activity of ensuring that the operands of an operator are of compatible types
• A compatible type is one that is either legal for the operator, or is allowed under language rules to be implicitly converted, by compiler- generated code, to a legal type
• If all type bindings are static, nearly all type checking can be static
• If type bindings are dynamic, type checking must be dynamic
12
Type Conversion
• A narrowing conversion is one that converts an object to a type that cannot include all of the values of the original type e.g. float to int• A widening conversion is one in which
an object is converted to a type that can include at least approximations to all of the values of the original type e.g. int to float
13
Type Conversion (con’t)
• Implicit type conversion (Coercion)–decreases type error detection
ability. In most languages, all numeric types are coerced in expressions, using widening conversions. Ada has no implicit Conversion
14
Type Conversion (con’t)
–Cdouble d;long l;int i; …d = i;l = i;if (d == l) d = 2 * l;
– Javaint x;double d;x = 5;d = x + 2;
15
Type Conversion (con’t)
• Explicit type conversion (Casting)– ( type-name ) cast-expression • Cdouble d = 3.14;int i = (int) d;
• Javaboolean t = true;byte b = (byte) (t ? 1 : 0);
• Ada (similar to function call)3 * Integer(2.0)2.0 + Float(2)
16
Semantic Domains
• Semantic Domain– A set with well-defined properties and
operations– Environment• A set of pairs <variable, location>
–Memory• A set of pairs <location, value>
• State– Product of environment and its memoryσ = { <Var1, Val1>, <Var2, Val2>,…, <Varn, Valn>}
17
Semantic Domains (con’t)
• Three ways to define the meaning of a program–Operational Semantics• Program is interpreted as a set of sequences of computational steps• A set of execution rules
Premise -> Conclusionσ(x) => 4 and σ(y) => 2 -> σ(x+y) =>
6
18
Semantic Domains (con’t)
• Three ways to define the meaning of a program–Operational Semantics (con’t)• Usage
– Language manuals and textbooks– Teaching programming languages
• Structural: define program behavior in terms of the behavior of its parts• Natural: define program behavior in terms of
its overall effects, and not from its single steps
19
Semantic Domains (con’t)
– Axiomatic Semantics• The program does what it is supposed to do• Agreement of the program result and
specification• Formal verification of a program using logic
expressions, assertions• Hoare triple
{Pre-condition} s {Post-condition}
• Example{a = 2} b = a; {b = 2}
• Weakest Pre-condition{?} a = b+1; {a > 1}
20
Semantic Domains (con’t)
– Axiomatic Semantics (con’t)• Axioms
– Rule of Consequence
– Rule of Conjunction
– Rule of Assignment (s : b = a)
– Rule of sequence
– Rule of Conditions : if c then a else b
21
}{}{
,},{}{
QaP
QQPPQaP
}{}{
}{}{},{}{
RQaP
RaPQaP
}{]}\[{ QsbaQ
true
}{}{
}{}{},{}{
21
21
QssP
QsRRsP
}{}{
}{}{},{}{
QsP
QbcPQacP
Semantic Domains (con’t)
–Axiomatic Semantics (con’t)• Axioms–Rule of Loops : while c do b end – I is loop invariant–Loop Invariant is true before the loop, at
the bottom of the loop in each iteration, and when the loop is terminated.–Find the loop invariant to prove the
correctness of the loop
22
}{}{
}{}{
cIsI
IbcI
Semantic Domains (con’t)
– Denotational Semantics• Define the meaning of statement as a state-
transforming mathematical function• A state of a program indicates the current
values of the active objects• Example
– Denotational semantics of Integer arithmetic expressions» Production rules:
Number ::= N D | D Digit ::= 0 | 1 | … | 9 Expression ::= E1 + E2 | E1 – E2 | E1 * E2
| E1 / E2| (E) | N
23
Semantic Domains (con’t)
–Denotational Semantics (con’t)– Semantic domain:
Integer = { …, -1, 0, 1, …}– Semantic functions:
Value: Numner => NumberDigit: Digit => NumberExpr: Expression => Integer
–Auxiliary functions:plus: Number + Number => Number…
– Semantic equations:Expr[[E1+E2]] = plus(Expr[E1] , Expr[E2])
24
Data Types
• Elements of a data type– Set of possible values– Set of operations– Internal representation– External representation
• Type information– Implicit
• 5 is implicitly integer• I is integer, implicitly, in Fortran
– Explicit• Using variable or function declaration
25
Data Types (con’t)
• Data type classifications–Built-in• Included in the language definition–Primitive–Composite–Recursive
–User-defined• Data types defined by users• Declared and defined before usage
26
Primitive Data Types
• Unstructured and indivisible entities• Integer, Real, Boolean, Char • Depends to the language application
domain–COBOL: fixed-length strings and fixed-
point numbers–SNOBOL: Strings with different length–Scheme: integer, rational, real,
complex
27
Primitive Data Types (con’t)
• Example– C
• int, float, char
– Java• int, float, char, boolean
– Pascal• Integer, Char, Real, Longint
–ML• bool, real, int, word, char
– Scheme• integer?, real?, boolean?, char?
28
Primitive Data Types (con’t)
• Integer–Almost always an exact reflection of
the hardware so the mapping is trivial–There may be as many as eight
different integer types in a language – Java’s signed integer sizes: byte, short, int, long
29
Primitive Data Types (con’t)
• Float–Model real numbers, but only as
approximations– Languages for scientific use support at
least two floating-point types (e.g., float and double; sometimes more–Usually exactly like the hardware, but not
always– IEEE Floating-Point– Standard 754
30
Primitive Data Types (con’t)
• Complex–Some languages support a complex
type, e.g., C99, Fortran, and Python–Each value consists of two floats,
the real part and the imaginary part–Literal form (in Python):
(7 + 3j), where 7 is the real part and 3 is the imaginary part
31
Primitive Data Types (con’t)
• Decimal– For business applications (money)• Essential to COBOL• C# offers a decimal data type
–Store a fixed number of decimal digits, in coded form (BCD) (Binary-Coded Decimal)–Advantage: accuracy–Disadvantages: limited range, wastes
memory
32
Primitive Data Types (con’t)
• Boolean–Simplest of all–Range of values: two elements, one for “true” and one for “false”–Could be implemented as bits, but often as bytes
33
Primitive Data Types (con’t)
• Character– Stored as numeric codings–Most commonly used coding: ASCII– An alternative, 16-bit coding: Unicode (UCS-
2) (Universal Character Set)• Includes characters from most natural
languages• Originally used in Java• C# and JavaScript also support Unicode
– 32-bit Unicode (UCS-4)• Supported by Fortran, starting with 2003
34
Composite Data Types
• Structured or compound types• Array, String, Enumeration, Pointer,
Record, List, Function• Homogeneous like Array• Heterogeneous like Record• Fixed size like Array• Dynamic size like Linked List• Inside the core or as a separate library
35
Composite Data Types (con’t)
• Example–C• Array ([]), Pointer (*), Struct, enum
–Java• String, Array
–Pascal• Record, Array, Pointer (^)
36
Composite Data Types (con’t)
• String– C and C++
• Not primitive• Use char arrays and a library of functions that provide
operations
– SNOBOL4 (a string manipulation language)• Primitive• Many operations, including elaborate pattern matching
– Fortran and Python• Primitive type with assignment and several operations
– Java• Primitive via the String class
– Perl, JavaScript, Ruby, and PHP • Provide built-in pattern matching, using regular expressions
37
Composite Data Types (con’t)
• String length option– Static: COBOL, Java’s String class– Limited Dynamic Length: C and C++• In these languages, a special character is
used to indicate the end of a string’s characters, rather than maintaining the length
– Dynamic (no maximum): SNOBOL4, Perl, JavaScript
– Ada supports all three string length options
38
Composite Data Types (con’t)
• String Implementation–Static length: compile-time
descriptor–Limited dynamic length: may need
a run-time descriptor for length (but not in C and C++)–Dynamic length: need run-time
descriptor; allocation/de-allocation is the biggest implementation problem 39
Composite Data Types (con’t)
• Enumeration– All possible values, which are named
constants, are provided in the definition– C# exampleenum days {mon, tue, wed, thu, fri, sat, sun};
– Design issues• Is an enumeration constant allowed to appear in
more than one type definition, and if so, how is the type of an occurrence of that constant checked?
• Are enumeration values coerced to integer?• Any other type coerced to an enumeration type?
40
Composite Data Types (con’t)
• Enumeration (con’t)– Aid to readability, e.g. no need to code a color
as a numberenum Colors {Red, Blue, Green, Yellow};
– Aid to reliability, e.g. compiler can check: • operations (don’t allow colors to be added) • No enumeration variable can be assigned a value
outside its defined range• Ada, C#, and Java 5.0 provide better support for
enumeration than C++ because enumeration type variables in these languages are not coerced into integer types
41
Composite Data Types (con’t)
• Sub-range Types– An ordered contiguous subsequence of an
ordinal type• Example: 12..18 is a sub-range of integer type
– Ada’s design
type Days is (mon, tue, wed, thu, fri, sat, sun);
subtype Weekdays is Days range mon..fri;subtype Index is Integer range 1..100;Day1: Days;Day2: Weekday;Day2 := Day1;
42
Composite Data Types (con’t)
• Enumeration and Sub-range implementation– Enumeration types are implemented as
integers– Sub-range types are implemented like
the parent types with code inserted (by the compiler) to restrict assignments to sub-range variables
43
Composite Data Types (con’t)
• Array– An array is an aggregate of
homogeneous data elements in which an individual element is identified by its position in the aggregate, relative to the first element.
– A heterogeneous array is one in which the elements need not be of the same type• Supported by Perl, Python, JavaScript, and
Ruby44
Composite Data Types (con’t)
• Array Index Type– FORTRAN, C: integer only– Ada: integer or enumeration (includes Boolean
and char)– Java: integer types only– Index range checking
• C, C++, Perl, and Fortran do not specify range checking
• Java, ML, C# specify range checking• In Ada, the default is to require range checking, but it
can be turned off
45
Composite Data Types (con’t)
• Array Initialization– C-based languages
int list [] = {1, 3, 5, 7}char *names [] = {“Mike”, “Fred”,“Mary Lou”};
– AdaList : array (1..5) of Integer := (1 => 17, 3 => 34, others => 0);
– PythonList comprehensionslist = [x ** 2 for x in range(12) if x % 3 == 0]
puts [0, 9, 36, 81] in list
46
Composite Data Types (con’t)
• Array Operations– APL provides the most powerful array
processing operations for vectors and matrixes as well as unary operators (for example, to reverse column elements)
– Ada allows array assignment but also concatenation
– Python’s array assignments, but they are only reference changes. Python also supports array concatenation and element membership operations 47
Composite Data Types (con’t)
• Array Operations (con’t)– Ruby also provides array concatenation
– Fortran provides elemental operations because they are between pairs of array elements
– For example, + operator between two arrays results in an array of the sums of the element pairs of the two arrays
48
Composite Data Types (con’t)
• Rectangular and Jagged Arrays– A rectangular array is a multi-dimensioned
array in which all of the rows have the same number of elements and all columns have the same number of elements
– A jagged matrix has rows with varying number of elements• Possible when multi-dimensioned arrays actually
appear as arrays of arrays
– C, C++, and Java support jagged arrays– Fortran, Ada, and C# support rectangular
arrays (C# also supports jagged arrays)
49
Composite Data Types (con’t)
• Slices– A slice is some substructure of an array;
nothing more than a referencing mechanism– Slices are only useful in languages that have
array operations– Fortran 95
Integer, Dimension (10) :: VectorInteger, Dimension (3, 3) :: MatInteger, Dimension (3, 3, 4) :: CubeVector (3:6) is a four element array
– Ruby supports slices with the slice methodlist.slice(2, 2) returns the third and fourth elements of list 50
Composite Data Types (con’t)
51
Composite Data Types (con’t)
• Array Access– Access function maps subscript expressions to
an address in the array – Access function for single-dimensioned arrays:
address(list[k]) = address (list[lower_bound])+ ((k-lower_bound) *
element_size)– Two common ways:
• Row major order (by rows) – used in most languages• column major order (by columns) – used in Fortran
52
Composite Data Types (con’t)
• Record– A record is a possibly heterogeneous aggregate
of data elements in which the individual elements are identified by names
– COBOL uses level numbers to show nested records; others use recursive definition01 EMP-REC.
02 EMP-NAME. 05 FIRST PIC X(20). 05 MID PIC X(10). 05 LAST PIC X(20). 02 HOURLY-RATE PIC 99V99.
53
Composite Data Types (con’t)
• Record (con’t)– Adatype Emp_Rec_Type is record
First: String (1..20);Mid: String (1..10);Last: String (1..20);Hourly_Rate: Float;
end record;Emp_Rec: Emp_Rec_Type;
54
Composite Data Types (con’t)
• Record (con’t)– PascalMonthType = (Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec);
DateType = recordMonth : MonthType; Day : 1..31;Year : 1900..2000;
end;
55
Composite Data Types (con’t)
• Record (con’t)–Cstruct student_type { char name[20]; int ID;
}
56
Composite Data Types (con’t)
• Record (con’t)– Java: No record in Java. It is defined
using class.class Person { String name; int id_number; Date birthday; int age; }
57
Composite Data Types (con’t)
• Pointer and Reference Types– A pointer type variable has a range of values
that consists of memory addresses and a special value, nil
– Provide the power of indirect addressing– Provide a way to manage dynamic memory– A pointer can be used to access a location in
the area where storage is dynamically created (usually called a heap)
58
Composite Data Types (con’t)
• Pointer Design Issues– What are the scope and lifetime of a pointer
variable?– Are pointers restricted as to the type of value
to which they can point?– Are pointers used for dynamic storage
management, indirect addressing, or both?– Should the language support pointer types,
reference types, or both?
59
Composite Data Types (con’t)
• Pointer Operations– Two fundamental operations: assignment and
dereferencing– Assignment is used to set a pointer variable’s
value to some useful address– Dereferencing yields the value stored at the
location represented by the pointer’s value• Dereferencing can be explicit or implicit• C++ uses an explicit operation via *
j = *ptr
sets j to the value located at ptr
60
Composite Data Types (con’t)
• Pointer Illustration– The assignment operation j = *ptr
61
Composite Data Types (con’t)
• Pointer Problems– Dangling pointers (dangerous)
• A pointer points to a heap-dynamic variable that has been de-allocated
– Lost heap-dynamic variable• An allocated heap-dynamic variable that is no longer
accessible to the user program (often called garbage)– Pointer p1 is set to point to a newly created heap-dynamic
variable– Pointer p1 is later set to point to another newly created heap-
dynamic variable– The process of losing heap-dynamic variables is called memory
leakage
62
Composite Data Types (con’t)
• Pointer Problems (con’t)– Ada• Some dangling pointers are disallowed
because dynamic objects can be automatically de-allocated at the end of pointer's type scope
– C, C++• Extremely flexible but must be used with
care• Pointers can point at any variable regardless
of when or where it was allocated• Used for dynamic storage management and
addressing63
Composite Data Types (con’t)
• Pointer Problems (con’t)– C, C++• Pointer arithmetic is possible• Explicit dereferencing and address-of
operators
• Domain type need not be fixed (void *) void * can point to any type and can be
type checked (cannot be de-referenced)
64
Composite Data Types (con’t)
• Pointer Arithmetics in C, C++float stuff[100];
float *p;p = stuff;
*(p+5) is equivalent to stuff[5] and p[5]*(p+i) is equivalent to stuff[i] and p[i]
65
Composite Data Types (con’t)
• Reference Types– C++ includes a special kind of pointer type
called a reference type that is used primarily for formal parameters• Advantages of both pass-by-reference and pass-by-
value
– Java extends C++’s reference variables and allows them to replace pointers entirely• References are references to objects, rather than
being addresses
– C# includes both the references of Java and the pointers of C++
66
Composite Data Types (con’t)
• Heap Management–A very complex run-time process–Single-size cells vs. variable-size
cells–Two approaches to reclaim garbage• Reference counters (eager approach):
reclamation is gradual• Mark-sweep (lazy approach):
reclamation occurs when the list of variable space becomes empty 67
Composite Data Types (con’t)
• Heap Management (con’t)– Reference counters• Maintain a counter in every cell that store
the number of pointers currently pointing at the cell• Disadvantages: space required, execution
time required, complications for cells connected circularly• Advantage: it is intrinsically incremental, so
significant delays in the application execution are avoided
68
Composite Data Types (con’t)
• Heap Management (con’t)– Mark-Sweep
• The run-time system allocates storage cells as requested and disconnects pointers from cells as necessary; mark-sweep then begins
• Every heap cell has an extra bit used by collection algorithm
• All cells initially set to garbage• All pointers traced into heap, and reachable cells marked as
not garbage• All garbage cells returned to list of available cells• Disadvantages: in its original form, it was done too
infrequently. When done, it caused significant delays in application execution. Contemporary mark-sweep algorithms avoid this by doing it more often—called incremental mark-sweep
69
Recursive Data Types
• Recursive or circular data types• Type composed from objects of
the same type• Example–Linked list in C and Pascal–ML
datatype intlist = nil | cons of int * intlist
70
5 10
Exercises
1. Determine which of the following programming languages are statically typed or not: (Explain by example)– Ada– Perl– Python– Haskell– Prolog– Fortran– Ruby
71
Exercises
2. Bring another example of type error in C.3. Show two examples for early and late
binding in a language.4. Is there any programming language
which does not allow implicit type conversion, say int to float?
5. Which type of coercions is not safe?6. compute the Weakest Pre-condition of
{?} a = b * -1; {a > 10}
72
Exercises
2. Using an example, show the rule of consequence in axiomatic semantic.
3. Find the loop invariant of the following while loop.
i = 1;s = 0;while (i <= 10) {s = s + i;i = i + 1;
}73
}{}{
,},{}{
QaP
QQPPQaP
Exercises
7. Which programming language(s) except Ada and different versions of C, support pointer?
8. What are the rules of call-by-value and call-by-reference in Pascal? Give examples.
9. Name two programming languages which have automatic garbage collection. What are the negative and positive effects of this operation in a language?
74