Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal...

48
Programming Languages Pragmatics Data Types Data Types

description

Integer  2’s complement  unsigned  operations exact within range  range depends on size of virtual cell - typical size: 1, 2, 4, 8 bytes 3

Transcript of Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal...

Page 1: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Programming Languages Pragmatics

Data TypesData Types

Page 2: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Simple typesSimple types

integer floating point binary-coded decimal

character boolean user-defined types

usually in hardware

usually in software

not composed of other types hardware or software implemented

2

Page 3: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

IntegerInteger 2’s complement unsigned operations exact within range range depends on size of virtual cell- typical size: 1, 2, 4, 8 bytes

3

Page 4: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Floating Point (FP)Floating Point (FP) Approximate real numbers but not dense, not even

“equally sparse” Languages may support at least two FP types: float and

double May follow the IEEE FP-754 Standard (Java) representations and operations are approximate range and precision depend on size of virtual cell

(usually 4 or 8 bytes)1 11 52 bits mantissaexponent

sign

4

See excellent detailed explanation of floating pointrepresentation in the following video: http://www.youtube.com/watch?v=t-8fMtUNX1A

Page 5: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Binary Coded DecimalBinary Coded Decimal ‘exact’ decimal arithmetic, space costly decimal digits in 4 bit code range and precision depend on size of

virtual cell – 2 digits per byte

4 4

5 9 0 5 1 8 7 8

defined decimal pointBytes

5

Page 6: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

CharacterCharacter ASCII – 128 character set – 1 byte Unicode – 2 byte extension usually coded as unsigned integer

6

Page 7: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

BooleanBoolean 1 bit is sufficient but... no bit-wise addressability in hardware store in a byte – space inefficient store 8 per byte – execution

inefficient c: 0=false, non-zero=true

7

Page 8: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

User-defined typesUser-defined types implemented (like character and

boolean usually are) as a coding of unsigned integer

enumerated type: (Pascal example)type suit = (club, diamond, heart, spade);var lead: suit;lead := heart; internally represented as { 0, 1, 2, 3 } operations:

8

Page 9: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

User-defined typesUser-defined types implemented as a restricted range of

integer subrange type: (Ada example)subtype CENTURY20 is INTEGER range 1900..1999;BIRTHYEAR: CENTURY20;BIRTHYEAR := 1981;

9

Page 10: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

User-defined typesUser-defined types Type compatibility issues:-can two enumerated types contain

same constant?-can defined types be coerced with

integer, with each other?

10

Page 11: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

When Should Two Types Be When Should Two Types Be Considered Equivalent?Considered Equivalent? Type equivalence Two principal forms

StructuralStructural Two types are equivalent if they consist of

the same components Name equivalence Name equivalence

Every type declaration defines a new type so two types are the same if they have the same name

More popular in more modern languages

11

Page 12: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

ExampleExampletypedef struct {

int a;int b;

} Point;

typedef struct {int a;int b;

} Pair;

Java uses name equivalence ML is more-or-less structural C hybrid (structural except for structs)

Point x;Pair y;X = y;

Legal?

12

Page 13: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Memory management introMemory management intro The parser creates a symbol table of

identifiers including variables: Some information, name plus more, is

bound at this time and as the program is compiled by storage in symbol table:e.g. int x;

--> x type: intaddr: offset

name type address 13

Page 14: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

StringsStrings First use: output formatting only Quasi-primitive type in most

languages (not just arrays of character)- operations: initialization, substring,

catenation, comparison The length problem: fixed or varying? No standard string model

14

Page 15: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

cchar *s = “abc”;int len = strlen(s);

array of char with terminal:

extended syntaxlibrary of methods

Strings - examplesStrings - examplesJAVAString s = “abc”+x;s = s.substring(0,2);

fixed length arrayextended syntaxclass with 70 methods

a b c 0

15

Page 16: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Strings - representationsStrings - representations fixed length and content

(static) fixed length and varying

content (FORTRAN) varying length and content by

reallocation (java String) varying length and content by

extension (java StringBuffer) Varying length and content(C)

Static strLengthAddress

Dynamic strMaxLengthCurrLengthAddress

char*Address

In symbol table

16

Page 17: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Compound (1)Compound (1) Arrays Arrays collection of elements of one type access to individual elements is

computed at execution time by position, O(1), or O(dim)

17

Page 18: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Arrays – design decisionsArrays – design decisions indexing:

dimensions – limit? recursive?types – int, other, user defined?first index: 0, 1, variablerange checking – no(C),

yes(java)syntax for subscript operator (),

[]?18

Page 19: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Arrays – design decisionsArrays – design decisions binding times

type, index typeindex range(ie array size), space

staticfixed stack-dynamicstack-dynamicheap-dynamic

initial values of elementsat storage allocation? e.g. int[] x =

{1,2,3};

19

Page 20: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Arrays – operationsArrays – operations on elements – based on type on entire array as variables -

- vector and matrix operations e.g.,APL- sub array (~ substring)

subarray dimensions(slices)

20

Page 21: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Arrays – storageArrays – storage<array>

element type, sizeindex type

index lower boundindex upper bound

address

address

lower bound upper bound

21

Page 22: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Arrays – element accessArrays – element access<array>

element type, sizeindex type

index lower boundindex upper bound

address

address

lower bound i

address of a[i] =address + (i-lower bound)*size

22

Page 23: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Arrays - multidimensionalArrays - multidimensional contiguous or not row major, column major order computed location of element

23

Page 24: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Jagged arraysJagged arrays Implemented as arrays of arrays

<array><array>, 4index type

index lower boundindex upper bound

addressaddress

<array><array>, 3index type

index lower boundindex upper bound

address

<array><array>, 7index type

index lower boundindex upper bound

address

<array><array>, 4index type

index lower boundindex upper bound

address

<array><array>, 5index type

index lower boundindex upper bound

address24

Page 25: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

(2) (2) Associative Arrays - mapsAssociative Arrays - maps values accessed by keys,not indices no order of elements automatic growth of capacity operations: add/set, get, remove fast search for individual data slower for batch processing than

array Java classes; Perl data structure

25

Page 26: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Associative Arrays - implementationAssociative Arrays - implementation hash tables based on key value most operations ‘near O(1)’ expanding capacity may be O(n)

For a java class that combines features of array and associative array, see LinkedHashMap

26

Page 27: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

(3)(3) RecordsRecords multiple elements of any type elements accessed by field name design issues:

- hierarchical definition(records within records)

- syntax of naming- scopes for elliptical (incomplete) reference to fields

27

Page 28: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Records - implementationRecords - implementation<array> a

element type, sizeindex type

index lower boundindex upper bound

address

address

lower bound upper bound

<record>dept

array [1..4] of char 0 (offset)code

address

Caddress O S C 3127

dept course

integer4

type course = record dept : array[1..4] of char; code : integer; end

28

Page 29: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

(4)(4) Pascal variant records Pascal variant records (unions)(unions)

type coord = (polar, cart); point = record case rep : coord of polar: ( radians : boolean; radius : real; angle : real); cart: ( x : real; y : real); end;

Note:•varying space requirements•discriminant field is optional (rep)

•type checking loopholes: Ada has similar variant record but closed these loopholes

29

Page 30: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Other unionsOther unions Fortran EQUIVALENCE c union not inside records no type checking

* unions do not cause type coercion - data is reinterpreted

Sebesta’s c exampleunion flextype { int intE1; float floatE1;}union flexType ell;float x;ell.intE1 = 27;x = ell.floatE1;

30

Page 31: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

(5)(5) Sets (Pascal)Sets (Pascal) defined on one (discrete) base type implementation imposes maximum

size (set of integer;-not possible)type day = (M, Tu, W, Th, F, Sa, Su); dayset = set of day;var work, wknd : dayset; today : day;today = F;work = [M, Tu, W, Th, F];wknd = [Sa, Su, F];if (today in work and wknd) ...

1 1 0111 00 0 1100 10 0 0100 0

31

Page 32: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

(6)(6) Pointers and references Pointers and references references are dereferenced pointers

(whatever that means) primary purpose: dynamic memory

access secondary purpose: indirect

addressing as in machine instructions

32

Page 33: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

PointersPointers (and references) (and references) data type that stores an address in

the format of the machine (usually 4 bytes) or a “null”

a pointer must be dereferenced to get the data at the address it contains

a reference is a pointer data type that is automatically dereferenced

33

Page 34: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Dereferencing exampleDereferencing exampleIn c++:double x,y;

Point p(0.0,0.0);

Point *pref;

pref = &p;

x = p.X;

y = (*pref).Y;

In Java:Point2D.Double p;

p = new Point2D.Double(0.0,0.0);

double xCoord = p.x;

Dereferencing and field access combined

Dereferencing Field access 34

Page 35: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Pointers hold addressesPointers hold addresses Indirect addressing

In c: pointer to statically allocated memoryint a,b;

int *iptr, *jptr;

a = 100;

iptr = &a;

jptr = iptr;

b = *jptr;

int x, y, arr[4];

int *iptr;

iptr = arr;

arr[2] = 33;

x = iptr[2];

y = *(iptr + 2);

Security loophole…35

Page 36: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Pointer arithmeticPointer arithmetic Arithmetic operations on addressesint x;

int *iptr;

iptr = &x;

for (;;){

<< process loc (*iptr)>>

iptr++;

}

Scan through memory starting at x

36

Page 37: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Basic dynamic memory Basic dynamic memory management model:management model: Heap (memory) manager keeps list of

available memory cells “Allocate” operation transfers cell

from list in heap to program “Deallocate” transfers cell from

program back to list in heap Tradeoffs of fixed or variable sized

cells

37

Page 38: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Problems with pointers and Problems with pointers and dynamic memory:1dynamic memory:1 Dangling reference: pointer points to

de-allocated memoryPoint *q;

Point *p = new Point(0,0);

q = p;

delete p;

// q is dangling - reference to q should cause

// an error - ‘tombstones’ will do error check

38

Page 39: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Problems with pointers and Problems with pointers and dynamic memory: 2dynamic memory: 2 Memory leakage: memory cell with no

reference to itPoint *p = new Point(0,0);

p = new Point(3,4);

// memory containing Point(0,0) object

// is inaccessible - counting references will help

39

Page 40: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Cause of reference problemsCause of reference problems Multiple references to a memory cell Deallocation of memory cells

Where is responsibility?-automatic deallocation (garbage collection)

OR -user responsibility (explicit ‘delete’ – C++)

40

Page 41: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

User management of memoryUser management of memory Dangling references can be detected as

errors but not prevented

Memory leakage is a continuing problem – Can you think of a way to find stranded memory?

int *p =*q = 6;

p = null;

p 6q

p 6q

41

Page 42: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Garbage CollectionGarbage Collection1. Reference counting: ongoing “eager”

-memory cells returned to heap as soon as all references removed.

2. Garbage collection: occasional “lazy”-let unreferenced memory cells ‘leak’ till heap is nearly empty then collect them

42

Page 43: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Reference counting:Reference counting: When an item is no

longer referenced it may be deleted

Need to keep count of references

When p is set to null nothing refers to Association in example

Does this technique always work? No!

Illustration from http://www.brpreiss.com/books/opus5/html/page422.html

43

Page 44: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Why Does This Fail?Why Does This Fail? What’s wrong with this?

Illustration from http://www.brpreiss.com/books/opus5/html/page423.html

44

Page 45: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Garbage Collection:Garbage Collection: (mark-sweep) (mark-sweep)

1. All cells in memory marked inaccessible(f)

2. Follow all references in program and mark cells accessible(t);

ftt

‘Accessible’ marker in cell

3. Return inaccessible cells to heap

ftt

Classic problem:effect on program performance

45

Page 46: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

A sloppy java exampleA sloppy java example from Main (Data Structures)public class ObjectStack{ private Object[] data; private int manyItems; .... public Object pop() { if (manyItems==0) throw new EmptyStackException(); return data[--manyItems]; //leaves reference in data }}

46

Page 47: Programming Languages Pragmatics Data Types. Simple types integer floating pointbinary-coded decimal character boolean user-defined types usually in hardware.

Managing heap ofManaging heap ofvariable-sized cellsvariable-sized cells Necessary for objects with different

space requirements Problem: tracking cell size Problem: heap defragmentation

- keep blocks list in size order?- keep blocks list in sequence order?

47