Introduction to Computer Science I Topic 2: Structured Data Types Data abstraction

Post on 22-Feb-2016

39 views 0 download

Tags:

description

Introduction to Computer Science I Topic 2: Structured Data Types Data abstraction. Prof. Dr. Max Mühlhäuser Dr. Guido Rößling. Structures. The input/output of a function is seldom an atomic value (number, boolean , symbol), but frequently a data object with many different attributes. - PowerPoint PPT Presentation

Transcript of Introduction to Computer Science I Topic 2: Structured Data Types Data abstraction

Telecooperation/RBG

Technische Universität Darmstadt

Copyrighted material; for TUD student use only

Introduction to Computer Science ITopic 2: Structured Data Types

Data abstractionProf. Dr. Max MühlhäuserDr. Guido Rößling

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

2

Structures• The input/output of a function is seldom an atomic

value (number, boolean, symbol), but frequently a data object with many different attributes.– E.g. CD: title and price– We need mechanisms to put compounding data together

• One of these mechanisms is the structure– A structure definition has the following form

– for example

(define-struct s (field1 … fieldn))

(define-struct point (x y))

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

3

Structure definitions

This definition creates a series of procedures:• make-s

– a constructor procedure, which gets n arguments and returns a structure-value

– e.g. (define p (make-point 3 4)) creates a new point• s?

– a predicate procedure, which returns true for a value that is generated by make-s and false for every other value

– e.g. (point? p) true• s-field

– for every field a selector, which gets a structure as an argument and extracts the value of the field

– e.g. (point-y p) 4

(define-struct s (field1 … fieldn))

(define-struct point (x y))

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

4

Design of procedures for compound data

• When do we need structures?– If the description of an object consists of many different

pieces of information• How does our design recipe change?

– Data analysis: Search the problem statement for descriptions of relevant objects, then generate corresponding data types; describe the contract of the data type

– Definition of a contract can use the new defined type names, i.e. ;; grant-qualified: Student bool

– Template: Header + Body, which contains all possible selectors

– Implementation of the bodies: Design an expression that uses primitive operations, other functions, selector expressions and the variables

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

5

Example;; Data Analysis & Definitions:(define-struct student (last first teacher));; A student is a structure: (make-student l f t) ;; where f, l, and t are symbols.

;; Contract: subst-teacher : student symbol -> student;; Purpose: to create a student structure with a new ;; teacher name if the teacher's name matches 'Fritz

;; Examples:;;(subst-teacher (make-student 'Find 'Matthew 'Fritz) 'Elise);; = (make-student 'Find 'Matthew 'Elise);;(subst-teacher (make-student 'Smith 'John 'Bill) 'Elise);; = (make-student 'Smith 'John 'Bill)

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

6

Example (continued);; Template:;; (define (subst-teacher a-student a-teacher) ;; ... (student-last a-student) ...;; ... (student-first a-student) ...;; ... (student-teacher a-student) ...);; Definition: (define (subst-teacher a-student a-teacher) (cond [(symbol=? (student-teacher a-student) 'Fritz) (make-student (student-last a-student) (student-first a-student)

a-teacher)] [else a-student]))

;; Test 1:(subst-teacher (make-student 'Find 'Matthew 'Fritz) 'Elise);; expected value:(make-student 'Find 'Matthew 'Elise);; Test 2:(subst-teacher (make-student 'Smith 'John 'Bill) 'Elise);; expected value: (make-student 'Smith 'John 'Bill)

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

7

The meaning of structuresin the substitution model (1/2)

• How does define-struct work in the substitution model?

• This structure produces the following operations:– make-c : a constructor– c-f1 … c-f2: a series of selectors– c? : a predicate

(define-struct c (f1 … fn))

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

8

The meaning of structuresin the substitution model

• We proceed like in every combination– Evaluation of the operator and the operands– The value of (make-c v1 … vn) is (make-c v1 … vn)

• this way constructors are self evaluating!– The evaluation of (c-fi v) is

• vi if v = (make-c v1 …vi … vn)• An error in all other cases

– The evaluation of (c? v) is• true, if v = (make-c v1 … vn)• false, otherwise

• Try it with the DrScheme Stepper!

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

9

Data abstraction• For procedures we have

– primitive expressions (+, -, and, or, …)– means of combination (procedure implementation)– means of abstraction (procedural abstraction)

• We have the same for data :– primitive data (numbers, boolean values, symbols)– compounded data (e.g. structures)– Data abstraction.

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

10

Why do we need data abstraction?• Example: Implementation of an operation for

adding rational numbers– Rational numbers are composed of a numerator and a

denominator, e.g. 1/2 or 7/9.– The addition of two rational numbers produces two

results:the resulting numerator and the resulting denominator.

– But a procedure can only return one value.– That’s why we would need two procedures: One returns

the resulting numerator, the other the resulting denominator.

– We have to remember which numerator is part of which denominator.

• Data abstraction is a method that combines several Data objects, so they can be used as a single object. How this works is hidden by means of data abstraction.

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

11

Why do we need data abstraction?• The new data objects are abstract data:

– They are used without making any assumptions about how they are implemented.

• Data abstraction helps to...– elevate the conceptual level at which programs are

designed,– increase the modularity of designs and– enhance the expressive power of a programming

language.• A concrete data representation is defined independent of

the programs using the data.• The interface between the representation and a program

using the abstract data is a set of procedures, called selectors and constructors.

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

12

Language extensions for handling abstract data

• Constructor: a procedure that creates instances of abstract data from data that is passed to it

• Selector: a procedure that returns a data item that is in an abstract data object

• The component data item returned might be– the value of an internal variable– or it might be computed.

• Constructors/Selectors generated by define-struct are a special case– The component data returned by these selectors is one of

the values that was passed during the constructor call (never computed)

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

13

Example: Rational Numbers• Mathematically represented by a pair of integers:

1/2, 7/9, 56/874, 78/23, etc.

• Constructor:(make-rat numerator denominator)

• Selectors:(numer rn)(denom rn)

• That's all a user needs to know!– But it’s not quite enough for the programmer and

DrScheme, as we have not defined the “rat” structure – this will follow in a couple of slides.

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

14

User-defined Operation for Rational Numbers

Multiplication of x = nx/dx and y = ny/dy

(nx/dx) * (ny/dy) = (nx*ny) / (dx*dy)

;; mul-rat: rat rat -> rat;; Multiplies two rational numbers;; Example: (mul-rat (make-rat 1 2) (make-rat 2 3);; = (make-rat 2 6)(define (mul-rat x y) (make-rat (* (numer x) (numer y)) (* (denom x) (denom y))))

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

15

Another Operation on Rational Numbers

Addition of x = nx/dx and y = ny/dy

nx/dx + ny/dy = (nx*dy + ny*dx) / (dx*dy);; add-rat: rat rat -> rat(define (add-rat x y) (make-rat (+ (* (numer x) (denom y)) (* (numer y) (denom x))) (* (denom x) (denom y))))

Subtraction and division are defined similarly to addition and multiplication.

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

16

A testEquality:

nx/dx = ny/dy

iffnx*dy = ny*dx

iff means:if and only if

;; equal-rat: rat rat -> bool(define (equal-rat? x y) (= (* (numer x) (denom y)) (* (numer y) (denom x))))

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

17

An output operation

;; print-rat: rat -> String(define (print-rat x) (string-append

(number->string (numer x)) "/" (number->string (denom x))))

To output rational numbers in a convenient form, we define an output procedure using data abstraction.

This is your first example with string manipulation!string-append puts several strings together.number->string turns a number in a string.This is not possible using symbols.

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

18

Below the abstract data• We implemented the operators add-rat, mul-rat and equal-rat using make-rat, denom, numer.– Without implementing make-rat, denom, numer!– Even without knowing, how they will be

implemented…

• We still need to define make-rat, denom, numer.– Therefore, we have to glue together numerator and

denominator.– To achieve this, we create a Scheme structure for

storing pairs:(define-struct xy (x y))

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

19

Representing rational numbers

(define (make-rat n d) (make-xy n d))

(define (numer r) (xy-x r))

(define (denom r) (xy-y r))

• We can define the constructor and the selectors with the assistance of the xy structure.

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

20

Using operations on rational numbers

(define one-third (make-rat 1 3))

(define four-fifths (make-rat 4 5))

(print-rat one-third)“1/3“

(print-rat (mul-rat one-third four-fifths))“4/15“

(print-rat (add-rat four-fifths four-fifths))“40/25“

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

21

Levels of abstraction• Programs are built up as layers of language extensions.• Each layer is a level of abstraction .• Each abstraction hides some implementation details.• There are four levels of abstraction in our rational numbers

example.

add-rat mul-rat equal-rat …

make-rat numer denomRational numbers as numerators and denominators

Rational numbers as structures

Rational numbers in the problem domain

make-xy xy-x xy-yWhatever way structures are implemented

Programs that use rational numbers

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

22

Bottom level

• Level of pairs• Procedures make-xy, xy-x and xy-y are

already constructed by the interpreter due to the structure definition.

• The actual implementation of structures is hidden.

Rational numbers as structuresmake-xy xy-x xy-y

Whatever way structures are implemented

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

23

Second level

• Level of rational numbers as data objects• Procedures make-rat, numer and denom are

defined at this level.• The actual implementation of rational numbers

is hidden at this level.

make-rat numer denomRational numbers as numerators and denominators

Rational numbers as structures

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

24

Third level• Level of service procedures on rational numbers• Procedures add-rat, mul-rat, equal-rat,

etc. are defined at this level.• Implementation of these procedures are hidden

at this level.

add-rat mul-rat equal-rat …

Rational numbers as numerators and denominators

Rational numbers in problem domain

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

25

Top level• Program level• Rational numbers are used in calculations as if

they were ordinary numbers.

Rational numbers in the problem domainPrograms that use rational numbers

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

26

Abstraction barriers

• Each level is designed to hide implementation details from higher-level procedures.

• These levels act as abstraction barriers.

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

27

Advantages of data abstraction• Programs can be designed one level of

abstraction at a time.• Thereby data abstraction supports top-down

design.– We can gradually figure out data representations

and how to implement constructors, selectors and service procedures that we need, one level at a time.

• We do not have to be aware of implementation details below the level at which we are programming.

• An implementation can be changed later without changing procedures written at higher levels.

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

28

Change data representation• A few slides ago we saw:

• Our rational numbers are not always in reduced form.

• We decide that rational number should be represented in a reduced form.– 40/25 and 8/5 are the same number.– Thanks to data abstraction our service procedures do not

care in which form the number is represented.– The procedures like add-rat or equal-rat function

correctly in either case.

(print-rat (add-rat four-fifths four-fifths))"40/25"

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

29

Change data representation

(define (make-rat n d) (make-xy

(/ n (gcd n d)) (/ d (gcd n d))))

• We can change the constructor...

• ...or the selectors.(define (numer x) (/ (xy-x x) (gcd (xy-x x) (xy-y x))))(define (denom x) (/ (xy-y x) (gcd (xy-x x) (xy-y x))))

gcd is a built-in procedure that produces the greatest common divisor!

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

30

Designing Procedures for Mixed Data• Up to this point, our procedures have handled only

one type of data – Numbers– Booleans– Symbols– Types of special structures

• But we often want that procedures operate with different types of data

• We will also learn how to protect procedures from wrong use

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

31

Example

• We have (define-struct point (x y)) for points

• Many points are on the x-axis• In this case we want to represent these points

just by a number– A = (make-point 6 6), B= (make-point 1 2)

C = 1, D = 2, E = 3

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

32

Designing Procedures for Mixed Data• To document our representation of points, we

make the following informal definition of data types

• Now the contract, the description and the header of a procedure distance-to-0 is easy:

• How can we differentiate between the data types?– With the help of the predicates: number?, point? etc.

;; a pixel-2 is either;; 1. a number;; 2. a point-structure

;; distance-to-0 : pixel-2 -> number;; to compute the distance of a-pixel to the origin(define (distance-to-0 a-pixel) ...)

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

33

Designing Procedures for Mixed Data• Base structure: Procedure body with cond-

expression that analyzes the type of the input

• We know that in the second case the input is composed of two coordinates …

(define (distance-to-0 a-pixel) (cond [(number? a-pixel) ...] [(point? a-pixel) ...]))

(define (distance-to-0 a-pixel) (cond [(number? a-pixel) ...] [(point? a-pixel)

… (point-x a-pixel) … (point-y a-pixel) … ]))

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

34

Designing Procedures for Mixed DataNow it is easy to complete the function…(define (distance-to-0 a-pixel) (cond [(number? a-pixel) a-pixel] [(point? a-pixel) (sqrt

(+ (sqr (point-x a-pixel)) (sqr (point-y a-pixel))))]))

built-in procedures:• (sqr x) : x square• (sqrt x) : square root of x

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

35

Designing Procedures for Mixed Data• Another Example: graphical objects

– Variants: squares, circles,…– procedures: calculating the perimeter, draw, …

;; A shape is either ;; a circle structure: ;; (make-circle p s);; where p is a point describing the center;; and s is a number describing the radius; or ;; a square structure:;; (make-square nw s);; where nw is the north-west corner point ;; and s is a number describing the side length.

(define-struct circle (center radius))(define-struct square (nw length))

;; Examples:(make-circle (make-point 5 9) 87);; (make-square (make-point 20 5) 5)

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

36

Designing Procedures for Mixed Data• Compute the perimenter:

– Using our design-recipe…

– Using our design-recipe… (continued)

;; perimeter : shape -> number;; to compute the perimeter of a-shape(define (perimeter a-shape) (cond [(square? a-shape) ... ] [(circle? a-shape) ... ]))

;; perimeter : shape -> number;; to compute the perimeter of a-shape(define (perimeter a-shape) (cond [(square? a-shape) ... (square-nw a-shape)..(square-length a-shape) ...] [(circle? a-shape) ... (circle-center a-shape)..(circle-radius a-shape) ..]))

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

37

Designing Procedures for Mixed Data• Compute perimeter:

– Final result;; perimeter : shape -> number;; to compute the perimeter of a-shape

(define (perimeter a-shape) (cond [(square? a-shape) (* (square-length a-shape) 4)] [(circle? a-shape) (* (* 2 (circle-radius a-shape)) pi)]))

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

38

Program design and heterogeneous data

• The data analysis gets more important– Which classes of objects exists and what are their attributes?– Which meaningful groupings of classes are there?

• So called “subclass creation”– Yields a hierarchy of data definitions in general

• Templates– 1. step: cond-expression that analyzes the types of data

inside a group– 2. step: add selectors accordingly for every branch– Alternative: call a procedure specific to the respective data

type (i.e.: procedure perimeter-circle, perimeter-square)

• Program body– Combine the available information in every branch of the

case, depending on the purpose– Alternative: implement procedures specific to data types one

by one using the normal design recipe • For an overview of the new design process, see HTDP Fig.

18

Dr. G. RößlingProf. Dr. M. MühlhäuserRBG / Telekooperation

©

Introduction to Computer Science I: T2

39

Get your data structures correct first, and the rest of the program will write itself.

David Jones