Fondamenti di Informatica - ce.uniroma2.it · Introduzione al corso e all’informatica Prof....

48
Fondamenti di Informatica Introduzione al corso e all’informatica Prof. Emiliano Casalicchio

Transcript of Fondamenti di Informatica - ce.uniroma2.it · Introduzione al corso e all’informatica Prof....

Fondamenti di Informatica

Introduzione al corso e all’informatica

Prof. Emiliano Casalicchio

The teacher

n  Emiliano Casalicchio

[email protected] n  Consulting hours

 When: monday 16:00-18:00  Where: room D1-17 (1st floor), “Ingegneria

dell’Informazione” building

2

Textbook and teaching material

n  Textbook

 Engineering Computation with MATLAB®, Second Edition, David M. Smith

n  The MATLAB® tool

 http://www.mathworks.com  Student edition available 89$

n  More material

 Slides presented and suggested readings

3

Slides don’t replace the textbook. Them are complementary tools.

Both are essential to succeed!!!!

Authoritative information

n  Official source   information and teaching material

http://www.ce.uniroma2.it/courses/FOI/

»  friends »  forums »  Other

18 Marzo

2011

Emiliano Casalicchio (C)

5

Teaching material will be prevealently in English

6

Introduction to Computer Science and Media Computation

n  Learning Objectives

7

Definitions

n  Several definitions

1.  the “science of computers”   aka: Computer Science

”Computer science is no more about computers than astronomy is about telescopes.”!

!E. Dijkstra (Dutch computer Scientist. Turing Award in 1972. 1930-2002)!

2.  the “science of information” or

3.  “the science of information representation and (automatic) processing”   aka: Informatics

8

Definitions

n  “the science of information representation and (automatic) processing”

n  what is information ?   why are we interested in its representation and (automatic) processing ?

n  actually, we are interested in “solving problems”

  more precisely, in systematic ways for solving problems   Computer Science is concerned with this

9

A problem to be solved

n  what do we need to solve this problem “abstractly” ?   a representation of the involved entities   a procedure to be followed (based on the adopted representation)

?

n  representation   a way of expressing the relevant information for the problem at hand

n  procedure   a step-by-step process to be followed to get the solution

  a “recipe”

10

Representation + Process

n  “the science of information representation and (automatic) processing”

  now it should be a bit clearer what we are talking about

  it remains to discuss more explicitly the meaning of “automatic”   later …

11

the box-and-door problem again …

n  box representation   face A: heigth(A), width(A)

  e.g.: height(A) = 220 cm, width(A) = 70 cm   face B: heigth(B), width(B)

  e.g.: height(B) = 220 cm, width(B) = 110 cm   face C: heigth(C), width(C)

  e.g.: height(C) = 70 cm, width(C) = 110 cm

?

n  door representation   heigth(door), width(door)

  e.g.: height(door) = 210 cm, width(door) = 80 cm

A B

12

the box-and-door problem again …

n  OK ?   hint: try with the numbers given in the previous slide …

?

A B

n  a new procedure (“new recipe”)   check: height(A)<heigth(door)) AND width(A)<width(door)

  if true, OK ; else, “rotate” face A   check: width(A)<heigth(door)) AND heigth(A)<width(door)

  if true, OK ; else, check face B   …

n  procedure (“recipe”)   check: height(A)<heigth(door)) AND width(A)<width(door)

  if true, OK ; else, check face B   check: height(B)<heigth(door)) AND width(B)<width(door)

  if true, OK ; else, check face C   check: height(C)<heigth(door)) AND width(C)<width(door)

  if true, OK ; else FAILURE

13

“general” procedure

  check: height(A)<heigth(door)) AND width(A)<width(door)

  if true, OK ; else, “rotate” face A   check: width(A)<heigth(door)) AND

heigth(A)<width(door)   if true, OK ; else, check face B

  …

n  we are interested in general procedures   “parametric” procedure

n  does this procedure work only for this box and this door ?

?

A B

14

“automatic” processing

  check: height(A)<heigth(door)) AND width(A)<width(door)

  if true, OK ; else, “rotate” face A   check: width(A)<heigth(door)) AND

heigth(A)<width(door)   if true, OK ; else, check face B

  …

n  box representation   face A: heigth(A), width(A)   face B: heigth(B), width(B)   face C: heigth(C), width(C)

n  door representation   heigth(door), width(door)

representation procedure

+

n  what do we mean by “automatic” ?

n  “the science of information representation and (automatic) processing”

15

a language to write “recipes”

START

recipe processing begins

END

recipe processing ends

input / output

a recipe step

condition

selection of alternative paths

true false

a “collateral” recipe

X :

a named “container”

X ← expr

put a value in a “container”

16

an example of “recipe”

Start

true write: “trivial answer or impossible to calculate”

End r ← remainder

of x/y

read m and n

x ← m y ← n

x ≠ 0 AND y ≠ 0 false

false r = 0

x ← y

n  “try” this recipe with different pairs of non-negative integer numbers   are you able to get the “final” answer?

n  what does it mean ?   is it necessary to know that, to process

this recipe ?

y ← r

write: “the answer is” y

End

true

17

a mathematical problem

n  determine z = GCD(i, j) i, j ∈ N

n  D(i) the set of the integer divisors of i

=def

n  D(i) = { k | i = k·q, k ∈ N+, q ∈ N } N+= N - {0}

n  z = GCD(i, j) = max( D(i) ∩ D(j) ) i, j z GCD

i, j z GCD

matching problems with recipes

BTW: this “recipe” to solve the GCD problem is known as “the Euclid algorithm” (300 B.C.)

read m and n

start

End

End

write:“the answer is ” y

x←m y←n

x≠0 AND y≠0 true

true

false

false

r←remainder of x/y

r=0 x←y

y←r

1-19

Selecting Baseball Cards – The Problem

n  For example, say you have a big collection of baseball cards and you want to find the names of the 10 “qualified” players with the highest lifetime batting averages.

n  To qualify, the players must have been in the league at least 5 years, had at least 100 plate appearances per year, and made fewer than 10 errors per year.

n  The cards contain all the relevant information for each player. You just have to organize the cards to solve the problem.

1-20

Selecting Baseball Cards – The Steps

Clearly there are a number of steps between the stack of cards and the solution. In no particular order these are:

a. Write down the names of the players from some cards

b. Sort the stack of cards by the lifetime batting average

c. Select all players from the stack with 5 years or more in the league

d. Select all players from the stack with fewer than 10 errors per year

e. Select all players from the stack with over 100 plate appearances per year

f. Keep the first 10 players from the stack

1-21

Selecting Baseball Cards – The recipes

The solution might be: c. Select all players from the stack with 5 years or more

in the league d. Select all players from the stack with fewer than 10

errors per year e. Select all players from the stack with over 100 plate

appearances per year b. Sort the stack of cards by the lifetime batting

average f. Keep the first 10 players from the stack a. Write down the names of the players from these

cards

In any

order

Computer Science / Informatics is about …

n  “recipes” to solve problems   the methodological facet

  problem solving and information management

n  “machines” to process recipes   the technological facet

  (presently) electronic computing devices and systems that use them

n  both have a long history   methodologies (Euclid ~300 B.C., …, al-Kuwarizmi ~1000 A.D., …

Hilbert ~1800 A.D., Gödel, Turing, ~1900 A.D., …)   machines (abacus ?B.C., …, Babbage engine ~1800 A.D., …

ENIAC ~1940 A.D., …)"

the actual “birthdate” of computer science

n  when technology advances made it realistic building machines able to process recipes millions (and more …) times faster than humans   mid of past century …"

n  the MACHINE (computer) does the recipe"  however hard, tedious and complex it is"

  Crank through a million genomes?   Find one person in a 30,000 campus?   Process a million dots on the screen or a billion sound samples?   No problem!"

BTW: That’s media computation

n  a word of caution : we have not (yet?) recipes and machines to solve any problem   intractable problems   non computable problems

Computer science is concerned with the study of recipes

n  Computer scientists study…   How the recipes are written

  algorithms, software engineering   The “units” used in the recipes

  data structures, databases   What can recipes be written for

  systems, intelligent systems, theory   How well the recipes work (human-computer interfaces)

Specialized Recipes

n  computer scientists can also specialize on special kinds of recipes   recipes that create pictures, sounds, movies, animations (graphics,

computer music)   like other people specialize in crepes or barbeque ...

n  still others look at emergent properties of computer “recipes”   What happens when lots of recipes talk to one another (networking,

non-linear systems) …

n  our focus will be on recipes to solve engineering problems   computing liquid level, measuring a solid object, control robot arm

motion, encryption, processing images

n  despite specialization, they share several core concepts with other C.S. fields   playing a particular game, we will learn general rules …

core concept

information representation

n  “recipes” work on abstractions (representations)   … and we know what they mean"

n  “machines” execute recipes to manipulate representations"  without knowing what they mean"

What computers understand

n  quotation from previous slide: "  “machines” execute recipes to manipulate representations"

  without knowing what do they mean"

27

n  It’s not really multimedia at all.   It’s unimedia

  (said Nicholas Negroponte, founder of MIT Media Lab)   Everything is 0’s and 1’s

n  Computers are exceedingly stupid   The only data they understand is 0’s and 1’s   They can only do the most simple things with those 0’s and 1’s

  Move this value here   Add, multiply, subtract, divide these values   Compare these values, and if one is less than the other, go follow this step

rather than that one.   Done fast enough, those simple things can be amazing.

How a computer works

n  just an outline …

28

n  The part that does the adding and comparing is the Central Processing Unit (CPU).

n  The CPU talks to the memory   Think of it as a sequence

millions of mailboxes, each one byte in size, each of which has a numeric address

n  The hard disk provides 10 times or more storage than in memory, but is millions of times slower

n  The display is the monitor or LCD (or whatever)

Let’s make a step back in 19th century

n  The Babbage’s difference engine

n  Conceived in 1854

n  Realized in 1991 (Science Museum in London)

29

Colossus: built during II World War

n  by Max Newman to crack the Enigma code

30

The Von Neumann architecture

n  Conceived around 1940

31

ENIAC

Suggested reading

http://en.wikipedia.org/wiki/John_von_Neumann

How a computer works

n  just an outline …

32

n  The part that does the adding and comparing is the Central Processing Unit (CPU).

n  The CPU talks to the memory   Think of it as a sequence

millions of mailboxes, each one byte in size, each of which has a numeric address

n  The hard disk provides 10 times or more storage than in memory, but is millions of times slower

n  The display is the monitor or LCD (or whatever)

33

Central Processing Unit (CPU)

n  is in charge of executing the operations of a recipe   the recipe is stored in memory

n  Execution cycle 1.  Fetch an operation (instruction) from memory

1.  each operation is coded according to predefined “rules” 2.  Decode operation 3.  Execute operation

n  Each CPU is characterized by its own operation codes (machine language) : 0100 0000 0000 1000

0100 0000 0000 1001

0000 0000 0000 1000

...

34

Memory

n  Main memory: central memory   stores data and operations of running programs

  …binary format   volatile   “random access” (RAM)

  constant access time   SRAM, DRAM, etc.

  fast (~10-100nsecs), expensive   “limited” capacity (up to a few Gigabytes)   hierarchical structure

  1st, 2nd level cache, …

n  Secondary memory: hard disk, CD, etc..   non volatile   large capacity (several hundreds Gigabytes)   slow (~ms and more), cheap

35

Central Memory

n  Consisting of cells (locations), with each of them consisting in turn of a fixed number of binary elements   each binary element can store (represent) only two

values : 0 or 1   binary digit -> bit

  usually: one cell = 1 byte (8 bit) n  Each cell is associated with an address in the

range [0,1,…,M-1]   M: memory dimension   main memory can be seen as a “vector” of bytes

n  CPU reads/writes cell content by specifying the cell address   read: to fetch the content of a memory cell   write: to modify the content of a memory cell   m bit address ⇒ address space 2m

  not necessarily: M = 2m

Memory

8 bit

byte M-1 byte M-2

byte 0 byte 1

Layers of abstraction

n  basically, we have not this “raw” vision of a computer

n  high level operations   mathematical functions (log, sine, …)   text processing   ...

n  high level memory   set of cells identified by a name

  user defined   high level cell content

  text, image, set of …, …   secondary memory organized as a set of named files

n  obtained through stratified layers of software

Application Software

System Software

Hardware

1-37

Interactions between Hardware and Software

Key Concept: Encodings

n  We can interpret the 0’s and 1’s in computer memory any way we want.   We can treat them as

numbers.   We can encode information in

those numbers

38

n  Even the notion that the computer understands numbers is an interpretation   We encode the voltages on

wires as 0’s and 1’s, eight of these defining a byte

  Which we can, in turn, interpret as a decimal number

BTW: why do we interpret this string of 0’s an 1’s as 74 ?

Layer the encodings as deep as you want

n  ASCII encoding for characters   “A” coded as 65   “B” coded as 66   …   If there’s a byte with a 65 in it, and we decide that it’s a

character, POOF! It’s an “A”!

n  We can string together lots of these numbers together to make usable text   “77, 97, 114, 107” stands for “Mark”   “60, 97, 32, 104, 114, 101, 102, 61” stands for “<a href=“ (HTML)

Layered encodings

n  A number is just a number

n  If you have to treat it as a letter, there’s a piece of software that does it   For example, that associates 65 with the graphical

representation for “A” n  If you have to treat it as part of an HTML document, there’s a

piece of software that does it   That understands that “<A HREF=“ is the beginning of a link

n  That part that knows HTML communicates with the part that knows that 65 is an “A”

Multimedia is unimedia

n  But that same byte with a 65 in it might be interpreted as…   A very small piece of sound (e.g., 1/44100-th of a second)   The amount of redness in a single dot in a larger picture   The amount of redness in a single dot in a larger picture which

is a single frame in a full-length motion picture

n  We use software to manage all these layers   How do you decide what a number should mean, and how you

should organize your numbers to represent all the data you want?

  That’s data structures

n  If that sounds like a lot of data, it is   To represent all the dots on your screen probably takes more

than 3,145,728 bytes   Each second of sound on a CD takes 44,100 bytes

Why digitize media?

n  We work with digital encoding of media   digitization

n  Digitizing media is encoding media into numbers   Real media is analogue (continuous).   To digitize it, we break it into parts where we can’t perceive

the parts.

n  By converting them in digital format, we can more easily manipulate them, store them, transmit them without error, etc.

How can it work to digitize media?

n  Why does it work that we can break media into pieces and we don’t perceive the breaks?

n  We can only do it because human perception is limited.   We don’t see the dots in the pictures, or the gaps in the

sounds.

n  We can make this happen because we know about physics (science of the physical world) and psychophysics (psychology of how we perceive the physical world)

“talking” with computers

n  We need a language to exchange information with computers   data, recipes, …

n  Different programming languages are different ways (encodings) to tell computers same things

45

Programming languages and layers of abstraction

n  Different languages at different layers

n  machine language 0100 0000 0000 1000 0100 0000 0000 1001 0000 0000 0000 1000

n  Assembler LOAD X ADD Y STORE Z

n  high level language def fun(): a = 0; print a+5

Sequence of binary instructions, directly executable by CPU

Instructions in 1-to-1 correspondence with binary instructions, but

expressed with symbolic (human understandable) names

Machine independent. Data abstraction

Introduzione 46

translation from layer to layer

swap: muli $2, $5, 4 add $2, $4, $2 lw $15, 0($2) lw $16, 4($2) sw $16, 0($2) sw $15, 4($2) jr $31

assembler

00000000101000010000000000011000 00000000100011100001100000100001 10001100011000100000000000000000 10001100111100100000000000000100 10101100111100100000000000000000 10101100011000100000000000000100 00000011111000000000000000001000

def swap(v, k) : temp = v[k] v[k] = v[k+1] v[k+1] = temp

compiler

Program in a high level language Program in

assembler language (MIPS)

Programm in binary machine language (MIPS)

47

Programming language

n  Each programming language is characterized by: 1.  syntax 2.  semantics

n  A natural language sentence can be syntactically correct, but with no meaning at all !   the grass reads the house

n  The same for programming languages sentences.

Why should you need to study computer science? or “recipes”?

n  To understand better the “recipe-way” of thinking   It’s influencing everything, from computational science to bioinformatics   Eventually, it’s going to become part of everyone’s notion of a liberal

education   That’s the process argument   BTW, to work with and manage computer scientists

n  AND … to communicate!   Writers, marketers, producers communicate through computation

n  We’ll somehow take these in opposite order

48