Introduction to programming languages

33
CSC8304 – Computing Environments for Bioinformatics - Lecture 6 1 Introduction to programming languages

description

Introduction to programming languages. Objectives. Concepts of programming Programming languages Development of computer programs. Why computer programs ?. Problems: Arranging the text of a letter Collecting and maintaining data about customers Calculating the best investment portfolio - PowerPoint PPT Presentation

Transcript of Introduction to programming languages

Page 1: Introduction to programming languages

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

1

Introduction toprogramming languages

Page 2: Introduction to programming languages

2

Objectives

Concepts of programming Programming languages Development of computer programs

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 3: Introduction to programming languages

3

Why computer programs ?

Problems:• Arranging the text of a letter

• Collecting and maintaining data about customers

• Calculating the best investment portfolio

• Making a photo with your mobile phone

• Synchronising the components of car engine

Computer programs aim to solve such problems related to electronically stored and processed data

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 4: Introduction to programming languages

4

Solving problems

Problem description Data collections Problem analysis (including data analysis) Designing a solution Implementing the solution

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 5: Introduction to programming languages

5

Algorithms

Algorithm = systematic processing of actual or virtual data

Specification of input and output data Specification of methods of data processing E.g. Euclid’s greatest common divisor algorithm:

• a, b two positive numbers – which is their gcd ?

• x = a, y = b

• If x > y then n = x, d = y otherwise n = y, d = x

• n = q * d + r, x = d, y = r

• If y = 0 then gcd = x

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 6: Introduction to programming languages

6

Early computers

Binary data entry – punch-cards Machine language: e.g. MOV A,B; LLR; etc. Difficult to program – easy to make errors

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 7: Introduction to programming languages

7

Constants and variables

Constant: a fixed value, e.g. 5 Constant: a fixed value with a name, e.g. a=5 Variable x – a place holder for a value (e.g. number,

text) ‘:=‘ assignation of a value to a variable = the

contents of the variable with a given name takes a certain specified value

Makes sense: x := x +1• x := 5, x := x+1, now the value of x is 6

Other variables: s := ‘Hello!’, y := (2, ‘apples’, ‘table’)

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 8: Introduction to programming languages

8

Data types

Data is stored in variables with names Variable: name + type + contents Type determines what kind of contents the variable

may have: e.g. integer, floating point real, string, combination of other data types

E.g. • int x, x := 5 is allowed, x := 5.1 is not allowed

• string s, s := ‘hello kids’ is allowed, s := 3 is not allowed Type definition for combined types:

• addr = record (int nr, string st, string ct, string pc)

• addr a, a := (5, ‘Hyde’, ‘York’, ‘YO2 4RH’)

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 9: Introduction to programming languages

9

Operators

Operators: +, -, concatenate, <=:• a:=5+3, s:=concatenate(‘hot’, ‘dog’), a<=5

Each type has a range of operators that can be applied to variables of that type

Operator overload: some operators may apply in different ways to data of different types

In case of subtypes, e.g. real and integer, additional operators may apply to the subtype – e.g. integer division

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 10: Introduction to programming languages

10

Early programming languages

Fortran, Cobol Better than machine code Introduce flow control

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 11: Introduction to programming languages

11

Flow control – 1 (conditions)

If-then-else Branching depending on condition If <condition> then <Tblock> else <Fblock> E.g.

• If x=5 then a=2 else a=1

• If (signal, left) then (turn, left) else (turn, right)

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 12: Introduction to programming languages

12

Flow control – 2 (loops)

for – fixed length cycling for <init statement>, <increment statement>, <condition

statement>, <execution statement> E.g.

• for {i:=1,a:=1}, i:=i+1, i<=100, do a:=a*i; while, repeat – variable length cycling while <condition statement>, <execution statement> repeat < execution statement>, <condition statement> E.g.

• while i<100, do a:=a*i, i:=i+1

• repeat a:=a*i, i:=i+1, until i=100

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 13: Introduction to programming languages

13

Structured programming

Structured programming was introduced in the late 60’s – early 70’s

Pascal, C Flow control is packaged into procedures, data are

separated between program structures better understanding, better design, better programs with fewer errors

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 14: Introduction to programming languages

14

Procedures and functions Procedures: blocks of programs containing flow control

structures with a set of specified input data and a set of specified output data

Functions: similar to procedures, but generates a single output data (i.e. it is like a function)

Procedures are called with a set of actual values of their formal input variables and a set of variables specified for their formal output variables

E.g.• procedure Draw (int x,y,z,w)• procedure Prediction (int x,y,z; var int a,b)• int function Length (string s)• Length(‘hello’)• Draw(10,10,50,50)

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 15: Introduction to programming languages

15

Recurrent procedures

Recurrent procedure: procedure that calls itself Data separation E.g.

• Procedure Gcd (int a,b; var int g)int x,y,r,q,n,dx:=a; y:=b;if x>y then {n:=x; d:=y} else {n:=y; d:=x};q:=n div d; r:=n – q*d;x:=d; y:=r;if y=0 then g:=x else Gcd(x,y,g);end;

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 16: Introduction to programming languages

16

Object oriented programming

Object oriented programming emerges in the 70s and becomes mainstream programming paradigm in the late-80s – early 90s

Aims: • Better description of real world problems

• Better software design

• Increased reliability of large software systems

Smalltalk, Delphi, C++, C#, Java

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 17: Introduction to programming languages

17

Classes and objects – 1 Class: encapsulation of data and data manipulation, such that

interference with outside is the minimal necessary Class: attributes and methods – some visible from the outside,

most visible only inside E.g.

• Class Squareint llx,lly,dx,colorCreateDestroyDrawFillDraw

Square S , S.Create – an object is an instance of a class

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 18: Introduction to programming languages

18

Classes and objects – 2

Classes can be defined as derivatives of other classes – inheritance

Derived classes inherit attributes and methods from the parent class and may add further attributes and methods to these or may change the definition of some inherited

E.g. Class Rectangle (Square)int dy (new attribute)

(int llx,lly,dx,color – inherited)

Draw (redefined)

FillDraw (redefined)

Rotate (new method)

(Create, Destroy – inherited)

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 19: Introduction to programming languages

19

Flow control with exceptions Objects are instances of classes and many objects exist

simultaneously concurrent execution of objects Objects interact by sending messages – i.e. invoking methods

of them, which are visible from the outside Flow control: try – catch – throw Exception: incorrect execution because of some reason E.g.

tryR.Draw;return(‘OK’);

catch (exception e)throw GraphicsExceptionFault;return(‘Error’);

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 20: Introduction to programming languages

20

Functional programming

Everything is written as a function, the program is a combination of functions

LISP Applied in AI (Artificial Intelligence)

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 21: Introduction to programming languages

21

Declarative programming

Instructions are not necessarily specified directly What is wanted is declared, but how to get it is not

specified Prolog – logic programming used in AI SQL – database language Declarative programming is closer to natural

language than imperative programming (describing how to do things – e.g. C, C++, Java), but it may imply much longer execution time

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 22: Introduction to programming languages

22

Compilation vs. interpretation

Compilation: the program is translated into a sequence of machine codes that can be executed directly by the processor – the whole program is translated (compiled) at once, when it is finished, the compiled program is executed compilers

Interpretation: the program is interpreted by taking instructions/declarations one-by-one, each interpretation leads to a brief machine code translation that is executed, then the next instruction/declaration is interpreted – the program is translated (interpreted) as it is executed, and at any time only a small part is translated into machine code interpretors

Compilers usually generate faster running programs, while interpretors leave more space for interactive use of programs

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 23: Introduction to programming languages

23

Interpreted or compiled?

BASIC

C/C++

Java

R

Matlab

Perl

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 24: Introduction to programming languages

24

Reusable software

Developing software takes long time – it is desirable to re-use existing software to solve partial problems of new problems

Re-use is facilitated by documentation – description of what is written in the program and why

Early programming languages did not support very much re-use

Object oriented programming languages provide very much support for re-use

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 25: Introduction to programming languages

25

Component-based programming

Component-based programming is the current major trend in software development

New software is built by combining existing components in novel ways – relies very much on re-use of existing software

E.g. classes or objects can be purchased or used as service providers, most of the software does not have to written from scratch – for example handling of a printer or reading standard file formats (like XML)

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 26: Introduction to programming languages

26

Software development

Problem analysis Data analysis Design Development and integration Prototype Testing Use and maintenance

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 27: Introduction to programming languages

27

Software development: problem analysis

What is the problem that needs the software solution E.g.

• Management of data bases in a uniform manner

• Visualisation of complex scientific data

Identification of users Collection of information and data about user needs

and requirements Analysis of collected information and data

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 28: Introduction to programming languages

28

Software development: data & design

Collection and analysis of relevant data Analysis of data formats – needs and requirements Design the relevant information flow Design data structures supporting the information

flow Design processing of the data

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 29: Introduction to programming languages

29

Software development: integration & implementation

Development of software components implementing the design

Acquiring existing components based on design requirements, and analysis of features of existing components

Integration of existing components and writing of integration software and possible other components that cannot be bought-in off-the-shelf

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 30: Introduction to programming languages

30

Software development: prototype & testing

Development of a small-scale prototype to test functionalities

Testing of components of the software system – test scenarios, use cases

Elimination and correction of faults and errors

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 31: Introduction to programming languages

31

Software development: use and maintenance

Installation and training of users Deployment of the software Maintenance Updates and patches

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 32: Introduction to programming languages

32

Summary Algorithms History of programming languages: machine code; early

languages: Fortran, Cobol; structured programming: Pascal, C; object oriented programming: C++, C#, Java; functional programming: Lisp; declarative programming: SQL

Constants, variables, data types Flow control structures: if-then-else, for, while, repeat Procedures and functions Classes: encapsulation, inheritance Compilers and Interpreters Software development process

CSC8304 – Computing Environments for Bioinformatics - Lecture 6

Page 33: Introduction to programming languages

33

Q & A

Is it true that Java is a declarative language ? Is it true that only variables of the same type can be compared

by comparison operators ? Can we use the ‘for’ flow control mechanism to execute the

same set of operations for 10 or 20 times depending on the value of some processed data ?

Is it true that a class is an instance of an object ? Can we use the try-catch-throw flow control in concurrent

environments, with many objects executed at the same time ? Can we develop a prototype of a software before meeting the

users to collect user requirements ?

CSC8304 – Computing Environments for Bioinformatics - Lecture 6