CSC8304 – Computing Environments for Bioinformatics - Lecture 61 Introduction to programming...
-
Upload
sarah-wells -
Category
Documents
-
view
223 -
download
4
Transcript of CSC8304 – Computing Environments for Bioinformatics - Lecture 61 Introduction to programming...
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
1
Introduction toprogramming languages
2
Objectives
Concepts of programming Programming languages Development of computer programs
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
3
Why computer programs ?
Problems:• Arranging the text of a letter
• Collecting and maintaining data about customers
• Calculating the best investment portfolio
• Making a photo with your mobile phone
• Synchronising the components of car engine
Computer programs aim to solve such problems related to electronically stored and processed data
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
4
Solving problems
Problem description Data collections Problem analysis (including data analysis) Designing a solution Implementing the solution
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
5
Algorithms
Algorithm = systematic processing of actual or virtual data
Specification of input and output data Specification of methods of data processing E.g. Euclid’s greatest common divisor algorithm:
• a, b two positive numbers – which is their gcd ?
• x = a, y = b
• If x > y then n = x, d = y otherwise n = y, d = x
• n = q * d + r, x = d, y = r
• If y = 0 then gcd = x
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
6
Early computers
Binary data entry – punch-cards Machine language: e.g. MOV A,B; LLR; etc. Difficult to program – easy to make errors
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
7
Constants and variables
Constant: a fixed value, e.g. 5 Constant: a fixed value with a name, e.g. a=5 Variable x – a place holder for a value (e.g. number,
text) ‘:=‘ assignation of a value to a variable = the
contents of the variable with a given name takes a certain specified value
Makes sense: x := x +1• x := 5, x := x+1, now the value of x is 6
Other variables: s := ‘Hello!’, y := (2, ‘apples’, ‘table’)
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
8
Data types
Data is stored in variables with names Variable: name + type + contents Type determines what kind of contents the variable
may have: e.g. integer, floating point real, string, combination of other data types
E.g. • int x, x := 5 is allowed, x := 5.1 is not allowed
• string s, s := ‘hello kids’ is allowed, s := 3 is not allowed Type definition for combined types:
• addr = record (int nr, string st, string ct, string pc)
• addr a, a := (5, ‘Hyde’, ‘York’, ‘YO2 4RH’)
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
9
Operators
Operators: +, -, concatenate, <=:• a:=5+3, s:=concatenate(‘hot’, ‘dog’), a<=5
Each type has a range of operators that can be applied to variables of that type
Operator overload: some operators may apply in different ways to data of different types
In case of subtypes, e.g. real and integer, additional operators may apply to the subtype – e.g. integer division
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
10
Early programming languages
Fortran, Cobol Better than machine code Introduce flow control
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
11
Flow control – 1 (conditions)
If-then-else Branching depending on condition If <condition> then <Tblock> else <Fblock> E.g.
• If x=5 then a=2 else a=1
• If (signal, left) then (turn, left) else (turn, right)
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
12
Flow control – 2 (loops)
for – fixed length cycling for <init statement>, <increment statement>, <condition
statement>, <execution statement> E.g.
• for {i:=1,a:=1}, i:=i+1, i<=100, do a:=a*i; while, repeat – variable length cycling while <condition statement>, <execution statement> repeat < execution statement>, <condition statement> E.g.
• while i<100, do a:=a*i, i:=i+1
• repeat a:=a*i, i:=i+1, until i=100
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
13
Structured programming
Structured programming was introduced in the late 60’s – early 70’s
Pascal, C Flow control is packaged into procedures, data are
separated between program structures better understanding, better design, better programs with fewer errors
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
14
Procedures and functions Procedures: blocks of programs containing flow control
structures with a set of specified input data and a set of specified output data
Functions: similar to procedures, but generates a single output data (i.e. it is like a function)
Procedures are called with a set of actual values of their formal input variables and a set of variables specified for their formal output variables
E.g.• procedure Draw (int x,y,z,w)• procedure Prediction (int x,y,z; var int a,b)• int function Length (string s)• Length(‘hello’)• Draw(10,10,50,50)
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
15
Recurrent procedures
Recurrent procedure: procedure that calls itself Data separation E.g.
• Procedure Gcd (int a,b; var int g)int x,y,r,q,n,dx:=a; y:=b;if x>y then {n:=x; d:=y} else {n:=y; d:=x};q:=n div d; r:=n – q*d;x:=d; y:=r;if y=0 then g:=x else Gcd(x,y,g);end;
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
16
Object oriented programming
Object oriented programming emerges in the 70s and becomes mainstream programming paradigm in the late-80s – early 90s
Aims: • Better description of real world problems
• Better software design
• Increased reliability of large software systems
Smalltalk, Delphi, C++, C#, Java
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
17
Classes and objects – 1 Class: encapsulation of data and data manipulation, such that
interference with outside is the minimal necessary Class: attributes and methods – some visible from the outside,
most visible only inside E.g.
• Class Squareint llx,lly,dx,colorCreateDestroyDrawFillDraw
Square S , S.Create – an object is an instance of a class
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
18
Classes and objects – 2
Classes can be defined as derivatives of other classes – inheritance
Derived classes inherit attributes and methods from the parent class and may add further attributes and methods to these or may change the definition of some inherited
E.g. Class Rectangle (Square)int dy (new attribute)
(int llx,lly,dx,color – inherited)
Draw (redefined)
FillDraw (redefined)
Rotate (new method)
(Create, Destroy – inherited)
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
19
Flow control with exceptions Objects are instances of classes and many objects exist
simultaneously concurrent execution of objects Objects interact by sending messages – i.e. invoking methods
of them, which are visible from the outside Flow control: try – catch – throw Exception: incorrect execution because of some reason E.g.
tryR.Draw;return(‘OK’);
catch (exception e)throw GraphicsExceptionFault;return(‘Error’);
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
20
Functional programming
Everything is written as a function, the program is a combination of functions
LISP Applied in AI (Artificial Intelligence)
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
21
Declarative programming
Instructions are not necessarily specified directly What is wanted is declared, but how to get it is not
specified Prolog – logic programming used in AI SQL – database language Declarative programming is closer to natural
language than imperative programming (describing how to do things – e.g. C, C++, Java), but it may imply much longer execution time
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
22
Compilation vs. interpretation
Compilation: the program is translated into a sequence of machine codes that can be executed directly by the processor – the whole program is translated (compiled) at once, when it is finished, the compiled program is executed compilers
Interpretation: the program is interpreted by taking instructions/declarations one-by-one, each interpretation leads to a brief machine code translation that is executed, then the next instruction/declaration is interpreted – the program is translated (interpreted) as it is executed, and at any time only a small part is translated into machine code interpretors
Compilers usually generate faster running programs, while interpretors leave more space for interactive use of programs
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
23
Interpreted or compiled?
BASIC
C/C++
Java
R
Matlab
Perl
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
24
Reusable software
Developing software takes long time – it is desirable to re-use existing software to solve partial problems of new problems
Re-use is facilitated by documentation – description of what is written in the program and why
Early programming languages did not support very much re-use
Object oriented programming languages provide very much support for re-use
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
25
Component-based programming
Component-based programming is the current major trend in software development
New software is built by combining existing components in novel ways – relies very much on re-use of existing software
E.g. classes or objects can be purchased or used as service providers, most of the software does not have to written from scratch – for example handling of a printer or reading standard file formats (like XML)
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
26
Software development
Problem analysis Data analysis Design Development and integration Prototype Testing Use and maintenance
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
27
Software development: problem analysis
What is the problem that needs the software solution E.g.
• Management of data bases in a uniform manner
• Visualisation of complex scientific data
Identification of users Collection of information and data about user needs
and requirements Analysis of collected information and data
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
28
Software development: data & design
Collection and analysis of relevant data Analysis of data formats – needs and requirements Design the relevant information flow Design data structures supporting the information
flow Design processing of the data
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
29
Software development: integration & implementation
Development of software components implementing the design
Acquiring existing components based on design requirements, and analysis of features of existing components
Integration of existing components and writing of integration software and possible other components that cannot be bought-in off-the-shelf
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
30
Software development: prototype & testing
Development of a small-scale prototype to test functionalities
Testing of components of the software system – test scenarios, use cases
Elimination and correction of faults and errors
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
31
Software development: use and maintenance
Installation and training of users Deployment of the software Maintenance Updates and patches
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
32
Summary Algorithms History of programming languages: machine code; early
languages: Fortran, Cobol; structured programming: Pascal, C; object oriented programming: C++, C#, Java; functional programming: Lisp; declarative programming: SQL
Constants, variables, data types Flow control structures: if-then-else, for, while, repeat Procedures and functions Classes: encapsulation, inheritance Compilers and Interpreters Software development process
CSC8304 – Computing Environments for Bioinformatics - Lecture 6
33
Q & A
Is it true that Java is a declarative language ? Is it true that only variables of the same type can be compared
by comparison operators ? Can we use the ‘for’ flow control mechanism to execute the
same set of operations for 10 or 20 times depending on the value of some processed data ?
Is it true that a class is an instance of an object ? Can we use the try-catch-throw flow control in concurrent
environments, with many objects executed at the same time ? Can we develop a prototype of a software before meeting the
users to collect user requirements ?
CSC8304 – Computing Environments for Bioinformatics - Lecture 6