Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

28
Computer Programming for Biologists Class 3 Nov 13 th , 2014 Karsten Hokamp http://bioinf.gen.tcd.ie/GE3M25

Transcript of Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Page 1: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Computer Programming for Biologists

Class 3

Nov 13th, 2014

Karsten Hokamp

http://bioinf.gen.tcd.ie/GE3M25

Page 2: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

quiz, recap

operator short-cuts

arrays

control structures

comparisons

project

Overview

Computer Programming for Biologists

Page 3: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Scalar variables:

$name = value;

right-to-left assignment

value: numbers or strings of characters

default variable $_

special character: \n (newline)

Evaluation within double quotes

e.g.: $text = "$input\n";

$text = '$input\n'; no evaluation

Recap

Computer Programming for Biologists

Page 4: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Computer Programming for Biologists

Variable name delimiter:

$text;

# same as

${text};

useful if text follows variable name, e.g.:

$item = 'apple';

print "I bought 4 $items"; # wrong

print "I bought 4 ${item}s"; # correct

print "I bought 4 " . $item . "s"; # correct

Addition to scalars

Page 5: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

functions for scalars:

. (concatenation)

.= (extension)

lc, uc, lcfirst, ucfirst (capitalisation)

chop, chomp (removing characters)

length

reverse

substr (extracting parts of a string)

tr (transliteration)

Recap

Computer Programming for Biologists

Page 6: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Computer Programming for Biologists

increase value by one:

$counter = $counter + 1;

same as

$counter++;

same as

$counter += 1;

Operator shortcuts other shortcuts:

$num += 10;

$num -= 5;

$num1 *= $num2;

$num1 /= $num2;

$num1 **= 3

$first_name .= $last;

operator combined with '=' applies operation to variable on the left

Shortcuts

Page 7: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Computer Programming for Biologists

Practical session:

Go to http://bioinf.gen.tcd.ie/GE3M25/class3

and try the 'Operators' exercises

Operators

Page 8: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Computer Programming for Biologists

Variables

Page 9: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Computer Programming for Biologists

Array examples:

@letters = ('a', 'b', 'c', 'd', 'e', 'f');

@numbers = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10);

@names = ('Karsten', 'Devin', 'Ken');

@dates = ('19/01/2007', '26/01/2007');

@block = ($start, $middle, $end);

Variables

ordered listarray variable

Page 10: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Computer Programming for Biologists

@letters = ('a', 'b', 'c', 'd', 'e', 'f');

@letters = (a..f); # same

@numbers = (1..10); # only works ascending

use variables as range limits:

$start = 1; $end = 10;

@numbers = ($start .. $end);

combine arrays:

@hex = (@numbers, @letters); # (1..10, a..f)

Array Construction

range operator

Page 11: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Computer Programming for Biologists

Each element is an individual scalar identified by its index

$numbers[0];

Use an expression as index:

$i = 0;

print $numbers[$i]; # prints first element

print $numbers[$i + 1]; # prints second element

Array Access

array name index

Note: @ changes to $Note: @ changes to $

Page 12: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Special indices and length

length of an array:

@numbers = (1..10);

$length = scalar @numbers;

last index: $#numbers (one less than length)

$numbers[$#numbers] refers to last element

negative index:

$numbers[-1] refers to last element

$numbers[-2] refers to second last element

t

Computer Programming for Biologists

Page 13: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Scalar and List Context

@bases = ('a', 'c', 'g', 't'); # list

@bases2 = reverse @bases; # list

$bases = @bases; # scalar

print @bases; # list

print scalar @bases; # scalar

print ''.@bases.''; # scalar

Computer Programming for Biologists

The context determines how a list is treated:

Page 14: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Computer Programming for Biologists

Practical session:

Go to http://bioinf.gen.tcd.ie/GE3M25/class3

and try the 'Arrays' exercises

Arrays

Page 15: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Basic Built-in functions for arrays

Adding and removing elements:

• push, pop (apply to end of array):

push @numbers, 11;

$last = pop @numbers;

• shift, unshift (apply to start of array):

unshift @numbers, 0;

$first = shift @numbers;

Computer Programming for Biologists

Page 16: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

<cmd> @array, add-on

0, 1, 2, 3, 4, 5, 6, 7, 8, 9

removed = <cmd> @array

Basic Built-in functions for arrays

Computer Programming for Biologists

shift

unshift push

pop

Special variable @ARGV command line arguments

$next_arg = shift @ARGV

$next_arg = shift;

Page 17: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

More Built-in functions for arrays

You will practise these in class!

• join (joins a list into a string)

• scalar (returns the length of an array)

• sort (sorts a list)

• reverse (reverses a list)

• splice (removing or adding slices)

example: $sequence = join '-', (a..f);

$sequence: 'a-b-c-d-e-f'

Computer Programming for Biologists

Page 18: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Computer Programming for Biologists

Practical session:

Go to http://bioinf.gen.tcd.ie/GE3M25/class3

and try the 'Arrays' exercises

Arrays

Page 19: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Control Structures

Computer Programming for Biologists

Page 20: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Loops cycle through elements of a list

apply same processing steps each time

Computer Programming for Biologists

foreach $letter (a..z) {

print "$letter\n";

}

foreach element (list) {

statements

}

(a..z) list of characters

$letter single element

foreach type of loop

() contains list

{} contains statement(s)

Page 21: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Loops cycle while an expression returns something true or content

Computer Programming for Biologists

while (@array) {

$last = pop @array;

}

Cycle through all elements of an array

until (length($seq) < 3) {

$cod = substr $seq, 0, 3, '';

}

Cycle through all codons of a sequence

while ($in = <>) {

$input .= $in;

}

Go through all lines from standard input

Page 22: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

if ($num > 100) {die "$num is too high!";

} elsif ($num < 1) { die "$num is too low!";}

Branching if / else / elsif structure

Pseudo code:

Computer Programming for Biologists

Perl code:

if (expression) {statements;

} else { statements;}

if number is too bigstop with error message

else if number is too small stop with error message

if a condition is truethen do one thing

else do another thing

Page 23: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Branching if / else / elsif structure

test for multiple conditions:

Computer Programming for Biologists

if ($response eq 'y') {

print "ok!\n";

} elsif ($response eq 'n') {

print "maybe next time\n";

} elsif ($response eq '') {

print "please try again: ";

} else {

print "don't know what you mean\n";

}

Page 24: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Comparisons

compare one scalar (or expression) to another

numerical alphabetical

>, >= gt

<, <= lt

== eq

!= ne

Computer Programming for Biologists

common mistake: $num = 10 is NOT a comparison!

Page 25: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

True and False

false 0 or empty string ('')

true value different from 0 (1 by default)

Computer Programming for Biologists

Comparison Evaluation Boolean5 > 2

2 > 5

a > b

'ACTG' eq 'actg'

'1' ne '1.0'

2+2 == 8/2

Comparisons are last in order of execution!

Page 26: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

True and False

false 0 or empty string ('')

true value different from 0 (1 by default)

Computer Programming for Biologists

Comparison Evaluation Boolean5 > 2 1 TRUE2 > 5 0 FALSEa > b 0 FALSE

'ACTG' eq 'actg' 0 FALSE'1' ne '1.0' 1 TRUE2+2 == 8/2 1 TRUE

Comparisons are last in order of execution!

Page 27: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Computer Programming for Biologists

Practical session:

Go to http://bioinf.gen.tcd.ie/GE3M25/class3

and try the 'Controls' exercises

Control Structures

Page 28: Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp .

Computer Programming for Biologists

Implement the following in a program:

1. Print a welcome message

2. Read input from a file

3. Separate header from sequence

4. Report length of sequence

5. Make sequence all upper case

6. Reformat sequence into 60 bp width

7. Print reverse-complement

8. Provide position numbers at each line

Go to http://bioinf.gen.tcd.ie/GE3M25/class3

Project