1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

22
1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Transcript of 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

Page 1: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

1

An Introduction to PerlPart 2

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 2: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

2

Objectives

To introduce the Perl programming language• Lists, arrays, hashes

Recommended Books:• SAMS – Teach yourself Perl in 24 hours – Clinton Pierce

• Beginning Perl for Bioinformatics – James Tisdall

The Best way to learn Perl is to read the books, numerous tutorials and to Practice.

These notes are not a comprehensive tutorial – reading extra material is essential

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 3: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

3

Lists A list is an ordered collection of scalars. Space for lists is dynamically allocated and removed from the program's

memory as required Parenthesis () are used to construct the list. Commas separate elements,

Creates a four element list containing the numbers 5, the word apple, the contents of the scalar variable $x and pi.

If the list contains only simple strings then can use the qw (quote) operator to avoid many quotation marks

(5, ‘apple’, $x, 3.14159)

qw (5 apple $x 3.14159)

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 4: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

4

Arrays Literal lists are usually used to initialise some other structure In Perl this can be an array or a hash To create an array in Perl you just put something into it Unlike Java, you don’t have to initialise it to a specific size before hand

This is an array assignment using the = sign as array assignment operator Array assignments can involve other arrays or empty lists e.g.

@boys= qw(Greg Peter Bobby Quentin);

@copy=@original;@clean=();

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 5: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

5

Getting elements from Arrays Elements in an array can be searched, values changed, or individual elements

removed. The simplest way to get the contents out of the entire array is to use the array in

double quotation marks:

Prints the elements of @array with a space separating each element Individual elements in an array are accessed by an index, as shown in the

following code. As in Java, the index starts at 0 and increases by 1 for each additional element. To access an element use the syntax

Where array is the array name and index is the index of the element you want

print “@array”;

$array[index];

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 6: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

6

Some example of using Arrays

@trees=qw(oak cedar maple apple);print $trees[0]; #prints “oak”;print $trees[3]; #prints “apple”;$trees[4]=‘pine’;

Notice that individual elements of the array are referred to using a $ This is because it refers to a single scalar value within the array Finding the size of the array:

$size=@array;

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 7: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

7

Stepping through an array

@flavours=qw(choc vanilla strawberry mint sherbet);for($index=0; $index<@flavours; $index++){

print “My favourite is $flavours[$index] and ..”;}print “many others.\n”; An easier way in Perl…

foreach $cone (@flavours){

print “I’d like a $cone ice cream please \n”;}

This is one way to step through the array

Last element of an array:$#arrayname – e.g. print $#flavours; prints ‘sherbet’

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 8: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

8

Converting scalars to arrays

@words=split(/ /,”the slow brown fox”);

Perl provides a number of functions and operators for converting between these two types One method is to use the split function to convert a scalar into an array. Split takes a pattern and a scalar and uses the pattern to split apart the scalar The first argument is the pattern, the second the scalar to split e.g.

@words now contains each of the words the, slow, brown, fox without the spaces If you don’t specify a string the variable $_ is used – one of Perls special reserved variable If you don’t specify a pattern or string whitespace is used to split apart the variable $_

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 9: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

9

Converting scalars to arrays The patterns used by split are called regular expressions. Regular expressions are a pattern matching language that we will discuss a bit

later

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 10: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

10

Hashes Hashes are another kind of collective data type Like arrays, hashes contain a number of scalars The difference is hashes access their scalar data by name, not by a numeric

subscript like arrays do Hash elements have two parts:

• A key – identifies each element of the hash

• A value – the data associated with that key

• This relationship is called a key-value pair

A hash in Perl can contain as many elements as available memory will allow. Hashes are re-sized as elements are added and deleted.

Access to elements in a hash is extremely fast

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 11: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

11

Hashes Example of when we might use a hash:

• If we wanted to store information on licensed drivers, we might use the driver’s license number as the key

• This is unique per driver

• The data associated with each license number, the value, would be the driver’s information (license type, addess, age, etc)

• Each driver’s license would represent an element in the hash

• The (license, information) would be the key-value pair

• To search for a particular entry, we look for the unique key first, which is very fast

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 12: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

12

Hashes – Putting data in Hash variables in Perl are indicated by the percent sign (%). Individual elements are accessed using the $ just as with Arrays.

Individual hash elements are created by assigning values to them:

This assignment creates a relationship in the hash between Dune and Frank Herbert. The value associated with the key, $Authors(‘Dune’), can be treated like any other scalar$Authors(‘Dune’)=‘Frank Herbert’;

%Authors;

The key Dune

The hash %Authors

The value Frank Herbert

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 13: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

13

Hashes – Putting data in To put several values into a hash, you could use a series of assignments from

key to value:

Or, you could use a shortcut, listing pairings of keys and values:

Or, to help keep track of keys and values, use the => operator:

To be completely lazy, the left hand side of the => operator is expected to be a string, so need not even be quoted.

$food(‘apple’)=‘fruit’;$food(‘pear’)=‘fruit’;$food(‘carrot’)=‘vegetable’;

%food = (‘apple’, ‘fruit’, ‘pear’, ‘fruit’, ‘carrot’, ‘vegetable’);

%food = (‘apple’ => ‘fruit’, ‘pear’ => ‘fruit’, ‘carrot’ => ‘vegetable’);

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 14: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

14

Hashes – Getting data out As we have seen, we can retrieve single elements of a hash with a $:

To access all elements of a hash:

%movies = (‘The Shining’ => ‘Kubrick’, ‘Alien’ => ‘Scott’, ‘Kill Bill’ => ‘Tarantino’);print $movies(‘The Shining’);

foreach $film (keys %movies){ print “$film was directed by $movies{$film}.\n”;}

$film contains the value of a hash key

$movies{film} retrieves the element of the hash represented by the key

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 15: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

15

Hashes – Getting data out Perl also provides the values function to retrieve all the values stored in

a hash. The values are returned in the same order as the keys function would return the keys:

In the example above, the name of the director contained in $Directors[0] corresponds to the name of the movie stored in $Films[0] and so on

It is possible to invert a hash, where all the keys of the original hash become values, and all the values of the original hash become keys:

@Directors = values %movies;@Films = keys %movies;

%movies = (‘The Shining’ => ‘Kubrick’, ‘Alien’ => ‘Scott’, ‘Kill Bill’ => ‘Tarantino’);%byDirector = reverse %movies;

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 16: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

16

Hashes and Lists and Arrays Whenever a hash is used in a list context, Perl unwinds the hash back into a flat list of

keys and values. This list can be assigned to an array, just like any other list:

In the example above, @data is an array containing six elements. The even elements are the keys, and the odd elements the values. You can perform any operation you require on the array @data, and then reassign the contents to %movies:

You can also copy and combine hashes (beware that keys need to be unique):

%movies = (‘The Shining’ => ‘Kubrick’, ‘Alien’ => ‘Scott’, ‘Kill Bill’ => ‘Tarantino’);@data = %movies;

%movies = @data;

%new_hash = %old_hash; #copying a hash%both = (%first, %second); #combining two hashes

%additional = (%both, key1 => ‘value1’, key2 => ‘value2’);#adding two more key-value

pairs

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 17: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

17

Hashes – Special operations To test whether a key exists in a hash, use the exists function:

To remove a key from a hash, use the delete function:

To remove all the keys and values from a hash, simply reinitialise the hash to an empty list like this:

The keys function returns a list of all the keys in the hash, and we can use the sort function to order that list:

if ( exists $myHash{keyval} ){ #etc}

delete $myHash{keyval};

%myHash = ();

foreach ( sort keys %words ){ print “$_ $words{$_}\n”;}

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 18: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

18

Useful things to do with Hashes Many of the interesting things to do with hashes involve array manipulation See the literature for more examples One quick example is how to find the unique elements in an array:

%seen = ();foreach (@wordsArray){ $seen{$_} = 1;}@uniquewords = keys %seen;print “@uniquewords”;

initialise a temporary hash %seen

iterate over the array @wordsArray, setting $_ to each word in turn

create an entry in the hash with the key $_ for each entry not already seen, with dummy value 1.

extract all keys from the hash into the array @uniquewords

print out the contents of @uniquewords

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 19: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

19

Summary

Lists Arrays Conversion of scalars to arrays using patterns Hashes Conversions between hashes, arrays and lists

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 20: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

20

Q & A – 1

Are these lists equivalent: (5, ‘apple’, $x, 3.14159) and qw (5 apple $x 3.14159) ?

What will be the contents of the array: @clean=(); ? If @trees=qw(oak cedar maple apple); is it correct to refer to

the third element of the array as @trees[3] ? What will be the value of $size=@trees ? What will be the values of $tree in the following context

foreach $tree (@trees){

print “Select the $tree tree \n”;

}

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 21: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

21

Q & A – 2

Is it true that $@arrayname gets the last element of the array arrayname ?

What will be the contents of the array words @words=split(/ /,”the slow brown fox”); ?

Do we need a key for each value in a hash ? Is it correct to create hash as follows:

%food = (‘apple’ => fruit,

‘pear’ => fruit,

‘carrot’ => vegetable); ?

CSC8304 – Computing Environments for Bioinformatics - Lecture 8

Page 22: 1 An Introduction to Perl Part 2 CSC8304 – Computing Environments for Bioinformatics - Lecture 8.

22

Q & A – 3

Can we invert a hash by using the ‘reverse’ operator ? Does this change keys into values and values into keys ?

What will be result of %both = (%first, %second); if %first and %second are hashes ?

Is it allowed to do @data = %movies; then perform operations on the @data array and then do %movies = @data; ?

Is it true that delete $myHash{keyval}; and %myHash = (); are equivalent ?

CSC8304 – Computing Environments for Bioinformatics - Lecture 8