Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those...

41
Page 1 VI, March 2005 P ractical E xtraction and R eport L anguage « Perl is a language of getting your job done » Larry Wall « There is more than one way to do it »

Transcript of Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those...

Page 1: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 1VI, March 2005

Practical Extraction and Report Language

« Perl is a language of getting your job done »

Larry Wall

« There is more than one way to do it »

Page 2: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 2VI, March 2005

Practical Extraction and Report Language

http://perl.oreilly.com

" Perl is both a programming languageand an application on your computer that runs those programs "

Page 3: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 3VI, March 2005

Perl history

1969 UNIX was born at Bell Labs.

1970 Brian Kernighan suggested the name "Unix" and the operating system we know today was born.

1972 The programming language C is born at the Bell Labs (C is one of Perl's ancestors).

1973 “grep” is introduced by Ken Thompson as an external utility: Global REgular expression Print.

1976 Steven Jobs and Steven Wozniak found Apple Computer (1 April).

1977 The computer language awk is designed by Alfred V. Aho, Peter J. Weinberger, and Brian W. Kernighan (awk is one of Perl's ancestors).

A few dates:

Page 4: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 4VI, March 2005

Perl history

1987 Perl 1.000 is unleashed upon the world

NAME perl | Practical Extraction and Report Language

SYNOPSIS perl [options] filename args

DESCRIPTION Perl is a interpreted language optimized for scanning arbitrary textfiles, extracting information from those text files, and printing reports based on thatinformation. It's also a good language for many system management tasks. The languageis intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny,elegant, minimal). It combines (in the author's opinion, anyway) some of the best featuresof C, sed, awk, and sh, so people familiar with those languages should have little difficultywith it (Language historians will also note some vestiges of csh, Pascal, and evenBASIC|PLUS). Expression syntax corresponds quite closely to C expression syntax. Ifyou have a problem that would ordinarily use sed or awk or sh, but it exceeds theircapabilities or must run a little faster, and you don't want to write the silly thing in C, then perlmay be for you. There are also translators to turn your sed and awk scripts into perl scriptsOK, enough hype.

Page 5: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 5VI, March 2005

Perl history

1994 Perl5: last major release (Currently Perl 5.8.6).

1996 Creation of the CPAN repository of modules and documentation( Comprehensive Perl Archive Network).

2005 Perl 5.8.6

Supported Operating Systems:Unix systems / Macintosh (OS 7-9 and X) / Windows / VMS

Perl FeaturesPerls database integration interface (DBI) supports thirdparty databases including Oracle, Sybase, Postgres, MySQL and others.Perl works with HTML, XML, and other markup languages .Perl supports Unicode.Perl is Y2K compliant.Perl supports both procedural and objectoriented programming.Perl interfaces with external C/C++ libraries through XS or SWIG.Perl is extensible There are over 500 third party modules available from (CPAN).

Page 6: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 6VI, March 2005

Perl history

Perl and the Web

Perl is the most popular web programming language due to its text manipulation capabilities and rapid development cycle.

Perl's CGIpm module, part of Perl's standard distribution, makes handling HTML forms simple.

Perl can handle encrypted Web data, including ecommerce transactions.

Perl can be embedded into web servers (mod_perl) to speed up processing by as much as 2000%.

Perl's DBI package makes webdatabase integration easy.

Page 7: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 7VI, March 2005

Perl Hello world !

My first program (hello.pl):

computerX: vioannid$ which perl/usr/bin/perl

computerY: vioannid$ which perl/usr/local/bin/perl

#!/usr/local/bin/perl

use strict;use warnings;

#tell the program to print "Hello world"print "Hello world" ;

#tell the program to exitexit ;

The first line of a Perl program is called "command interpretation" or "Shebang line". This linerefers to the "#!" and tells the computer that this is a Perl program.

To find out whether you should use /usr/bin/perl OR /usr/local/bin/perl,type: "which perl" in your shell:

Page 8: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 8VI, March 2005

Perl Hello world !

My first program (hello.pl):

use strict;

A command like use strict is called a pragma. Pragmas are instructions to the Perl interpreter to dosomething special when it runs your program. "use strict" does two things that make it harder towrite bad software:

It makes you declare all your variables, and it makes it harder for Perl to mistake your intentions when you are using subroutines

ALL STATEMENTS ENDS IN A SEMICOLON ";"(similar to the use of the period "." in the English language)

#!/usr/local/bin/perl

use strict;use warnings;

#tell the program to print "Hello world "print "Hello world" ;

#tell the program to exitexit ;

Page 9: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 9VI, March 2005

Perl Hello world !

My first program (hello.pl):#!/usr/local/bin/perl

use strict;use warnings;

#tell the program to print "Hello world"print "Hello world" ;

#tell the program to exitexit ;

use warnings;

Comments are good, but the most important tool for writing good Perl is the "warnings". Turning onwarnings will make Perl yelp and complain at a huge variety of things that are almost alwayssources of bugs in your programs.

Perl normally takes a relaxed attitude toward things that may be problems:it assumes that you know what you're doing, even when you don't…

Page 10: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 10VI, March 2005

Perl Hello world !

My first program (hello.pl):#!/usr/local/bin/perl

use strict;use warnings;

#tell the program to print "Hello world"print "Hello world" ;

#tell the program to exitexit ;

CommentsAll lines starting with "#" are not taken into account in the execution of the program.Good comments are short, but instructive They tell you things that aren't clear from readingthe code.

Blank lines or spaces are also not taken into account in the execution of the program. However, theyhelp in the reading of the code.

Page 11: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 11VI, March 2005

Perl Hello world !

My first program (hello.pl):#!/usr/local/bin/perl

use strict;use warnings;

#tell the program to print "Hello world"print "Hello world" ;

#tell the program to exitexit ;

Print statement:

… prints !

By default, the standard output is the shell window from which the program is executed.

ALL STATEMENTS ENDS IN A SEMICOLON ";"(similar to the use of the period "." in the English language)

Page 12: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 12VI, March 2005

Perl Hello world !

My first program (hello.pl):#!/usr/local/bin/perl

use strict;use warnings;

#tell the program to print "Hello world"print "Hello world" ;

#tell the program to exitexit ;

The exit statement:

Tells the computer to exit the program.

Although not explicitely required in Perl, it is definitely common.

Page 13: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 13VI, March 2005

Perl Hello world !

My first program (hello.pl):#!/usr/local/bin/perl

use strict;use warnings;

#tell the program to print "Hello world"print "Hello world" ;

#tell the program to exitexit ;

(Do not forget to make the file executable: vioannid$ chmod a+x perl_01.pl )

vioannid$ ./perl_01.pl Hello worldvioannid$

output:

Page 14: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 14VI, March 2005

Perl Hello world !!

Print:

#!/usr/local/bin/perl

use strict;use warnings;

#play with the print statement

#words separated by newlineprint "Hello\nworld\n" ;

#words separated by tabs & a final newlineprint "Hello\tworld\n" ;

#usage of the period to cat stringsprint "Hello"."world"."\n";

#tell the program to exitexit ;

vioannid$ ./perl_02.pl HelloworldHello worldHelloworldvioannid$

Important:Unix & all Unix flavors: \nMac OS : \rWindows: \r\n

Page 15: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 15VI, March 2005

Perl variables

Perl has 3 data types: scalars / arrays / hashes

scalars

a single string (of any size, limited only by the available memory), or a number, or a reference to something

Scalar values are always named with '$' (even when referring to a scalar that is part of an array ora hash). The '$' symbol works semantically like the English word "the" in that it indicates a singlevalue is expected.

my $variable_1 = "Hello world !\n"; #note the quotes

my $variable_two = 30; #note the absence of quotes

my $marks[4]; # the fifth element of the array "marks"

Page 16: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 16VI, March 2005

Perl variables

Perl has 3 data types: scalars / arrays / hashes

arrays (of scalars)Normal arrays are ordered lists of scalars indexed by number (starting with 0).

Entire arrays are denoted by '@', which works much like the word "these" or "those" does inEnglish, in that it indicates multiple values are expected.

my @numbers = ("One", "Two", "Three", "Four", "Five");

my @numbers = (1..5); #same as "@numbers = (1, 2, 3, 4, 5);"

my $numbers[0] = "One"; my $numbers[1] = "Two";…

my @anyarray = (6, "hello", @numbers);

FiveFourThreeTwoOne

43210index

value…

Page 17: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 17VI, March 2005

Perl variables

Perl has 3 data types:

hashes (associative arrays of scalars)

Hashes are unordered collections of scalar values indexed by their associated string key.Entire hashes are denoted by '%'

my %var = ("a","first","b","3");

my %codon3 = ("TTT" => "Phe","TTA" => "Leu",

);

print $codon3{'TTT'};TyrTAT

CysTGT

SerTCT

PheTTT

ValueKey

Page 18: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 18VI, March 2005

Perl special variables (small extract)

$_ The default input and patternsearching space.

$& The string matched by the last successful pattern match.$` The string preceding whatever was matched by the last successful pattern match.$' The string following whatever was matched by the last successful pattern match.

$! If a system or library call fails, it sets this variable This means that the value of $! is meaningful only immediately after a failure.

$/ The input record separator, newline by default .

$$ The process number of the Perl running this script.

@ARGV commandline arguments (space separation by default).

note:$ARGV[0] first commandline argument …

Page 19: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 19VI, March 2005

Perl variables

Programs using variables :

#!/usr/local/bin/perl

use strict;use warnings;

my $name = "John Doe";

print "Hello $name !\n" ;

exit ;

#!/usr/local/bin/perl

use strict;use warnings;

my $name = $ARGV[0];

print "Hello $name !\n" ;

exit ;

#!/usr/local/bin/perl

use strict;use warnings;

print "\nEnter your name(then press \"return\"when done):\t";

#get information from the#terminal windowmy $name = <STDIN>;

print "Hello $name !\n" ;

exit ;

Interpolation & quoting:

the quotes have different significations

…my $price = '$100';print "the price is $price";

#this is called interpolation

Page 20: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 20VI, March 2005

Perl variables

Program using variables :

#!/usr/local/bin/perl

use strict;use warnings;

my @names = ("Pedro", "Claire", "Yemima", "Fabien" , "RochPhilippe", "Francisco", "Sandra Yukie","Simona", "Christophe", "Dominique", "Michaela", "Lionel", "Gabriele", "Michael", "Charlotte","Subhash", "Adam", "Sebastian", "Tu", "Sergey", "Olusegun", "Joel", "Uta", "Viviane", "Stanislav","Kyrill", "Petr", "Sebastien");

print "Hello\n @names !\n" ;

exit ;

Some arrays functions:sort sorts all the elements of an array.reverse inverses the order of all the elements of an array.shift, unshift takes the first element, places an element at the first position of the array.pop, push takes the last element, places an element at the last position of the array.

Page 21: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 21VI, March 2005

Perl statement modifiers

Any simple statement may optionally be followed by a SINGLE modifier, just before the terminatingsemicolon (or block ending). The possible modifiers are:

if (EXPR) { }unless (EXPR) { }while (EXPR ) { }until (EXPR ) { }foreach (LIST ) { }

The EXPR following the modifier is referred to as the "condition". Its truth or falsehood determineshow the modifier will behave.

if executes the statement once if and only if the condition is true .unless is the opposite, it executes the statement if the condition is false (unless the condition is true).The foreach modifier is an iterator: it executes the statement once for each item in the LIST (with$_ aliased to each item in turn).while repeats the statement while the condition is true.until does the opposite, it repeats the statement until the condition is true (or while the condition isfalse): The while and until modifiers have the usual "while loop" semantics (conditionalevaluated first).

Page 22: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 22VI, March 2005

Perl statement modifiers

if / if else / if elsif else#!/usr/local/bin/perl

use strict;use warnings;

print "\nEnter your name (then press \"return\" when done):\t";

#get information from the terminal windowmy $name = <STDIN>;

#remove trailing "\n" if anychomp $name;

if ($name eq "Couchepin") { print "Hello Mr President !\n" ; }

else { print "Hello $name !\n" ; }

exit ;

Page 23: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 23VI, March 2005

Perl statement modifiers

if / if else / if elsif else (name.pl) :#!/usr/local/bin/perl

use strict;use warnings;

print "\nEnter your name (then press \"return\" when done):\t";

#get information from the terminal windowmy $name = <STDIN>;

#remove trailing "\n" if anychomp $name;

if ($name eq "Couchepin") { print "Hello Mr President !\n" ; }

elsif ($name eq "Falquet") { print "Good day to you Master $name !\n" ; }

else { print "Hello $name !\n" ; }

exit ;

Page 24: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 24VI, March 2005

Perl statement modifiers

Perl looping the for/foreach loop :"Passing an array":foreach my $element ( @array ) { # do something with the element}

"Passing a hash":foreach my $key (keys %hash) {

print "The value of $key is $hash{$key}\n";}

"specify 3 EXPR inside the (): initial state, condition and loop expression": for ($i = 0; $i <= 10; $i=$i+1 ) { #execute the contents of the block as long as $i is less than, or equal to 10 or while $i is smaller than 10}

Page 25: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 25VI, March 2005

Perl statement modifiers

Perl looping the for/foreach loop :

#!/usr/local/bin/perl

use strict;use warnings;

my @names = ("Pedro", "Claire", "Yemima", "Fabien" , "RochPhilippe", "Francisco", "Sandra Yukie","Simona", "Christophe", "Dominique", "Michaela", "Lionel", "Gabriele", "Michael", "Charlotte","Subhash", "Adam", "Sebastian", "Tu", "Sergey", "Olusegun", "Joel", "Uta", "Viviane", "Stanislav","Kyrill", "Petr", "Sebastien", "Haleh");

foreach my $name (@names) {print "Hello $name !\n";

}

exit ;

Page 26: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 26VI, March 2005

Perl statement modifiers

Perl looping the for/foreach loop :

#!/usr/local/bin/perl

use strict;use warnings;

my $counter;

for ($counter=1;$counter<=10;$counter++){print "I can count up to $counter !\n";

}

exit ;

Page 27: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 27VI, March 2005

Perl statement modifiers

Perl looping the while loop

while ( condition ) { #execute the contents of the block}

ATTENTION: Infinite Loop !!!

while (1) { #execute the contents of the block forever !}

True/False

In Perl some variables are consideredtrue:

- integer with a nonzero value - string with nonzero length - array with at least one element - hash with at least one key/value pair

For example:

$lang = "Perl"; # < true

$version = 5.6; # < true

$zero = 0; # < false

$empty = ""; # < false

@states = (); # < false

%table = (1 => "one"); # < true

Page 28: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 28VI, March 2005

#!/usr/bin/perl

use strict;use warnings;

my $number = 1;

while ($number<=10) {print "I can count up to $number !";$number+=1; #Ha !

}

exit ;

Perl statement modifiers

Perl looping the while loop

#!/usr/local/bin/perl

use strict;use warnings;

my $number = 1;

while ($number<=10) {print "I can count up to $number !";

}

exit ; #really ?

Tip:

To stop a "looping" script press CTRL+C …

Page 29: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 29VI, March 2005

Perl statement modifiers

Perl looping while loop / do until

while loop

do until

"Activity" is executed at least once !

"Activity" may never be executed.

Page 30: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 30VI, March 2005

Perl operators

Perl operators

Arithmetic+ addition- subtraction* multiplication/ division

Numeric comparison== equality!= inequality< less than> greater than<= less than or equal>= greater than or equal

String comparisoneq equalityne inequalitylt less thangt greater thanle less than or equalge greater than or equal

Why do we have separate numeric and string comparisons?

Because we don't have special variable types, and Perl needs to know whether to sortnumerically (where 99 is less than 100) or alphabetically (where 100 comes before 99).

Page 31: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 31VI, March 2005

Perl operators

Perl operators

#!/usr/local/bin/perl

use strict;use warnings;

my $x = 100;my $y = 99;

if ($x > $y) { print "\"$x\" is numerically greater than \"$y\"\n" ; }else { print "\"$x\" is numerically smaller than \"$y\"\n" ; }

if ($x gt $y) { print "\"$x\" is alphabetically greater than \"$y\"\n" ; }else { print "\"$x\" is alphabetically smaller than \"$y\"\n" ; }

exit ;

Output:"100" is numerically greater than "99""100" is alphabetically smaller than "99"

Page 32: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 32VI, March 2005

Perl operators

Perl operators

Boolean logic&& and|| or! not

Miscellaneous= assignment. string concatenationx string multiplication.. range operator (creates a list of numbers)

Many operators can be combined with a "=" as follows:

$a += 1; # same as $a = $a + 1 #same as $a++

$a -= 1; # same as $a = $a - 1 #same as $a--

$a .= "\n"; # same as $a = $a. "\n";

Page 33: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 33VI, March 2005

Perl functions

Functions in Perl are called subroutines

Functions are useful to avoid typing redundant code over and over.

Functions help in the clarity of scripts.

There are already many available functions in Perl:

http://searchcpanorg/~nwclark/perl-5.8.6/pod/perlfunc.pod

syntax of Perl subroutines:

sub (list of arguments) { list of statements to execute return some value

}

Page 34: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 34VI, March 2005

Perl functions

#!/usr/local/bin/perl

use strict;use warnings;

my $height = 220;my $weight = 120;

#to calculate the BFI you need the heigth in cm and the weight in kgmy $bfi = &cal($height, $weight);print "$bfi\n";exit;

sub cal { if (@_ != 2) { die "&cal should get exactly two arguments!\n" ; } my ($cm, $kg) = @_ ; my $index = ($kg)/(($cm / 100)*($cm / 100)); return $index;}

Output:24.7933884297521

Notice on Body Fat Index (BFI):BFI <20 => weight is too low20 < BFI < 25 => weight is correctBFI > 25 => Oups !

Page 35: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 35VI, March 2005

Perl functions

#!/usr/local/bin/perl

use strict;use warnings;

my @names = ("Pedro", "Claire", "Yemima", "Fabien", "Uta");

foreach (@names) {my $size = length($_);print "*"x($size+2)"\n";print "*$_*\n";print "*"x($size+2)"\n";

}

exit ;

Output:********Pedro*****************Claire******************Yemima******************Fabien***************Uta******

my @names1 = ("Pedro", "Claire", "Yemima", "Fabien" ,"Uta");my @names2 = ("Sandra Yukie", "Simona", "Christophe", "Dominique");my @names3 = ("Lionel", "Michael", "Charlotte", "Subhash", "Adam");my @names4 = ("Sebastian", "Tu", "Sergey", "Olusegun", "Joel", "Viviane");my @names5 = ("Stanislav", "Kyrill", "Petr", "Sebastien", "Haleh");

What if you need this "pretty print" more than once ?

Page 36: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 36VI, March 2005

Perl functions

#!/usr/local/bin/perl

use strict;use warnings;

my @names1 = ("Pedro", "Claire", "Yemima", "Fabien" ,"Francisco");my @names2 = ("Sandra Yukie", "Simona", "Christophe", "Dominique", "Michaela");my @names3 = ("Lionel", "Gabriele", "Michael", "Charlotte", "Subhash", "Adam");my @names4 = ("Sebastian", "Tu", "Sergey", "Olusegun", "Joel", "Uta", "Viviane");my @names5 = ("Stanislav", "Kyrill", "Petr", "Sebastien", "Haleh");

&pretty_print(@names1);&pretty_print(@names2);&pretty_print(@names3);&pretty_print(@names4);&pretty_print(@names5);

exit ;

sub pretty_print {foreach (@_) {my $size = length($_);print '*'x($size+2),"\n";print "*$_*\n";print '*'x($size+2),"\n";}

}

Page 37: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 37VI, March 2005

Perl File handles

A "file handle" is a connection between your Perl script and the outside world.

You can open a file for input or output using the open() function.

open(INFILE, "input.txt") or die "Can't open input.txt: $!";open(OUTFILE, ">output.txt") or die "Can't open output.txt: $!";open(LOGFILE, ">>logfile") or die "Can't open logfile: $!";

print() can also take an optional first argument specifying which filehandleto print to:

print STDERR "This is your final warning\n";print OUTFILE $record;print LOGFILE $logmessage;

use whatever name you like BUT: STDIN, STDOUT, STDERR !

Page 38: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 38VI, March 2005

Perl File handles

Perl special file handles

There are three connections that always exist and are always "open" when your program starts:

STDIN, STDOUT, and STDERR.

Actually, these names are file handles. File handles are variables used to manipulate files.

STDIN reads from standard input which is usually the keyboard in normal Perl script(or input from a Browser in a CGI script. Cgi-lib.pl reads from this automatically.)

STDOUT (Standard Output) and STDERR (Standard Error) by default write to a console(or a browser in CGI).

We have been using the STDOUT file handle without knowing it for every print()statement during this presentation. The print() function uses STDOUT as the default if noother file handle is specified.

Page 39: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 39VI, March 2005

Perl File handles

You can read from an open filehandle using the "<>" operator.

In scalar context it reads a single line (or a single record) from the filehandle, and in list context itreads the whole file in, assigning each line to an element of the list:

my $line = <INFILE>;my @lines = <INFILE>;

Reading in the whole file at one time is called slurping. It can be useful but it may be a memoryhog. Most text file processing can be done a line at a time with Perl's looping constructs.The "<>" operator is most often seen in a while loop:

while <INFILE> { # assigns each line in turn to $_print "Just read in this line: $_";

}

When you're done with your filehandles, you should close() them(though Perl will clean up after you if you forget…):

close INFILE; You can modify the regular record separator "\n" by something else:$/= "\/\/\n"; for a file containing SwissProt entries or$/=">"; for a fasta file)

Page 40: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 40VI, March 2005

Perl regular expressions

Idea: powerful way to search for text patterns …

>sw:THIO_RAT/110VKLIESKEAFQEALAAAGDKLVVVDFSATWCGPCKMIKPFFHSLCDKY ……>te:CB530525/66168VKQIESKYAFQEALNSAGEKLVVVDFSATWCGPCKMIKPFFHSLSEKY ……>tr:Q5R9M3_PONPY/210VKQIESKTAFQEALDAAGDKLVVVDFSATWCGPCKMIKPFFHSLSEKY ……>tg:NT039170_956/56151VKLIESKEAFQEALAAERDKLVMVDFSATWCGPCKMIKPFFHSSCDKY ……>te:CV502349/88193VSLITTKESWDQKLAEAKKegKIVIANFSASWCGPCRMISPFYCELKY ……>sw:TRXL2_ARATH/98174ITSAEQFLNALKDAGDRLVIVDFYGTWCGSCRAMFPKLCKFGHTAKEH ……>te:OMY_1368_2/13111ISSEEQWEEALSGPGLLVIEVYQRWCGPCKAVQNIFRKLRSHTHHTEY ……>te:CA246724/110160SKATYDEQWAAhkSSGKLMVIDFSASWCGPCRFIEPAFKELTHTASRF ……>tr:Q84XR8_CHLRE/68169ILTADTYHGFLEKNAEKLVVTDFYAVWCGPCKVIAPEIERTLANEMMT ……>tg:AL772421_11/578KLVVIEFGASWCEPSRRIAPVFAEYAKKMNKDKNDHDKDGDKDGMKEF ……

Page 41: Practical Extraction and Report Language...of C, sed, awk, and sh, so people familiar with those languages should have little difficulty with it (Language historians will also note

Page 41VI, March 2005

Perl