4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

21
4.1 Reading and writing files

Transcript of 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

Page 1: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.1

Reading and writing files

Page 2: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.2

Open a file for reading, and link it to a filehandle:open(IN, "<EHD.fasta");

And then read lines from the filehandle, exactly like you would from <STDIN>:my $line = <IN>;

my @inputLines = <IN>;foreach $line (@inputLines) ...

Every filehandle opened should be closed:close(IN);

Always check the open didn’t fail (e.g. if a file by that name doesn’t exists):open(IN, "<$file") or die "can't open file $file";

Reading files

Page 3: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.3

Open a file for writing, and link it to a filehandle: open(OUT, ">EHD.analysis") or die...

NOTE: If a file by that name already exists it will be overwriten!

Or, you can add lines at the end of an existing file (append): open(OUT, ">>EHD.analysis") or die...

Print to a file:print OUT "The mutation is in exon $exonNumber\n";

Writing to files

no comma here

Page 4: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.4

You can ask questions about a file or a directory name (not filehandle):

if (-e $name) { print "The file $name exists!\n"; }

-e $name exists-r $name is readable-w $name is writable by you-z $name has zero size-s $name has non-zero size (returns size)-f $name is a file-d $name is a directory-l $name is a symbolic link-T $name is a text file-B $name is a binary file (opposite of -T).

File Test Operators

Page 5: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.5

open(IN, '<D:\workspace\Perl\p53.fasta');

• Always use a full path name, it is safer and clearer to read

• Remember to use \\ in double quotes

open(IN, "<D:\\workspace\\Perl\\$name.fasta");

• (usually) you can also use /

open(IN, "<D:/workspace/Perl/$name.fasta");

Working with paths

Page 6: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.6

It is common to give parameters within the command-line for a program or a script:

They will be stored in the array @ARGV:

@ARGV contains: ("my","argument","list");

foreach my $arg (@ARGV){ print "$arg\n";}

Command line parameters

> perl -w findProtein.pl my argument list

myargumentlist

Page 7: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.7

It is common to give parameters within the command-line for a program or a script:

They will be stored in the array @ARGV:

@ARGV contains: ("my argument list");

foreach my $arg (@ARGV){ print "$arg\n";}

> perl -w findProtein.pl "my argument list"

Command line parameters

my argument list

Page 8: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.8

It is common to give parameters within the command-line for a program or a script:

They will be stored in the array @ARGV:

my $inFile = $ARGV[0];my $outFile = $ARGV[1];

Or more simply:

my ($inFile,$outFile) = @ARGV;

Command line parameters

> perl -w findProtein.pl D:\perl\input.fasta D:\perl\output.txt

Page 9: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.9Command line parameters in PerlExpress

Page 10: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.10

Reminder: the class exercise of 3 days ago.

Reading files - example

Input: Yossi 6.10,16.50,5.00Dana 21.00,6.00Refael 24.00,7.00,8.00END

Output: Yossi 27.6Dana 27Refael 45.1

Page 11: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.11

Reading files: example

$line = <STDIN>;chomp $line;

# loop processes one input line and print output for linewhile ($line ne "END") { # Separate name and numbers @nameAndNums = split(" ", $line); $name = $nameAndNums[0]; @nums = split(",", $nameAndNums[1]); $sum = 0;

# Sum numbers foreach $num (@nums) {

$sum = $sum + $num; } print "$name $sum\n";

# Read next line $line = <STDIN>; chomp $line;} Input: Yossi 6.10,16.50,5.00

Dana 21.00,6.00Refael 24.00,7.00,8.00END

Output: Yossi 27.6Dana 27Refael 45.1

Page 12: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.12

Reading files: example

my ($inFileName) = @ARGV;open(IN, "<$inFileName") or die "can't open $inFileName";

$line = <IN>;chomp $line;

# loop processes one input line and print output for linewhile ($line ne "END") { # Separate name and numbers @nameAndNums = split(" ", $line); $name = $nameAndNums[0]; @nums = split(",", $nameAndNums[1]); $sum = 0;

# Sum numbers foreach $num (@nums) {

$sum = $sum + $num; } print "$name $sum\n";

# Read next line $line = <IN>; chomp $line;}close(IN);

Input: Yossi 6.10,16.50,5.00Dana 21.00,6.00Refael 24.00,7.00,8.00END

Output: Yossi 27.6Dana 27Refael 45.1

Page 13: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.13

Reading files: example

my ($inFileName, $outFileName) = @ARGV;open(IN, "<$inFileName") or die "can't open $inFileName";open(OUT, ">$outFileName") or die "can't open $outFileName";$line = <IN>;chomp $line;

# loop processes one input line and print output for linewhile ($line ne "END") { # Separate name and numbers @nameAndNums = split(" ", $line); $name = $nameAndNums[0]; @nums = split(",", $nameAndNums[1]); $sum = 0;

# Sum numbers foreach $num (@nums) {

$sum = $sum + $num; } print OUT "$name $sum\n";

# Read next line $line = <IN>; chomp $line;}close(IN);close(OUT);

Input: Yossi 6.10,16.50,5.00Dana 21.00,6.00Refael 24.00,7.00,8.00END

Output: Yossi 27.6Dana 27Refael 45.1

Page 14: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.14

Reading files: example

my ($inFileName, $outFileName) = @ARGV;open(IN, "<$inFileName") or die "can't open $inFileName";open(OUT, ">$outFileName") or die "can't open $outFileName";$line = <IN>;chomp $line;

# loop processes one input line and print output for linewhile (defined $line) { # Separate name and numbers @nameAndNums = split(" ", $line); $name = $nameAndNums[0]; @nums = split(",", $nameAndNums[1]); $sum = 0;

# Sum numbers foreach $num (@nums) {

$sum = $sum + $num; } print OUT "$name $sum\n";

# Read next line $line = <IN>; chomp $line;}close(IN);close(OUT);

Input: Yossi 6.10,16.50,5.00Dana 21.00,6.00Refael 24.00,7.00,8.00

Output: Yossi 27.6Dana 27Refael 45.1

Page 15: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.15

Reading files: example

my ($inFileName, $outFileName) = @ARGV;open(IN, "<$inFileName") or die "can't open $inFileName";open(OUT, ">$outFileName") or die "can't open $outFileName";$line = <IN>;

# loop processes one input line and print output for linewhile (defined $line) { chomp $line; # Separate name and numbers @nameAndNums = split(" ", $line); $name = $nameAndNums[0]; @nums = split(",", $nameAndNums[1]); $sum = 0;

# Sum numbers foreach $num (@nums) {

$sum = $sum + $num; } print OUT "$name $sum\n";

# Read next line $line = <IN>;}close(IN);close(OUT);

Input: Yossi 6.10,16.50,5.00Dana 21.00,6.00Refael 24.00,7.00,8.00

Output: Yossi 27.6Dana 27Refael 45.1

Page 16: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.16Class exercise 5

1. Write a script that reads a file containing a Perl script, that is named by the first command-line parameter (from @ARGV). Print out the script without comment lines (lines that begin with #).

2. Now write the results to a file that is named by the second command-line parameter.

3. Now remove all other comments as well (that may not start at the beginning of a line).

Page 17: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.17

Perl allows easy access to the files in a directory by “globbing”:

The * represents any string character.For example, *.fasta represents all filenames with the extension ".fasta"

my @files = <D:\\proteins\\*.fasta>;foreach $fileName (@files) { open(IN, $fileName) or die "can't open file $fileName"; foreach $line (<IN>) { do something... }}

Note: the “glob” gives a list of the file names in the directory.

Reading directories

no " hereno " here

Page 18: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.18

You can interpolate variables in the glob, as in double-quoted strings:

@files = <D:\\proteins\\chr$chromosme*.fasta>;

If $chromosome is 4 then we may get these files in @files: chr4.fasta chr4_NT_003827.fasta chr4_NT_007222.fasta

Reading directories

Page 19: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.19

Delete a file: unlink ("fred.txt") or die "can't delete fred.txt";

Delete all files in a directory whose name matches a certain “pattern”: unlink <fred\\*.txt> or die "can't delete files in fred";

(Here – all file names that end with “.txt”)

Move/rename files:

rename ("fred.txt", "friends\\bob.txt") or die "can't move fred.txt";

Manipulating files

Page 20: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.20

Generally, you can execute any command of the operating system:

$systemReturn = system("delete fred.txt");

Or:

$systemReturn = system("copy fred.txt george.txt");

When checking the value returned by a system call, usually 0 means no errors:

if ($systemReturn != 0) { die "can't copy fred.txt"; }

Calling system commands

Page 21: 4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "

4.21Class exercise 6

1. Write a script that prints a list of all Perl files (i.e. files with extension “.pl”) in a given directory, that is named by the first command-line parameter.

2. Change the script from class exercise 5.1 so that it will read all Perl files in a given directory, that is named by the first command-line parameter, and print them out to the screen without the comment lines.

3* Change the script so that each script will be written to a file named as the input file with an added extension “.noComments”e.g. input “class_ex.2.2.pl” output “class_ex.2.2.pl.noComments”