Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet...

21
Programming and Perl Programming and Perl for for Bioinformatics Bioinformatics Part I Part I
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    1

Transcript of Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet...

Page 1: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

Programming and PerlProgramming and Perlfor for

BioinformaticsBioinformaticsPart IPart I

Page 2: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

A Taste of Perl: print a A Taste of Perl: print a messagemessage

perltaste.pl: Greet the entire world.

#!/usr/bin/perl

#greet the entire world

$x = 6e9;

print “Hello world!\n”;

print “All $x of you!\n”; }- function calls(output statements)

- command interpretation header

- variable assignment statement

- a comment

Page 3: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

Basic Syntax and Data Basic Syntax and Data TypesTypes

whitespacewhitespace doesn’t matter to Perl. One doesn’t matter to Perl. One can write all statements on one linecan write all statements on one line

All Perl statements end in a semicolon All Perl statements end in a semicolon ;; just like Cjust like C

Comments begin with ‘Comments begin with ‘##’ and Perl ignores ’ and Perl ignores everything after the # until end of line.everything after the # until end of line. Example: #this is a commentExample: #this is a comment

Perl has Perl has three basic data typesthree basic data types:: scalarscalar array (list)array (list) associative array (hash)associative array (hash)

Page 4: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

ScalarsScalars

Scalar variablesScalar variables begin with ‘ begin with ‘$$’ followed by ’ followed by an identifieran identifier Example: $this_is_a_scalar;Example: $this_is_a_scalar;

An An identifieridentifier is composed of upper or lower is composed of upper or lower case case lettersletters, , numbersnumbers, and , and underscoreunderscore '_'. '_'. Identifiers are case sensitive (like all of Perl)Identifiers are case sensitive (like all of Perl)

$progname = “first_perl”; $progname = “first_perl”; $numOfStudents = 4;$numOfStudents = 4; = sets the content of $progname to be the string = sets the content of $progname to be the string

“first_perl” & $numOfStudents to be the integer 4“first_perl” & $numOfStudents to be the integer 4

Page 5: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

Scalar ValuesScalar Values

Numerical ValuesNumerical Values integer:integer: 5, “3”, 0, -307 5, “3”, 0, -307 floating point: 6.2e9, -4022.33floating point: 6.2e9, -4022.33 hexadecimal/octal:hexadecimal/octal: 0x0xd4f, d4f, 00477477 Binary: Binary: 0b011011 0b011011

NOTE: NOTE: allall numerical values stored as numerical values stored as floating-point numbers (“double” floating-point numbers (“double” precision)precision)

Page 6: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

Do the MathDo the Math Mathematical functions work pretty much Mathematical functions work pretty much

as you would expect:as you would expect:4+74+76*46*443-2743-27256/12256/122/(3-5)2/(3-5)

ExampleExample#!/usr/bin/perl#!/usr/bin/perlprint "4+5\n";print "4+5\n";print 4+5 , "\n";print 4+5 , "\n";print "4+5=" , 4+5 , "\n";print "4+5=" , 4+5 , "\n";$myNumber = 88;$myNumber = 88;

Note: use commas to separate multiple items in a Note: use commas to separate multiple items in a printprint statementstatement

What will be the output?What will be the output?

4+594+5=9

Page 7: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

Scalar ValuesScalar Values String valuesString values Example:Example:

$day = "Monday ";print "Happy Monday!\n";print "Happy $day!\n";print 'Happy Monday!\n';print 'Happy $day!\n';

Double-quoted: interpolates (Double-quoted: interpolates (replaces variable replaces variable name/control character with it’s valuename/control character with it’s value) )

Single-quoted: Single-quoted: nono interpolation done (as-is) interpolation done (as-is)

Happy Monday!<newline>

Happy Monday!\n

Happy Monday!<newline>

Happy $day!\n

What will be the output?What will be the output?

Page 8: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

String ManipulationString Manipulation

ConcatenationConcatenation$dna1 = “ACTGCGTAGC”;$dna1 = “ACTGCGTAGC”;

$dna2 = “CTTGCTAT”;$dna2 = “CTTGCTAT”;

juxtapose in a string assignment or print juxtapose in a string assignment or print statementstatement

$new_dna = “$dna1$dna2”;$new_dna = “$dna1$dna2”;

Use the Use the concatenation operatorconcatenation operator ‘ ‘..’’

$new_dna = $dna1 $new_dna = $dna1 . $dna2; $dna2;

SubstringSubstring$dna = “ACTGCGTAGC”;$dna = “ACTGCGTAGC”;

$exon1 = substr($dna,2,5); $exon1 = substr($dna,2,5);

0 2

# TGCGT# TGCGT

Length of the substring

Page 9: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

SubstitutionSubstitutionDNA transcription: T DNA transcription: T U U

Substitution operator Substitution operator s///s/// : :$dna = “GATTACATACACTGTTCA”;$dna = “GATTACATACACTGTTCA”;$rna = $dna;$rna = $dna;$rna $rna =~=~ s/s/TT//UU//gg; ;

#“GAUUACAUACACUGUUCA”#“GAUUACAUACACUGUUCA”

=~=~ is a binding operator indicating to exam the is a binding operator indicating to exam the contents of $contents of $rnarna for a match pattern for a match pattern

Ex:Ex: Start with Start with $dna =“gaTtACataCACTgttca”;$dna =“gaTtACataCACTgttca”;

and do the same as above. What will be the and do the same as above. What will be the output?output?

Page 10: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

ExampleExample transcribe.pl:transcribe.pl:

$dna ="gaTtACataCACTgttca";

$rna = $dna;

$rna =~ s/T/U/g;

print "DNA: $dna\n";

print "RNA: $rna\n";

Does it do what you expect? If not, why not? Patterns in substitution are case-sensitive! What can

we do? Convert all letters to upper/lower case (preferred

when possible) If we want to retain mixed case, use

transliteration/translation operator tr///$rna =~ tr/tT/uU/; #replace all t by u, all T by U

Page 11: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

Case conversionCase conversion$string = “acCGtGcaTGc”;$string = “acCGtGcaTGc”;Upper case:Upper case:

$dna = uc($string);$dna = uc($string); # “ACCGTGCATGC”# “ACCGTGCATGC”

oror $dna = uc $string;$dna = uc $string;

oror $dna = “\U$string”;$dna = “\U$string”;

Lower case:Lower case:

$dna = lc($string);$dna = lc($string); # “accgtgcatgc”# “accgtgcatgc”

oror $dna = “\L$string”;$dna = “\L$string”;

Sentence case:Sentence case:

$dna = ucfirst($string) $dna = ucfirst($string) # “Accgtgcatgc”# “Accgtgcatgc”

oror $dna = “\u\L$string”;$dna = “\u\L$string”;

Page 12: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

Reverse ComplementReverse Complement

5’-5’- A C G T C T A G C A C G T C T A G C . . . .. . . . G C A T G C A T -3’-3’

3’-3’- T G C A G A T C G T G C A G A T C G . . . .. . . . C G T A C G T A -5’-5’

ReverseReverse: reverses a string: reverses a string$string = "ACGTCTAGC";$string = "ACGTCTAGC";

$string = reverse($string);$string = reverse($string); "CGATCTGCA“"CGATCTGCA“

ComplementationComplementation: use transliteration : use transliteration operatoroperator$string =~ tr/ACGT/TGCA/;$string =~ tr/ACGT/TGCA/;

Page 13: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

More on String More on String ManipulationManipulation

String length:String length:length($dna)length($dna)

Index:Index:##index STR,SUBSTR,POSITIONindex STR,SUBSTR,POSITION index($strand, $primer, 2)index($strand, $primer, 2)

optionaloptional

Page 14: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

Flow ControlFlow ControlConditional StatementsConditional Statements

parts of code executed depending on truth value parts of code executed depending on truth value of a logical statementof a logical statement

““truth” (logical) values in Perl:truth” (logical) values in Perl:false = {0, 0.0, 0e0, “”, undef}, default false = {0, 0.0, 0e0, “”, undef}, default “”“”

truetrue = anything else, default = anything else, default 11

($a, $b) = (75, 83);($a, $b) = (75, 83);

if ( $a < $b ) {if ( $a < $b ) {

$a = $b;$a = $b;

print “Now a = b!\n”;print “Now a = b!\n”;

} }

if ( $a > $b ) { print “Yes, a > b!\n” }if ( $a > $b ) { print “Yes, a > b!\n” } # Compact# Compact

Page 15: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

Comparison OperatorsComparison Operators

ComparisonComparison StringString NumberNumber

EqualityEquality eqeq ====

InequalityInequality nene !=!=

Greater thanGreater than gtgt >>

Greater than or Greater than or equal toequal to

gege >=>=

Less thanLess than ltlt <<

Less than or equal Less than or equal toto

return 1/nullreturn 1/null

lele <=<=

Comparison:Comparison:

Returns -1, 0, 1Returns -1, 0, 1cmpcmp <=><=>

Page 16: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

Logical OperatorsLogical Operators

OperationOperation ComputeresComputeresee

English English versionversion

ANDAND &&&& andand

OROR |||| oror

NOTNOT !! notnot

Page 17: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

if/else/elsifif/else/elsif

allows for multiple allows for multiple branching/outcomesbranching/outcomes$a = rand();$a = rand();

ifif ( $a <0.25 ) { ( $a <0.25 ) {print “A”;print “A”;

}}

elsifelsif ($a <0.50 ) { ($a <0.50 ) {print “C”;print “C”;

}}

elsifelsif ( $a < 0.75 ) { ( $a < 0.75 ) {print “G”;print “G”;

}}

elseelse { {print “T”; print “T”;

}}

Page 18: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

Conditional LoopsConditional Loops

whilewhile ( ( statement statement ) {) { commands … commands … }} repeats repeats commandscommands until until statementstatement is no is no

longer truelonger true

dodo { { commandscommands } } whilewhile ( ( statementstatement ); ); same as same as whilewhile, except , except commandscommands executed as least executed as least

onceonce NOTENOTE the ‘ the ‘;;’ after the while statement!!’ after the while statement!!

Short-circuiting commands: Short-circuiting commands: nextnext and and lastlast

next;next; #jumps to end, do next iteration#jumps to end, do next iteration last;last; #jumps out of the loop completely #jumps out of the loop completely

Page 19: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

whilewhile

Example:Example:

while ($alive) {while ($alive) {

if ($needs_nutrients) {if ($needs_nutrients) {

print “Cell needs nutrients\n”;print “Cell needs nutrients\n”;

}}

}}

Any problem?Any problem?

Page 20: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

for and foreach loopsfor and foreach loops Execute a code loop a specified number of Execute a code loop a specified number of

times, or for a specified list of valuestimes, or for a specified list of values forfor and and foreachforeach are identical: use are identical: use

whichever you wantwhichever you want

Incremental loop (“C style”):Incremental loop (“C style”):for ( $i=0 ; $i < 50 ; $i++ ) {for ( $i=0 ; $i < 50 ; $i++ ) {

$x = $i*$i;$x = $i*$i;

print "$i squared is $x.\n";print "$i squared is $x.\n";

}}

Loop over list (“Loop over list (“foreachforeach” loop):” loop): foreach $name ( "Billy", "Bob", "Edwina" ) {foreach $name ( "Billy", "Bob", "Edwina" ) {

print "$name is my friend.\n";print "$name is my friend.\n";

}}

Page 21: Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.

Basic Data TypesBasic Data Types

Perl has Perl has three basic data three basic data typestypes::scalarscalararray (list)array (list)associative array (hash)associative array (hash)