Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl...

24
Bioperl

Transcript of Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl...

Page 1: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Bioperl

Page 2: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

What’s Bioperl?

Bioperl is not a new language

It is a collection of Perl modules that facilitate the development of Perl scripts for bioinformatics applications.

Page 3: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Perls script

Perl Interpreter

Perl Modules

Bioperl Modules

output

input

Bioperl and Perl

Bioperl and perl

Page 4: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Why bioperl for bioinformatics?

Perl is good at file manipulation and text processing, which make up a large part of the routine tasks in bioinformatics.

Perl language, documentation and many Perl packages are freely available.

Perl is easy to get started in, to write small and medium-sized programs.

Page 5: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Where to get help

Type perldoc <modulename> in terminal

Search for particular module in https://metacpan.org

Bioperl Document

Page 6: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Object-oriented and Process-oriented programming

Process-oriented: Yuan Hao eats chicken

Name object: $name Food object: $foodAction method: eat

Object-oriented: $name->eat($food)

Page 7: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Modularize the program

Page 8: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Perl 5.6.1 or higher Version 5.8 or higher is highly recommended

make for Mac OS X, this requires installing the Xcode Developer Tools

Platform and Related Software Required

Page 9: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Installation On Linux or Max OS X

Install from cpanminus: perlbrew install-cpanm cpanm Bio::Perl

Install from source code: git clone https://github.com/bioperl/bioperl-live.gitcd bioperl-liveperl Build.PL./Build test (optional)./Build install

Page 10: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Install MinGW (MinGW is incorporated in Strawberry Perl, but must it be installed through PPM for ActivePerl) : ppm install MinGW

Install Module::Build, Test::Harness and Test::Most through CPAN: Type cpan to enter the CPAN shell. At the cpan> prompt, type install CPAN Quit (by typing ‘q’) and reload CPAN. You may be asked some configuration questions, accept defaults At the cpan> prompt, type o conf prefer_installer MB then type o conf commitAt the cpan> prompt, type install Module::Build. At the cpan> prompt, type install Test::Harness. At the cpan> prompt, type install Test::Most.

Installation On Windows

Page 11: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Finish install from source code: Go to GitHub and press the Download ZIP button. Extract the archive in the normal way. In a cmd window cd to the directory you extracted to. Eg. if you extracted to directory ‘bioperl-live’, cd bioperl-live Type perl Build.PL and answer the questions appropriately. Type perl Build test. All the tests should pass, but if they don’t, let us know. Your usage of Bioperl may not be affected by the failure, so you can choose to continue anyway. Type perl Build install to install Bioperl.

Installation On Windows

Finish install from cpan: type /d/bioperl/ .... Distribution C/CJ/CJFIELDS/BioPerl-1.007001.tar.gz type install C/CJ/CJFIELDS/BioPerl-1.007001.tar.gz

Page 12: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Show the capability of bioperl in following examples

Page 13: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Creating a sequence, and an Object

#!/usr/bin/perl -w

use Bio::Seq;

my $seq_obj = Bio::Seq->new(-seq => 'aaaatgggggggggggccccgtt', -alphabet => 'dna' );

object class method argument

Page 14: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Creating a sequence, and an Object

#!/usr/bin/perl -w

use Bio::Seq;

my $seq_obj = Bio::Seq->new(-seq => 'aaaatgggggggggggccccgtt', -alphabet => 'dna' );

print $seq_obj->seq . "\n"

Page 15: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

#!/usr/bin/perl -w

use Bio::Seq;

my $seq_obj = Bio::Seq->new(-seq => "aaaatgggggggggggccccgtt", -display_id => "#12345", -desc => "example 1", -alphabet => "dna" );

print $seq_obj->seq();

More True-to-life example

Page 16: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Write Sequence to File

#!/usr/bin/perl -w

use Bio::Seq; use Bio::SeqIO;

my $seq_obj = Bio::Seq->new(-seq => 'aaaatgggggggggggccccgtt', -alphabet => 'dna' );

my $seqio_obj = Bio::SeqIO->new(-file => '>sequence.fasta', -format => 'fasta' );

Create object for IO from class Bio::SeqIO

Page 17: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

#!/usr/bin/perl -w

use Bio::Seq; use Bio::SeqIO;

my $seq_obj = Bio::Seq->new(-seq => "aaaatgggggggggggccccgtt", -display_id => "#12345", -desc => "example 1", -alphabet => "dna" );

my $seqio_obj = Bio::SeqIO->new(-file => '>sequence.fasta', -format => 'fasta' );

$seqio_obj->write_seq($seq_obj);

Write Sequence to File

Page 18: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Write Sequence to File

#!/usr/bin/perl -w

use Bio::Seq; use Bio::SeqIO;

my $seq_obj = Bio::Seq->new(-seq => "aaaatgggggggggggccccgtt", -display_id => "#12345", -desc => "example 1", -alphabet => "dna" );

my $seqio_obj = Bio::SeqIO->new(-file => '>sequence.fasta', -format => 'Genbank' );

$seqio_obj->write_seq($seq_obj);

Unified Programming~~

Page 19: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Retrieving a Sequence from a File

#!/usr/bin/perl -w

use Bio::SeqIO;

my $seqio_obj = Bio::SeqIO->new(-file => "sequence.fasta", -format => "genbank" );

my $seq_obj = $seqio_obj->next_seq;

print $seq_obj->seq . "\n";

Page 20: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

#!/usr/bin/perl -w

use Bio::Tools::Run::Alignment::Muscle; use Bio::AlignIO;

my @params = (quiet => 0, maxiters => '100');

my $factory = Bio::Tools::Run::Alignment::Muscle->new(@params); my $inputfilename = "$ARGV[0]"; my $aln = $factory->align($inputfilename);

my $out = Bio::AlignIO->new(-file => ">$ARGV[1]", -format => 'fasta'); $out->write_aln($aln);

Multiple Sequences Alignment

Bioperl can incorporate with other software

Page 21: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Retrieving a Sequence from a Database

#!/usr/bin/perl -w

use strict; use Bio::EnsEMBL::Registry;

my $registry = 'Bio::EnsEMBL::Registry';

$registry->load_registry_from_db( -host => 'ensembldb.ensembl.org', -user => 'anonymous', );

my $slice_adaptor = $registry->get_adaptor( 'Human', 'Core', 'Slice' ); my $slice = $slice_adaptor->fetch_by_gene_stable_id('ENSG00000128573');

print $slice->seq . "\n";

Page 22: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Application Programming Interface (API)

Bioperl provides various kind of API to extract user-defined dataset from database efficiently even you aren’t familiar with data structure of them

Database

Page 23: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Obstacle in Learning OOP

Programming in a total different way

Familiar with different object, method and class

That’s worthwhile !!!

Page 24: Bioperl - lmse.org · Bioperl Modules output input Bioperl and Perl Bioperl and perl. Why bioperl for bioinformatics? Perl is good at file manipulation and text processing, which

Thanks