Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides...
-
Upload
osborne-gregory -
Category
Documents
-
view
215 -
download
1
Transcript of Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides...
![Page 1: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/1.jpg)
Orthology & ParalogyAlignment & Assembly
Alastair Kerr Ph.D.WTCCB Bioinformatics Core [many slides borrowed from various sources]
![Page 2: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/2.jpg)
Overview
Orthology & Paralogy Definitions and examples Ways to determine an ortholog Pre-calculations: resources
Alignment & Assembly Differences Key programs for each Jalview example
![Page 3: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/3.jpg)
Homologs
Have common origins but may or may not have common activity.
Homologous or not?: Often determined by arbitrary threshold level of similarity determined by alignment
![Page 4: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/4.jpg)
Homologs
…have common ancestry, but the way they are related
can vary
(i.e. the reasons they have diverged into different sequences can vary)
orthologs - Homologs produced by speciation. They tend to have similar function.
paralogs - Homologs produced by gene duplication. They tend to have differing functions.
![Page 5: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/5.jpg)
Orthologous or paralogous homologs
Early globin gene
mouse
ß-chain gene-chain gene
cattle ß human ß mouse ßhuman cattle
Orthologs () Orthologs (ß)Paralogs (cattle)
Homologs
Gene Duplication
Orthologs – diverged after speciation – tend to have similar function
Paralogs – diverged after gene duplication – some functional divergence occurs
Therefore, for linking similar genes between species, or performing “annotation transfer”, identify orthologs
![Page 6: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/6.jpg)
True or False?
A1x is the ortholog in species x of A1y?
A1x is a paralog of A2x?
A1x is a paralog of A2y?
![Page 7: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/7.jpg)
Identifying Gene/Protein Relationships from Phylogenies
Orthologs– Homologs produced by speciation– Gene phylogeny matches organismal
phylogeny
Paralogs– Homologs produced by gene duplication.– Multiple copies of homologs in a given
species • or evidence that gene duplication involved
through phylogenetic analysis
– Lack of match to organismal phylogeny
![Page 8: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/8.jpg)
Gene Orthology: How to detect? Most : Identify reciprocal best BLAST hits (EGO, COGs,…)
Example Problem:
If making comparisons between human and bovine, for example, the bovine gene dataset is still quite incomplete
Therefore, current best hit may be a paralog now and the true ortholog not yet sequenced
cattle human cattle mouse
![Page 9: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/9.jpg)
2 Forms in 1 Species+ + ++ +
Slides from Jonathan Eisen
![Page 10: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/10.jpg)
2 Forms in 1 Species - Gene Loss
Gene duplicated in common ancestor
+ + ++ +
++
LossLoss
![Page 11: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/11.jpg)
Unusual Distribution Pattern+ +
![Page 12: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/12.jpg)
Unusual Distribution - Gene Loss+ +
Gene present in ancestor
Gene losthere
![Page 13: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/13.jpg)
Unusual Distribution -Evolutionary Rate Variation -?
+
+
Gene too diverged to be found
![Page 14: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/14.jpg)
Ortholog guess via synteny
AA CCB
AA CC?
![Page 15: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/15.jpg)
Syntenic blocks
![Page 16: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/16.jpg)
Alignments and Assemblies
Alignment ALL sequences from SAME region Therefore can be useless for
non-overlapping contigs PCR probes/oligos
Good for paralog/orthologs Basis for phylogeny More dissimilar sequences
Assembly: Good for near identical sequences Read Length
Short Read [Next Gen Sequencing] Long Read [Sanger and 3rd Gen sequencing?]
Reference? De-novo Guided [reference sequence]
![Page 17: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/17.jpg)
ensEMBL calculationshttp://www.ensembl.org
demo
![Page 18: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/18.jpg)
OMA Browserhttp://omabrowser.org
demo
![Page 19: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/19.jpg)
Alignment
Implicit statement Each residue in an aligned sequence
derived from the last common ancestor [LCA]
Therefore ok to only look at conserved regions or mask non-conserved regions Especially for phylogeny
![Page 20: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/20.jpg)
Alignment Tools
Faster but less accurate (some better with gaps) Muscle ClustalW/X MAFFT
Slow but more accurate *-Coffee
T: original 3D: uses pdb as guide (structural) M: uses multiple methods
Probcons
![Page 21: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/21.jpg)
Alignment Edit Tools
NEVER use a word processor or excel to edit alignments……
JalView (Java Alignment Viewer) Good for editing DAS capable
![Page 22: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/22.jpg)
FigureGeneration Trees
Annotation
Features
Structures
PDB
‘Standard’ FormatsFASTA MSF CLUSTAL
PILEUP BLC PFAM
DistributedAnnotationSystem
DistributedAnnotationSystem
GFF
Jalview Features
Newick
Secondary StructurePrediction
MultipleSequenceAlignment
Sequences
Alignments
ClickableHTML
ImagesLine Art
Analysis
ConsensusConservation& Clustering
Visualization
Jalview Annotation
![Page 23: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/23.jpg)
![Page 24: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/24.jpg)
![Page 25: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/25.jpg)
Jalview DAS Client Functionality
DASANNOTATIO
NSERVERS
DASANNOTATIO
NSERVERS
•Query matches ID to Authority•Map to local reference frame
•Mouse over for feature name, links and scores
•Group features by source•Type==colour•Highlight start-end
•Select specific sources•Filtered list•Add user defined sources
![Page 26: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/26.jpg)
Assemblers
Many free options : examples below Long Reads
STADEN - staden.sf.net NextGenSequencing
Guided: Bowtie, Novoalign, MAQ Denovo: Velvet
3rd Generation Sequencing ????
![Page 27: Orthology & Paralogy Alignment & Assembly Alastair Kerr Ph.D. WTCCB Bioinformatics Core [many slides borrowed from various sources]](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bfaa1a28abf838c9a935/html5/thumbnails/27.jpg)
Post Assembly
Correction Reads mapping to multiple places PCR amplification prior to mapping
Tools and workflows available in our Galaxy platform demo