Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains...

24
Molecular evidence for endosymbiosis • Perform blastp to investigate sequence similarity among domains of life • Found yeast nuclear genes exhibit more sequence similarity (closer in evolutionary time) with archaeal genes • Found yeast mitochondrial genes exhibit more sequence similarity with eubacterial genes

Transcript of Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains...

Page 1: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Molecular evidence for endosymbiosis

• Perform blastp to investigate sequence similarity among domains of life

• Found yeast nuclear genes exhibit more sequence similarity (closer in evolutionary time) with archaeal genes

• Found yeast mitochondrial genes exhibit more sequence similarity with eubacterial genes

Page 2: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

t-test and significance

• t-test determines if the data come from the same population or if there are significant differences

• Calculate the mean of data, standard deviation of each data set, derive a weighted standard deviation to be used in t-test

• Compare to t-critical value obtained from t-table or software

Page 3: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Origins of eukaryotic cells

Page 4: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Martin-Muller hypothesis

Martin and Muller hypothesis

Page 5: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Evidence from phylogenetic relationships

Page 6: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Leprae vs. tuberculosis

• Leprae (3.2Mb) is ~50% coding, contrasted with 4.4 Mb and 91% coding for tuberculosis

• Comparing genomes using Mummer:

• http://www.tigr.org/tigr-scripts/CMR2/webmum/mumplot

Page 7: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

How Mummer works:

• Uses suffix trees to create an internal representation of a genome sequence

• Identify maximal unique matches (MUM); version 2.0 uses streaming whereas 1.0 adds sequence 2 to suffix tree for sequence 1

• Alignment via Smith-Waterman

Page 8: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Origin of species

• Mitochondrial DNA and human evolution

• Evolution of pathogens

Page 9: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Phylogeny – data mining by biologists

• Molecular phylogenetics is using clustering techniques to discern relationships between different biological sequences

Page 10: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Why phylogenetics?

• Understand evolutionary history

• Map pathogen strain diversity for vaccines

• Assist in epidemiology (Dentist and HIV)

• Aid in prediction of function of novel genes

• Biodiversity

• Microbial ecology

Page 11: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Changes can occur

Page 12: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Observing differences in nucleotides

• The simplest measure of distance between two sequences is to count the # of sites where the two sequences differ

• If all sites are not equally likely to change, the same site may undergo repeated substitutions

• As time goes by, the number of differences between two sequences becomes less and less an accurate estimator of the actual number of substitutions that have occurred

Page 13: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

The relationship between time and substitutions is non-linear

Page 14: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Various models have been generated to more accurately estimate distance and evolution

• All use the following framework:

Probability matrix

pAC is the probability of a site starting with an A had a C at the end of time interval t, etc.

Base composition of sequence; fa = frequency of A

Page 15: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Jukes-Cantor Model

• Distance between any two sequences is given by: d = -3/4 ln(1-4/3p)

• p is the proportion of nucleotides that are different in the two sequences

• All substitutions are equally probable– Each position in matrix = ; except diagonal =

1-

Page 16: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Kimura’s two parameter model

• d = ½ ln[1/(1-2P-Q)] + ¼ ln[1/1-2Q)]

• P and Q are proportional differences between the two sequences due to transitions and transversions, respectively.

• Accounts for transition bias in sequences (transversions more rare)

Page 17: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Evolutionary models

Page 18: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Implementing models and building trees

Page 19: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Rooted vs. unrooted

• Root – ancestor of all taxa considered

• Unrooted – relationship without consideration of ancestry

• Often specify root with outgroup– Outgroup – distantly related species (ie.

mammals and an archaeal species)

Page 20: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Tree building

• Get protein/RNA/DNA sequences

• Construct multiple sequence alignment

• Compute pairwise distances (if necessary)

• Build tree – topology and distances

• Estimate reliability

• Visualize

Page 21: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Distance methods

• UPMGA

• Neighbor joining

Page 22: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Unweighted pair-group method using arithmetic averages (UPGMA)• Assumes a constant rate of gene

substitution, evolution• Clustering algorithm that measures

distances between all sequences, merges the closest pair, recalculates that node as an average, then merges the next closest pair, re-iterate

• Usually gives a rooted tree

Page 23: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Testing the reliability of trees

• Interior branch test or Bootstrap analysis

• Bootstrap analysis – subsequences or sequence deletion or replacement; re-draw trees; how many times do you get some branching? Bootstrap values of 70 (95) or greater are normally considered reliable

Page 24: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.

Homework due on 10/6

• Discovery questions in Chapter 2

• 4, 25-27