Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a...

Sequence Similarity

Why study sequence similarity?

• Possible indication of common ancestry

• Similarity of structure implies similar biological function – even among apparently distant organisms

• Example context: establishing possible causal relationship between wide use of antibiotics in agriculture and spread of antibiotic resistant bacteria

Antibiotic resistant bacteria

• have evolved rapidly

• can thrive when antibiotics kill non-resistant bugs

• horizontal gene transfer can speed development of antibiotic resistance

Source: http://textbookofbacteriology.net/themicrobialworld/bactresanti.html

Figure 3.2: Vertical and horizontal gene transfer

Figure 3.3: How exposure to antibiotics selects for the survival of resistant cells in a population of bacteria

Figure 3.4: A plasmid carrying an antibiotic-resistance gene can be transferred to a new cell by conjugation

• Widespread use of antibiotics means non-resistant strains die, leaving resistant strains to survive and multiply; phenomenon observed in hospitals, care centers, etc.

• Once some bacteria in environment are resistant, HGT can occur & spread resistance faster than would otherwise occur (through mutation)

• Use of antibiotics common in agriculture

• Presence in human pathogens of resistant genes that are highly similar to genes found in animals would provide evidence that HGT has occurred

Gene similarity

• Homologues: similar sequences– homology

– homologous

• Orthologs: a similar gene appears in two different organisms where– several other such similarities occur

– organisms have common evolutionary ancestry

• Xenologs: similar gene found in organisms that have little else in common – evidence of HST

Similarity: how close is close?

• Proteins considered homologous if 25% of residues are identical

• DNA homologous with 70% identity

• Threshold level for HST: 95% identity

Establishing homology: alignment

• Match sequences in meaningful way

• Account for differences in sequence length due to indels:

– insertions

– deletions

• Scoring system based on closeness of match

BLAST: Basic Local Alignment & Search Tool

• Versions exist to compare

– protein – protein

• blastp: use when you want to learn about function of protein

– protein – nucleotide

• tblastn: used to compare protein with DNA to discover new genes encoding simple proteins

– nucleotide – nucleotide

• blastn: we’ll use this to look for HGT evidence

BLAST servers

• Home server at NCBI

• Other servers available worldwide

– BLAST servers very popular (and busy)

– Japan is sleeping when it’s morning in the USA

– Europe is sleeping when it’s afternoon in the USA

Using blastn

• Start with query sequence – nucleotide sequence you want to investigate

• BLAST compares query with every GenBanksequence

– performs alignment

– reports matches with high degree of similarity

Using blastn

• Point browser to NCBI website

– choose BLAST on home page

– scroll down to Basic BLAST and choose nucleotide

Using blastn

• Paste your query sequence in the window, as shown:

Using blastn

• Scroll down to the next box on the page, and select the database to be searched (Nucleotide, in this case)

Using blastn

• Scroll down to the BLAST button and click it

• Then wait …

• Eventually, you’ll see a screen like this:

BLAST results

• Graphical summary

– query sequence at top

– each bar represents portion of another sequence similar to query

• red: most similar – homologous to query

• pink: not as good

• green: borderline

• blue/black: “twilight zone”

BLAST results: graphics section

BLAST results: description section

• Accession: database entry’s GenBankaccession number

• Description: usually identifies organism, some characteristics of sequence

• Scores: based on number of matches in alignment

• E-value: statistical significance of score

E-value

• Estimate of the number of times a match could have been produced by chance

• The lower the e-value, the greater the significance:– greater similarity between query & target

– greater confidence of homology

– identical sequences have e-value of 0; anything above .001 is considered insignificant

• E-values are written in scientific notation form

Alignment section

Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a...

Documents

Transcript of Sequence Similarity · •Homologues: similar sequences –homology –homologous •Orthologs: a...

sieges homologues sur la base de la norme fia 8855-1999

the relative activity of prostigmine homologues and other substances ...

Orthologs and paralogs Algorithmen der Bioinformatik WS 11/12.

AGING AND BIOTECNOOGY · Researches discovered with mutation in genes daf-2 or age-1 lifespan increase The orthologs gene mammalian respectively InR and PI3K DAF-16 human orthologs

Mycobacterium tuberculosis GroEL Homologues Unusually Exist as

Sequence Comparison – Identification of remote homologues

orthologs encoding necrosis and ethylene inducing proteins ...

Bacterial DNA repair genes and their eukaryotic homologues ... · PDF fileReview Bacterial DNA repair genes and their eukaryotic homologues: 3. AlkB dioxygenase and Ada methyltransferase

Rhesus Cytomegalovirus Contains Functional Homologues of US2

Lactadherin orthologs inhibit migration of human, porcine ...

Characterization of fengycin homologues produced by Characterization of fengycin homologues produced by B. amyloliquefaciens (ET) strain isolated from.

Abiotic transformations of TKEBS and other lower homologues of

ARABIDILLO gene homologues in basal land plantspure-oai.bham.ac.uk/ws/files/13371880/Proof_generated_on_sub.pdf · For Peer Review 1 ARABIDILLO gene homologues in basal land plants:

Loss of GET pathway orthologs in Arabidopsis thaliana · Loss of GET pathway orthologs in Arabidopsis thaliana causes root hair growth defects and affects SNARE abundance Shuping

Proteomics Analysis Identifies Orthologs of Human ...

Orthologs: Two genes, each from a different species , that descended from

Amavadin and homologues as promoters of technological ...

KinOrtho: a method for mapping human kinase orthologs ...

Hexaploid (Bread) wheat Triticum aestivum 2n = 6x =42 1234567 A B D abcdabcd abcdabcd abcdabcd homologues homoeologues To be fertile, true homologues.

Orthologs of the archaeal isopentenyl phosphate kinase ...Orthologs of the archaeal isopentenyl phosphate kinase regulate terpenoid production in plants Laura K. Henrya, Michael Gutensohnb,