16S rRNA gene marker intra-gene variability primer selection size & information content Primer...

11
16S rRNA gene marker 16S rRNA gene marker intra-gene variability intra-gene variability primer selection primer selection size & information size & information content content Primer selection, information content, alignment and length Primer selection, information content, alignment and length

Transcript of 16S rRNA gene marker intra-gene variability primer selection size & information content Primer...

Page 1: 16S rRNA gene marker  intra-gene variability  primer selection  size & information content Primer selection, information content, alignment and length.

16S rRNA gene marker 16S rRNA gene marker

intra-gene variabilityintra-gene variability

primer selectionprimer selection

size & information contentsize & information content

Primer selection, information content, alignment and lengthPrimer selection, information content, alignment and length

Page 2: 16S rRNA gene marker  intra-gene variability  primer selection  size & information content Primer selection, information content, alignment and length.

16s rRNA gene marker

Conserved 2º structure

Natural gene amplificationGenealogy reconstruction

Ludwig and Schleifer, 1994 FEMS Rev 15:155-173

http://rna.ucsc.edu/rnacenter/ribosome_images.html

Page 3: 16S rRNA gene marker  intra-gene variability  primer selection  size & information content Primer selection, information content, alignment and length.

Intra-gene variability

secondary structure shows differences in the conservation of homologous sites

highly conserved zones give information on deep-genealogies

(higher resolution for distantly related)

hypervariable zones give information on recent events

(higher resolution for close relatives)

Anderson et al., 2008 PLoS ONE, 3: e2836

Stahl and Amann, 1991 John Wiley and Sons

Page 4: 16S rRNA gene marker  intra-gene variability  primer selection  size & information content Primer selection, information content, alignment and length.

Primer selection Primer selection universality universality

Universal primers target highly conserved regionsUniversal primers target highly conserved regions

Universality depends on the known datasetUniversality depends on the known dataset

Different phyla may have differences in the Different phyla may have differences in the ““universaluniversal”” regions (e.g. EUB 338) regions (e.g. EUB 338)

Primers used for rRNA cloning may give biased resultsPrimers used for rRNA cloning may give biased results

Metagenomics without amplification steps may reveal hidden diversityMetagenomics without amplification steps may reveal hidden diversity

EUB338 IEUB338 I Most Most BacteriaBacteria GCTGCCTCCCGTAGGAGTGCTGCCTCCCGTAGGAGT

EUB338 IIEUB338 II PlanctomycetalesPlanctomycetales GCGCAAGCCGCCAACCCGTAGGCCCGTAGGTTGTGT

EUB338 IIIEUB338 III VerrucomicrobialesVerrucomicrobiales GCTGCCGCTGCCAACCCGTAGGCCCGTAGGTTGTGT

Daims et al. 1999. System Appl Microbiol 22, 434-444

Page 5: 16S rRNA gene marker  intra-gene variability  primer selection  size & information content Primer selection, information content, alignment and length.

Primer selection Primer selection size of the amplicon size of the amplicon

GM38

616Valt 8

GM5 GM5-clamp

341

518F518

518R518

GM41492

907R907

945F945

Bac1055F1055

630R1529

S1505

ideally the almost complete gene (ideally the almost complete gene (~ 1520 nucleotides) should be sequenced~ 1520 nucleotides) should be sequenced

many amplifications skip sequencing the helix 50 (~ 1490 nucleotides)many amplifications skip sequencing the helix 50 (~ 1490 nucleotides)

many clone libraries are based on just partial amplicons (~ 900 nucleotides)many clone libraries are based on just partial amplicons (~ 900 nucleotides)

Pairs GM3 (8) – GM4 (1492) most widely usedPairs GM3 (8) – GM4 (1492) most widely used

Page 6: 16S rRNA gene marker  intra-gene variability  primer selection  size & information content Primer selection, information content, alignment and length.

16S rRNA sequencing has grown exponentially in parallel to the development of sequencing techniques

Yarza et al., Nature Revs. 2014. 12: 635-645 Tamames & Rosselló-Móra 2012 TIM 20:514-516

rRNA cataloguing

radioactive Sanger

sequencing

non-radioactive

Sanger sequencing

reverse transcription sequencing

NSG

The database is exponentially increasing99% environmental sequences

1% cultured organisms

3.8 x106 sequences700,000 / year (last three)

Sources of sequences and quality

rRNA Cataloguing (up to late 80’s), bad quality

reverse transcription sequencing (up to late 90’s), bad quality

Sanger methods (radioactive, biotin-labelled, terminal-dye… still in use)

cloning DNA, good quality

direct amplification, good quality

DGGE/TGGE, short sequences, bad quality

NSG, short sequences

454 technology (now up to 800nuc, mean of 500nuc), moderate quality

illumina (now 2x 250nuc), too short

Page 7: 16S rRNA gene marker  intra-gene variability  primer selection  size & information content Primer selection, information content, alignment and length.

16S rRNA sequencing has grown exponentially in parallel to the development of sequencing techniques

Quast et al., 2013, Nuc Acid Res. 41: D590-D596

www.arb-silva.de

SILVA release 119 (July 2014)

rate of rejection of about 30% of the existing sequences

short sequences are generally worse than long stretches

Page 8: 16S rRNA gene marker  intra-gene variability  primer selection  size & information content Primer selection, information content, alignment and length.

We divided the 16S rRNA gene into We divided the 16S rRNA gene into 66

regions of regions of 250250 nucleotides nucleotides

-Calculated taxa recovery in each stretchCalculated taxa recovery in each stretch

-Compare with that of the full sequenceCompare with that of the full sequenceRegions Regions V1 & V2V1 & V2

Regions Regions V3 & V4V3 & V4

Regions Regions V5 & V6V5 & V6Category minimum

Species 98.7%

Genus 94.5%

Family 86.5%

Order 82.0%

Class 78.5%

Phylum 75.0%

Yarza et al., Nature Revs. 2014. 12: 635-645

Page 9: 16S rRNA gene marker  intra-gene variability  primer selection  size & information content Primer selection, information content, alignment and length.

- 77% 77% of the 16S rRNA gene of the 16S rRNA gene

sequences sequences < 900pb< 900pb

- The 5The 5‘‘region (V1-V2) region (V1-V2)

overestimates speciesoverestimates species

- The remaining regions tend to The remaining regions tend to

underestimate all taxaunderestimate all taxa

- Increases in length tend to mirror Increases in length tend to mirror

that of the full sequencethat of the full sequence

Yarza et al., Nature Revs. 2014. 12: 635-645

Page 10: 16S rRNA gene marker  intra-gene variability  primer selection  size & information content Primer selection, information content, alignment and length.

Size & information contentSize & information content

complete sequences give complete information complete sequences give complete information

partial sequences lose phylogenetic signalpartial sequences lose phylogenetic signal

short sequences lose resolutionshort sequences lose resolution

1500 nuc1500 nuc

900 nuc900 nuc

300 nuc300 nuc

Page 11: 16S rRNA gene marker  intra-gene variability  primer selection  size & information content Primer selection, information content, alignment and length.

Primer selection & size of ampliconsPrimer selection & size of amplicons

selection of primers is important for representative results selection of primers is important for representative results

the length of the amplified/sequenced gene the length of the amplified/sequenced gene adequate phylogenetic signal adequate phylogenetic signal

short sequences may lose resolutionshort sequences may lose resolution