Introduction to NGS

53
Introduction to NGS Ana Conesa Head of Genomics of Gene Expression Lab Centro de Investigaciones Prínicpe Felipe [email protected] http://bioinfo.cipf.es/aconesa

description

Introduction to NGS - Ana Conesa -Massive sequencing data analysis workshop -Granada 2011

Transcript of Introduction to NGS

Page 1: Introduction to NGS

Introduction to NGS

Ana Conesa

Head of Genomics of Gene Expression Lab

Centro de Investigaciones Prínicpe Felipe

[email protected]

http://bioinfo.cipf.es/aconesa

Page 2: Introduction to NGS

Next Generation Sequencing

NGS has brought high speed not only to genome sequencing and personal medicine, but has also change the way we do genome research:

Got a question on genome organization:

SEQUENCE IT!!!!

Page 3: Introduction to NGS

NGS technologies

Cost-effectiveFast

Ultra throughputCloning-freeShort reads

Page 4: Introduction to NGS

Roche 454 pyrosequencing

Page 5: Introduction to NGS

Roche 454 pyrosequencing

Page 6: Introduction to NGS

Roche 454

Page 7: Introduction to NGS

GS Junior, benchtop

Page 8: Introduction to NGS

Solexa

Page 9: Introduction to NGS

Solexa

Page 10: Introduction to NGS

Solexa-HiSeq

200 Gb/run in 8 days2x100 bp fragments

2 billion reads per run

Page 11: Introduction to NGS

Helicos

Page 12: Introduction to NGS

SOLiD

Page 13: Introduction to NGS

SOLiD

* Sequencing output in “color space”

* Needs reference genome to translate to base space.

Page 14: Introduction to NGS

SOLiD 5500

* Fifth 3-based encoded primer

* Sequencing output in base space

* No reference needed

Page 15: Introduction to NGS

5500 xl-u SOLiD

180 Gb/run (microbeads)300 Gb/run (nanobeads)

35-75 bp fragments

2.8 - 4.8 billion reads/run

2x6 lanes/run96 bar-codes

99.99% accuracy

Page 16: Introduction to NGS

Pacific Biosystems

Real time DNA synthesisUp to 12000 nt??

50 bases/second??

Page 17: Introduction to NGS

Ion Torrent

•$ 50.000•$ 500 /sample

• 1 hour/run• > 200 nt lengths

•Reads H+ released by DNA polymerase

Page 18: Introduction to NGS

Comparison

•Short fragments•Errors: Hexamer bias•High throughput•Cheap

•Resequencing:•ChipSeq•RNASeq•MethylSeq

•Short fragments•Color-space•High throughput•Cheap

•Resequencing:•ChipSeq•RNASeq•MethylSeq

•Long fragments•Errors: poly nts•Low throughput•Expensive

•De novo sequencing:•Amplicon sequencing

Roche 454 Solexa SOLiD

Page 19: Introduction to NGS

Applications

De novo sequencingResequencingExome SequencingRNA-seqGenome annotationChip-seqMethyl-seq…….

Page 20: Introduction to NGS

Applications

De novo sequencingResequencingExome SequencingRNA-seqGenome annotationChip-seqMethyl-seq…….

Page 21: Introduction to NGS

Basic steps NGS data processing

QC and read cleaning

Page 22: Introduction to NGS

Basic steps NGS data processing

QC and read cleaning

Mapping

Page 23: Introduction to NGS

Basic steps NGS data processing

QC and read cleaning

MappingFeature

identification

Page 24: Introduction to NGS

Basic steps NGS data processing

QC and read cleaning

MappingFeature

identification

SNVsIndels

Rearrang.

RPKMSplicing

DNA Binding site

Page 25: Introduction to NGS

RNA-seq

Elucidate gene models

Quantify gene expression

Page 26: Introduction to NGS

RNA-seq

Elucidate gene models

Page 27: Introduction to NGS

RNA-seq protocol*

total RNA purification

oligodT

RiboZ

mRNA preparation

2nd strand synthesis fragmentation1st strand synthesis

RNADNA

*Solexa Pair-End

Page 28: Introduction to NGS

RNA-seq protocol (II)

A

A

A

A

A

A

A

A

A

A

adenylation 3’ ends

ligate adapters

amplification

SEQUENCING!

library

10

0b

p la

d

400-200

400-200

Page 29: Introduction to NGS

Strand-specific RNAseq

Page 30: Introduction to NGS

Strand-specific RNA-seq

Page 31: Introduction to NGS

fastq: sequence data and qualities

SAM/BAM: mapping data and qualities

File formats

Page 32: Introduction to NGS

Some Figures

1 Solexa run ==8 lanes ==25 M reads/lane==2 x 4 G fastq/lane (PE)32 G disk space

Mapping @ processor 12 cores, 48 GB RAM , 4TB disk 24 hours

SAM (Ascii) / BAM (Binary) output 36 G / 9 G

How much does it “cost” (computationally) to sequence a human transcriptome?

One human transcriptome: 100 Million reads

Page 33: Introduction to NGS

Applications of RNAseq

Qualitative:* Alternative splicing* Antisense expression* Extragenic expression* Alternative 5’ and 3’ usage* Detection of fusion transcripts

….

Quantitative:* Differential expression* Dynamic range of gene expression

….

Tophat/CufflinksScripture

Alexa

edgeRDESeqbaySeqNOISeq

Page 34: Introduction to NGS

Advantages of RNAseq?

* Non targeted transcript detection* No need of reference genome* Strand specificity* Find novels splicing sites* Larger dynamic range* Detects expression and SNVs* Detects rare transcripts

….

* Restricted to probes on array* Needs genome knowledge* Normally, not strand specific* Exon arrays difficult to use* Smaller dynamic range* Does not provide sequence info* Rare transcripts difficult

….

RNAseq microarrays

and…. are there any disadvantages?????

Page 35: Introduction to NGS

Resequencing

Page 36: Introduction to NGS

Exome Sequencing

DNA (patient)

Gene A Gene B

Produce shotgun library

Capture exon sequences

Wash & Sequence

Map againstreference genome

Determine variants,Filter, comparepatients

candidate genes

1

2

3

45

Page 37: Introduction to NGS

Exome capture

Page 38: Introduction to NGS

The principle: comparison of patients

Patient 1

Patient 2

Patient 3

Patient 4

Patient 5

Patient 6

mutation

candidate gene (shares mutation for all patients)

Page 39: Introduction to NGS

ChipSeq

Page 40: Introduction to NGS

MethylSeq

Page 41: Introduction to NGS

MIDseq

Page 42: Introduction to NGS

Census NGS methods

Page 43: Introduction to NGS

Sucessful Stories

Page 44: Introduction to NGS
Page 45: Introduction to NGS
Page 46: Introduction to NGS

Miller syndrome

Page 47: Introduction to NGS
Page 48: Introduction to NGS
Page 49: Introduction to NGS

Species composition of metagenomic DNAextracted from mammoth hair.

Page 50: Introduction to NGS

Conclusions

NGS is revolutionizing how we do genome research

Page 51: Introduction to NGS

Conclusions

NGS is revolutionizing how we do genome research

But it will also revolutionize our lives….

Page 52: Introduction to NGS

Conclusions

NGS is revolutionizing how we do genome research

But it will also revolutionize our lives….

If we manage to process and analyze the data

Page 53: Introduction to NGS

YOUR SUCESSFUL STORY???

Have a great MDA course?