Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years,...

47
Learn more at www.genespring.com Introduction to RNA-Seq in GeneSpring NGS Software Dipa Roy Choudhury, Ph.D. Strand Scientific Intelligence and Agilent Technologies

Transcript of Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years,...

Page 1: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Learn more at www.genespring.com

Introduction to RNA-Seq in GeneSpring NGS Software

Dipa Roy Choudhury, Ph.D. Strand Scientific Intelligence and Agilent Technologies

Page 2: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Introduction to RNA-Seq

• In a few years, massively parallel cDNA sequencing, or RNA-seq,

has allowed many advances in the characterization and

quantification of transcriptomes.

• Rapidly decreasing sequencing cost and massively-parallel

sequencing technologies has resulted in a dramatic increase in

the quantity of data that needs to be analyzed.

• Therefore, the need of the day is to build a tool that will enable

the analysis and integration of data produced on multiple

platforms and using multiple methods.

• Agilent has designed NGS analysis in GeneSpring keeping in

mind the biologist, who is interested in answering REAL biological

questions and does not want to become a bioinformatics expert

just to do their work.

Page 3: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

GeneSpring

NGS

Agilent SureSelect Target Enrichment

5 Map Reads to the

Ref. Genome 6 Quality Control on

Mapping 7 Detect

SNPs/InDels or

Diff. Spliced

Genes

8 Find Biological

Relevance for

your Results

Page 4: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

GeneSpring NGS Provides Downstream

Analysis of Next-Gen Sequencing data

Data File (Reads + Quality)

Control Software

Data File (Reads + Quality)

ELAND/BIOSCOPE/BWA…

Reads aligned to genome

Reads aligned to genome

GeneSpring NGS

Primary Analysis Tertiary Analysis Secondary Analysis

FASTQ, … BAM, …

FASTQ

Page 5: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Questions we can seek to answer using

GeneSpring RNA-Seq Workflow are

What are the differentially expressed genes?

What are the differentially spliced genes?

Are there any SNPs in the transcriptome?

Can we identify gene fusion events?

Can we identify novel genes, novel exons, and novel splice junctions?

Page 6: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

6

Import Data and Annotations

Page 7: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Download Human build HG18 from the

Agilent Server

8

Open Human

“tree” Select hg18 and

Homologene

Groups

Click [UPDATE]

to start the

download

Annotations files

can be quite

large

Page 8: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Baits and Targets for SureSelect Catalog

Kits are Pre-loaded

9

List of Agilent

SureSelect Catalog

Kits

Page 9: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Creating a new Experiment

10

Click “New

Experiment” icon to

start new experiment

Or select “New

Experiment” from

Project menu

Page 10: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Create New GeneSpring NGS Experiment

Provide some useful and descriptive name

Be sure to select “NGS” as the Analysis type

Page 11: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Importing Data: Choosing Metadata New organisms supported on demand

Indicate organism,

build and transcript

model

Prepackaged

annotations

available for a

variety of

organisms

Support for Illumina,

Life Technology, 454

Roche

Support for single end,

paired end, and mate

pair protocols

Page 12: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Provide Information on Sequencing platform and

library layout for SureSelect specific kits

13

Be sure to select the

previously loaded

Target region

Page 13: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

GeneSpring NGS

Organization Of The Windows

14

Project and Experiment Navigator

Workflow Browser

Region View

Page 14: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

15

Quality Control

Page 15: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Perform QC on reads and filter anomalous reads View reads by tile/lane and remove reads in anomalous tiles

Base

qualities in

the same

tiles show

bad

Perfect match

reads and too

many

mismatch

reads in this

tile.

Page 16: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Quality Inspection

Open Quality Control Manager

Press Compute to

calculate the Library

QC metrics

Press Compute to

calculate the Library

QC metrics

Page 17: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

July 11 Page 18

QC Manager feature in SureSelect experiments

Off-target reads

can be removed

to focus analysis

Targeted Regions

(SureSelect Baits)

Off-target reads

can be removed

to focus analysis

Page 18: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Determine the expression values for

each gene

Page 19: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Run Quantification to determine raw gene counts Which reads contribute to a gene’s count?

These reads

do NOT

contribute

Only Reads

overlapping exonic

regions contribute to

the read count

Multiply mapping

reads contribute

fractionally to the

count

Page 20: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Quantification

Quantify Genes, Transcripts and Exons

Reverted back to the All

Aligned Reads list to

establish a base line. Feel

free to try a different

(filtered) read list

Unchecked will only count

reads falling completely

inside an exon.

New genes and exons are

discovered using

conservation data

(Conservation track in

Annotations)

Page 21: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Filter genes by RPKM: Results of Filtering

Profile plot of

genes that pass

the filter criteria

Profile plot of

genes that pass

the filter criteria

Page 22: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Handling overlapping genes Which reads contribute to a gene’s count?

These reads contribute to both

genes except for ABI data

which is strand specific

Overlapping gene on

negative strand

Gene on positive

strand

Page 23: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Determine expression values normalized across samples

Page 24: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Scatter Plot between Two Replicates

Page 25: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Detection of differentially expressed genes

Page 26: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Output of Differential Expression Analysis

P-values,

Corrected p-

values and fold

changes

Volcano Plot

Page 27: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Determine differentially spliced genes

detected

Page 28: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

34

Identifying Differentially Spliced Genes

Compute the proportion of a

gene’s count that can be ascribed to

a particular transcript.

If the proportion for a particular

transcript changes substantially across conditions, the gene

is said to be differentially spliced.

Page 29: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

The Challenge: Deconvoluting Transcript Read Counts

Which of the 4

transcripts do

these reads

come from?

Page 30: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Differential Splicing Analysis

View Results in Gene View

Ensure Splicing

Analysis Results

Entiy list is selected

Click “Gene

View” icon to

show Gene

View

Page 31: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Differential Splicing Analysis

Gene View

This transcript

is expressed

less in Tumor

Gene’s

RPKM

4 Transcript

RPKMs

4 Transcript

RPKMs

4 Transcript

RPKMs

Transcript

RPKM

This transcript

is expressed

more in Tumor

Possible

New Exon

Possible

New Exon

Page 32: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Determine SNPs be determined in the

transcriptome

Page 33: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

GeneSpring NGS has a built-in SNP calling algorithm

Set Filters for SNP

statistical significance

Set filters for min number of

overlapping reads and min

number of overlapping

variant reads

Page 34: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

GeneSpring NGS calls transcript effects for

each SNP and allows filtering of SNPs based on

these effects

Types of effects

predicted Change in

Amino Acid for

Non-

synonymous

SNPs

Page 35: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Viewing SNPs in the Genome Browser

Color-coded

indicator for a

Homozygous

SNP

Known in

dbSNP

GeneSpring

NGS SNP Call

In a Repeat

Region

Page 36: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Determine chimeric transcripts or

fusion genes

Page 37: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Identify Fusion Genes

In a K562 Leukemia cell line, GeneSpring NGS confirms the well-known

BCR-ABL1 gene fusion.

Filters set on the

Genome

Browser to show

only trans-

located reads

Several reads pairs for

the BCR gene on

chr22 with mates

translocated to the

ABL1 gene on chr9

The

corresponding

paired reads for

the ABL1 gene on

chr9

Page 38: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Detection of novel genes, exons, and

splice junctions

Page 39: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Identify Novel Exons and Genes

In a mouse myoblast study, GeneSpring NGS determines

a new exon for the FHL3 gene

Read clumps not aligned

with a known exon

Novel exon

determined by

GeneSpring NGS,

probably a new

transcription

start site

Add exon to gene if close to or within the

gene, otherwise call it a new

gene

Page 40: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Identify Novel Splice Junctions In a brain tissue expression study, GeneSpring NGS determines a new splice

junction in the DTX3 gene when considering only Refseq transcripts; this novel

splice junction is corroborated by a UCSC transcript.

Solid lines show

spliced reads

connecting the

1st and 3rd exons

of the RefSeq

transcript

The

corresponding

novel splice

junction found by

GeneSpring

NGS

Indeed, a known

UCSC transcript

that is not

present in

RefSeq validates

this discovery

Page 41: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

49

Biological Contextualization

Pathway Analysis

Page 42: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Agilent GeneSpring NGS for SureSelect

Display the Results on a Pathway

Page 43: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Questions we can seek to answer using

GeneSpring RNA-Seq Workflow

What are the differentially expressed genes?

What are the differentially spliced genes?

Are there any SNPs in the transcriptome?

Can we identify gene fusion events?

Can we identify novel genes, novel exons, and novel splice junctions?

Page 44: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Summary

•Differential expression and splicing analysis

•Novel gene, exon and alternative splicing

discovery

•Gene Fusion Analysis

•SNP & InDel discovery and annotation with

dbSNP

Page 45: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

Summary in General

•GeneSpring NGS supports both SureSelect RNASeq

experiments as well as RNA Seq experiments that don’t

use Sure Select

•The workflow steps in GeneSpring NGS application are

application specific and changes based on whether you

are analyzing a DNA-SEQ or RNA-SEQ experiment.

•It is possible to integrate data produced on multiple

platforms and using multiple methods in the same project

in GeneSpring.

•Multiple different visualization tools available to query the

data.

Page 47: Introduction to RNA-Seq in GeneSpring NGS Software · Introduction to RNA-Seq • In a few years, massively parallel cDNA sequencing, or RNA-seq, has allowed many advances in the

55

http://www.AVADIS-NGS.com