Practical Guide to the $1000 Genome (2014)

A Practical Guide to the $1000 Genome

Michael Heltzen, CEO & Co-Founder

Shawn C. Baker, Ph.D., CSO & Co-Founder

The Sequencing Marketplace

Match researchers with sequencing providers

Neutral stance

Unique perspective

Where to start?

How do I communicate it?

Pick 2, but not all 3…

The lab’s side of the problem

Overcapacity…

What is our value proposition?

What is an optimal customer for us?

Buyers don’t know what they want?

How do I price?

Should I generalize or specialize?

Technologies and needs change all the time…

Lack of standards

Why are standards so hard for us as an industry?

How does AllSeq work?

AllSeq connect researcher with NGS sequencing needs, to the most optimal lab for each case

It works like this

Project design& QA

Offers & Picking a lab

Match & talks

Human and

diseases

Virus and

Bacteria

Plants and

Animals

Over to Shawn and the $1000 Genome

The $1000 genome is here!

(sort of…)

The HiSeq X Ten: What is it?

Data output:

– 600 Gb/day

– 1.8 Tb/run

– ~5 whole human genomes/day

– 1800 genomes per year

Patterned flow cells

Improved optics

What’s the catch?

$1000 Genome

=$800 – sequencing$135 – amortization$65 – library prep

$1000 Genome

= $1M

$1000 Genome

= $10M

1 day = $5000

=

1 year = $1,800,000

=

1 year= $18,000,000

=

4 years = $72,000,000

=

Allseq.com/1000-genome

…ACCATGATCTAGCCGATTTCGA…

…TGGTACTAGATCGGCTAAAGCT…

Whole Genome vs Exome

Whole Genome

~2.8Gb = ~ 95% coverage



Exome Sequencing

~40Mb = ~ 1.3% coverage



Whole Genome vs Exome

WGS Exome

Price ✓Coverage ✓

Uniformity ✓Analysis ✓

HiSeq X Ten Dataset

HiSeq X Ten Dataset

NA12878D and NA12878J – Coriell Cell Repository

Illumina TruSeq Nano, 2X150bp, 350bp insert

>120Gb, 87% >Q30

Analyzing the Data

Primary

• Base calling

• QC

Secondary

• Assembly

• Alignment

Tertiary

• Annotations

• Visualization

• Statistics

Reporting

• Research

• Clinical

IT Infrastructure/Data Management

Analyzing the Data

@EAS54_6_R1_2_1_413_324CCCTTCTTGTCTTCAGCGTTTCTCC+;;3;;;;;;;;;;;;7;;;;;;;88@EAS54_6_R1_2_1_540_792TTGGCAGGCCAAGGCCGATGGATCA+;;;;;;;;;;;7;;;;;-;;;3;83@EAS54_6_R1_2_1_443_348GTTGCTTCTGGCGTGGGTGGGGGGG+EAS54_6_R1_2_1_443_348;;;;;;;;;;;9;7;;.7;393333

fastq file:

Data Analysis & Interpretation

Medical report:

Example from knomeDISCOVERY

Analyzing the Data

Long Reads: PacBio

~2kb ~10kb

Long Reads: Moleculo

Moleculo TruSeq Synthetic Long Reads

10kb ‘synthetic’ reads

Long Reads: Oxford Nanopore

Single Cell/Cell-Free DNA Sequencing

Moving Beyond the Genome

Credits: Darryl Leja (NHGRI), Ian Dunham (EBI)

Topic: Researchers vs. clinical.

Trends: Transition to the Clinic

Increased output

Lower cost

Rapid updates

Ease of use

Quick TAT

Stability

Researchers Clinicians

Approval trend: Transition to the Clinic

MiSeq Dx

– FDA clearance Nov 2013

– Will also submit 2500 and NIPT assay

PGM

– Listed with FDA Sept 2014

Opportunities and challenges

What is great– We are getting there…– It is going faster and better/cheaper/faster– More and more people are starting to understand

What is not so great– We are not there yet – We are not even as far as many people think we are– Lack of standards (especially for the clinical market)

First: The bad part

Technical error sources:

– Sampling

– Sequencing

– Bioinformatics

– Interpretation

Lack of standards…

Then: The good part

Large steps in the right direction on all fronts. Is it only a matter of time now…

The new genomics technologies are slowly getting ripe for the clinic!

We are collectively making the world a better place!

www.allseq.com@[email protected]

Practical Guide to the $1000 Genome (2014)

Science

Transcript of Practical Guide to the $1000 Genome (2014)