Post on 15-Aug-2019
1
George Church
Thanks to:
Personal Genome Project
AppliedBiosystems, Helicos, Roche454, Illumina, CGI, IBS, Affymetrix, Enzymatics
PGP Volunteers & Donors !
2
What about government requiring testing of babies for intelligence genes?
PKU Phenylketonuria(Phenylalanine hydroxylase deficiency)Tested in nearly all 4M newborns per year1 in 15,000 births.
Close to 100% heritable (& 100% environmental)Nutritional preventative
3
Is anonymity in genomics realistic? http://arep.med.harvard.edu/PGP/Anon.htm(10) Re-identification after “de-identification” using other public data. Group Insurance Commission list of birth date, gender, and zip code was sufficient to re-identify medical records of Governor Weld & family via voter-registration records (1998) (9) Hacking. A hacker gained access to confidential medical info at the U. Washington Medical Center -- 4000 files (names, conditions, etc, 2000)(8) Combination of surnames from genotype with geographical infoAn anonymous sperm donor was traced on the internet 2005 by his 15 year old son who used his own Y chromosome genealogy to access surname relations.(7) Inferring phenotype from genotype Markers for eye, skin, and hair color, height, weight, racial features, dysmorphologies, etc. are known & the list is growing.(6) Self-identification. An example of this at Celera undermined confidence in the investigators. Kennedy D. Science. 2002 297:1237. Not wicked, perhaps, but tacky.(5) A tiny amount of DNA data in the public domain with a name leverages the rest. This would allow the vast amount of DNA data in the HapMap (or other study) to be identified. This can happen for example in court cases even if the suspect is acquitted.(4) Laptop theft. 26 million Veterans' medical records, SSN & disabilities stolen Jun 2006. (3) Unauthorized access to DNA bearing samples (e.g. hair, dandruff, hand-prints, etc.) (2) Identification by phenotype. If CT or MR imaging data is part of a study, one could reconstruct a person’s appearance . Even blood chemistry can be identifying in some cases. (1) Government subpoena. False positive IDs can be very disruptive.
4
Personal Genome Project (PGP) -- ELSI• Submitted May’03 to NIH, $10M approved Mar 2004 (technology)• HMS IRB Human Subjects protocol approved Aug 2005.
(Possibly unique in including identifiable traits)• Highly-informed individuals consenting to potentially non-anonymous genomes & extensive phenotypes (medical records, imaging, omics). Scaling to 100K volunteers: http://pgen.us
• Cell lines in Coriell NIGMS Repository(B-cells, keratinocytes, fibroblasts)
G M Church GM (2005) The Personal Genome ProjectNature Molecular Systems Biology doi:10.1038/msb4100040 Kohane IS, Altman RB. (2005) Health-information altruists--a potentially critical resource. N Engl J Med. 353:2074-7. McGuire AL, Gibbs RA (2006). Genetics. No longer de-identified. Science. 312:370-1.
5
PGP: comprehensive traits, [diseases], (treatments)
Hair: Baldness [alopecia](minoxidil) Eyes: [Near/Far-sightedness](glasses) Iris color [ARMD] (glasses)Face: [Developmental syndromes, Wrinkles] (Botox)Brain: ADHD(Ritalin); Depression(Prozac); Headache(analgesics)Sleep & Circadian (caffeine, amphetamine, modafinil)Motion sickness (Dramamine, and Scopolamine)Ears: Sensitivity (hearing aids)Nose: Shape [breathing disorders] (CPAP)Lip: [Cleft palate] (surgery); [Hirsutism] (calcium thioglycolate)Mouth: Halitosis, throat exams; aerosols [airborne pathogens]Digestion [reflux, gas,ulcer] (antibiotics, antacids, PPIs) Back: Strain sensitivity [IDD] (analgesics)Skin: Perspiration, Body odor, Pheromones (deodorants)Surface texture [psoriasis] (topicals, photo-treatments)Immune components [acne] (topical antibiotics) Skin color [vitamin D & sunburn] (supplements, SPF cream)Hands: Dermatoglyphics [syndromes], [Arthritis](corticosteroids) Internal sensors: Proprioceptor, Repetitive stress (NSAIAs)Body: Height [Marfan] [short stature] (hGH)Weight [anorexia] [obesity] (Orlistat, Phentermine, Sibutramine)Allergies (antihistamines, cortisone, epinephrine, theophylline)Metabolic polymorphisms (vitamins, minerals, insulin)Feet: Plantar fasciitis (orthotic shoes)Athlete’s foot (miconazole, itraconazole, terbinafine, salicylate)1933
6
Status quo “de-identification”problems & potential solutions1) Less integrated, holistic, comprehensive2) Less enabling of system-wide medicine3) Subjects not informed enough to “opt-out”4) Life-impacting info can’t be shared with subjects5) False sense of anonymity (see http://pgen.us)
In contrast, PGP IRB emphasizes genetics education to enable 1) Active subject participation, informed opt-out, 2) choose research-only -OR- open public database. 3) genomics, EMRs, traits-questionaire linked4) Scaleable to millions of research subjects,
leveraging inexpensive geno/phenotypes5) Possibly self-funding. Early adopters pay
for less financially able (but well-informed) set
7
What if no treatment exists?
Huntington's ChoreaNancy Wexler’s family
AdrenoleukodystrophyAugusto Odone’s son
DougMelton’s son, Sam, has diabetes
Inspire personal health activism.
Parkison’sDiseaseMichael J. Fox
Cancer, substance
abuseBetty Ford
SchizophreniaJim Watson’s son Rufus
8Siebold 2004 “Crystal structure of HLA-DQ0602 that protects against type 1 diabetes and confers strong susceptibility to narcolepsy”
Causative alleles: cell/animal/drug models, physiological/anatomical mechanisms,
MAO-AFoxP2
9
Polony sequencing processopen source software hardware, wetware
Shendure, Porreca et al., Science 309: 1729-3210:11
10
G
A
C
T
Multiplex Cyclic Sequencing by Synthesis(Next-gen, polonies on glass or beads)
Polymerase -or- LigaseShendure,
Porreca, et al. 2005 Science
Illumina, IBS
AB-SOLiD, CGI
Mitra, et al. 2003 Analyt.
Biochem.1999NAR
11
ACUCAUC…(3’)…TAGAGT????????????????TGAGTAG…(5’)
5’-Cy5-nnnnAnnnn-3’5’-Cy3-nnnnGnnnn-3’
5’-TR-nnnnCnnnn-3’5’-Cy3+Cy5-nnnnTnnnn-3’
5'PO4
Sequencing by Ligation (SBL) with fluorescent combinatorial 9-mers
Excitation Emission647 700555 605572 630555 700
nm
Shendure, Porreca, et al. (2005) Science 309:1728
12
HPLC autosampler(96 wells) syringe
pump
Polony Sequencing EquipmentHMS/AB/APG
microscope with xyz controls
flow-cell
temperature control
13
2nd-generation sequencing
AB-SOLiD$550K
36 flow-cells * 28 bp * 60Mbeads = 60 Gbp / 30-180 h run36*28*2000*1Mpix*4 colors*2 bytes = 16 Terabytes / run
Harvard-model-F07: $106K incl. computer. $14K support. Open-source software, hardware, wetware Reduce reagent volume & per vol cost 100X each.
E07 (Nikon) F07
PorrecaTerry
14
Reducing costs of 2nd generation genome sequencing in an open-access model
(1) 5X reduction in equipment: F07: $106K
(2) >20X reads per run: 60M beads per each of 36 flow-cells, = 60 Gbp per run (cf. 1G to 4G)(>= 3X coverage minimum: 3E-7 error rate)
(3) Kit costs are inflated 50X (relative to standard enzymes)
(4) Enzyme costs (e.g. Taq Pol) are inflated 100X.
(5) Flow-cell volume reduced 15-fold to improve flow also reduces reagent use! (another 70-fold reduction in progress).
15
Selective genome sequencing
Shendure, et al. Science 309(5741):1728-32. Nilsson et al. (2006) Trends Biotechnol 24:83.
Red=Synthetic; Yellow=genome/cDNA
How do we optimize >100K 100mers ?
7 ways to capture alleles from genomic or c-DNA
In vitro Paired-tag
library
Gapfill
Cleave& ligate
Zhang, Chou, Shendure, Li, Leproust, Dahl, Davis, Nilsson, Church,
For rearrangments
2. 3.
4. Hybridize-select5. Allelic-RNA-ratio6. Mb region primers7. Dilution haplotype
1.
16
10 Mbp of oligos / $300 chip
8K Atactic/Xeotron/Invitrogen Photo-Generated Acid
12K Combimatrix Electrolytic44K Agilent Ink-jet standard reagents380K Nimblegen/GA Photolabile 5'protection
Tian et al. Nature. 432:1050Carr & Jacobson 2004 NAR
Smith & Modrich 1997 PNAS
~1000X lower oligo costs
Amplify pools of 50mers using flanking universal PCR primers &
3 paths to 10X error correction
17
Circle Capture Oligos from Chips
18
Circle Capture
3,5: duplicate controls
(no genome)
2, 4 duplicatecapture experiments
Expected :140 – 271 bp
19
Consensus error rate Estimated # of false positiveHuman Exons Full Genome
1E-4 Bermuda/Hapmap 600 600,0004E-5 454 Nature ‘05 240 240,0003E-7 Polony-SbL Science‘05 18 1,800
Goal of genotyping & resequencing Discovery of variantsE.g. cancer somatic mutations ~1E-6 (or lab evolved cells)
Why low error rates?
false negative genotype reduction by allelic-ratio-RNA & haplotypes
20
Monitoring resistance to BCR-ABL-kinase inhibitors with polonies during CML patient therapy Nardi, Raz, Chao, Wu, Stone, Cortes, Deininger, Church, Zhu, Daley (submitted)
E255K
T315I
M244V
21
Rearrangements detected using polony paired end reads Shendure et al Science Sep 2005
Deletion Insertion Inversion(rare in this clonal population)