Viral Detection and Discovery Using DNA Microarrays...March 22 Samples from CDC received for...
Transcript of Viral Detection and Discovery Using DNA Microarrays...March 22 Samples from CDC received for...
Viral Detection and Discovery Using DNA Microarrays
David Wang, Ph.D.Washington University in St. Louis
School of Medicine
Many Diseases Have an Infectious Etiology
???
Clinical Infectious Diseases 2001
• Which other cancers and diseases may be caused by unrecognized pathogens?
Which Viruses Cause Respiratory Tract Infections?
Virus FrequencyRhino 30-50%Corona 10-15%Influenza 5-15%Parainfluenza 5%Respiratory Syncytial 5%Adeno <5%Entero <5%Metapneumovirus <5% ??UNKNOWN 20-30%
Lancet (2003)
How can one find pathogens in the unknown cases?
Existing Viral Detection/Discovery Methods Have Limitations
• Viral culture– Many viruses not culturable
• Immunoassays – Require special reagents, candidate viruses
• PCR– Limited breadth (candidate viruses)
• Subtractive Hybridization– e.g. Representational difference analysis (RDA)– Requires matched uninfected sample, low throughput
• Solution: Develop a comprehensive, unbiased, and high-throughput method
GOAL: Create a microarray for simultaneous screening of ALL known viruses
The “Virochip”
cDNA vs Oligo-Based Microarray?
• cDNA Arrays– Require 1000 viral genomes as templates for PCR
• Oligonucleotide Arrays– Require only sequence information– Increasing sensitivity with increasing length– Length limited by efficiency of chemical synthesis
Use 70mers
Which 70mer Sequences Should Be On the Array?
Two Step Strategy:
1. Select most highly conserved 70mers within each viral family
– Maximize cross hybridization – Predict that unknown/unsequenced species detectable
2. Add more specific 70mers to discriminate species
Initial prototype array-conserved oligos only– Focus on 5 families containing known respiratory viruses
Conserved Oligonucleotide Selection
Example 1: BLAST reveals highly conserved 70mers from CoxsackieA21
Background: Rhinoviruses (RV)
• Single-stranded RNA (positive strand) viruses in Picornavirus family
• >100 serotypes, 5 fully sequenced – RVs 1b, 2, 14, 16, 89
Generation 1 Virochip: 1592 Oligos from 140 viruses
EM Images: Lina Stannard
Wang et al. PNAS (2002)
Research Plan
• Phase I: Positive Control Samples– Tissue culture infected with various viruses– Individuals experimentally infected with RV16
• Phase II: Clinical Study– Determine the viral flora in patients with upper
respiratory tract infections
Hybridization strategy: tissue culture infections
Cy5Red
Cy3Green
RV16 Infected Cell Culture
Equally abundant in both samplesHuman genes
More abundant in infected sampleViral spots!
“BarView” Data Visualization
• Each viral microarray element converted to a stripe• Stripes organized by viral family of origin• Red (Cy5) hybridization intensity plotted in linear
scale (degrees of yellow)
BarView of RV16 Infection
Viruses from Cell Culture Can Be Successfully Classified by Family
Can viral subtypes be distinguished?
Each Rhinovirus(RV) Serotype Yields a Distinct Hybridization Pattern
Conserved Probes Enable Detection of Unrepresented RV Serotypes
Potential to detect novel viruses!
Clinical Sample Analysis
• Patients with upper respiratory tract infections1. Experimental inoculation with RV16*2. Undefined community-acquired upper respiratory
tract infections (colds)*
Analysis of RNA isolated from nasal lavage samples
* IRB-approved studies-UCSF Asthma Center
Experimental in vivo infection of RV16 Detectable
Which viruses can be found in patients with respiratory tract infections?
Detection of Rhinoviruses in Clinical Nasal Lavage Samples
Other Respiratory Viruses Are Also Detected
Generation 2 Virochip: ~11000 70mers
• ALL (934) Reference Viral Genome Sequences (August, 2002)– http://www.ncbi.nlm.nih.gov/PMGifs/Geno
mes/viruses.html– Human, animal, plant, bacteriophage
• 30 Bacterial genomes
Clinical Studies in Progress
• What is the viral flora of the respiratory tract?1. Comparison to conventional hospital diagnostic
panel? -IFA for Influenza A & B, Parainfluenza, RSV
2. Evidence for new respiratory viruses?-20-30% unknown etiology
Known Virus Detection: How Well Are We Doing?
3N/ACorona OC43
1N/AMetapneumo
6N/ARhino
00Parainfluenza 3
11Parainfluenza 2
00Parainfluenza 1
23Influenza B
31Influenza A
1113RSV
ViroChipClinical LabVirus
What about NOVEL viruses?
2003 SARS Outbreak Timeline
March 15 WHO declares SARS travel alert: ~150 cases
March 17 WHO assembles international team
March 21 Hong Kong Univ. & CDC culture unknown virus
March 22 Samples from CDC received for microarray analysis
Array Hybridization Yields 2 Candidate Viral Families
AstrovirusTurkey AstrovirusHuman AstrovirusOvine AstrovirusAvian Nephritis
CoronavirusAvian Infectious Bronchitis
(2 spots)Bovine CoronavirusHuman Coronavirus229E
Potentially 2 different viruses present?
A Consensus Motif Among 5 of the 70mers
A single virus (Corona family) present!
Viral Recombination Generated a Shared 3’ Motif
70mers from Several CoronavirusRegions Hybridize
A Novel Coronavirus Present!
Evidence that the Unknown Virus is a Coronavirus
• CDC data:– Degenerate PCR – Serology– EM
How Can Sequences of Novel Viruses Be Obtained From Array Leads?
1. Conventional PCRUse cross hybridizing array elements as primers
2. Novel ApproachArray hybridization physically isolates viral sequence
~1kB of Unknown Viral Genome Sequenced
Wang et al Public Library of Science Biology 2003
2003 SARS Outbreak Timeline
March 15 WHO declares SARS travel alert: ~150 cases
March 17 WHO assembles international team
March 21 Hong Kong Univ. & CDC culture unknown virus
March 22 Samples from CDC received for microarray analysis
March 23 Microarray detection of a novel Coronavirus
March 24 CDC: “…a previously unrecognized virus from the coronavirusfamily is the leading hypothesis…”
April 13 Canada: Complete coronavirus genome sequenced
April 16 Netherlands: Proof of causality in monkey model
E-Predict: Automated Intepretation of ViroChip Data
Goal: Which KNOWN virus best accounts for the observed pattern of hybridization?
Approach: Create a library of “virtual” hybridization signatures for comparison1. BLAST all known viruses vs all 70mers on the
array2. Calculate a theoretical binding energy3. Assume intensity proportional to binding energy4. Compare observed result to theoretical results.
E-Predict: Comparison of All Theoretical Profiles to Observed Profile
scorex > scorey …. > scoren
Virus x is the best candidate!
E-Predict of SARS Microarray
Highest Scoring Candidate:
gi|9626535|ref|NC_001451.1| Avian infectious bronchitis virus9626535_1099 Avian infectious bronchitis virus 9635576_275 Turkey astrovirus9635572_255 Ovine astrovirus9626535_568 Avian infectious bronchitis virus 15081544_766 Bovine coronavirus9630726_269 Human astrovirus12175745_728 Human coronavirus 229E 9626535_727 Avian infectious bronchitis virus
The unknown virus should be similar to AIB
Increasing Coverage at Taxonomic Nodes
Family
Genus A Genus B Genus C
sp.A1
sp.A2
sp.B1
sp.B2
sp.C1
sp.C2
A1 A2 B1 B2 C1 C2
Family Probes Span Multiple Genera
Family
Genus A Genus B Genus C
sp.A1
sp.A2
sp.B1
sp.B2
sp.C1
sp.C2
A1 A2 B1 B2 C1 C2
Genus Probes Span Multiple Species
Family
Genus A Genus B Genus C
sp.A1
sp.A2
sp.B1
sp.B2
sp.C1
sp.C2
A1 A2 B1 B2 C1 C2
Genus Probes Span Multiple Species
Family
Genus A Genus B Genus C
sp.A1
sp.A2
sp.B1
sp.B2
sp.C1
sp.C2
A1 A2 B1 B2 C1 C2
Genus Probes Span Multiple Species
Family
Genus A Genus B Genus C
sp.A1
sp.A2
sp.B1
sp.B2
sp.C1
sp.C2
A1 A2 B1 B2 C1 C2
Species Probes Increase Specificity
Family
Genus A Genus B Genus C
sp.A1
sp.A2
sp.B1
sp.B2
sp.C1
sp.C2
A1 A2 B1 B2 C1 C2
Generation 3 ViroChip22,000+ viral sequences representing all viral species in GenBank as of June 2004.
Conclusions
• Designed a novel pan-viral microarray~1000 viral genomes
• Detected many unrelated viruses – No preconceptions about targets
• Discovered novel virus via cross-hybridization• Recovered viral sequence directly from
hybridized microarray element • Automating data interpretation
Ongoing Projects
• Outbreaks/Bioterrorism• Viral Flora Surveys
– “Healthy” individuals– Potential Viral Reservoirs (insects, rodents etc)
Arbovirus surveillance (Bob Tesh UT Galveston)– Blood Supply
• Diseases of Unknown Etiology– Cancers– Neurodegenerative Diseases– Hepatitis (15% unknown)– Encephalitis (~60% unknown)– Veterinary, Agricultural Diseases
Joe DeRisiDon GanemAnatoly UrismanShoshannah BeckYT LiuHomer BousheyPedro AvilaTara GreenhowPeggy WeintraubLawrence DrewKael FischerAmy Kistler
Wang LabKathie MihindukulasuriyaStacy Finkbeiner
CDCDean ErdmanTom Ksizaek
CA Dept of HealthDavid SchnurrShigeo Yagi