Viral Detection and Discovery Using DNA Microarrays...March 22 Samples from CDC received for...

Post on 16-Aug-2020

0 views 0 download

Transcript of Viral Detection and Discovery Using DNA Microarrays...March 22 Samples from CDC received for...

Viral Detection and Discovery Using DNA Microarrays

David Wang, Ph.D.Washington University in St. Louis

School of Medicine

Many Diseases Have an Infectious Etiology

???

Clinical Infectious Diseases 2001

• Which other cancers and diseases may be caused by unrecognized pathogens?

Which Viruses Cause Respiratory Tract Infections?

Virus FrequencyRhino 30-50%Corona 10-15%Influenza 5-15%Parainfluenza 5%Respiratory Syncytial 5%Adeno <5%Entero <5%Metapneumovirus <5% ??UNKNOWN 20-30%

Lancet (2003)

How can one find pathogens in the unknown cases?

Existing Viral Detection/Discovery Methods Have Limitations

• Viral culture– Many viruses not culturable

• Immunoassays – Require special reagents, candidate viruses

• PCR– Limited breadth (candidate viruses)

• Subtractive Hybridization– e.g. Representational difference analysis (RDA)– Requires matched uninfected sample, low throughput

• Solution: Develop a comprehensive, unbiased, and high-throughput method

GOAL: Create a microarray for simultaneous screening of ALL known viruses

The “Virochip”

cDNA vs Oligo-Based Microarray?

• cDNA Arrays– Require 1000 viral genomes as templates for PCR

• Oligonucleotide Arrays– Require only sequence information– Increasing sensitivity with increasing length– Length limited by efficiency of chemical synthesis

Use 70mers

Which 70mer Sequences Should Be On the Array?

Two Step Strategy:

1. Select most highly conserved 70mers within each viral family

– Maximize cross hybridization – Predict that unknown/unsequenced species detectable

2. Add more specific 70mers to discriminate species

Initial prototype array-conserved oligos only– Focus on 5 families containing known respiratory viruses

Conserved Oligonucleotide Selection

Example 1: BLAST reveals highly conserved 70mers from CoxsackieA21

Background: Rhinoviruses (RV)

• Single-stranded RNA (positive strand) viruses in Picornavirus family

• >100 serotypes, 5 fully sequenced – RVs 1b, 2, 14, 16, 89

Generation 1 Virochip: 1592 Oligos from 140 viruses

EM Images: Lina Stannard

Wang et al. PNAS (2002)

Research Plan

• Phase I: Positive Control Samples– Tissue culture infected with various viruses– Individuals experimentally infected with RV16

• Phase II: Clinical Study– Determine the viral flora in patients with upper

respiratory tract infections

Hybridization strategy: tissue culture infections

Cy5Red

Cy3Green

RV16 Infected Cell Culture

Equally abundant in both samplesHuman genes

More abundant in infected sampleViral spots!

“BarView” Data Visualization

• Each viral microarray element converted to a stripe• Stripes organized by viral family of origin• Red (Cy5) hybridization intensity plotted in linear

scale (degrees of yellow)

BarView of RV16 Infection

Viruses from Cell Culture Can Be Successfully Classified by Family

Can viral subtypes be distinguished?

Each Rhinovirus(RV) Serotype Yields a Distinct Hybridization Pattern

Conserved Probes Enable Detection of Unrepresented RV Serotypes

Potential to detect novel viruses!

Clinical Sample Analysis

• Patients with upper respiratory tract infections1. Experimental inoculation with RV16*2. Undefined community-acquired upper respiratory

tract infections (colds)*

Analysis of RNA isolated from nasal lavage samples

* IRB-approved studies-UCSF Asthma Center

Experimental in vivo infection of RV16 Detectable

Which viruses can be found in patients with respiratory tract infections?

Detection of Rhinoviruses in Clinical Nasal Lavage Samples

Other Respiratory Viruses Are Also Detected

Generation 2 Virochip: ~11000 70mers

• ALL (934) Reference Viral Genome Sequences (August, 2002)– http://www.ncbi.nlm.nih.gov/PMGifs/Geno

mes/viruses.html– Human, animal, plant, bacteriophage

• 30 Bacterial genomes

Clinical Studies in Progress

• What is the viral flora of the respiratory tract?1. Comparison to conventional hospital diagnostic

panel? -IFA for Influenza A & B, Parainfluenza, RSV

2. Evidence for new respiratory viruses?-20-30% unknown etiology

Known Virus Detection: How Well Are We Doing?

3N/ACorona OC43

1N/AMetapneumo

6N/ARhino

00Parainfluenza 3

11Parainfluenza 2

00Parainfluenza 1

23Influenza B

31Influenza A

1113RSV

ViroChipClinical LabVirus

What about NOVEL viruses?

2003 SARS Outbreak Timeline

March 15 WHO declares SARS travel alert: ~150 cases

March 17 WHO assembles international team

March 21 Hong Kong Univ. & CDC culture unknown virus

March 22 Samples from CDC received for microarray analysis

Array Hybridization Yields 2 Candidate Viral Families

AstrovirusTurkey AstrovirusHuman AstrovirusOvine AstrovirusAvian Nephritis

CoronavirusAvian Infectious Bronchitis

(2 spots)Bovine CoronavirusHuman Coronavirus229E

Potentially 2 different viruses present?

A Consensus Motif Among 5 of the 70mers

A single virus (Corona family) present!

Viral Recombination Generated a Shared 3’ Motif

70mers from Several CoronavirusRegions Hybridize

A Novel Coronavirus Present!

Evidence that the Unknown Virus is a Coronavirus

• CDC data:– Degenerate PCR – Serology– EM

How Can Sequences of Novel Viruses Be Obtained From Array Leads?

1. Conventional PCRUse cross hybridizing array elements as primers

2. Novel ApproachArray hybridization physically isolates viral sequence

~1kB of Unknown Viral Genome Sequenced

Wang et al Public Library of Science Biology 2003

2003 SARS Outbreak Timeline

March 15 WHO declares SARS travel alert: ~150 cases

March 17 WHO assembles international team

March 21 Hong Kong Univ. & CDC culture unknown virus

March 22 Samples from CDC received for microarray analysis

March 23 Microarray detection of a novel Coronavirus

March 24 CDC: “…a previously unrecognized virus from the coronavirusfamily is the leading hypothesis…”

April 13 Canada: Complete coronavirus genome sequenced

April 16 Netherlands: Proof of causality in monkey model

E-Predict: Automated Intepretation of ViroChip Data

Goal: Which KNOWN virus best accounts for the observed pattern of hybridization?

Approach: Create a library of “virtual” hybridization signatures for comparison1. BLAST all known viruses vs all 70mers on the

array2. Calculate a theoretical binding energy3. Assume intensity proportional to binding energy4. Compare observed result to theoretical results.

E-Predict: Comparison of All Theoretical Profiles to Observed Profile

scorex > scorey …. > scoren

Virus x is the best candidate!

E-Predict of SARS Microarray

Highest Scoring Candidate:

gi|9626535|ref|NC_001451.1| Avian infectious bronchitis virus9626535_1099 Avian infectious bronchitis virus 9635576_275 Turkey astrovirus9635572_255 Ovine astrovirus9626535_568 Avian infectious bronchitis virus 15081544_766 Bovine coronavirus9630726_269 Human astrovirus12175745_728 Human coronavirus 229E 9626535_727 Avian infectious bronchitis virus

The unknown virus should be similar to AIB

Increasing Coverage at Taxonomic Nodes

Family

Genus A Genus B Genus C

sp.A1

sp.A2

sp.B1

sp.B2

sp.C1

sp.C2

A1 A2 B1 B2 C1 C2

Family Probes Span Multiple Genera

Family

Genus A Genus B Genus C

sp.A1

sp.A2

sp.B1

sp.B2

sp.C1

sp.C2

A1 A2 B1 B2 C1 C2

Genus Probes Span Multiple Species

Family

Genus A Genus B Genus C

sp.A1

sp.A2

sp.B1

sp.B2

sp.C1

sp.C2

A1 A2 B1 B2 C1 C2

Genus Probes Span Multiple Species

Family

Genus A Genus B Genus C

sp.A1

sp.A2

sp.B1

sp.B2

sp.C1

sp.C2

A1 A2 B1 B2 C1 C2

Genus Probes Span Multiple Species

Family

Genus A Genus B Genus C

sp.A1

sp.A2

sp.B1

sp.B2

sp.C1

sp.C2

A1 A2 B1 B2 C1 C2

Species Probes Increase Specificity

Family

Genus A Genus B Genus C

sp.A1

sp.A2

sp.B1

sp.B2

sp.C1

sp.C2

A1 A2 B1 B2 C1 C2

Generation 3 ViroChip22,000+ viral sequences representing all viral species in GenBank as of June 2004.

Conclusions

• Designed a novel pan-viral microarray~1000 viral genomes

• Detected many unrelated viruses – No preconceptions about targets

• Discovered novel virus via cross-hybridization• Recovered viral sequence directly from

hybridized microarray element • Automating data interpretation

Ongoing Projects

• Outbreaks/Bioterrorism• Viral Flora Surveys

– “Healthy” individuals– Potential Viral Reservoirs (insects, rodents etc)

Arbovirus surveillance (Bob Tesh UT Galveston)– Blood Supply

• Diseases of Unknown Etiology– Cancers– Neurodegenerative Diseases– Hepatitis (15% unknown)– Encephalitis (~60% unknown)– Veterinary, Agricultural Diseases

Joe DeRisiDon GanemAnatoly UrismanShoshannah BeckYT LiuHomer BousheyPedro AvilaTara GreenhowPeggy WeintraubLawrence DrewKael FischerAmy Kistler

Wang LabKathie MihindukulasuriyaStacy Finkbeiner

CDCDean ErdmanTom Ksizaek

CA Dept of HealthDavid SchnurrShigeo Yagi