Candidate Gene Resource Steering Committee Meeting July 25, 2006

39
Candidate Gene Resource Steering Committee Meeting July 25, 2006

description

Candidate Gene Resource Steering Committee Meeting July 25, 2006. Goals for Today. Strengthen relationships among CARE investigators Define pilot project (phenotypes & SNPs) Establish principles of data release Discuss genotyping study design Select phenotypes to be analyzed. - PowerPoint PPT Presentation

Transcript of Candidate Gene Resource Steering Committee Meeting July 25, 2006

Page 1: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Candidate Gene ResourceSteering Committee MeetingJuly 25, 2006

Page 2: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Goals for Today• Strengthen relationships among CARE investigators

• Define pilot project (phenotypes & SNPs)

• Establish principles of data release

• Discuss genotyping study design

• Select phenotypes to be analyzed

Page 3: Candidate Gene Resource Steering Committee Meeting July 25, 2006

CARE Governance

• Steering committee– Representative of each CARE organization– Subcommittees : Data Release,

Phenotypes, Study Design, Informatics, SNP Selection, DNA/Genotyping

• NHLBI staff

• NHLBI appointed oversight committee

Page 4: Candidate Gene Resource Steering Committee Meeting July 25, 2006

CARE : timeline

• RFP released March 2005

• Response submitted July 15, 2005

• Awarded April 1, 2006

• Four year award– Y1: Create DNA and phenotype database– Y2: Genotyping– Y3 / 4: Joint analysis and data distribution

Page 5: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Resources Provided by NHLBI

• $18.3M over 4 years to create a resource to relate genotype-phenotype across cohorts:– Create a consortium among CARE cohorts– Database DNA and phenotypes– Genotype a common set of SNPs across cohorts– Create software tools to enable joint analysis– Data distribution as per CARE data release policy– Project management and coordination

-PM hired : Deb Farlow

Page 6: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Areas for Discussion Today

• Data Release

• Study Design

• Phenotypes

NHLBI

Current state of genotyping technology

Presentation of informatics tools

Page 7: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Data release

• Data release policy to be established by CARE steering committee with NHLBI and local IRB’s

• Broad proposed secure, HIPAA compliant web architecture to implement this policy and to enable access-controlled environment for data sharing and analysis

Page 8: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Areas for Discussion Today

• Data Release

• Study Design

• Phenotypes

NHLBI

Current state of genotyping technology

Presentation of informatics tools

Page 9: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Original CARE Study Design

• Candidate Gene Study– 50,000 samples– average 10 SNPs/gene x 1700 genes = 17,000 SNPs– Requirement: $0.01 /genotype (fully loaded)

• Whole Genome Association Study– 500 cases / 1,000 controls– At least 300,000 SNPs genome wide

Page 10: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Candidate gene study

• Targeted genotyping technology has remained stable : same price and throughput as in approved proposal

• Key issue: criteria for selecting 17,000 candidate gene-based SNPs– biological hypotheses

Page 11: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Developments since RFP

• Whole genome scans promise new hypotheses for candidate genes

• Evaluation of coverage / performance of whole genome arrays

• Price for whole genome genotyping technology has improved

Page 12: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Whole genome scanning

• SHARE will genotype 15,000 people from NHLBI cohorts (FHS and TBA)

• RFA for 4-5 whole genome scans• GAIN, WTCCC, etc, etc• Implication: hypotheses that could be

confirmed and extended by CARE• Challenge: timing doesn’t synch up well

with original CARE timeline

Page 13: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Developments since RFP

• Whole genome scans promise new hypotheses for candidate genes

• Evaluation of coverage / performance of whole genome arrays

• Price for whole genome genotyping technology has improved

Page 14: Candidate Gene Resource Steering Committee Meeting July 25, 2006

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Coverage

Page 15: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Do they work?

SamplesAverage call

rateConcordance with

Hap MapTrio

concordanceAffymetrix 500K

(Broad) 1200 99.10%48 CEU samples,

99.10% 60 trios, 99,9%Illumina 317K

(CIDR*) 1400 99.80%8 CEU samples,

99.85% 10 trios, 99.85%

* from http://www.cidr.jhmi.edu/human_gwa.html

Page 16: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Do They Work at High Scale?Recent Call Rate Data

(at Broad)

Product Chips Call Rate

Affy 500K 12,000 98.7%ILMN 317K 250 99.2%

In-Process QC test

HapMap sample vs Hap Map

CONCORDANCE (CNTRL VS HapMap, n=42)

97.50%

98.00%

98.50%

99.00%

99.50%

100.00%

0 5 10 15 20 25 30 35 40 45

Avg=99.62%7,947,748 comparisons

Page 17: Candidate Gene Resource Steering Committee Meeting July 25, 2006

QC statistics: MS andT2D Scans

# % of Total # % of Total # % of Total # % of TotalSamples attempted 1530 100% 1558 100% 1117 100% 867 1%Pass DM (0.26) >=85% 1474 96% 1476 97% 1040 93% 817 94%Pass BRLMM >=95% 1438 94% 1428 93% 1008 90% 792 91%

Avg call rate passing samples 99.10% 99.00% 99.00% 98.70%

# Passing SNPs in passing samples 253,172 97% 230,816 97% 251,248 96% 228,972 96.10%

T2D ScanNsp StyNsp

MS ScanSty

Page 18: Candidate Gene Resource Steering Committee Meeting July 25, 2006

DM vs. BRLMM 2500 chips

<5% of chips fail

Page 19: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Genotyping Costs per Sample

$0

$200

$400

$600

$800

$1,000

$1,200

$1,400

$1,600

$1,800

Jul-05 Oct-05 Jan-06 Apr-06 Aug-06 Nov-06 Feb-07 Jun-07

Ch

ip c

ost

per

sam

ple

Affy 500KILMN 317KILMN 550KILMN 650YMIP (20K)

Page 20: Candidate Gene Resource Steering Committee Meeting July 25, 2006

WGAS: Then and Now

Original Plan

Product: Affymetrix 500KTotal cost per sample: $1600 (chip+reagents+equipment+labor+IDC)

Study Design: 500 cases / 1,000 controlsBudget=$2,400,000

Page 21: Candidate Gene Resource Steering Committee Meeting July 25, 2006

WGAS: Then and Now

Now possible

Product: Affymetrix 500KTotal cost per sample: $530 (chip+reagents+equipment+labor+IDC)

Study Design: 4,500 samplesBudget=$2,400,000

Page 22: Candidate Gene Resource Steering Committee Meeting July 25, 2006

WGAS: Then and Now

January 2007

Product: Affymetrix 500KTotal cost per sample: $410 (chip+reagents+equipment+labor+IDC)

Study Design: 5,800 samplesBudget=$2,400,000

Page 23: Candidate Gene Resource Steering Committee Meeting July 25, 2006

In Summary

SNPs Samples Cost

7/15/05 500,000 1,500 $2.4M 17,000 50,000 $8.5M

7/25/06 500,000 4,500 $2.4M 17,000 50,000 $8.5M

1/07 500,000 5,800 $2.4M` 17,000 50,000 $8.5M

Page 24: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Conclusions: genotyping

• Targeted genotyping (custom set of candidate genes) stable @ $0.01 / gt

• Timing of candidate gene selection

• Improved cost and performance of whole genome arrays @ $0.001 / gt

Page 25: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Areas for Discussion Today

• Data Release

• Study Design

• Phenotypes

NHLBI

Current state of genotyping technology

Presentation of informatics tools

Page 26: Candidate Gene Resource Steering Committee Meeting July 25, 2006

High Level Workflow – for CaRE

Upload Samples, Peds, Individuals,

Phenotypes

Create Experiments(Samples x Features)

Summarize/FilterPLINK

Data VaultQC/Curate Results

Design and Execute

Experiments

ProjectDB

LIMS DBs

BSP DB

Association & Statistics Viewers

Cohort’s CustomAlgorithms, Viewers

Web

Ser

vice

s

Data Compile

FeatureDB

Analysis: Gene Pattern + CaRE analysis tools

Production:BSP/GAP + CaRE enhancements

Page 27: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Designing a Pilot

• A trial run for DNA quality, genotyping, phenotype and joint analysis, and publication

• Scale and content of pilot to be refined, topic for today’s discussion sessions

Page 28: Candidate Gene Resource Steering Committee Meeting July 25, 2006

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this p icture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

A R EA R EQuickTime™ and a

TIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Our shared aspiration: the greatest genetic epidemiology experiment to date

CCQuickTime™ and a

TIFF (Uncompressed) decompressorare needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

CSSCD

Page 29: Candidate Gene Resource Steering Committee Meeting July 25, 2006
Page 30: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Technological Advance

Current 500K assay New 500K assay

DNA DNA

Page 31: Candidate Gene Resource Steering Committee Meeting July 25, 2006

How?

Smaller format

BRLMM

Sequence Variability(DNA Analysis)

A/A B/BA/B

Mismatch probes not needed

Fewer probes needed

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Single format

Page 32: Candidate Gene Resource Steering Committee Meeting July 25, 2006

No drop in Het Calls

Page 33: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Mendel Errors Per PlateAccuracy 99.4%

Sty/Nsp : one family 25,000 errors

Page 34: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Coverage of Common Variants by Whole-genome Products

Tag SNPs

Affymetrix Mapping 500K GeneChip

Illumina HumanHap300 BeadChip

Page 35: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Coverage Mostly Provided by Pairwise Correlations

A

A

A

T

T

T

G

G

G

T

T

T

G

G

T

G

G

G

A

A

C

A

A

C

T

T

C

T

T

C

T

T

G

T

T

G

G

G

C

C

C

C

G

G

T

T

G

G

G

G

T

T

G

G

C

C

C

C

T

T

C

C

C

C

G

G

A

A

A

A

C

C

A

A

A

A

T

T

G

G

C

C

C

C

G

G

C

C

C

C

G

G

T

T

G

G

Page 36: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Specified Multimarker Tests Improve Effective Coverage

A

A

A

T

T

T

G

G

T

G

G

G

A

A

C

A

A

C

G

G

C

C

C

C

G

G

T

T

G

G

G

G

T

T

G

G

C

C

C

C

T

T

G

G

T

T

G

G

C C

Page 37: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Coverage of the genomeYRI Coverage

0%

20%

40%

60%

80%

100%

Affy100k Affy500k Ilmn300k Ilmn550k

Array

Fra

cti

on

co

mm

on

SN

Ps

ca

ptu

red

at

r2 o

f 0

.8 Single markers2-marker predictors

Page 38: Candidate Gene Resource Steering Committee Meeting July 25, 2006

CEU Coverage

0%

20%

40%

60%

80%

100%

Affy100k Affy500k Ilmn300k Ilmn550k

Array

Fra

cti

on

co

mm

on

SN

Ps

ca

ptu

red

at

r2 o

f 0

.8 Single markers2-marker predictors

Page 39: Candidate Gene Resource Steering Committee Meeting July 25, 2006

Other recent developments

• Whole genome scan planned in 9,000 FHS participants (SHARE)

• Other whole genome scans will be funded (recent NHLBI RFA)