Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

30
www. geocities.com/ResearchTriangle/Forum/4463/ani

Transcript of Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Page 1: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif

Page 2: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Bayesian Hierarchical Model Bayesian Hierarchical Model for QTLsfor QTLs

Susan Simmons

University of North Carolina Wilmington

Page 3: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

CollaboratorsCollaborators

Dr. Edward BooneDr. Edward BooneDr. Ann StapletonDr. Ann StapletonMr. Haikun BaoMr. Haikun Bao

Page 4: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

DNADNA

Page 5: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

ChromosomeChromosome

Page 6: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

GenesGenes

Page 7: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Genetic MapGenetic Map

Page 8: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Chromosome 1 of ProtozoaChromosome 1 of ProtozoaCryptosporidium Cryptosporidium parvumparvum

Page 9: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Chromosome 1 of Homo Chromosome 1 of Homo sapienssapiens

Page 10: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

AllelesAlleles

Page 11: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Genetic MapsGenetic Maps

Many more maps available at www.ncbi.nih.gov

Knowing information about genes now allows us to find associations between genes and outcomes (phenotypes)

Page 12: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Some examplesSome examples

In 1989 a breakthrough was made for the disease of cystic fibrosis.

Location (or locus) is 7q31.2 - The CFTR gene is found in region q31.2 on the long (q) arm of human chromosome 7 (single gene responsible for this disease).

The disease arises when an individual has two recessive copies at this location.

An individual with one dominant and one recessive is said to be a carrier of the disease.

Genetic screening to determine disease.

Page 13: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Green revolutionGreen revolution

The Green Revolution is the increase in food production stemming from the improved strains of wheat, rice, maize and other cereals in the 1960s developed by Dr Norman Borlaug in Mexico and others under the sponsorship of the Rockefeller Foundation

Created new species of wheat and rice that produced higher yield.

Page 14: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

QTLQTL

Better medical treatments and increased agriculture are only two examples in which identifying the location on the genome can have an impact.

Identifying the region on the genome (or on the chromosome) responsible for a quantitative trait (as opposed to qualitative as disease) is known as Quantitative Trait Locus (QTL).

Page 15: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Existing softwareExisting software

Zhao-Bang Zeng’s group at NC State has QTL Cartographer

Karl Broman (John Hopkins) has an R program that performs a number of algorithms for QTLs

To use these algorithms (and a number of other published algorithms) only one observation per genotype can be used

Page 16: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

World of plantsWorld of plants

Page 17: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Why plants?Why plants?

Increase yield to feed our increasing population

Make plants resistant to UV-B exposure

Page 18: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Plants, continuedPlants, continued

Control– Design and Environment– Reproduction– Design (RIL is one of the best designs for

detecting QTLs)… Alleles are homozygous

CostTime

Page 19: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Plant QTL experimentsPlant QTL experiments

In most experiments, a number of replicates or clones are observed within each line

A number of plant biologist use some summary measure to use conventional methods

Information is lost (and can be misleading…example in Conte et al (unpublished))

Hierarchical model to incorporate replicates within each line

Page 20: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

DataData

Trait or phenotype, yij , i = 1,..,L where L is the number of lines and j = 1, …, ni (number of replicates within each line)

Design matrix, X is L x M where M is the number of markers on the genetic map

Page 21: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Hierarchical ModelHierarchical Model

Hierarchical Model

yij ~ N(i,i2)

i ~ N(XiT, 2)

Priors

2 ~ Inverse 2 (1)

k ~ N(0,100)

i2 ~ Inverse 2 (1)

Page 22: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Posterior Model ProbabilityPosterior Model Probability

Let denote the set of all possible models. Given data D, the posterior probability of model ki is given by Bayes Rule

(These probabilities are implicitly conditioned on the set )

1

( | ) ( )( | )

( | ) ( )

i ii

i ij

P D k P kP k D

P D k P k

Page 23: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Posterior Model continuedPosterior Model continued

To compute probability of the model given the data in previous slide ( ), we need to compute P(D|ki), where

i is the vector of unknown parameters for model ki

( | ) ( | , ) ( | )i i i i i iP D k P D k P k d

( | )iP k D

Page 24: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

IntegrationIntegration

This integration can become difficult since the length of the unknown parameters is 2*L + M +2. Use Monte Carlo estimate of the integral

Where , j = 1,…,t are samples from the posterior distribution

( )

1

1( | , ) ( | ) ( | , )

tj

i i i i i i ij

P D k P k d P D kt

( )ji

Page 25: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Search strategySearch strategy

The activation probability, P(j 0|D) is defined as

There are 2M number of potential models,which can make the calculation of P(j 0|D) computationally intensive

Instead, we define a conditional probability search approach

( 0 | ) ( 0 | , ) ( | )j j i iP D P k D P k D

Page 26: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

C211

C2 C3 C4 C5C1

C212

C4212C4211

C422

C41

C421

C42C21 C22

Page 27: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Simulated dataSimulated data

Using the line information from the Bay x Sha RIL population, a single QTL was simulated on the fourth marker of the first chromosome.

The Bay x Sha population has 5 chromosomes.

Page 28: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

C111

0.818

C2

0.4

C3

0.6

C4

0.4

C5

0.0029

C1

1

C112

0.927

C1112

0.014(M2)

C1111

0.041 (M1)

C122

0.108

C31

0.063

C121

0.114

C32

0.063

C11

1

C12

0.9362

C1121

0.083(M3)

C1122

1(M4)

Page 29: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

CommentsComments

Need to run model on more simulationsWould like to compare this search strategy

to a stochastic searchWould like to include epistasis in the model

Page 30: Www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif.

Thank youThank you