Lecture 3 quantitative traits and heritability full

Post on 14-Jun-2015

345 views 2 download

Tags:

description

Heritability

Transcript of Lecture 3 quantitative traits and heritability full

This session

• The theory behind quantitative traits• Heritability – definition and estimation• Data preparation• Lab on data preparation

Learning objectives

• Primary• Know what a quantitative vs. a categorical trait is.• Be able to calculate heritability from twin correlations• Explain why heritability is population specific• Be able to transform a variable to a normal distribution• Secondary• Explain why we think quantitative traits are caused by

many genetic variants.• Evaluate why heritability estimates may not be accurate• Discuss the importance of sample and trait characteristics.

Part 1The theory of quantitative traits

Qualitative traits

• PKU• Deficiency of phenylalanine hydroxylase.• Characterized by – Intellectual disability– microcephaly – delayed speech – seizures – eczema – behavior abnormalities

Qualitative traits

• Qualitative trait: PKU• 12q23.2

Qualitative traits

• “You have it or you don’t”

Affected Not Affected0

5

10

15

20

25

30

PKU

Qualitative traits

– Not “a bit pregnant”

Not Somewhat Pregnant Very pregnant0

5

10

15

20

25

30

Qualititative traits

– Not “a bit pregnant”

Not Somewhat Pregnant Very pregnant0

5

10

15

20

25

30

Quantitative traits

• Attention Deficit Hyperactivity Disorder• Characterized by developmentally

inappropriate– Hyperactivity / impulsivity– Inattentiveness

• Ever said: I am sure I am ‘a bit ADHD’?

Quantitative traits

– A bit ADHD

Qualitative vs. Quantitative traits

• Qualitative• Categorical• Dichotomous

• Quantitative• Continuous• Dimensional

Is it easy to decide whether traits are qualitative or quantitative?AutismType II DiabetesEating disordersDissociation disorders

“Traits are influenced by many variants of small effect, no one variant being necessary, nor

sufficient, for the disorder”.

Implications of the dimensional approach for genetics

Quantitative Trait Loci approach: Quantitative trait loci (QTLs) are stretches of DNA containing or linked to the genes that underlie a quantitative trait

• Imagine 1 locus contributing to a trait• Each locus has 2 Alleles, one of which is the

risk allele• The presence of each copy of the risk allele

conveys an additional ‘score’ on the trait• What happens are you add loci?

Why QTLs give rise to a normal distribution

• 1 locus. Aa.• How many genotypes?• AA• Aa• aa

Why QTLs give rise to a normal distribution

• Given our genotypes, each risk allele gives you a score of +1 on the phenotype

• If A is the risk allele, what are our phenotype scores?

• AA• Aa• aa

Why QTLs give rise to a normal distribution

+2+1

+020

21

19

.5

.25

.25

Genotype Effect Trait Frequency

Why QTLs give rise to a normal distribution

19 20 210

0.5

1

1.5

2

2.5

• 2 locus. Aa / Bb• 2 risk alleles (A and B)• Aa / Bb take a value of 20• Fill in Table 1

Why QTLs give rise to a normal distribution

Why QTLs give rise to a normal distribution

Genotypes Effect Trait Frequency

AA/BB +4 21 1/16

AA/Bb +3 20 2/16

aA/BB +3 20 2/16

AA/bb +2 19 1/16

aA/Bb +2 19 4/16

aa/BB +2 19 1/16

Aa/bb +1 18 2/16

aa/Bb +1 18 2/16

aa/bb +0 17 1/16

• 2 locus. Aa / Bb• 2 risk alleles (A and B)• Aa / Bb take a value of 20• Fill in Table 1• Sketch out a graph (assuming 16 individuals)

Why QTLs give rise to a normal distribution

17 18 19 20 21

2

4

6

8

10

12

14

16

1 locus

19 20 210

0.1

0.2

0.3

0.4

0.5

0.6

17 18 19 20 21

2

4

6

8

10

12

14

16

2 loci

What do you notice about the trait as the number of loci increases?

*Note: this shows additive genetic variance. Dominant genetic variance calculations are in the resources section.

Quantitative traits

– A bit ADHD

• Waiting for ‘proof’ that the phenotypes & genes are the same

• Does not mirror how clinicians work

Controversies of the QTL

• Much easier to find study participants• Can be more powerful

Advantages of the QTL approach

• Quantitative traits are normally distributed traits

• Assumed that these arise from the combined effects of multiple genetic variants

• Assumption is that finding variants associated with traits will find genes associated with disease:

• Attentiveness -> ADHD• Blood sugar -> T2DM

Quantitative traits in genetic research

Part 2Heritability

Heritability is an estimation of the proportion of observed trait variance, attributable to genetic influences.

What is heritability?

Trait variance (Vp)

Vp = Variance due to genetics (Vg) + variance due to non genetics (VE)

• Twin studies• Vp = A + C + E• A = Genes• C = Common environment• E = Unique environment

How do we calculate heritability?

• Assumptions of Twin studies• MZ twins correlate (rMZ) 100% A• DZ twin correlate (rDZ) 50% A• MZ and DZ correlate 100% C• MZ and DZ correlate 0% for E

How do we calculate heritability?

• A = h2 = 2 (rMZ – rDZ) • C = rMZ – A• E = 1- rMZ• If rMZ = .8, and rDZ = .4• A = 2 (.8 - .4 ) = .8• C= .8 - .8 = 0• E = 1 = .8 = .2

How do we calculate heritability?

Calculate the heritabilityrMZ rDZ A C E

.5 .3

.5 .4

.7 .7

.1 .1

.9 .6

.2 .1

.9 .8

Calculate the heritabilityrMZ rDZ A C E

.5 .3 .4 .1 .5

.5 .4 .2 .3 .5

.7 .7 .0 .7 .3

.1 .1 .0 .1 .9

.9 .6 .6 .3 .1

.2 .1 .2 .0 .8

.9 .8 .2 .7 .1

• The equal environments assumption (EEA) (including prenatal) • Assortative mating • Generalizability.

Assumptions of the twin method

• Gene environment correlation (rGE)• Passive rG• Increase C

• Active rG• Increases or decreases heritability• (why does this increase with age?)

• Evocative rG• Increases or decreases heritability

• Gene environment interaction (G*E)– Increases E

Limitations of the twin method

Our concept of heritability is tied up with variation, and with our population.

What is heritability?

A thought experiment:• Where do our hearts come from? If you have a heart,

is this from your genetics? Or from the environment?• What is the heritability of having a heart?

Heritability & Variation

• Tonsillectomies (NG martin, 1991)• Thought experiment: Reading ability

Population specific heritability

1. How much genetics contributes to some trait that an individual shows.

Heritability?

2. Proportion of trait variation between individuals in a given population due to genetic variation.

Heritability?

2 definitions:1. How much genetics contributes to some trait

that an individual shows. 2. Proportion of trait variation between

individuals in a given population due to genetic variation.

What is heritability?

What is the difference?

Question:Does a high heritability for a disease mean that we should target our treatments at genetics?

What is heritability?

Qualitative traits

• PKU• Deficiency of phenylalanine hydroxylase.• Characterized by – Intellectual disability– microcephaly – delayed speech – seizures – eczema – behavior abnormalities

• Parent-offspring

Heritability estimates in non twins

Mid-parent phenotypic trait value

Offs

prin

g ph

enot

ypic

trai

t val

ue

Slope = 0.89

h2= 0.89

• If offspring do not resemble parents then best fit line has a slope of approximately zero.

• Slope of zero indicates most variation in individuals due to variation in environment.

• If offspring strongly resemble parents then best fit line will be close to 1.

Heritability estimates in non twins

• Most traits in most populations fall somewhere in the middle with offspring showing moderate resemblance to parents.

Heritability estimates in non twins

• Heritability can be ascertained from twin correlations, and parent-offspring data

• The point heritability is estimate is not exact (not like a mean)• Furthermore, it applies only to your population, at your time.

Heritability summary

Part 3Lab: Using phenotype data

The world of quantitative genetics

• Genetics… without genotype data.

Phenotype data1. Sample

characterization2. Quantitative trait

distribution3. Heritability

Genotype data– Variant description– Missing data– HWE– LD and haplotypes

• Make a table of sample characteristics• Prepare a quantitative trait for genetic analysis

Goals of this lab

Summarize the sample characteristics (covariates) for our population, often broken down by gender or ethnicity.Summarize trait distribution

1. Summarize data characteristics

Why?

– Define the population parameters for comparison of results with those from other samples (i.e. gender, age, health)

– Help to identify biases in the data

Why summarize sample characteristics?

LOOK AT THE TABLE CLOSELY!!!

– Population definition– Generalizability

Why summarize trait distribution?

LOOK AT THE TABLE CLOSELY!!!

‘Table 1’

‘Table 1’

1. Is the distribution normal?2. Are there outliers?

2. Prepare a quantitative trait for genetic analysis

1. What is a normal distribution?For continuous data we don’t have equally spaced discrete values so instead we use a curve or functionthat describes the probability density over the range of the distribution.

Continuous data

Normal distribution describes a special class of continous distributions that are symmetric and can be described by two parameters(i) μ = The mean of the distribution(ii) σ = The standard deviation of the distributionChanging the values of μ and σ alter the positions andshapes of the distributions.

The normal distribution

The normal distribution

Deviations from normal - skew

The normal distribution will have a skew of 0

Deviations from normal - Kurtosis‘Tails’ are misshapen

The normal distribution will have a kurtosis of 0

Why we care about the normal distribution

• Assumption of most (including genetic association) tests.

How do we test for a normal distribution?• The Chi-square and KS GOF test

(low power). • Eyeball methods: look at

histogram & look for a skew and kurtosis -1 - +1

• Shapiro and Wilk formal test

What is the Ho for Shapiro Wilk?

What do we do about non normal distributions• Run a monotonic transformation• You can try• Log • Square root• Cube root• Reciprocal• STATA: lnskew0 command which

does it for you!

Example of a log transformation

Pre transformation Log transformed

NOTE

ALWAYS check the correlation of the

transformed variable with the original variable.

Screening outliers

Screen ‘odd’ or extreme values Subjective definition: sometimes values 3 or 4 +/- the meanContentious. Positives and negatives.My personal recommendation ‘sensitivity analysis’

Summary

Normal distribution is a symmetrical distributionSkew and kurtosis represent a deviation from normalityMost genetic tests require a normal distributionTherefore we try to transform our distributions

Lab Goals

Going to prepare three variables for analysis: fasting VLDL, LDL and HDL particle size (cardiovascular disease risk factors)1. Prepare a summary table, split by gender, for the

trait and relevant covariate characteristics of the sample

2. Decide if the variables need to be transformed, and transform if so.

3. Prepare a variable with no outliers (using 2 definitions)

Learning objectives• Primary• Given an example of a qualitative trait• Give an example of a quantitative trait• With rMz = 6 and rDz = 6, what is the heritability?• Why is heritability is population specific?• How would you recognize a non-normal variable and transform it

to a normal distribution• Secondary• Explain why we think quantitative traits are caused by many

genetic variants.• Given one reason why heritability estimates may not be accurate• Why do we need to include covariate characteristics?

Slide graveyard

Lab Goals

Going to prepare three variables for analysis: fasting VLDL, LDL and HDL particle size.Prepare a summary table, split by gender of the trait distributions and relevant

Implications of the dimensional approach

AA/BB0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

AA/BBColumn1Column2Column3Column4