2008 PGSAS G-nomes

The Human Genome Project

➢ June 26, 2000: Successful completion of the first ‘draft’ of the entire human genome!!!

➢ The race between Celera and NIH is finished. The private company appears to have won.

•But,What the heck is a ‘genome’?What did they/we win?

http://www.celera.com/

http://www.ncbi.nih.gov/

The Chicken Genome Project

➢ An initiative begun by NIH in 2002➢ Completed in 2004➢

➢ Other species considered:➢ Cats, cows, sheep, horses, dogs

➢ Cow begun in 2004➢ Pig begun in 2005➢ Who's next?

➢ Look here: Ensembl

➔But,What the heck is a ‘genome’?What did they/we win?

The Genome (?)

➢ G-nomes;Grumpy and Sleepy?

➢ With apologies to Dr. Dean Snow

✔Not really.✔A genome is a complete sequence of all the known genes of an organism; including their structure and function

Maps and markers

➢ What’s a genetic map?

➢ With apologies to Dr. David Bottstein.

✔WELL,Let’s start with a simpler question:✔How do you get to Penn State??

One kind of map of Penn State

Here’s a better view

Now I know this will be helpful

Perhaps we need a different kind of map?

How about this?

Or, this?

Or, even this?

The Genome (among friends)

➢ Chromosomes➢ Each chromosome is

one molecule of DNA.➢ 107 to 108 base pairs➢ A structural gene,

coding for a polypeptide/protein, is between 103 to 104 bp.

➢ Approximately 10% of the genome is coding.

➢ DO THE MATH!!➢ A chromosome contains

1,000 to 10,000 genes.➢ Vertebrate genomes contain

approximately 50,000 to 100,000 genes.

➢ These are generalizations and are highly species specific.

➢ Indeed, calculations from the human genome project suggest that there are approx. 35,000 genes

Genes and Markers and Maps

➢ Gene Mapping➢ The location of genes to specific positions (e.g.,

loci) on specific chromosomes.

Structural Genes

➢ Consider Hemoglobin!➢ Normal adult hemoglobin consists of 2 molecules

each of 2 different polypeptides.➢ α (141 aa) and β (146 aa)➢ On chromosomes 16 and 11

➢ Given 3 bp per aa➢ the β chain has 4438 possible single bp variants➢ This number exceeds the total number of fundamental

particles in the universe.

Hemoglobin-β mutations

Non-sense

Nil-STOPUAGATCMutant

Mis-sense

ValineGUGCACMutant

Same-sense

GlutamateGAACTTMutant

Wild-type

GlutamateGAGCTCNormal

TypeAmino Acid

mRNA codon

DNA codon

Allele

Mapping➢ Prior to the 1980’s all mapping was accomplished

using major genes of obvious phenotypic effect.➢ The advent of RFLP’s, AFLP’s, microsatellites and

other molecular markers, we can identify large numbers of segregating loci, simultaneously in the same cross.

➢ Remember that these markers are not true genes and are really ‘framework maps’, since they provide the ‘road map’ to locate genes of interest.➢ Useful for locating and studying QTL / MAS.➢ Invaluable to investigating genomic organization across

related species/genera.

Power Supply

Electrode

BufferSolution

Gel

Electrophoresis ApparatusElectrophoresis Apparatus

Restriction EndonucleasesRestriction Endonucleases

Eco

RI

BamHI

Xho

I

Hae

III

Hha

I

Alu

I

5' - C C G C - 3'3' - C G C G - 5'

5' - G G C C - 3'3' - C C G G - 5'

5' - G A A T T C - 3'3' - C T T A A G - 5'

5' - G G A T C C - 3'3' - C C T A G G - 5'

5' - C T C G A C - 3'3' - G A G C T C - 5'

5' - A G C T - 3'3' - T C G A - 5'

These enzymes cleave These enzymes cleave DNA at sites of DNA at sites of specific, short specific, short nucleotide sequencesnucleotide sequences

··

More than 500 are More than 500 are commercially commercially availableavailable

··

Southern Transfer ProcedureSouthern Transfer ProcedureGel Filter

DNA restrictionfragments

(a) (b) (c)

Restriction Length PolymorphismsRestriction Length Polymorphisms

AA Aa aa

Design for Interpreting a Marker Locus and Design for Interpreting a Marker Locus and PhenotypePhenotype

Hypothetical FHypothetical F11 Genotype Genotype

QQ

qq mm

rr MM F1 Gametes Frequencies

QM

qm

Qm

qM

(1 - r)/2

(1 - r)/2

r/2

r/2

FF22 Genotypic Array Genotypic ArrayGenotypes Frequency Value

QQMM

QqMM

qqMM

QQMm

QqMm

qqMm

QQmm

Qqmm

qqmm

(1 - r)2/4

(r - r2)/2

r2/4

(r - r2)/4

[r2 + (1 - r)2]/2

(r - r2)/2

r2/4

(r - r2)/2

(1 - r)2/4

+a

d

-a

+a

d

-a

+a

d

-a

Estimation of additive and dominance effectsEstimation of additive and dominance effects

Marker class Mean expression

MM

Mm

mm

(1-2r)a + 2r(1 - r)d[(1 - r)2 + r2]d

(1 - 2r)(-a) + 2r(1 - r)d

Marker-locus class means (frequency adjusted)

Additivity: (Additivity: (MMMM - - mmmm)/2 =)/2 = a a(1 - 2(1 - 2rr))Dominance: Dominance: Mm Mm - (- (MM MM + + mmmm)/2 = )/2 = d d(1 - 2(1 - 2rr))22

Gene Order and Arrangements

➢ Now that we’ve talked about structure and function …

➢ How do we figure out their placement on the map?

➢ We take advantage of a violation of the law.

✔ Specifically,Mendel’s law of independent assortment.

Consequences of crossing over (1)

Chiasma

A B

A Ba b

a b A B

a b

Meiosis

Linkage between a mutant gene and a marker

Mutant gene

DNA marker

Wild-type gene

Variant DNA marker

Consequences of crossing over (2a)

Consequences of crossing over (2b)

Chiasma frequency and distance between loci

0102030405060708090

0 0.5 1 1.5 2 2.5 3 3.5

Chiasma Frequency

cM o

r R

F%

MAP DISTANCE

RECOMBINATIONFREQUENCY

Using the test-cross

135Total

3aBaaBbaB

4AbAabbAbRecombinants

60abAabbab

68ABAaBbABParentals

NumberProgeny Phenotype

Progeny Genotype

Calculating Recombination Frequency

➢ Number of ‘A’ individuals:

➢ 68 + 4 = 72➢ Number of ‘a’

individuals:➢ 60 + 3 = 63

χ2=0.6; ns

➢ Number of ‘B’ individuals:

➢ 68 + 3 = 71➢ Number of ‘b’

individuals:➢ 60 + 4 = 64

χ2=0.37; ns.

✔Test for single factor ratios:

✔RF = (4+3)/135 = 0.0518 or 5.18%

What if you had 3 genes of interest?

➢ Start with an F1 produced by 2 pureline parents (AABBCC x aabbcc).

➢ Backcross the F1 to the triple-recessive parent.➢ Check that all alleles are segregating in a 1:1 ratio

in the backcross➢ Altered segregation will give a poor estimate of RF%.

➢ differential survival➢ misclassification

Here’s how to determine gene order

400TOTAL

32aBc

24AbC4

7aBC

14Abc3

49abC

51ABc2

120abc

103ABC1F1 gametes

Number of progeny

Progeny phenotypes

Class

Calculate RF% as before

➢ ALL χ2 are non-significant.

➢ A – B = (14+7+24+32)/400 = 0.1925 or 19.25%

➢ A – C = (51+49+14+7)/400 =0.3025 or 30.25%

➢ B – C = (51+49+24+32)/400 =0.3900 or 39.00%

#Calculating%20Recombination%20Frequency

And the answer is:

B A C

Since B-C is the largest RF, genes B and C must be the furthest apart; while A is in between.

39.00%

19.25% 30.25%

Genes and Markers and Maps

➢ Gene Mapping➢ The location of genes to specific positions (e.g.,

loci) on specific chromosomes.➢ Linkage

➢ Genes that are located on the same chromosome are ‘linked’.

The Human Map

11263 Mb263 Mb

171792 Mb92 Mb

212150 Mb50 Mb

XX164 Mb164 Mb

2008 PGSAS G-nomes

Technology

Transcript of 2008 PGSAS G-nomes