2008 PGSAS G-nomes

37
The Human Genome Project June 26, 2000: Successful completion of the first ‘draft’ of the entire human genome!!! The race between Celera and NIH is finished. The private company appears to have won. •But, What the heck is a ‘genome’? What did they/we win?

Transcript of 2008 PGSAS G-nomes

Page 1: 2008 PGSAS G-nomes

The Human Genome Project

➢ June 26, 2000: Successful completion of the first ‘draft’ of the entire human genome!!!

➢ The race between Celera and NIH is finished. The private company appears to have won.

•But,What the heck is a ‘genome’?What did they/we win?

Page 2: 2008 PGSAS G-nomes

The Chicken Genome Project

➢ An initiative begun by NIH in 2002➢ Completed in 2004➢

➢ Other species considered:➢ Cats, cows, sheep, horses, dogs

➢ Cow begun in 2004➢ Pig begun in 2005➢ Who's next?

➢ Look here: Ensembl

➔But,What the heck is a ‘genome’?What did they/we win?

Page 3: 2008 PGSAS G-nomes

The Genome (?)

➢ G-nomes;Grumpy and Sleepy?

➢ With apologies to Dr. Dean Snow

✔Not really.✔A genome is a complete sequence of all the known genes of an organism; including their structure and function

Page 4: 2008 PGSAS G-nomes

Maps and markers

➢ What’s a genetic map?

➢ With apologies to Dr. David Bottstein.

✔WELL,Let’s start with a simpler question:✔How do you get to Penn State??

Page 5: 2008 PGSAS G-nomes

One kind of map of Penn State

Page 6: 2008 PGSAS G-nomes

Here’s a better view

Page 7: 2008 PGSAS G-nomes

Now I know this will be helpful

Page 8: 2008 PGSAS G-nomes

Perhaps we need a different kind of map?

Page 9: 2008 PGSAS G-nomes

How about this?

Page 10: 2008 PGSAS G-nomes

Or, this?

Page 11: 2008 PGSAS G-nomes

Or, even this?

Page 12: 2008 PGSAS G-nomes

The Genome (among friends)

➢ Chromosomes➢ Each chromosome is

one molecule of DNA.➢ 107 to 108 base pairs➢ A structural gene,

coding for a polypeptide/protein, is between 103 to 104 bp.

➢ Approximately 10% of the genome is coding.

➢ DO THE MATH!!➢ A chromosome contains

1,000 to 10,000 genes.➢ Vertebrate genomes contain

approximately 50,000 to 100,000 genes.

➢ These are generalizations and are highly species specific.

➢ Indeed, calculations from the human genome project suggest that there are approx. 35,000 genes

Page 13: 2008 PGSAS G-nomes

Genes and Markers and Maps

➢ Gene Mapping➢ The location of genes to specific positions (e.g.,

loci) on specific chromosomes.

Page 14: 2008 PGSAS G-nomes

Structural Genes

➢ Consider Hemoglobin!➢ Normal adult hemoglobin consists of 2 molecules

each of 2 different polypeptides.➢ α (141 aa) and β (146 aa)➢ On chromosomes 16 and 11

➢ Given 3 bp per aa➢ the β chain has 4438 possible single bp variants➢ This number exceeds the total number of fundamental

particles in the universe.

Page 15: 2008 PGSAS G-nomes

Hemoglobin-β mutations

Non-sense

Nil-STOPUAGATCMutant

Mis-sense

ValineGUGCACMutant

Same-sense

GlutamateGAACTTMutant

Wild-type

GlutamateGAGCTCNormal

TypeAmino Acid

mRNA codon

DNA codon

Allele

Page 16: 2008 PGSAS G-nomes

Mapping➢ Prior to the 1980’s all mapping was accomplished

using major genes of obvious phenotypic effect.➢ The advent of RFLP’s, AFLP’s, microsatellites and

other molecular markers, we can identify large numbers of segregating loci, simultaneously in the same cross.

➢ Remember that these markers are not true genes and are really ‘framework maps’, since they provide the ‘road map’ to locate genes of interest.➢ Useful for locating and studying QTL / MAS.➢ Invaluable to investigating genomic organization across

related species/genera.

Page 17: 2008 PGSAS G-nomes

Power Supply

Electrode

BufferSolution

Gel

Electrophoresis ApparatusElectrophoresis Apparatus

Page 18: 2008 PGSAS G-nomes

Restriction EndonucleasesRestriction Endonucleases

Eco

RI

BamHI

Xho

I

Hae

III

Hha

I

Alu

I

5' - C C G C - 3'3' - C G C G - 5'

5' - G G C C - 3'3' - C C G G - 5'

5' - G A A T T C - 3'3' - C T T A A G - 5'

5' - G G A T C C - 3'3' - C C T A G G - 5'

5' - C T C G A C - 3'3' - G A G C T C - 5'

5' - A G C T - 3'3' - T C G A - 5'

These enzymes cleave These enzymes cleave DNA at sites of DNA at sites of specific, short specific, short nucleotide sequencesnucleotide sequences

··

More than 500 are More than 500 are commercially commercially availableavailable

··

Page 19: 2008 PGSAS G-nomes

Southern Transfer ProcedureSouthern Transfer ProcedureGel Filter

DNA restrictionfragments

(a) (b) (c)

Page 20: 2008 PGSAS G-nomes

Restriction Length PolymorphismsRestriction Length Polymorphisms

AA Aa aa

Page 21: 2008 PGSAS G-nomes

Design for Interpreting a Marker Locus and Design for Interpreting a Marker Locus and PhenotypePhenotype

Hypothetical FHypothetical F11 Genotype Genotype

QQ

qq mm

rr MM F1 Gametes Frequencies

QM

qm

Qm

qM

(1 - r)/2

(1 - r)/2

r/2

r/2

Page 22: 2008 PGSAS G-nomes

FF22 Genotypic Array Genotypic ArrayGenotypes Frequency Value

QQMM

QqMM

qqMM

QQMm

QqMm

qqMm

QQmm

Qqmm

qqmm

(1 - r)2/4

(r - r2)/2

r2/4

(r - r2)/4

[r2 + (1 - r)2]/2

(r - r2)/2

r2/4

(r - r2)/2

(1 - r)2/4

+a

d

-a

+a

d

-a

+a

d

-a

Page 23: 2008 PGSAS G-nomes

Estimation of additive and dominance effectsEstimation of additive and dominance effects

Marker class Mean expression

MM

Mm

mm

(1-2r)a + 2r(1 - r)d[(1 - r)2 + r2]d

(1 - 2r)(-a) + 2r(1 - r)d

Marker-locus class means (frequency adjusted)

Additivity: (Additivity: (MMMM - - mmmm)/2 =)/2 = a a(1 - 2(1 - 2rr))Dominance: Dominance: Mm Mm - (- (MM MM + + mmmm)/2 = )/2 = d d(1 - 2(1 - 2rr))22

Page 24: 2008 PGSAS G-nomes

Gene Order and Arrangements

➢ Now that we’ve talked about structure and function …

➢ How do we figure out their placement on the map?

➢ We take advantage of a violation of the law.

✔ Specifically,Mendel’s law of independent assortment.

Page 25: 2008 PGSAS G-nomes

Consequences of crossing over (1)

Chiasma

A B

A Ba b

a b A B

a b

Page 26: 2008 PGSAS G-nomes

Meiosis

Linkage between a mutant gene and a marker

Mutant gene

DNA marker

Wild-type gene

Variant DNA marker

Page 27: 2008 PGSAS G-nomes

Consequences of crossing over (2a)

Page 28: 2008 PGSAS G-nomes

Consequences of crossing over (2b)

Page 29: 2008 PGSAS G-nomes

Chiasma frequency and distance between loci

0102030405060708090

0 0.5 1 1.5 2 2.5 3 3.5

Chiasma Frequency

cM o

r R

F%

MAP DISTANCE

RECOMBINATIONFREQUENCY

Page 30: 2008 PGSAS G-nomes

Using the test-cross

135Total

3aBaaBbaB

4AbAabbAbRecombinants

60abAabbab

68ABAaBbABParentals

NumberProgeny Phenotype

Progeny Genotype

Page 31: 2008 PGSAS G-nomes

Calculating Recombination Frequency

➢ Number of ‘A’ individuals:

➢ 68 + 4 = 72➢ Number of ‘a’

individuals:➢ 60 + 3 = 63

χ2=0.6; ns

➢ Number of ‘B’ individuals:

➢ 68 + 3 = 71➢ Number of ‘b’

individuals:➢ 60 + 4 = 64

χ2=0.37; ns.

✔Test for single factor ratios:

✔RF = (4+3)/135 = 0.0518 or 5.18%

Page 32: 2008 PGSAS G-nomes

What if you had 3 genes of interest?

➢ Start with an F1 produced by 2 pureline parents (AABBCC x aabbcc).

➢ Backcross the F1 to the triple-recessive parent.➢ Check that all alleles are segregating in a 1:1 ratio

in the backcross➢ Altered segregation will give a poor estimate of RF%.

➢ differential survival➢ misclassification

Page 33: 2008 PGSAS G-nomes

Here’s how to determine gene order

400TOTAL

32aBc

24AbC4

7aBC

14Abc3

49abC

51ABc2

120abc

103ABC1F1 gametes

Number of progeny

Progeny phenotypes

Class

Page 34: 2008 PGSAS G-nomes

Calculate RF% as before

➢ ALL χ2 are non-significant.

➢ A – B = (14+7+24+32)/400 = 0.1925 or 19.25%

➢ A – C = (51+49+14+7)/400 =0.3025 or 30.25%

➢ B – C = (51+49+24+32)/400 =0.3900 or 39.00%

Page 35: 2008 PGSAS G-nomes

And the answer is:

B A C

Since B-C is the largest RF, genes B and C must be the furthest apart; while A is in between.

39.00%

19.25% 30.25%

Page 36: 2008 PGSAS G-nomes

Genes and Markers and Maps

➢ Gene Mapping➢ The location of genes to specific positions (e.g.,

loci) on specific chromosomes.➢ Linkage

➢ Genes that are located on the same chromosome are ‘linked’.

Page 37: 2008 PGSAS G-nomes

The Human Map

11263 Mb263 Mb

171792 Mb92 Mb

212150 Mb50 Mb

XX164 Mb164 Mb