Crossing over and map distance
description
Transcript of Crossing over and map distance
Crossing over and map distance
• Replicated chromosomes during meiosis are comprised of two sister chromatids.
• Crossovers occur between non-sister chromatids from homologous chromosomes, thereby producing recombinant haplotypes.
• Recombination frequencies between widely spaced genes tend towards 50%.
Crossing over and map distance
• We observe the frequency of recombinants, not the frequency of crossing over
• An odd number of crossovers between two loci produces recombinant haplotypes, whereas an even number of crossovers between two loci produces non-recombinant haplotypes.
– recombination frequency ≠ crossover frequency
• The presence of one crossover often suppresses crossover in the immediate vicinity, a phenomenon known as interference.
– recombination frequencies are not additive
Map distance• The genetic map distance between two genes, measured in centi-
Morgans (cM), is the expected number of crossovers that arise on a single chromatid.
• The mean number of crossovers per chromatid between A and C in the diagram shown below is one. (0 + 1 + 2 + 1)/4 = 1
A
B
C
A
B
C
12
a
b
c
a
b
c
34
A
b
c
A
B
C
12
a
b
C
a
B
c
34
A
b
c
A
B
C
21
a
b
C
a
B
c
43
Meiotic Post-Meiotic
Mapping functions
• Recombination frequency and map distance (r = d) are equal in the absence of multiple crossovers
• Mapping functions are used to transform recombination frequencies into additive map distances to better estimate map distance by counting single crossover events once and double crossover events twice.
Morgan mapping function (1928)
• With complete interference (C = 0) and absence of multiple crossovers
rd d = map distance in Mr = recombination frequency
• In most cases, this assumption only applies for very tighly linked loci (r< 0.1)
Haldane mapping function (1919)
• With no interference (C = 1), the distribution of crossovers is Poisson
)ˆ21ln(2
1rd
x = number of crossoversd = map distance in Mr = recombination frequency
Haldane map distance
Distances will be additive only when there is no interference
dx
ex
ddx
!);Pr(
BCABAC ddd
BCABBCABAC rrrrr ˆˆ2ˆˆ
Haldane mapping function (1919)
3.0ˆ
1.0ˆ
22.0ˆ
BC
AC
AB
r
r
r
2899.0)]22.0(21ln[2
1)ˆ21ln(
2
1 ABAB rd
4015.0)]276.0(21ln[2
1)ˆ21ln(
2
1 BCBC rd
Order BAC
1116.0)]10.0(21ln[2
1)ˆ21ln(
2
1 ACAC rd
276.0)1.0)(22.0(21.022.0ˆˆ2ˆˆ ACABACABBC rrrrr
276.032.010.022.0ˆˆ ACABBC rrr
4015.01116.02899.0 ACABBC ddd
Kosambi mapping function (1944)
• Condition: C = 2r
• Make sense biologically– C tends to 1 as r approaches 0.5– C tends to 0 as r approaches 0.0.
• Widely used
• For three linked loci ordered ABC
Kosambi’s addition formula for the recombination fractions of the loci
r
rd
ˆ21
ˆ21ln4
1
2ˆˆ41
ˆˆ C
rr
rrr
BCAB
BCABAC
Kosambi mapping function (1944)
3.0ˆ
1.0ˆ
22.0ˆ
BC
AC
AB
r
r
r
2361.0)22.0(21
)22.0(21ln4
1ˆ21
ˆ21ln4
1
AB
ABAB r
rd
3375.0)2941.0(21
)2941.0(21ln4
1ˆ21
ˆ21ln4
1
BC
BCBC r
rd
Order BAC
2941.0)1.0)(22.0)(5882.0(21.022.0ˆˆ2ˆˆ ACABACABBC rrCrrr
2941.032.010.022.0ˆˆ ACABBC rrr
3375.01014.02361.0 ACABBC ddd
5882.0)1.0)(22.0(41
)1.022.0(2ˆˆ41
)ˆˆ(22
ACAB
ACABBC rr
rrrC
1014.0)1.0(21
)1.0(21ln4
1
21
ˆ21ln4
1
AC
ACAC r
rd
2ˆˆ41
ˆˆ C
rr
rrr
ACAB
ACABBC
Felsenstein Mapping Function (1979)
• equal to the Haldane mapping function when k = 1
• equal to the Kosambi mapping function when k = 0
where 0 ≤ k ≤ 2
r
kr
kd
21
)2(21ln
)2(2
1
Binomial mapping function
• proposed by Karlin (1984)• When a maximum of N crossovers arise in an interval of length
d with binomial probability, map distance is estimated by the inverse of the binomial mapping function
2
])21(1[ /1 NrNd
if d < N/2, and r = 1/2 otherwise
• incorporates interference• multilocus feasible• recommended by Ott (1991)
])2
1(1[2
1 N
N
rr
Recombination fraction as a function of the map distance
0.0000
0.1000
0.2000
0.3000
0.4000
0.5000
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00
Map distance (M)
Re
co
mb
ina
tio
n f
rac
tio
n
Haldane
Kosambi
C=0
Binomial N=2
Backcross (testcross)A B
BA
a b
baX
A B
ba
a b
baX
P1 P2
F1 (tester)
expected frequencyA B
ba
A b
ba
a B
ba
a b
ba
NR
R
R
NR
coupling linkage phase
AB/ab
Ab/ab
aB/ab
ab/ab
f112--- 1 r– =
f212---r=
f312---r=
f412--- 1 r– =
Genotyping Error or Incomplete Penetrance
Genotypes Observed count
Fully Penetrant
Incompletely Penetrant
ABab f1 r rp)
Abab f2 r rp
aBab f3 r r+pr
abab f4 r rpr]
Genotypes Observed count
Fully Penetrant
Incompletely Penetrant
ABab f1 r rpr]
Abab f2 r rpr
aBab f3 r rp
abab f4 r rp)
Prob. of apparent recombinants is:
p = r-s) + (1-r)s
p = rs(1-2r
r =prob. of recombinants
r=prob. of non-recombinants
s = prob. of misclassifying recombinants as non- recombinants and vice versa
Aa misclassified as aa
aa misclassified as Aa
Incomplete Penetrance
prrfrprfprfprfL aaAa )1(
2
1lnˆ)1(
2
1lnˆ)1(
2
1lnˆ)1)(1(
2
1lnˆ 4321
)1(
2
1lnˆ
2
1lnˆ
2
1lnˆ)1(
2
1lnˆ 4321 rfrfrfrfL
)1)(1(
2
1lnˆ)1(
2
1lnˆ)1(
2
1lnˆ)1(
2
1lnˆ 4321 prfprfrprfprrfL Aaaa
Incomplete Penetrance
4321
32
ˆˆˆˆ
ˆˆˆ
ffff
ffr
)ˆˆ(ˆ)ˆˆ(ˆ)ˆˆ(ˆ
ˆ421312
312
ffffff
fffr
)ˆˆ)(ˆˆ(
ˆˆˆˆˆ
4231
2143
ffff
ffffp
)ˆˆ(ˆ)ˆˆ(ˆ)ˆˆ(ˆ
ˆ134243
243
ffffff
fffr
)ˆˆ)(ˆˆ(
ˆˆˆˆˆ
1324
3412
ffff
ffffp
Fully penetrant
Aa misclassified as aa
aa misclassified as Aa
Segregation Distortion
Genotypes Observed count
Exp. freq.
No diff. viability
Exp. freq. with diff. viability
for the A locus
AaBb f1 r r) r r
r
Aabb f2 r r r r r
aaBb f3 r r r r r r
aabb f4 r r r r
r
r
-viability coefficient for the A locus- Aa genotypes are less viable than aa when 0.0 < < 1.0- Aa and aa genotypes are equally viable when = 1.0- Aa genotypes are more viable than aa when 1.0 < < ∞
Segregation distortionGenotypes Observed count Exp. freq.
No diff. viability
Exp. freq. with diff. viability for the A locus
AaBb f1 r r
Aabb f2 r r
aaBb f3 r r
aabb f4 r r
4321
32
ˆˆˆˆ
ˆˆ
ffff
ffr
rrrr
rrrr
rr
ffff
ffr
A
A
A
A
AAA
A
1
)1(
11ˆˆˆˆ
ˆˆ
4321
32
The solution for recombinant fraction with differential viability for the A locus is the same as the solution without differential viability.
43
21
ˆˆ
ˆˆ
ff
ffA
MLE
Segregation Distortion
Genotypes Observed count
Exp. freq.
No diff. viability
Exp. freq. with diff. viability for the A and
B loci
AaBb f1 r rd
Aabb f2 r rd
aaBb f3 r rd
aabb f4 r rd
- is the viability coefficient for the A locus = is the viability coefficient for the B locus
The viability effects for the two loci are independent
d = rrrr
Segregation Distortion
Genotypes Observed count
Exp. freq.
No diff. viability
Exp. freq. with diff. viability for the A and B loci
AaBb f1 r rd
Aabb f2 r rd
aaBb f3 r rd
aabb f4 r rd
2/1
43
21
ˆˆ
ˆˆ
ff
ffA
2/1
32
2/1
41
2/1
32
ˆˆˆˆ
ˆˆ
ffff
ffr
ABS
- is the viability coefficient for the A locus = is the viability coefficient for the B locus
d = rrrr
The solution for recombinant fraction with differential viability for the A and B loci is NOT the same as the solution without differential viability.
2/1
42
31
ˆˆ
ˆˆ
ff
ffBMLE
Effect of genotyping errors, missing values, and segregation distortion (Hackett and Broadfoot
2003)
• Locus-ordering criteria– weighted least squares– maximum likelihood– SARF
• Three linkage groups of 10 loci– 2, 6, and 10 cM spacings
• Doubled haploid population of 150 individuals
Criteria for Evaluation
• Replicates with correctly estimated orders
• mean rank correlation between estimated versus true orders
• mean total map length
Results
Summary
• Missing data and genotyping errors reduced the proportion of correctly ordered maps – this effect worsened as the distances between loci decreased
• Maximum likelihood criterion was the most successful at ordering loci correctly but gave inflated map lengths when typing errors are present
• Missing data produced shorter map lengths for more widely spaced markers under the weigthed least-squares criterion
• The presence of segregation distortion had little effect