Module 7: Estimating Genetic Variances – Why estimate genetic variances? – Single factor mating...
-
Upload
marylou-mcdonald -
Category
Documents
-
view
268 -
download
7
Transcript of Module 7: Estimating Genetic Variances – Why estimate genetic variances? – Single factor mating...
Module 7: Estimating Genetic Variances
– Why estimate genetic variances?
– Single factor mating designs
PBG 650 Advanced Plant Breeding
Why estimate genetic variances?
• New crop species
– ensure adequate genetic variance for selection
– determine appropriate type of cultivar• pure lines, hybrids, open-pollinated varieties
• Predict response to short and long-term selection
• Determine optimum number and location of testing environments
• Use in selection indices
• Predict single-cross performance
Do Breeders Need to Estimate Genetic Variances?
• For breeders working with elite germplasm, it is often more useful to develop breeding populations for the purposes of selection, than to estimate genetic variances– use parents with high means
– make crosses between unrelated individuals to maintain high genetic variation (or assess diversity at molecular level)
– single-cross performance can be predicted from data routinely generated in breeding programs
– recurrent selection is not widely used in breeding programs for major crop species
Bernardo, 2010, Chapt. 7
What about newer crops, less developed germplasm?
• Options in mating designs for self-pollinated crops are limited
• Potential of purelines or open-pollinated varieties vs hybrids can be assessed by comparing means of these types of cultivars and by considering costs of hybrid seed production
• Precision of genetic variance estimates is often low
• Selection indices can be constructed that do not require input of genetic variances
Do Breeders Need to Estimate Genetic Variances?
Genetic variances?
• Provides valuable baseline information for breeding initiatives for minor crops, new traits
• For many crops and situations, recurrent selection is more efficient than pedigree selection
• Need to distinguish between genetic and environmental correlations among traits
• Better understanding of environmental influences and GXE is essential for effective, well-targeted breeding efforts
Obtain estimates of genetic variances as an integral part of breeding program– progeny trials, mapping populations
– realized selection response, correlated selection response
– monitor changes in genetic variances over time
– accumulate information about inheritance of important traits
Classic approach for estimating genetic variances
• Develop one or more types of progeny
– half sibs, full-sibs, testcrosses, recombinant inbreds
• Evaluate progeny in a set of environments
– representative of potential environments in target region
• Estimate variance components from mean squares in ANOVA (or directly using mixed models)
• Equate variance components with expectation based on covariances among relatives
# of variance components that can be estimated
= # of covariances among relatives in the design
Assumptions
• Relatives are noninbred and belong to a particular random-mating reference population
– estimates apply to that population alone
– relatives must represent a random sample from the population
• parents cannot be selected from the population, or chosen from different populations
• parents can be inbred, as long as their progeny (relatives) are not inbred (use of inbred parents can increase precision)
• The usual assumptions for equilibrium also apply
– diploid inheritance
– no linkage or linkage disequilibrium• using fully inbred parents may reduce effects of linkage
Fixed vs Random effects
• Fixed effects• interested in the effects of the treatments per se
• Σi=0
• Random effects• treatments are a random sample from a larger reference
population that has a mean of 0 and variance σt2
• objectives are to extend conclusions to all members of the population
• interested in estimating magnitude of variance among and within groups
• Σti 0 for any given experiment
Source df MS Expected Mean Square
Blocks r-1 MSR
Families f-1 MSF
Error (r-1)(f-1) MSE
Single-factor analysis, one location
• Families and blocks are considered to be random effects
2
R
2
e f 2
F
2
e r 2
e
r/)( EF
2
F MSMS
However, estimate of additive genetic variance will be biased upward if there is GXE or epistasis
2
F = CovFamily
Single-factor analysis, multiple environments
• An environment could be a location or a different year or season at the same location
• Environments are generally considered to be random, because we want to make inferences about the performance that could be expected at other potential sites in the target production environment
• Specific environments, such as irrigation, fertilizer levels, temperature or daylength regimes, would be fixed effects
• Note that aspects of the experimental design (blocks, locations) are often treated as fixed effects in molecular studies where the objective is to make associations between markers and phenotypes.
Single-factor analysis, multiple environments
Source df MS Expected Mean Square
Years y -1
Blocks/Years y(r-1)
Families f-1 MSF
Families x Years (f-1)(y-1) MSFY
Error y(r-1)(f-1) MSE
2
F
2
FY
2
e ryr 2
FY
2
e r 2
e
ry/)( FYF
2
F MSMS 2
F = CovFamily
Not biased by GXE
Additive genetic variance from single-factor design
Family
2
F Cov 2
A
Genotypes divided into sets
Source df MS Expected Mean Square
Years y -1
Sets s-1
Years x Sets (y-1)(s-1)
Blocks/(YearsxSets) (r-1)ys
Families/Sets (f-1)s MSF
Years x Families/Sets (y-1)(f-1)s MSFY
Error (r-1)(f-1)ys MSE
2
F
2
FY
2
e ryr 2
FY
2
e r 2
e
Calculation of σA2 is the same as before
• Large numbers of families can be divided into sets, and variances can be pooled across sets.
Example – single-factor analysis
• 60 maize S2 lines are allowed to open pollinate; bulked to form half-sib families
• 2 randomized complete blocks, 3 locations
Bernardo, pg 155
Source df MS Mean Square
Location 2
Blocks/Locations 3
Families 59 MSF 14.36
FamiliesxLocations 118 MSFL 6.18
Error 177 MSE 4.00
2
F
2
FL
2
e rlr 2
FL
2
e r 2
e
Are there significant differences among families?
F test MSF/ MSFL= 14.36/6.18 = 2.32Compare to Fcritical with 59,118 df Pr>F is <0.0001
What is the level of inbreeding in the S2 parents?
Expected frequency of heterozygotesP12 = 2pq(1-F)
Plants Families P12 F
F2 or S0 F3 or S1 P12=2pq 0
F3 or S1 F4 or S2 (0.5)P12 0.5
F4 or S2 F5 or S3 (0.25)P12 0.75
F5 or S3 F6 or S4 (0.125)P12 0.875
Fn or Sn-2 Fn+1 or Sn-1 (1/2)n-2P12 1-(1/2)n-2
• A family represents the alleles of its parents– Collectively, an S1 family has the same distribution of alleles as
the S0 plant from which it was derived
• The distinction between plants and families decreases as F approaches 1
rl/)( FLF
2
F MSMS = (14.36-6.18)/(2*3) = 1.36
Source df MS Mean Square
Location 2
Blocks/Locations 3
Families 59 MSF 14.36
FamiliesxLocations 118 MSFL 6.18
Error 177 MSE 4.00
2
F
2
FL
2
e rlr 2
FL
2
e r 2
e
Example – single-factor analysis
Estimate additive genetic variance
63.336.11
4)(
F1
4
21
2
F
2
A
Heritability based on family means
• For animals, a family consists of multiple progeny from an individual
– each of the progeny is a replicate
– usually measure variance among progeny within each family
• For plants, we usually take collective measurements of multiple plants in a plot, and replicate the plots across reps and environments
• Heritabilities in plants are usually expressed on the basis of family means. Meaning will vary depending on the size of the plots, number of replications and number of environments
2 2
2 22
2 2 22
( , )e GL
G G
P G Xrl lG
Cov G Ph
Variance of family means
rl
MSerror2
X appropriate error term for families
number of observations on each family
Families 59 MSF 14.36
FamiliesxLocations 118 MSFL 6.18
Error 177 MSE 4.00
2
F
2
FL
2
e rlr 2
FL
2
e r 2
e
03.13*2
18.6
rl
MSFL2
X
39.23*2
36.14
lrlrl
rlr
rl
MS 2
FL
2
e2
F
2
F
2
FL
2
eF2
P
2 2 2P F X
1.36 1.03 2.39
think of this as the square of the standard error of a family mean
Heritability on a family mean basis
2
X
2
G
2
G
rrl
2
G
2
G2
P
22GL
2e
)P,G(Covh
57.003.136.1
36.1
rrl
2
F
2
F22FL
2e
h
2
A4
F1