Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan...
Transcript of Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan...
Nested association mapping
Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace
David Jordan
DO NOT C
OPY
ContextMedium size sorghum pre breeding program focused on improving the productivity of hybrid grain sorghum in highly variable water limited environments characterised by high levels of GxEDelivery of varieties via the private sector
DO NOT C
OPY
R linesB lines
Sample of elite MR SG UQ/DAF Germplasm (1990+)
0.0
0.1
0.2
0.3
0.4
0.5
NSA4
40BT
X398
Whe
atlan
dBa
iley
Trac
yDa
leRa
dar
Comb
ine.K
afir.6
0..B
Tx31
97..A
c..No
.318
184
Sega
lane
Early
.Sum
acAt
lasDw
Redla
nSc
hroc
kBo
k11
KS4
IS85
25IS
3151
SC56
.14E
Fros
t.Sele
ction
AZ90
54Br
awley
Sacc
aline
Suma
cIta
lian
Rio
Crys
tal.D
ripKe
llers.
Crys
tal
AF28
IS35
11M8
1.E
SC23
CB3
5IS
1266
1TA
M422
BTX3
042
RTX7
000
X296
BSC
170.
6.8
Tx43
0Tx
2536
Tx27
37Ka
rper
669
QL12
IS20
31Br
ande
sIS
1030
2KS
115
IS15
96R9
188
Kelle
rW
ray
MLT1
35SC
62C
ICSV
745
IS11
435
Kuyu
maMP
531
ICSV
400
Macia
Sure
noSD
SL90
168
RS29
Cowl
eyRa
mada
Hega
riGr
assl
Top.
76.6
SDS1
948.
3QL
39R4
0128
1.3.
1.2.
1.1.
1B2
3171
B232
96QL
36R4
0386
R403
463.
2.1
R860
72.2
R905
62R7
4443
.1.2
R319
45.2
.2R8
0515
IS12
179C
Acme
.Bro
omco
rnIS
8336
Goos
enec
kNS
L433
8IS
1237
IS22
63Du
alLR
9198
AI4
LR26
59Am
ber
Leot
iHo
ney
Blac
k.Amb
er
Grain Sorghum Females (B lines) Grain Sorghum Males R lines
A need to increase genetic diversity
Low diversityLow separation of male and female parents
Genetic diversity study ~1996
DO NOT C
OPY
Increasing genetic diversity while retaining 40 years of breeding
Jordan DR, et al (2011). Crop Sci 51:1444-1457
Pop size >5000 individualsNumber of families >100
Challenge of needing to increase genetic diversity in elite germplasm without destroying adaptive complexes built up over decades of breeding
Design parameters1. Needed to increase genetic diversity by crossing to our elite lines
with exotic material2. Needed to maintain existing adaptive complexes (height, maturity,
stay-green, midge resistance etc)3. Needed to be able to evaluate genes in a genetic context that was
relevant to our environments, management systems and populations
4. Need to develop material that could be used in breeding relatively quickly
5. We wanted to generate populations where molecular technologies could enhance our understanding of traits
Backcross nested association mapping BC NAM population
DO NOT C
OPY
Why backcross
Allowed the recovery of enough lines with appropriate height and maturity
elite alleles for important traits are at high frequency (genetic context and utility for breeding)
Simplifies the genetics and helps identify genes that contribute to yield in an elite background‒ contribution of exotic x exotic epistasis is
reduced‒ easy to generate isolines if required
%RP Simple
Ai4
Dorado
Macia
Malisor84-7
QL12
Rio
SC326-6
SC35C
SC56-14E
0 20 40 60 80 100
Box plot of the distribution of the percentage of non recurrent parent genome present in a sample of lines from 9 of the populations
DO NOT C
OPY
Selection for agronomic traits (eg height dtf etc)
We wanted to be able to rapidly evaluate the impact of QTL on yield in relevant genetic backgrounds and environments
Variation for major genes for height and flowering time in sorghum cause lack of adaptation
This lack of adaptation makes evaluation for complex traits like yield difficult and the results irrelevant for applied breeding
DO NOT C
OPY
ExampleF1 Backcross progeny
Sorghum bicolorsubsp. verticiliflorum
DO NOT C
OPY
Selection of Germplasm A range of strategies were used to choose
the exotic germplasm including:‒ visual phenotypic diversity and racial diversity‒ geographic diversity
‒ phenotypic extremes (published or unpublished)‒ elite lines from international breeding programs‒ fertile wild species.
35/12
35/36
DO NOT C
OPY
Context dependencies There is no absolute relative merit value for
a gene or trait or genotype The value of a gene always depends on the
context (genetic background, environment, management)
Context dependencies occur at all scalesand generate major challenges for crop improvement programs (eg epistasis, heterosis, GxExM)
Too much complexity to deal with so it is critical to get the context right when dissecting genetic architecture with a view to improving performance
DO NOT C
OPY
Phenotyping for yield as an MET Our BCNAM is part of an ongoing pre-
breeding activity Each year we evaluate 20 populations in
multi-environment trials (yield flowering time, height, head shape, stay-green…….)
15 new populations and 5 previous populations (allows analysis as a single experiment)
Environment characterisation to allow simulation modelling to be used as an analysis tool
DO NOT C
OPY
A resource for breeding and trait dissection
BLUP
IS 14257SDS1948-3TAM422RS29LR2917-2KS115M35-1MaciaRTx430LR2844R9733LR2716S.purpureosericeumMalisor 84-7SC108-14ERTx2737RioLR2931-2ICSV400LR2490-3SurenoRTx2785IS 22525DoradoSC111-14ESC999-14ELR9198RTx7000SC157-6Ai4SC173-14ERTx2536KuyumaQL12LR2755-2SC172-12ESC103-14EIS 22457LR2528SC23-14EMP531IS 17214SC56-14ERTx2903IS 3614-2SC62-14EMLT135S.drummondiiLR2505SC35-14ESC326-6SC636-6IS 3614-3Karper669ICSV745SC871-6R931945-2-2
-0.5 0.0 0.5 1.0o oo oo
o
oooo
oo
ooooo
oo o
o
oo
oo o
o oo ooo o
oo o
ooo ooo
oo
ooooo o
oo o
oo
o ooo
ooo
o
o
ooo o o
oooo
o
ooo
oo
Environment 1
-0.5 0.0 0.5 1.0
oo oooo
oo
o
o
o
o
o
oo
oooo
o oooo
oo
oooo
oo
oo
o
o ooo
ooo
oooo
ooooooo
oo
ooo
o
Environment 4 > 5000 lines >100 exotic parents evaluated GBS marker data on ~1300 lines Data from more than >50 trials (>2M
phenotypic data points) The reference parent, the female
testers and 28 exotic donors have been re-sequenced >500 lines directly licensed to
commercial companies (many more derivatives licensed)
DO NOT C
OPY
Advantages of NAM for mapping Conventional bi-parental mapping populations
‒ Power to detect QTL (allele replication)‒ Limited sample of available alleles‒ Low resolution
Association mapping panels‒ Sample multiple alleles‒ High resolution due to historic recombination‒ Less power to detect QTL particularly rare alleles‒ Prone to false associations ‒ Genetic background effects are large and often
compromise phenotyping
Nested association mapping Yu et al. 2008‒ Combines the power of linkage analysis with the high
resolution of association mapping through the joint analysis of multiple interlinked populations with sequenced parents DO N
OT COPY
A powerful population for mapping quantitative traitsstartsb1
marker1 marker41
marker2 marker42
marker3 marker43marker83
marker4 marker44
marker5
endSb1
cen
SBI-01
startSb2
marker6
marker7
marker8 marker48
endSb2
cen
SBI-02
startSb3
marker9
marker10
marker11
marker12 marker52
marker13 marker53
marker14
endSb3
cen
SBI-03
startSb4
marker15
marker16 marker56
marker17 marker57
marker18 marker58marker98
marker19 marker59
marker20
endSb4
cen
SBI-04
startSb5
marker21 marker61
marker22 marker62
marker23 marker63
marker24 marker64
endSb5
cen
SBI-05
startSb6
marker25 marker65
marker26 marker66
marker27 marker67
marker28 marker68
marker29 marker69
endSb6
cen
SBI-06
startSb7
marker30 marker70
marker31 marker71
marker32 marker72
endSb7
cen
SBI-07
startSB8
marker33
marker34 marker74
marker35
endSb8
cen
SBI-08
startSb9
marker36 marker76
marker37
marker38 marker78
endSb9
cen
SBI-09
startsb10
marker39
marker40 marker80
endSb10
cen
SBI-10
DO NOT C
OPY
Structure of allelic variation The number of QTL
segregating in each population varied between 2 and 18 average >10
Most genotypes have a mixture of early and late alleles
DO NOT C
OPY
Comparative genomics: maize and sorghum•75 significant markers associated with flowering time in maize NAM
•Sequence mapped these significant maize markers onto the sorghum genome and compared them with our sorghum NAM flowering time QTL
•90% of the sorghum QTL were <10cM away from the comparable maize QTL
•37.5% were less than 1cM away
•Maize sorghum comparison provides mutual insights
McMullen et al 2009. Science 325, 737
LOD score
Chr
omos
omes
Buckler et al 2009. Science 325, 714
end
marker1
marker2
marker3
marker44
start
QD
S_1_
60.8
QD
A_1_
60.8
QD
S_1_
84.6
QD
A_1_
84.9
QD
S_1_
137.
6
QD
A_1_
137.
6
QD
S_1_
178.
5QD
A_1_
181.
3
startsb1
marker1 marker41
marker2 marker42
marker3 marker43marker83
marker4 marker44
marker5
endSb1
start5
marker41
marker42
marker43
end5
QD
S_5_7.2
QD
A_5_7.2Q
DS_5_63
QD
A_5_70
start9
marker83
marker4
end9
QD
S_9_44.5
QD
A_9_45.2Q
DA_9_62
QD
S_9_62.2
Zm Chr1 (50-200cM) SBI-01 Zm Chr5
(0-75cM)
Zm Chr9 (40-100cM)
*
*
*
**
*
**
**
*
**
end
marker1
marker2
marker3
marker44
start
QD
S_1_
60.8
QD
A_1_
60.8
QD
S_1_
84.6
QD
A_1_
84.9
QD
S_1_
137.
6
QD
A_1_
137.
6
QD
S_1_
178.
5QD
A_1_
181.
3
startsb1
marker1 marker41
marker2 marker42
marker3 marker43marker83
marker4 marker44
marker5
endSb1
start5
marker41
marker42
marker43
end5
QD
S_5_7.2
QD
A_5_7.2Q
DS_5_63
QD
A_5_70
start9
marker83
marker4
end9
QD
S_9_44.5
QD
A_9_45.2Q
DA_9_62
QD
S_9_62.2
Zm Chr1 (50-200cM) SBI-01 Zm Chr5
(0-75cM)
Zm Chr9 (40-100cM)
*
*
*
**
*
**
**
*
**
start
marker25marker26marker27
marker28
marker29
end
QD
S_2_
63
QD
S_2_
74.8
QD
A_2_
74.8
startSb6
marker25 marker65
marker26 marker66
marker27 marker67
marker28 marker68
marker29 marker69
endSb6
start10
marker65
marker66
marker67
marker68
marker69
end10
QD
S_10_42.9
QD
A_10_42.9
Zm Chr2 (50-90cM)
Zm Chr10 (30-80cM)
SBI-06 (0-90cM)
***** H
ighe
r rec
ombi
natio
n ra
te p
rovi
de h
ighe
r re
solu
tion
Hig
her r
ecom
bina
tion
rate
pro
vide
hig
her
reso
lutio
n
Mace, Hunt & Jordan, 2013. Theor Appl Genet. 126: 1377-1395DO NOT C
OPY
Design and analysis
Genetic design factors BCNAM Reference parent Backcrossing derived populations with selection Large number of populations 50-100 individuals per population Total genotypes families 23 (~2500 individuals)
• Individual populations often lack power to detect quantitative traits on their own • Selection during population development makes joint linkage analysis difficult• Small populations allow greater numbers of populations to be tested for the
same amount of resource (> diversity, more QTL alleles)DO N
OT COPY
Analysis methods AssociationGWAS (eg tassel)
Haplotype based and Joint linkage cpQTL (Mace et al 2013) 1 FarmCPU Joint inclusive composite interval mapping (JICIM)
DO NOT C
OPY
GWAS on NAM can lack power if population sizes are small Power of the GWAS
depends on LD and number of SNPsSNP allele frequencyNumber of QTL alleles Frequency of QTL alleles
Reference
NAM P1
NAM P2
NAM P3
NAM P4
NAM P5
SNP
QTL
DO NOT C
OPY
GWAS on NAM can lack power if population sizes are small Power of the GWAS
depends on LD and number of SNPsSNP allele frequencyNumber of QTL alleles Frequency of QTL alleles
Reference
NAM P1
NAM P2
NAM P3
NAM P4
NAM P5
1
1
1
1
0
0
DO NOT C
OPY
Reference
NAM P1
NAM P2
NAM P3
NAM P4
NAM P5
A
A
C
D
B
B
Haplotype based methods make use of positon information and multiple SNPs Allow access to multiple
alleles at a QTLGives greater power to detect
QTL in NAM type populations
Haplotype based methods
DO NOT C
OPY
Our experience GWAS for flowering time was
less effective than a haplotype based methodBoth methods found some unique
QTL but more were found by the haplotype based method
Methods QTL for flowering time
% of total QTL
Number of QTL unique to method
GWAS 25/40 62.5% 6
cpQTL 34/40 85% 15
DO NOT C
OPY
FarmCPU
PLOS Genetics | DOI:10.1371/journal.pgen.1005767 February 1, 2016
Our experience is that FarmCPU works well for QTL detection in our diversity panel and in our BCNAM population
SimulationSimulated 40qtl with equal effect and found >20 in our Association panel with a threshold based on the Bonferroni correction (adjusted for then number of independent tests)DO N
OT COPY
Mapping Populations
X
NAM Population
Breeding Populations
Diversity Panel
n=213 n=563n=1385 n>900
NAM Breeding populations Diversity Panel
Mapping in complementary interlinked populations vary in LD and diversity
DO NOT C
OPY
Joint analysis of multiple populations
Different populations have different advantages and disadvantages Typically we no longer use a single type of population for genetic analysis of
traits of interest Multiple populations allows
‒ Use of reduced thresholds for QTL detection (detection in unique datasets)‒ Use of prior information ‒ Tracing alleles from one population to another using haplotypes there is the potential to
use prior information
Mapping Populations
X
NAM Population
Breeding Populations
Diversity Panel
DO NOT C
OPY
Diversity panel resource• >900 genotypes• Highly diverse genotypes• Excellent point of entry to explore native variation• High density SNP genotyping• Low LD high precision• High false positive rate
Mapping Populations
X
NAM Population
Breeding Populations
Diversity Panel
Resources for trait dissection: Diversity
Diversity panel
DO NOT C
OPY
Resources for trait dissection: BCNAM
Mapping Populations
X
NAM Population
Breeding Populations
Diversity Panel
Nested Association Mapping Resource• >100 exotic parental lines back-crossed to a single elite
genotype
• Less diversity than the diversity panel but provides opportunity to evaluate exotic alleles in an elite genetic background
• 23 populations, consisting of 1385 progeny, genotyped with high density SNP markers.
• Whole genome sequence data available for 17 of the 23 parental lines
• High LD high power low precision
Orange points NAM parents
DO NOT C
OPY
Elite sorghum breeding trials
• Advanced Yield Testing (multiple environments)
• 342 unique genotypes tested in hybrid combination with multiple females
• 563 hybrid combinations • High LD less diversity • Relevance to target
Resources for trait dissection: Breeding population
Mapping Populations
X
NAM Population
Breeding Populations
Diversity Panel Breeding diversity
Female lines: pinkMale line: dark blue
DO NOT C
OPY
Conclusions Use of complementary populations can improve the utility of genetic dissection
particularly when the aim is application in applied breeding Use of multiple populations can allow the use of lower thresholds Haplotype based methods are more powerful than standard GWAS methods in BCNAM
and in our diversity and breeding populations
DO NOT C
OPY
Acknowledgements
Adrian Hathorn
Emma Mace
Colleen Hunt
Alan Cruickshank
Bob Henzell
DAF technical team
DO NOT C
OPY