Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan...

29
Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace David Jordan DO NOT COPY

Transcript of Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan...

Page 1: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Nested association mapping

Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace

David Jordan

DO NOT C

OPY

Page 2: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

ContextMedium size sorghum pre breeding program focused on improving the productivity of hybrid grain sorghum in highly variable water limited environments characterised by high levels of GxEDelivery of varieties via the private sector

DO NOT C

OPY

Page 3: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

R linesB lines

Sample of elite MR SG UQ/DAF Germplasm (1990+)

0.0

0.1

0.2

0.3

0.4

0.5

NSA4

40BT

X398

Whe

atlan

dBa

iley

Trac

yDa

leRa

dar

Comb

ine.K

afir.6

0..B

Tx31

97..A

c..No

.318

184

Sega

lane

Early

.Sum

acAt

lasDw

Redla

nSc

hroc

kBo

k11

KS4

IS85

25IS

3151

SC56

.14E

Fros

t.Sele

ction

AZ90

54Br

awley

Sacc

aline

Suma

cIta

lian

Rio

Crys

tal.D

ripKe

llers.

Crys

tal

AF28

IS35

11M8

1.E

SC23

CB3

5IS

1266

1TA

M422

BTX3

042

RTX7

000

X296

BSC

170.

6.8

Tx43

0Tx

2536

Tx27

37Ka

rper

669

QL12

IS20

31Br

ande

sIS

1030

2KS

115

IS15

96R9

188

Kelle

rW

ray

MLT1

35SC

62C

ICSV

745

IS11

435

Kuyu

maMP

531

ICSV

400

Macia

Sure

noSD

SL90

168

RS29

Cowl

eyRa

mada

Hega

riGr

assl

Top.

76.6

SDS1

948.

3QL

39R4

0128

1.3.

1.2.

1.1.

1B2

3171

B232

96QL

36R4

0386

R403

463.

2.1

R860

72.2

R905

62R7

4443

.1.2

R319

45.2

.2R8

0515

IS12

179C

Acme

.Bro

omco

rnIS

8336

Goos

enec

kNS

L433

8IS

1237

IS22

63Du

alLR

9198

AI4

LR26

59Am

ber

Leot

iHo

ney

Blac

k.Amb

er

Grain Sorghum Females (B lines) Grain Sorghum Males R lines

A need to increase genetic diversity

Low diversityLow separation of male and female parents

Genetic diversity study ~1996

DO NOT C

OPY

Page 4: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Increasing genetic diversity while retaining 40 years of breeding

Jordan DR, et al (2011). Crop Sci 51:1444-1457

Pop size >5000 individualsNumber of families >100

Challenge of needing to increase genetic diversity in elite germplasm without destroying adaptive complexes built up over decades of breeding

Design parameters1. Needed to increase genetic diversity by crossing to our elite lines

with exotic material2. Needed to maintain existing adaptive complexes (height, maturity,

stay-green, midge resistance etc)3. Needed to be able to evaluate genes in a genetic context that was

relevant to our environments, management systems and populations

4. Need to develop material that could be used in breeding relatively quickly

5. We wanted to generate populations where molecular technologies could enhance our understanding of traits

Backcross nested association mapping BC NAM population

DO NOT C

OPY

Page 5: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Why backcross

Allowed the recovery of enough lines with appropriate height and maturity

elite alleles for important traits are at high frequency (genetic context and utility for breeding)

Simplifies the genetics and helps identify genes that contribute to yield in an elite background‒ contribution of exotic x exotic epistasis is

reduced‒ easy to generate isolines if required

%RP Simple

Ai4

Dorado

Macia

Malisor84-7

QL12

Rio

SC326-6

SC35C

SC56-14E

0 20 40 60 80 100

Box plot of the distribution of the percentage of non recurrent parent genome present in a sample of lines from 9 of the populations

DO NOT C

OPY

Page 6: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Selection for agronomic traits (eg height dtf etc)

We wanted to be able to rapidly evaluate the impact of QTL on yield in relevant genetic backgrounds and environments

Variation for major genes for height and flowering time in sorghum cause lack of adaptation

This lack of adaptation makes evaluation for complex traits like yield difficult and the results irrelevant for applied breeding

DO NOT C

OPY

Page 7: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

ExampleF1 Backcross progeny

Sorghum bicolorsubsp. verticiliflorum

DO NOT C

OPY

Page 8: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Selection of Germplasm A range of strategies were used to choose

the exotic germplasm including:‒ visual phenotypic diversity and racial diversity‒ geographic diversity

‒ phenotypic extremes (published or unpublished)‒ elite lines from international breeding programs‒ fertile wild species.

35/12

35/36

DO NOT C

OPY

Page 9: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Context dependencies There is no absolute relative merit value for

a gene or trait or genotype The value of a gene always depends on the

context (genetic background, environment, management)

Context dependencies occur at all scalesand generate major challenges for crop improvement programs (eg epistasis, heterosis, GxExM)

Too much complexity to deal with so it is critical to get the context right when dissecting genetic architecture with a view to improving performance

DO NOT C

OPY

Page 10: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Phenotyping for yield as an MET Our BCNAM is part of an ongoing pre-

breeding activity Each year we evaluate 20 populations in

multi-environment trials (yield flowering time, height, head shape, stay-green…….)

15 new populations and 5 previous populations (allows analysis as a single experiment)

Environment characterisation to allow simulation modelling to be used as an analysis tool

DO NOT C

OPY

Page 11: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

A resource for breeding and trait dissection

BLUP

IS 14257SDS1948-3TAM422RS29LR2917-2KS115M35-1MaciaRTx430LR2844R9733LR2716S.purpureosericeumMalisor 84-7SC108-14ERTx2737RioLR2931-2ICSV400LR2490-3SurenoRTx2785IS 22525DoradoSC111-14ESC999-14ELR9198RTx7000SC157-6Ai4SC173-14ERTx2536KuyumaQL12LR2755-2SC172-12ESC103-14EIS 22457LR2528SC23-14EMP531IS 17214SC56-14ERTx2903IS 3614-2SC62-14EMLT135S.drummondiiLR2505SC35-14ESC326-6SC636-6IS 3614-3Karper669ICSV745SC871-6R931945-2-2

-0.5 0.0 0.5 1.0o oo oo

o

oooo

oo

ooooo

oo o

o

oo

oo o

o oo ooo o

oo o

ooo ooo

oo

ooooo o

oo o

oo

o ooo

ooo

o

o

ooo o o

oooo

o

ooo

oo

Environment 1

-0.5 0.0 0.5 1.0

oo oooo

oo

o

o

o

o

o

oo

oooo

o oooo

oo

oooo

oo

oo

o

o ooo

ooo

oooo

ooooooo

oo

ooo

o

Environment 4 > 5000 lines >100 exotic parents evaluated GBS marker data on ~1300 lines Data from more than >50 trials (>2M

phenotypic data points) The reference parent, the female

testers and 28 exotic donors have been re-sequenced >500 lines directly licensed to

commercial companies (many more derivatives licensed)

DO NOT C

OPY

Page 12: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Advantages of NAM for mapping Conventional bi-parental mapping populations

‒ Power to detect QTL (allele replication)‒ Limited sample of available alleles‒ Low resolution

Association mapping panels‒ Sample multiple alleles‒ High resolution due to historic recombination‒ Less power to detect QTL particularly rare alleles‒ Prone to false associations ‒ Genetic background effects are large and often

compromise phenotyping

Nested association mapping Yu et al. 2008‒ Combines the power of linkage analysis with the high

resolution of association mapping through the joint analysis of multiple interlinked populations with sequenced parents DO N

OT COPY

Page 13: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

A powerful population for mapping quantitative traitsstartsb1

marker1 marker41

marker2 marker42

marker3 marker43marker83

marker4 marker44

marker5

endSb1

cen

SBI-01

startSb2

marker6

marker7

marker8 marker48

endSb2

cen

SBI-02

startSb3

marker9

marker10

marker11

marker12 marker52

marker13 marker53

marker14

endSb3

cen

SBI-03

startSb4

marker15

marker16 marker56

marker17 marker57

marker18 marker58marker98

marker19 marker59

marker20

endSb4

cen

SBI-04

startSb5

marker21 marker61

marker22 marker62

marker23 marker63

marker24 marker64

endSb5

cen

SBI-05

startSb6

marker25 marker65

marker26 marker66

marker27 marker67

marker28 marker68

marker29 marker69

endSb6

cen

SBI-06

startSb7

marker30 marker70

marker31 marker71

marker32 marker72

endSb7

cen

SBI-07

startSB8

marker33

marker34 marker74

marker35

endSb8

cen

SBI-08

startSb9

marker36 marker76

marker37

marker38 marker78

endSb9

cen

SBI-09

startsb10

marker39

marker40 marker80

endSb10

cen

SBI-10

DO NOT C

OPY

Page 14: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Structure of allelic variation The number of QTL

segregating in each population varied between 2 and 18 average >10

Most genotypes have a mixture of early and late alleles

DO NOT C

OPY

Page 15: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Comparative genomics: maize and sorghum•75 significant markers associated with flowering time in maize NAM

•Sequence mapped these significant maize markers onto the sorghum genome and compared them with our sorghum NAM flowering time QTL

•90% of the sorghum QTL were <10cM away from the comparable maize QTL

•37.5% were less than 1cM away

•Maize sorghum comparison provides mutual insights

McMullen et al 2009. Science 325, 737

LOD score

Chr

omos

omes

Buckler et al 2009. Science 325, 714

end

marker1

marker2

marker3

marker44

start

QD

S_1_

60.8

QD

A_1_

60.8

QD

S_1_

84.6

QD

A_1_

84.9

QD

S_1_

137.

6

QD

A_1_

137.

6

QD

S_1_

178.

5QD

A_1_

181.

3

startsb1

marker1 marker41

marker2 marker42

marker3 marker43marker83

marker4 marker44

marker5

endSb1

start5

marker41

marker42

marker43

end5

QD

S_5_7.2

QD

A_5_7.2Q

DS_5_63

QD

A_5_70

start9

marker83

marker4

end9

QD

S_9_44.5

QD

A_9_45.2Q

DA_9_62

QD

S_9_62.2

Zm Chr1 (50-200cM) SBI-01 Zm Chr5

(0-75cM)

Zm Chr9 (40-100cM)

*

*

*

**

*

**

**

*

**

end

marker1

marker2

marker3

marker44

start

QD

S_1_

60.8

QD

A_1_

60.8

QD

S_1_

84.6

QD

A_1_

84.9

QD

S_1_

137.

6

QD

A_1_

137.

6

QD

S_1_

178.

5QD

A_1_

181.

3

startsb1

marker1 marker41

marker2 marker42

marker3 marker43marker83

marker4 marker44

marker5

endSb1

start5

marker41

marker42

marker43

end5

QD

S_5_7.2

QD

A_5_7.2Q

DS_5_63

QD

A_5_70

start9

marker83

marker4

end9

QD

S_9_44.5

QD

A_9_45.2Q

DA_9_62

QD

S_9_62.2

Zm Chr1 (50-200cM) SBI-01 Zm Chr5

(0-75cM)

Zm Chr9 (40-100cM)

*

*

*

**

*

**

**

*

**

start

marker25marker26marker27

marker28

marker29

end

QD

S_2_

63

QD

S_2_

74.8

QD

A_2_

74.8

startSb6

marker25 marker65

marker26 marker66

marker27 marker67

marker28 marker68

marker29 marker69

endSb6

start10

marker65

marker66

marker67

marker68

marker69

end10

QD

S_10_42.9

QD

A_10_42.9

Zm Chr2 (50-90cM)

Zm Chr10 (30-80cM)

SBI-06 (0-90cM)

***** H

ighe

r rec

ombi

natio

n ra

te p

rovi

de h

ighe

r re

solu

tion

Hig

her r

ecom

bina

tion

rate

pro

vide

hig

her

reso

lutio

n

Mace, Hunt & Jordan, 2013. Theor Appl Genet. 126: 1377-1395DO NOT C

OPY

Page 16: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Design and analysis

Genetic design factors BCNAM Reference parent Backcrossing derived populations with selection Large number of populations 50-100 individuals per population Total genotypes families 23 (~2500 individuals)

• Individual populations often lack power to detect quantitative traits on their own • Selection during population development makes joint linkage analysis difficult• Small populations allow greater numbers of populations to be tested for the

same amount of resource (> diversity, more QTL alleles)DO N

OT COPY

Page 17: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Analysis methods AssociationGWAS (eg tassel)

Haplotype based and Joint linkage cpQTL (Mace et al 2013) 1 FarmCPU Joint inclusive composite interval mapping (JICIM)

DO NOT C

OPY

Page 18: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

GWAS on NAM can lack power if population sizes are small Power of the GWAS

depends on LD and number of SNPsSNP allele frequencyNumber of QTL alleles Frequency of QTL alleles

Reference

NAM P1

NAM P2

NAM P3

NAM P4

NAM P5

SNP

QTL

DO NOT C

OPY

Page 19: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

GWAS on NAM can lack power if population sizes are small Power of the GWAS

depends on LD and number of SNPsSNP allele frequencyNumber of QTL alleles Frequency of QTL alleles

Reference

NAM P1

NAM P2

NAM P3

NAM P4

NAM P5

1

1

1

1

0

0

DO NOT C

OPY

Page 20: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Reference

NAM P1

NAM P2

NAM P3

NAM P4

NAM P5

A

A

C

D

B

B

Haplotype based methods make use of positon information and multiple SNPs Allow access to multiple

alleles at a QTLGives greater power to detect

QTL in NAM type populations

Haplotype based methods

DO NOT C

OPY

Page 21: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Our experience GWAS for flowering time was

less effective than a haplotype based methodBoth methods found some unique

QTL but more were found by the haplotype based method

Methods QTL for flowering time

% of total QTL

Number of QTL unique to method

GWAS 25/40 62.5% 6

cpQTL 34/40 85% 15

DO NOT C

OPY

Page 22: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

FarmCPU

PLOS Genetics | DOI:10.1371/journal.pgen.1005767 February 1, 2016

Our experience is that FarmCPU works well for QTL detection in our diversity panel and in our BCNAM population

SimulationSimulated 40qtl with equal effect and found >20 in our Association panel with a threshold based on the Bonferroni correction (adjusted for then number of independent tests)DO N

OT COPY

Page 23: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Mapping Populations

X

NAM Population

Breeding Populations

Diversity Panel

n=213 n=563n=1385 n>900

NAM Breeding populations Diversity Panel

Mapping in complementary interlinked populations vary in LD and diversity

DO NOT C

OPY

Page 24: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Joint analysis of multiple populations

Different populations have different advantages and disadvantages Typically we no longer use a single type of population for genetic analysis of

traits of interest Multiple populations allows

‒ Use of reduced thresholds for QTL detection (detection in unique datasets)‒ Use of prior information ‒ Tracing alleles from one population to another using haplotypes there is the potential to

use prior information

Mapping Populations

X

NAM Population

Breeding Populations

Diversity Panel

DO NOT C

OPY

Page 25: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Diversity panel resource• >900 genotypes• Highly diverse genotypes• Excellent point of entry to explore native variation• High density SNP genotyping• Low LD high precision• High false positive rate

Mapping Populations

X

NAM Population

Breeding Populations

Diversity Panel

Resources for trait dissection: Diversity

Diversity panel

DO NOT C

OPY

Page 26: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Resources for trait dissection: BCNAM

Mapping Populations

X

NAM Population

Breeding Populations

Diversity Panel

Nested Association Mapping Resource• >100 exotic parental lines back-crossed to a single elite

genotype

• Less diversity than the diversity panel but provides opportunity to evaluate exotic alleles in an elite genetic background

• 23 populations, consisting of 1385 progeny, genotyped with high density SNP markers.

• Whole genome sequence data available for 17 of the 23 parental lines

• High LD high power low precision

Orange points NAM parents

DO NOT C

OPY

Page 27: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Elite sorghum breeding trials

• Advanced Yield Testing (multiple environments)

• 342 unique genotypes tested in hybrid combination with multiple females

• 563 hybrid combinations • High LD less diversity • Relevance to target

Resources for trait dissection: Breeding population

Mapping Populations

X

NAM Population

Breeding Populations

Diversity Panel Breeding diversity

Female lines: pinkMale line: dark blue

DO NOT C

OPY

Page 28: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Conclusions Use of complementary populations can improve the utility of genetic dissection

particularly when the aim is application in applied breeding Use of multiple populations can allow the use of lower thresholds Haplotype based methods are more powerful than standard GWAS methods in BCNAM

and in our diversity and breeding populations

DO NOT C

OPY

Page 29: Nested association mapping COPY · Nested association mapping Graeme Hammer, Scott Chapman, Alan Cruickshank, Colleen Hunt, Adrian Hathorn, Yongfu Tao, Emma Mace DO David Jordan.

Acknowledgements

Adrian Hathorn

Emma Mace

Colleen Hunt

Alan Cruickshank

Bob Henzell

DAF technical team

DO NOT C

OPY