2007: Genetic Diversity of Tetraploid Cotton Species Based ...

9
GENETIC DIVERSITY OF TETRAPLOID COTTON SPECIES BASED ON AFLP AND GT-AFLP ANALYSIS Jinfa Zhang Las Cruces, NM Wu Wang New Mexico State University Las Cruces, NM Mingxiong Pang R. Esmail 0, James Stewart University of Arkansas Fayetteville, AR Richard G. Percy USDA-ARS Maricopa, AZ Abstract This paper presents the results in using AFLP and GT-AFLP markers to evaluate 88 germplasm lines and accessions representing the five tetraploid cotton species. A gene targeted (GT) AFLP (GT-AFLP) system and TRAP system (GT-TRAP) have been devised and tested using primers designed from gene families encoding for transcription factors. The results demonstrated that clustering of germplasm lines using the new GT-AFLP marker system is highly consistent with that of AFLP markers. High reproducibility of results was obtained with the cDNA GT-TRAP system on a segregating population using degenerate TF primers and a random TRAP primer. Therefore, the two new marker systems, GT-AFLP and GT-TRAP can be used to design degenerate primers in a gene targeted PCR reaction to survey for genetic diversity at the DNA level and for differential gene expression at the RNA level. Introduction In Arabidopsis, approximately 1,000 gene families with more than 8,300 genes (>32% of 26,000 genes in the genome) have been identified (http://www.arabidopsis.org/browse/genefamily/index.jsp). A gene family is a group of genes coding for diverse proteins with related functions which, by virtue of their high degree of sequence similarity, are believed to have evolved from a single ancestral gene. For example, ~2,000 transcription factor (TF) genes (7.4%) belong to more than 60 gene families in Arabidopsis (http://datf.cbi.pku.edu.cn; http://arabtfdb.bio.uni- postsdam.de/v1.1; Gong et al., 2004; Xiong et al., 2005). Transcription factors are regulatory proteins in eukaryotes and often exhibit sequence-specific DNA-binding and are capable of activating or repressing transcription of multiple target genes. The DNA-binding domains of transcription factors are highly conserved and can be used to design degenerate primers for gene targeted (GT) PCR reactions to survey for genetic diversity at the DNA level and for differential gene expression at the RNA level. The objectives of the current study were to develop high-throughput DNA marker systems called GT-AFLP (amplified fragment length polymorphism) and GT-TRAP (target region amplification polymorphism) in combination with AFLP (Vos et al., 1995) and TRAP (Hu and Vick, 2003), respectively. Materials and Methods Plant Materials, DNA and RNA extraction Eighty-eight genotypes/accessions of the five tetraploid cotton species (Gossypim hirsutum-Gh, AD1; G. barbadense-Gb, AD2; G. tomentosum-Gt, AD3, G. mustelinum-Gm, AD4; and G. darwinii-Gd, AD5) were grown in either a greenhouse or a field in Las Cruces, NM. Leaf tissue was used for DNA extraction (Zhang and Stewart, 2000), however, only 32 genotypes were used for a comprehensive analysis. D5 (G. raimondii) was used as an out- 235 2007 Beltwide Cotton Conferences, New Orleans, Louisiana, January 9-12, 2007

Transcript of 2007: Genetic Diversity of Tetraploid Cotton Species Based ...

Page 1: 2007: Genetic Diversity of Tetraploid Cotton Species Based ...

2007 Beltwide Cotton Conferences, New Orleans, Louisiana, January 9-12, 2007 6746

GENETIC DIVERSITY OF TETRAPLOID COTTON SPECIES BASED ON AFLP AND GT-AFLP ANALYSIS

Jinfa Zhang Las Cruces, NM

Wu Wang New Mexico State University

Las Cruces, NM Mingxiong Pang

R. Esmail 0, James Stewart

University of Arkansas Fayetteville, AR

Richard G. Percy USDA-ARS

Maricopa, AZ

Abstract

This paper presents the results in using AFLP and GT-AFLP markers to evaluate 88 germplasm lines and accessions representing the five tetraploid cotton species. A gene targeted (GT) AFLP (GT-AFLP) system and TRAP system (GT-TRAP) have been devised and tested using primers designed from gene families encoding for transcription factors. The results demonstrated that clustering of germplasm lines using the new GT-AFLP marker system is highly consistent with that of AFLP markers. High reproducibility of results was obtained with the cDNA GT-TRAP system on a segregating population using degenerate TF primers and a random TRAP primer. Therefore, the two new marker systems, GT-AFLP and GT-TRAP can be used to design degenerate primers in a gene targeted PCR reaction to survey for genetic diversity at the DNA level and for differential gene expression at the RNA level.

Introduction In Arabidopsis, approximately 1,000 gene families with more than 8,300 genes (>32% of 26,000 genes in the genome) have been identified (http://www.arabidopsis.org/browse/genefamily/index.jsp). A gene family is a group of genes coding for diverse proteins with related functions which, by virtue of their high degree of sequence similarity, are believed to have evolved from a single ancestral gene. For example, ~2,000 transcription factor (TF) genes (7.4%) belong to more than 60 gene families in Arabidopsis (http://datf.cbi.pku.edu.cn; http://arabtfdb.bio.uni-postsdam.de/v1.1; Gong et al., 2004; Xiong et al., 2005). Transcription factors are regulatory proteins in eukaryotes and often exhibit sequence-specific DNA-binding and are capable of activating or repressing transcription of multiple target genes. The DNA-binding domains of transcription factors are highly conserved and can be used to design degenerate primers for gene targeted (GT) PCR reactions to survey for genetic diversity at the DNA level and for differential gene expression at the RNA level. The objectives of the current study were to develop high-throughput DNA marker systems called GT-AFLP (amplified fragment length polymorphism) and GT-TRAP (target region amplification polymorphism) in combination with AFLP (Vos et al., 1995) and TRAP (Hu and Vick, 2003), respectively.

Materials and Methods Plant Materials, DNA and RNA extraction Eighty-eight genotypes/accessions of the five tetraploid cotton species (Gossypim hirsutum-Gh, AD1; G. barbadense-Gb, AD2; G. tomentosum-Gt, AD3, G. mustelinum-Gm, AD4; and G. darwinii-Gd, AD5) were grown in either a greenhouse or a field in Las Cruces, NM. Leaf tissue was used for DNA extraction (Zhang and Stewart, 2000), however, only 32 genotypes were used for a comprehensive analysis. D5 (G. raimondii) was used as an out-

235

2007 Beltwide Cotton Conferences, New Orleans, Louisiana, January 9-12, 2007

Page 2: 2007: Genetic Diversity of Tetraploid Cotton Species Based ...

group. Sixty-four lines were selected from a backcrossed inbred population (BIL) for RNA extraction from 10 days post-anthesis (DPA) bolls using a modified hot-borate method (Wan and Wilkins, 1994). AFLP Analysis The AFLP analysis of DNA was performed as outlined by Vos et al. (1995) with minor modifications (Zhang et al., 2005). Briefly, genomic DNA was restricted with EcoRI and MseI, and ligated with EcoRI and MseI adaptors in the same reaction. The diluted, ligated solution was used in the first round of AFLP amplification using two pre-selective primers with a single selective nucleotide extension. Then, the second round of amplification was performed using the diluted pre-selective PCR reaction as a template with two selective AFLP primers. The PCR samples were analyzed and sequenced using a CEQ 8000 Sequencer and Fragment Analysis Software (Beckman-Coulter Inc., Fullerton, CA), as described in Zhang et al. (2005). GT-AFLP Analysis For DNA GT-AFLP analysis, one of the degenerate TF primers (Table 1) in combination with one of the selective AFLP primers was used in the second round of PCR amplification. The GT-AFLP PCR products were electrophoresed and analyzed using a CEQ 8000 Sequencer (Zhang et al., 2005). Here, GT-AFLP is also called TF-AFLP. Table 1. A list of TF gene families and families used for designing degenerate PCR primers are highlighted and the degenerate primer for each family is given below.

ABI3VP1 Family Alfin-like Family AP2-EREBP Family ARF Family ARID Family ARR-B Family AtRKD Family BBR/BPC Family BHLH Family bZIP Family BZR Family C2C2-CO-like Family C2C2-Dof Family C2C2-Gata Family C2C2-YABBY Family C2H2 Family C3H Family CAMTA Family CCAAT-DR1 Family CCAAT-HAP2 Family CCAAT-HAP3 Family CCAAT-HAP5 Family CPP Family E2F-DP Family EIL Family G2-like Family GeBP Family GRAS Family GRF Family Homeobox Family HRT Family HSF Family JUMONJI Family MADS Family MYB Family MYB-related Family NAC Family NLP Family Orphan Family PHD Family RAV Family REM Family SBP Family TCP Family Trihelix Family TUB Family VOZ-9 Family WHIRLY Family

WRKY Family ZF-HD Famil

236

2007 Beltwide Cotton Conferences, New Orleans, Louisiana, January 9-12, 2007

Page 3: 2007: Genetic Diversity of Tetraploid Cotton Species Based ...

Primer ID Primer sequence

1. bHLH-1 NNNGARMGINNNMGIMGIGAR

2. bZIP-1 AAYMGIGARTCNGCNNNNMGIAGY

3. bZIP-2 NNNMGIAGYMGINNNMGIAAR

4. G2-1 AGYCAYCTNCARAARTAYAGR

5. MADS-1 NNNAARMGIMGINNNGGNTTR

6. MADS-2 GARTTRATNNNNTTRTGYMGI

7. Myb-1 GGNAAGTCNTCYMGIYTNMGIT

8. Myb-2 CCNGGNMGICANGAYAAYGAA

9. NAC-1 NNNTGGNNNATGCAYGARTAY

10. WRKY-1 TGGMGIAARTAYGGNCARAAG

GT-TRAP Analysis For cDNA GT-TRAP analysis, PCR was performed on double stranded (ds) cDNA made from the RNA extracted from the 64 BIL lines using the TRAP system as described in Hu and Vicky (2003). The TF-TRAP products were separated using 6.5% polyacrylamide sequencing gel in a Li-Cor Global DNA Sequencer. Here, GT-TRAP is also called TF-TRAP. Data Analysis The AFLP and GT-AFLP markers were scored as present (1) or absent (0). To estimate the genetic similarities among genotypes a genetic distance matrix based on the Jaccard coefficient was used in the NTSYSpc, Numerical Taxonomy System, Version 2.1 (Exeter Software, Setauket, New York, USA.). Phylogenetic trees were constructed using the unweighted pair group method of arithmetic means (UPGMA). This program was used to group genotypes that are genetically related to each other based on the genetic similarity matrix.

Results and Analysis AFLP Marker Analysis With eight AFLP primer combinations, a total of 1,434 fragments (179 fragments per primer combination) were resolved by capillary electrophoresis with the CEQ 8000 Sequencer (Table 2 and 3). The number of AFLP markers was sufficient to separate the 32 genotypes into five groups (Fig. 1), consistent with their origins and known species relationships. Accessions from AD1, AD3, and AD4 were grouped into separate groups without ambiguity. Even though the majority of AD2 and AD5 were grouped into two groups, a few accessions including two AD2 and one AD5 were placed into the same group and one AD2 accession was grouped with the AD5 group. These may represent hybrids or could be the result of mis-identification. The results also confirmed that AD1 is closer to AD3 than other species, while AD4 is the most distant species among the five tetraploid species. TF-AFLP Marker Analysis Twelve TF-AFLP primer combinations amplified a total of 645 fragments (54 fragments per primer combination) from the same electrophoresis system (Table 2 and 4). It is understandable that the average number of TF-AFLP markers is only 1/3 of that for AFLP, since the TF-AFLP marker system specifically targets transcription factor genes, unlike AFLP which randomly amplifies the genome. However, similar to AFLP markers, cluster analysis reveals five major species groups (Fig. 2). All the AD1 accessions were grouped together. While the majority of AD3 and AD4 accessions were grouped into their own species groups, several genotypes were spread to several places in the dendrogram, perhaps attributable to the limited number of TF-AFLP markers. Since AD2 and AD5 accessions were close as expected, two accessions from AD2 and two accessions from AD5 were grouped into their

237

2007 Beltwide Cotton Conferences, New Orleans, Louisiana, January 9-12, 2007

Page 4: 2007: Genetic Diversity of Tetraploid Cotton Species Based ...

sister species group (Fig. 2). Good separation of closely related accessions probably will require a greater number of TF-AFLP markers than reported here. Further study will be needed to estimate the percentage of TF-AFLP markers that are amplified from the corresponding TF gene families and to map their genome locations in cotton. cDNA TF-TRAP Analysis Using the same arbitrary TRAP primer in combination with the degenerate TF primers, PCR and electrophoresis were performed on the 64 cDNA samples. Fig. 3, 4, 5, and 6 show examples of the gel images. As can been seen, different primer combinations with the same random primer Ga5-800 but different TF primers produced PCR products with different banding patterns. The high reliability of the TF-TRAP system can be gauged from the high proportion of monomorphic bands amplified by the same primer combination in all the 64 cDNA samples. It is apparent that there existed many fragments that were polymorphic in the BIL population, while many other fragments showed intensity differences that might reflect the differential expression of corresponding genes among the BIL lines. The results demonstrated that degenerate TF primers can be used in combination with TRAP primers for genome-wide profiling of TF gene expression. Table 2. TF-AFLP and AFLP markers: primer combinations, number, and size ranges

Primer combination Fragments Fragment size (bp)

No. Min. Max TF-AFLP markers EcoRI-E/bZIP-2 29 81 381 EcoRI-E/Myb-2 48 63 339 EcoRI-E/MADS-2 45 62 458 EcoRI-H/NAC-1 28 65 211 EcoRI-H/WRKY 28 65 211 EcoRI-H/bHLH-1 50 62 457 EcoRI-B/NAC-1 54 54 459 EcoRI-B/WRKY 54 54 459 EcoRI-B/G2-1 68 66 459 EcoRI-D/bZIP-1 51 54 421 EcoRI-D/Myb-1 87 55 400 EcoRI-D/MADS-1 103 55 463

Sum 645

238

2007 Beltwide Cotton Conferences, New Orleans, Louisiana, January 9-12, 2007

Page 5: 2007: Genetic Diversity of Tetraploid Cotton Species Based ...

AFLP markers

EcoR1-A/MseI-1 262 55 446

EcoR1-B/MseI-2 139 65 459

EcoR1-C/MseI-3 158 58 444

EcoR1-D/MseI-7 132 58 435

EcoR1-G/MseI-5 118 64 387

EcoR1-B/MseI-5 229 68 461

EcoR1-H/MseI-4 158 56 464

EcoR1-H/MseI-4 238 56 463

Sum 1434 Table 3. Genetic similarities (GS) between and within tetraploid species estimated using Jaccard coefficient based on AFLP markers

Species D5 Gh (AD1) Gb (AD2) Gd (AD5) Gm (AD4) Gt (AD3)

D5 NA 0.286a

0.267b 0.276 0.186

0.287 0.194

0.315 0.169

0.290 0.259

Gh (AD1) 0.279c 0.676 0.573 0.313

0.609 0.301

0.671 0.247

0.604 0.392

Gb (AD2) 0.249 0.475 0.607 0.675 0.206

0.556 0.127

0.581 0.313

Gd (AD5) 0.263 0.499 0.524 0.563 0.816 0.244

0.612 0.290

Gm (AD4) 0.259 0.436 0.393 0.441 0.475 0.584 0.242

Gt (AD3) 0.276 0.530 0.491 0.511 0.423 0.739 a- maximum GD. b- minimum GD. c- average GD.

239

2007 Beltwide Cotton Conferences, New Orleans, Louisiana, January 9-12, 2007

Page 6: 2007: Genetic Diversity of Tetraploid Cotton Species Based ...

Table 4. Genetic similarities (GS) between and within tetraploid species estimated using Jaccard coefficient based on TF-AFLP markers

Species

D5

Gh (AD1)

Gb (AD2)

Gd (AD5)

Gm (AD4)

Gt (AD3)

D5 NA

0.137a

0.084b 0.168 0.092

0.138 0.096

0.104 0.033

0.136 0.082

Gh (AD1) 0.113c 0.549

0.428 0.262

0.476 0.301

0.512 0.237

0.453 0.291

Gb (AD2) 0.128 0.325 0.384

0.485 0.258

0.377 0.145

0.399 0.225

Gd (AD5) 0.112 0.345 0.373 0.425

0.450 0.193

0.512 0.258

Gm (AD3) 0.072 0.354 0.253 0.286 0.323

0.510 0.224

Gt (AD4) 0.107 0.369 0.307 0.380 0.327 0.477 a- maximum GD. b- minimum GD. c- average GD.

Fig. 1. An UPGMA dendrogram based on AFLP markers

AD1

AD3

AD5

AD2

AD4

240

2007 Beltwide Cotton Conferences, New Orleans, Louisiana, January 9-12, 2007

Page 7: 2007: Genetic Diversity of Tetraploid Cotton Species Based ...

Fig. 2. An UPGMA dendrogram based on TF-AFLP markers Fig. 3. cDNA TF-TRAP using primer combination of WKRY-1 and Ga5-800

AD1/AD3/AD4

AD2/AD5

241

2007 Beltwide Cotton Conferences, New Orleans, Louisiana, January 9-12, 2007

Page 8: 2007: Genetic Diversity of Tetraploid Cotton Species Based ...

Fig. 4. cDNA TF-TRAP using primer combination of bZIP-2 and Ga5-800

Fig. 5. cDNA TF-TRAP using primer combination of MADS-2 and Ga5-800

242

2007 Beltwide Cotton Conferences, New Orleans, Louisiana, January 9-12, 2007

Page 9: 2007: Genetic Diversity of Tetraploid Cotton Species Based ...

Fig. 6. cDNA TF-TRAP using primer combination of bHLH-1 and Ga5-800

References

Gong, W., Y.P. Shen, L.G. Ma, Y. Pan, Y.L. Du, et al. 2004. Genome-wide ORFeome cloning and analysis of Arabidopsis transcription factor genes. Plant Physiol. 135:773-782. Hu, J., and B.A. Vick. 2003. TRAP (target region amplification polymorphism), a novel marker technique for plant genotyping. Plant Mol. Biol. Rep. 21:289–294. Vos, P., R. Hogers, M. Bleeker, M. Reijans, T. van de Lee, M. Hornes, A. Frijters, J. Pot, J. Peleman, M. Kuiper, and M. Zabeau. 1995. AFLP: a new technique for DNA fingerprinting. Nucleic Acids Res. 23:4407-4414. Wan, C.H. and T.A. Wilkins. 1994. A modified hot borate method significantly enhances the yield of high-quality RNA from cotton (Gossypium hirsutum L.). Anal. Biochem. 223:7-12. Xiong, Y., T. Liu, C. Tian, S. Sun, J. Li, and M. Chen. 2005. Transcription factors in rice: a genome-wide comparative analysis between monocots and eudicots. Plant Mol. Biol. 59:191-203. Zhang, J.F., and J.M. Stewart. 2000. Economic and rapid method for extracting cotton genomic DNA. J. Cotton Sci. 4:193-201. Zhang, J.F., Y.Z. Lu, and S.X. Yu. 2005. Cleaved AFLP (cAFLP), a modified amplified fragment length polymorphism analysis for cotton. Theor. Appl. Genet. 111:1385-1395.

243

2007 Beltwide Cotton Conferences, New Orleans, Louisiana, January 9-12, 2007