Cluster Analysis Using Quantitative, Qualitative and Molecular
-
Upload
leandro-silva -
Category
Documents
-
view
214 -
download
2
description
Transcript of Cluster Analysis Using Quantitative, Qualitative and Molecular
-
159
Cluster Analysis Using Quantitative, Qualitative and Molecular Traits for the Study of the Genetic Diversity in Pineapple Genotypes C. de Fatima Machadoa, F.V.D. Souza, J.R.S. Cabral, C.A. da Silva Ledo, A.P. de Matos and R. Ritzinger Embrapa Mandioca e Fruticultura Tropical CP 07, Cruz das Almas Bahia, 44.380-000 Brazil Keywords: Ananas spp., multivariate analysis, Gower algorithm, genetic distance,
variability Abstract
Cluster analysis using quantitative, qualitative and molecular variables is a useful tool in estimating genetic diversity between genotypes in germplasm collections. The objective of this study was to carry out a simultaneous analysis of quantitative, qualitative and molecular variables followed by clustering to study the genetic diversity between pineapple genotypes using the Gower algorithm. Eleven quantitative, five qualitative and forty three molecular characteristics in ninety pineapple genotypes were evaluated. The cophenetic correlation coefficient of the joint analysis was higher when compared with the individual analysis coefficients. Ten groups were observed which indicates higher variability among genotypes evaluated. The simultaneous analysis of the quantitative, qualitative and molecular variables was efficient in the expression of the genetic diversity between pineapple genotypes when compared to the individual analysis for each type of variable.
INTRODUCTION
The characterization of germplasm can be accomplished through the use of phenotypic traits (morphologic and agronomic descriptors), passport data and molecular markers. The characterization of the accessions is important to determine the genetic variability, to identify duplicated accessions and to establish nuclear collections. The use of multivariate techniques is one of the factors that has urged the increase in the studies about genetic diversity among genotypes (Gonalves et al., 2008, 2009; Mohammadi and Prasanna, 2003; Podany and Schmera, 2006; Sudr et al., 2007).
Cluster analysis of those traits are done individually, because the genetic distances are calculated according to the type of the trait. It has been reported by Cruz (2008) that the procedures to estimate dissimilarity measures can be based on quantitative traits (Euclidean distances or distances of Mahalanobis), binary traits (Jaccard index, Nei and Li index) and multicategorical traits (Cole-Rodgers et al., 1997). In that sense, several discrepancies are observed in relation to the clusters and to the inferences in relation to the quantification of the variability among accessions of a germplasm bank. A technique that allows the combined analysis of quantitative and qualitative data was proposed by Gower (1971).
The objective of this effort was to accomplish the characterization of pineapple accessions of the Active Germplasm Bank of Embrapa Cassava and Tropical Fruits through the combined analysis of quantitative, qualitative and molecular data.
MATERIALS AND METHODS
Ninety pineapple accessions of the Active Germplasm Bank of Embrapa Cassava and Tropical Fruits were evaluated. Each accession was evaluated with regard to 11 quantitative traits (fruit weight, crown weight, fruit length and diameter, crown length, axis diameter, soluble solids and acidity of the pulp, plant height, stalk length and a [email protected]
Proc. 7th International Pineapple Symposium Eds.: H. Abdullah et al. Acta Hort. 902, ISHS 2011
-
160
diameter), 5 qualitative traits (leaf color, presence of spines along the borders of the leaves, fruit shape, colors of the fruit and the pulp) and 43 RAPD (Random Amplified Polymorphic - DNA) molecular markers.
A combined analysis of the quantitative, qualitative and molecular data was accomplished for determination of the genetic distance, based on the algorithm described by Gower (1971). The hierarchical clusters of the accessions were obtained by the UPGMA - Unweighted Pair-Group Method With Arithmetic Averages (Sneath and Sokal, 1973). The validation of the clusters was determined by the cophenetic correlation coefficient (Sokal and Rohlf, 1962) and its significance was calculated by the test of Mantel with 10,000 permutations (Mantel, 1967).
A statistical software was used (R Development Core Team, 2006) for the calculation of the algorithm of Gower (Gower, 1971). The cophenetic correlation coefficient was calculated using the Genes software (Cruz, 2008) and the dendrogram was generated based on the matrix distance using the software Statistica (Statsoft, 2005).
RESULTS AND DISCUSSION
The dendrogram of dissimilarity obtained through the algorithm of Gower (Gower, 1971) in the evaluation of 11 quantitative traits, 5 qualitative traits and 43 molecular markers is represented in Figure 1. The dissimilarity matrix mean calculated from the UPGMA method (0.15) provided the formation of 10 groups. On Table 1 the listed accessions belong to each one of the 10 formed groups.
The cophenetic correlation coefficient (CCC) was 0.92** (significant by the test of Mantel with 10,000 permutations). The high CCC value indicates high correlation between the original distance matrices and the cluster matrix; Mohammadi and Prasanna (2003), Podani and Schmera (2006) and Gonalves et al. (2008) came to the same conclusion.
The analysis through the algorithm of Gower (Gower, 1971) was efficient in expressing the degree of genetic diversity among the pineapple accessions evaluated, demonstrating that the combined analysis of quantitative and qualitative data with the molecular markers provides a greater efficiency in the knowledge of the genetic diversity. According to Gonalves et al. (2009), the type and number of the variables chosen can jeopardize the efficiency of the combined analysis, especially in the case of using a large number of data from molecular markers, in the quantification of the genetic diversity.
CONCLUSION
The pineapple accessions of the Active Germplasm Bank of Embrapa Cassava and Tropical Fruits show high genetic variability based on quantitative, qualitative and molecular traits.
Literature Cited Cole-Rodgers, P., Smith, D.W. and Bosland, P.W. 1977. A novel statistical approach to
analyze genetic resource evaluations using Capsicum as an example. Crop Sci. 37:1000-1002.
Cruz, C.D. 2008. Programa genes (verso Windows): aplicativo computacional em gentica e estatstica. Viosa: UFV.
Gonalves, L.S, Rodrigues, R., Amaral, A.T.Jr., Karasawa, M. and Sudr, C.P. 2009. Heirloom tomato gene bank: assessing genetic divergence based on morphological, agronomic and molecular data using a Ward-modified location model. Genet. Mol. Res. 8:364-374.
Gonalves, L.S.A., Rodrigues, R. and Amaral Jnior, A.T. 2008. Comparison of multivariate statical algorithms to cluster tomato heirloom accessions. Genetics and Molecular Research 7:1289-1297.
Gower, J.C. 1971. A general coefficient of similarity and some of its properties. Biometrics, Arlington 27:857-874.
Mantel, N. 1967. The detection of disease clustering and generalized regression approach.
-
161
Cancer Research, Birmingham 27:209-220. Mohammadi, S.A. and Prasanna, B.M. 2003. Analysis of genetic diversity in crop plants -
salient statistical tools and considerations. Crop Sci. 43:1235-1248. Podani, J. and Schmera, D. 2006. On dendrogram-based measures of functional diversity.
Oikos 115:179-185. R Development Core Team. 2006. A language and environment for statistical computing.
Vienna: R Foundation for Statistical Computing. Sneath, P.H. and Sokal, R.R. 1973. Numerical taxonomy: the principles and practice of
numerical classification. San Francisco: W.H. Freeman. 573p. Sokal, R.R. and Rohlf, F.J. 1962. The comparison of dendrograms by objective methods.
Taxon. 11:33-40. Statsoft, Inc. 2005. Statistica for Windows (data analysis software system), version 7.1.
Statsoft, Tulsa, Oklahoma (USA). Sudr, C.P., Leonardez, E., Rodrigues, R., Amaral Jnior, A.T. et al. 2007. Genetic
resources of vegetable crops: a survey in the Brazilian germplasm collections pictured through papers published in the journals of the Brazilian Society for Horticultural Science. Hortic. Bras. 25:496-503.
Tables Table 1. Groups and accessions within groups formed in the cluster analysis obtained by
the algorithm of Gower in the evaluation of 11 quantitative traits, 5 qualitative traits and 43 molecular markers.
Groups Accessions 1 FRF-632 2 Guiana 3 Perolera 4 RBR-1, SNG-3 5 SNG-2 (Quinari) 6 FRF-678, FRF-762, FRF-1399, Pao de Acucar 7 LBB-1384, LBB-1374 8 LBB-1385 9 Hbrido-3607, Natal Queen, Puerto Rico I-67, Smooth Cayenne,
Smooth Cayenne 3, Smooth Cayenne 2 10 AltoTuri, AUST-2480, BGA-25, Boituva, Champaka F-153, Codazzi.
Comum, Fazenda Barreiro, Fernando Costa, FFR-1200, FRF-11, FRF-1202, FRF-1208, FRF-1220, FRF-1221, FRF-1226, FRF-1358, FRF-1369,
FRF-1396, FRF-1403, FRF-150, FRF-154, FRF-156, FRF-160, FRF-162, FRF-168, FRF-235, FRF-250, FRF-297, FRF-351, FRF-585, FRF-609, FRF-634, FRF-640, FRF-652, FRF-680, FRF-684, FRF-7, FRF-733,
FRF-737, FRF-770, FRF-8, FRF-800, FRF-820, FRF-845, G-79, G-80, Inerme de Rondnia, Jupi, Jupi 2, Jupi-TO, LBB-1383, LBB-1386,
LBB-1396, LBB-1413, LBB-1444, LBB-1450, LBB-569, LBB-612, LocaldeTef, MD-2, Muito Rstico, Prola, Primavera, Rondon 2, Rondon 3,
Roxo deTef, Semi-Selvagem, SNG-1, SNG-4, TD-240
-
162
Figures
Fig. 1. Dendrogram of dissimilarity of 90 pineapple accessions obtained through the
algorithm of Gower in the evaluation of 11 quantitative traits, 5 qualitative traits and 43 molecular markers.
FR
F-63
2G
uian
aP
erol
era
LBB
-569
RB
R-1
SN
G-3
SN
G-2
(Q
uina
ri)F
RF-
678
FR
F-76
2F
RF-
1399
Pao
de
Acu
car
LBB
-138
4LB
B-1
374
LBB
-138
5H
brid
o-36
07N
atal
Que
enP
uert
o R
ico
I-67
Sm
ooth
Cay
enne
2S
moo
th C
ayen
ne 3
Sm
ooth
Cay
enne
Jupi
-TO
FR
F-84
5LB
B-1
386
FR
F-12
21F
azen
da B
arre
iroA
UST
-248
0In
erm
e de
Ron
dnia
TD
-240
LBB
-138
3F
RF-
770
Rox
o de
Tefe
Sem
i-Sel
vage
mM
uito
Rs
tico
FR
F-80
0F
RF-
1220
LBB
-141
3F
FR-
1200
BG
A-2
5F
RF-
737
G-7
9LB
B-1
450
Fer
nand
o C
osta
Alto
Tur
iP
rimav
era
LBB
-144
4R
ondo
n 3
Ron
don
2F
RF-
820
FR
F-68
4Lo
cald
eTef
SN
G-4
FR
F-13
96C
ham
paka
F-1
53M
D-2
FR
F-14
03C
odaz
ziF
RF-
351
G-8
0B
oitu
vaLB
B-6
12F
RF-
235
SN
G-1
FR
F-7
FR
F-15
6F
RF-
634
FR
F-16
0F
RF-
250
FR
F-58
5F
RF-
652
FR
F-64
0F
RF-
168
FR
F-15
4F
RF-
162
FR
F-15
0LB
B-1
396
FR
F-60
9C
omum
Jupi
2F
RF-
1202
FR
F-13
58F
RF-
1369
FR
F-73
3F
RF-
680
Jupi
FR
F-12
08F
RF-
1226
FR
F-11
FR
F-29
7F
RF-
8P
rol
a
0.0
0.1
0.2
0.3
0.4
Link
age
Dis
tanc
e