Analyses of genetic diversity and population structure of ...
Transcript of Analyses of genetic diversity and population structure of ...
RESEARCH ARTICLE
Analyses of genetic diversity and population structureof anchote (Coccinia abyssinica (Lam.) Cogn.) using newlydeveloped EST-SSR markers
Bekele Serbessa Tolera . Kifle Dagne Woldegebriel . Abel Teshome Gari .
Mulatu Geleta Dida . Kassahun Tesfaye Geletu
Received: 30 April 2020 / Accepted: 29 January 2021 / Published online: 17 February 2021
� The Author(s) 2021
Abstract Anchote (Coccinia abyssinica (Lam.)
Cogn.) is a perennial root crop belonging to Cucur-
bitaceae family. It is endemic to Ethiopia and
distributed over wide range of agro-ecologies. For
further improvement and efficient conservation of this
crop, characterization of its genetic diversity and its
pattern of distribution is a vitally important step.
Expressed sequence tags-simple sequence repeats
(EST-SSRs) markers were developed from publicly
available watermelon [Citrullus lanatus (Thunb.)
Matsum. & Nakai] ESTs in the GenBank database.
Among those novel markers, eight were polymorphic
and subsequently used for genetic diversity and
population structure analyses of 30 anchote accessions
collected from western Ethiopia. A total of 24 alleles
were obtained across the eight polymorphic loci and
30 accessions that revealed moderate level of genetic
diversity in this minor crop. Among the eight loci,
locus CA_06 was the most informative with six alleles
and polymorphic information content (PIC) of 0.76.
The accessions showed about threefold variation in
terms of genetic diversity, with expected heterozy-
gosity (He) ranging from 0.15 (accession An) to 0.44Supplementary information The online version of thisarticle (https://doi.org/10.1007/s10722-021-01132-5) containssupplementary material, which is available to authorized users.
B. Serbessa Tolera (&)
Department of Biology, Wollega University,
P. O. Box 395, Nekemte, Ethiopia
e-mail: [email protected]
K. Dagne Woldegebriel � K. Tesfaye Geletu
Department of Microbial, Cellular and Molecular
Biology, Addis Ababa University,
P. O. Box 1176, Addis Ababa, Ethiopia
e-mail: [email protected]; [email protected]
K. Tesfaye Geletu
e-mail: [email protected]
A. Teshome Gari
The John Innes Centre, Norwich Research Park,
Norwich NR4 7UH, UK
e-mail: [email protected]
M. Geleta Dida
Department of Plant Breeding, Swedish University of
Agricultural Sciences, Box 101, 23053 Alnarp, Sweden
e-mail: [email protected]
K. Tesfaye Geletu
Ethiopian Biotechnology Institute, Addis Ababa, Ethiopia
123
Genet Resour Crop Evol (2021) 68:2337–2350
https://doi.org/10.1007/s10722-021-01132-5(0123456789().,-volV)( 0123456789().,-volV)
(accession Dg). Other accessions with higher genetic
diversity include Ar and Gu (He = 0.43 and 0.41,
respectively). Analysis of molecular variance
(AMOVA) revealed that the variation within acces-
sions and among accessions accounted for 84.7% and
15.3% of the total variation, respectively. The study
revealed low but significant population differentiation
in this crop with no clear pattern of population
structure. The EST-SSR markers developed in this
study are the first of their kind for anchote and can be
used for characterization of its wider genetic resources
for conservation and breeding purposes.
Keywords Accession � Anchote � Cocciniaabyssinica � EST-SSR � Genetic diversity � Populationstructure
Abbreviations
AMOVA Analysis of molecular variance
AR Allelic richness
EST Expressed sequence tags
FMA Frequency of major allele
SNP Single nucleotide polymorphism
SSR Simple sequence repeat
Introduction
Anchote (Coccinia abyssinica (Lam.) Cogn.) is a
perennial root crop that belongs to Cucurbitaceae
family (Edwards et al. 1995; Hora et al. 1995). It is
endemic to Ethiopia and distributed over a wide range
of agro-ecologies adapting to altitudes as low as
550 m above sea level (masl) and as high as 2800 masl
(Getahun 1973). The plant grows well in areas where
annual rainfall is between 950 and 2000 mm (Getahun
1973; Westphal 1974). This species can reproduce
both sexually and asexually through their root tubers.
Although anchote is monoecious, the main mode of
sexual reproduction is through outcrossing due to
protandry (Edwards et al. 1995). Asexually, it under-
goes an annual cycle in which its herbaceous shoots
die out and then new shoots emerge at the onset of
rainy season from its ‘‘everlasting’’ rootstock. Sowing
seeds is a preferred means of anchote cultivation as the
crop produces hundreds of seeds per fruit (Hora et al.
1995; Wondimu et al. 2014).
Anchote tuber is rich in protein and calcium with
low content of anti-nutritional factors, and hence
highly recommended as human food (Hora et al.
1995). It has also been used as traditional medicine, in
Ethiopia, to treat various illnesses such as Bone
fracture, backache, displaced joints and other diseases
such as gonorrhea, tuberculosis, and cancer (Dawit
and Estifanos 1991; Gelmesa 2010). Even though
anchote is economically, nutritionally, and medici-
nally valuable crop plant, there is limited information
regarding its genetics, breeding, best agronomic
practices and phylogeography to date. Recent studies
on its nutritional contents (Desta 2011), morpholog-
ical traits (Wondimu et al. 2014) and genetic diversity
using ISSR markers (Bekele et al. 2014) revealed key
information that can be used for its improvement.
However, these studies targeted only limited geo-
graphical areas where this crop is currently grown in
Ethiopia, and hence the sample size and geographic
coverage in previous study did not fully represent the
wider gene pool of anchote in the country.
In order to harness the nutritional values and other
benefits of this crop, quantifying its genetic diversity is
a primary step. At present, single nucleotide polymor-
phism (SNP) markers are the marker of choice as they
have genome-wide coverage and abundance in genic
regions of most studied crops. However, such molec-
ular markers require significant investment in terms of
resources and time and hence are not an immediate
option for orphan crop like anchote. Alternatively,
expressed sequence tags-simple sequence repeats
(EST-SSRs) are the second best option, as they are
quick and cheaper to develop and use (Ellis et al.
2006). EST-SSRs markers have attributes, such as
ease to develop, multi-allelic, transferable across
genera and are from expressed section of the genome.
They are also among markers of choice for QTL
analysis to identify genes driving traits of agronomic
interest (Ellis and Burke 2007). Therefore, developing
such markers for anchote, would lead to enhanced
understanding of its genome in general. In particular,
such molecular tool can guide in-situ and ex-situ
conservation efforts in Ethiopia and also play a role in
breeding new varieties. Therefore, the present study
was aimed at developing new EST-SSR markers for
anchote and uses them for assessing the genetic
diversity and population structure of anchote acces-
sions grown in Western Ethiopia where this crop is
staple food.
123
2338 Genet Resour Crop Evol (2021) 68:2337–2350
Materials and methods
Plant material
Seed samples were collected from anchote accessions
in 2013 from areas where the plant exists in nature and
where it is widely cultivated by small-scale farmers
(Fig. 1). Thirty accessions, each represented by five
individual plants, were selected based on their geo-
graphic distribution in the sampling area and their
seeds were planted at Holeta Agricultural Research
Center, Ethiopian Institute of Agricultural Research,
located 41.7 km west of Addis Ababa during the year
2013/14 crop growing season. Three weeks after
planting, young and healthy leaves were separately
sampled from each seedling representing the 30
accessions. Leaf samples were kept in a plastic zipper
containing sufficient silica gel for efficient desiccation
and stored at room temperature until DNA extraction
was conducted.
DNA extraction
Each silica gel dried leaf sample was transferred to a
2 ml eppendorf tube containing two glass beads and
was ground to fine powder using Mixer Mill MM 400
(Retsch GmbH, Germany). The DNA extraction from
the ground samples was done following protocol
described in Geleta et al. (2012). The quality and
quantity of extracted DNA was assessed by conduct-
ing agarose gel electrophoresis (1.5% (w/v) and
NanoDrop measurement. The genomic DNA
extracted from four of the samples was of poor quality
and consequently discarded. Hence, genomic DNA
Fig. 1 Map of Ethiopia showing sample collecting sites (districts) of the 30 accessions used in this study
123
Genet Resour Crop Evol (2021) 68:2337–2350 2339
from 146 individual plants representing the 30 acces-
sions was used in this study (Table 1).
EST-SSRs screening and PCR
Watermelon [Citrullus lanatus (Thunb.) Matsum. &
Nakai] expressed sequence tags were downloaded
from National Center for Biotechnology Information
(NCBI) database (https://www.ncbi.nlm.nih.gov/
nuccore). Then, WebSat (Martins et al. 2009), a web
software for microsatellite marker development, was
used to identify ESTs containing simple sequence
repeats (di, tri, tetra, penta and hexa repeats). This was
followed by using Primer3 (Rozen et al. 2000), an
online primer designing program, for designing pri-
mer-pairs targeting the SSR containing regions of the
ESTs. At the end, forty-seven primer-pairs were suc-
cessfully designed and tested to determine their
potential in amplifying the homologous DNA regions
in anchote genome. Five, randomly selected, genomic
DNA of anchote accessions were used to test these
primer-pairs.
The total reaction volume for the amplification of
the EST-SSRs was 25 ll composed of 2.5 ll of
10 9 PCR buffer (10 mM Tris–HCl pH 8.3, 50 mM
KCl, 1.25 mM MgCl2), 1.5 ll of 25 mM MgCl2,
0.3 ll of 25 mM dNTPs, 0.75 ll of 10 lM of each of
forward and reverse primers, 1.0 U (0.2 ll) of Dream
Table 1 List of 30 anchote
accessions used in this
study, their sampling
districts, and coordinates
and altitude of the sampling
sites
No Accessions code Sampling district* Coordinates Altitude (m asl)
1 Ay Ayira 9�05050.1600 N 35�23046.86 00E 1625.803
2 La Lalo Asabi 9�10014.890 N 35�28021.77 00E 1574.597
3 Ds Dale Sadi 8�53059.1200 N 35�13028.30 00E 1516.685
4 Dw Dale Wobera 8�55046.0100 N 35�03009.30 00E 1606.296
5 Sy Sayo 7�36005.6800 N 36�43031.35 00E 1843.43
6 An Anfilo 8�28019.6400 N 34�35026.55 00E 1496.568
7 Gm Gimbi 9�10015.150 0 N 35�50008.88 00E 1967.789
8 Nj Nejo 9�30008.2600 N 35�30010.22 00E 1870.862
9 Mt Metu 8�18000.0100 N 35�35000.65 00E 1703.222
10 Al Alle 8�08057.0400 N 35�32012.79 00E 1950.415
11 Hr Hurumu 8�20051.3100 N 35�41049.24 00E 1815.084
12 Yy Yayo 8�20009.4200 N 35�49020.82 00E 1598.981
13 Ch Chora 8�21039.8600 N 36�07014.56 00E 2021.129
14 De Dega 8�34006.6900 N 36�06036.03 00E 2266.798
15 Gc Gechi 8�20051.1100 N 35�51053.91 00E 1642.872
16 Dh Dhidhesa 8�04053.5500 N 36�27032.67 00E 1902.257
17 Gu Gumay 8�01000.0000 N 36�31000.00 00E 1621.536
18 Go Goma 7�51026.250 N 36�35018.16 00E 1671.523
19 Gr Gera 7�30000.140 0 N 36�04000.00 00E 2164.69
20 Sh ShebeSombo 7�30022.2500 N 36�30048.40 00E 1776.07
21 Dd Dedo 7�30058.8000 N 36�52039.12 00E 2112.569
22 Dc Decha 6�29048.3300 N 43�57058.05 00E 2150.001
23 Gg GutoGida 9�01022.3000 N 36�28024.35 00E 2242.414
24 Dg Diga 9�01032.4000 N 36�27014.53 00E 2215.896
25 Ar Arjo 8� 45018.3400 N 36�29053.66 00E 2510.638
26 Ld LekaDulecha 8�53051.9200 N 36�29011.29 00E 2161.946
27 Ss Sibu Sire 9�01058.1300 N 36�51059.67 00E 1818.742
28 Wt WayuTuka 9�04030.3700 N 36�30047.77 00E 2320.002
29 Ac AbayChomen 9�31028.0900 N 37�21004.60 00E 1980.01
30 Bt BakoTibe 9�070 05.1700 N 37�030 30.65 00E 1780.005
123
2340 Genet Resour Crop Evol (2021) 68:2337–2350
Taq DNA polymerase (Sigma, Germany), 2.5 ll of10 ng template DNA, and 16.5 ll Millipore water.
PCR amplification was performed in 96-well plates
using Thermal Cycler (S 1000TM) machine with the
following touchdown PCR conditions: Initial 95 �Cfor 3 min preheating, nine cycles of denaturation at
94 �C for 30 s, annealing at 58 �C for 45 s and primer
extension at 72 �C for 45 s. The later three steps were
repeated nine times by reducing the annealing tem-
perature by 1 �C for each subsequent touchdown
cycle. This was followed by 30 cycles of 94 �Cdenaturing for 30 s, 48 �C primer annealing for 45 s
and 72 �C primer extension for 45 s. At the end of
these cycles, additional 3 min primer extension was
done at 72 �C. To check quality and size of amplifi-
cations, sub-sample, from each PCR reactions were
loaded to ethidium bromide containing 1.5% agarose
gel and electrophoresed using 1xTAE buffer. The gels
were photographed using gel documentation system
and the fragment sizes were compared with 50 bp
DNA ladder (GeneRulerTM, Fermentas Life Sciences),
which was used as molecular size standard.
Out of the 47 EST-SSRs primer-pairs tested, 12
primer-pairs amplified a maximum of two DNA
fragments of expected size per sample and hence
were selected for use in this study. The description of
these 12 primer-pairs is provided in Table 2. The
forward primers were 50-end labeled with carboxyflu-
orescein (6-FAMTM) or hexachlorofluorescein
(HEXTM) fluorescent dyes, whereas the reverse
primers were tailed with GCTTCT to reduce
polyadenylation and improve genotyping as described
in Ballard et al. (2002). These 12 labeled primer-pairs
were used for PCR amplifications of the DNA samples
extracted from the 146 individuals representing the 30
accessions (Table 2).
The amplified products were multiplexed into
panels based on the fragment size differences and
the type of fluorescent dye of the forward primers. This
was followed by capillary gel electrophoresis using
ABI Prism� 3730xl genetic analyzer (Applied
Biosystems). The peak identification and fragment
size determination, based on the Genescan-500 LIZ
internal size standard, were done using GeneMarker
ver. 2.4.0 (SoftGenetics).
Data analyses
Various genetic diversity parameters were analyzed
using different software. Observed number of allele
(Na), effective number of alleles (Ne), percentage of
polymorphic loci (%PL), Shannon–Weaver diversity
index (I), observed heterozygosity (Ho), expected
heterozygosity (He) and gene flow (Nm) were analyzed
using POPGENE ver. 32 software (Yeh and Yang
999). DARwin ver. 6.0.112 software (2014) was used
for cluster analysis of the 30 accessions using neighbor
joining method. POWERMARKER ver. 3.25 software
(Liu and Muse 2005) was used to calculate polymor-
phic information contents (PIC) for each locus (An-
derson et al. 1993). Analysis of molecular variance
(AMOVA) was calculated using ARLEQUIN ver.
3.01 software (Excoffier et al. 2005). HP-Rare ver. 1.0
software was used to calculate allelic richness and
private allele Kalinowski (2005), allele frequency per
locus as well as gene diversity per locus and per
population. Excel add-ins (GENAIEX) ver.6.502
software Peakall and Smouse (2004) was used for
pair-wise population differentiation, and pair-wise
genetic distance.
The population structure analysis by clustering
method and determination of optimum number of
clusters (K) were conducted using STRUCTURE ver.
2.3.4 software (Pritchard et al. 2000) using admixture
model (Gilbert et al. 2012). The STRUCTURE
analysis was conducted for K = 2 to K = 25 with
burn-in period of 200,000 and number of Markov
Chain Mont Carol (MCMC) replications of 200,000,
and 10 runs at each K. Optimum K value (maximum
DK), estimation of different genetic groups, was
determined by the STRUCTURE HARVESTER soft-
ware according to Evanno et al. (2005). Cluster
alignment across the replicates was done using
CLUMPP software (Jakobsson and Rosenberg 2007)
and the population clusters were visualized by
DISTRUCT software (Rosenberg 2004).
Results
In present study twelve EST-SSR markers were
developed for anchote of which nine were di-nu-
cleotide repeats and the rest three were tri-nucleotide
repeats (Table 2). Four of these loci (CA_03, CA_04,
CA_05 and CA_12) were monomorphic (Table 2).
123
Genet Resour Crop Evol (2021) 68:2337–2350 2341
Table
2Descriptionof12EST-SSRprimer-pairsdeveloped
forCoccinia
abyssinicabased
onCitrulluslanatusESTs,andexpectedandobserved
size
(inbp)oftheiram
plified
products(alleles)
Locus/primer
nam
e
Repeat
motif
Accession
number*
Forw
ardprimer
(50 -[
30 )
Ta
(�C)
Reverse
primer
(50 -[
30 )
Ta
(�C)
EAS
OASR
Na
CA_01
(CTT)8
AI563449.1
GCAAGCTGCCTTCATTCTCT
59.7
AAAAGTGGGGGTGATAAGAGG
59.3
189
170–179
4
CA_02
(ATC)6
JG705240.1
GCGTCAAGAAATTCCCTCTC
58.9
TTGAAACTTGGAATGGATTGG
59.8
390
395–398
2
CA_03
(AT)9
JG704431.1
CAACGAAAAAGAAAACACATCC
58.6
AAGGAGAAGCTTCCAGGACA
59.0
372
397
1
CA_04
(AT)9
JG704425.1
ATCGAGGGATATGTGGTGGT
59.1
CCCCCATTTACTCAATTTCC
58.2
284
291
1
CA_05
(AT)8
JG704264.1
GAAAAAGAAAACACATCCCTTCC
60.2
GAGGCTGCTCCTCATCATCA
61.5
297
316
1
CA_06
(TC)6
JG704618.1
AGCAGCTGCAGCAGTGAAAT
61.3
GATGACCTTAAAAGATATTCTACAAGC
57.4
240
258–268
6
CA_07
(CCT)7
JG703774.1
AACCTCTCACACACAAACAAAAA
58.7
CGATGCGATTGAGAACGAA
60.9
242
234–246
3
CA_08
(TA)7
JG705018.1
TGGGTTGTTGCACAAATTGA
60.9
AAAATGAATTCCCAAGAAAGATG
58.5
300
323–339
2
CA_09
(TA)11
JG703427.1
GACAATCCAATACCCTTTACATTT
57.5
TTTGCCTTCATGTTTGCGTA
60.2
236
227–231
3
CA_10
(AG)14
JG703239.1
CCTCCAAACTCATCATTACCC
58.4
TCAATCTCAATGGACATGAAAGA
59.6
400
402–404
2
CA_11
(CT)14
JG703179.1
TTGGCTTTTGGATTCTGCTT
59.8
TAAGCCAAGACCCCTTGATG
60.1
245
225–229
2
CA-12
(AT)8
JG702160.1
CAAAATGAACACTATGACAACGAA
59.1
CATGAAGAAGATCAAGGAAAGGA
59.7
300
322–324
1
EASestimated
allele
size;OASRobserved
allele
size
range;
Naobserved
number
ofalleles
*Accessionnumber
ofwatermelonESTsusedto
developthecorrespondingSSRs
123
2342 Genet Resour Crop Evol (2021) 68:2337–2350
The other eight loci showed different levels of
polymorphism and their data was used for genetic
diversity and population structure analyses of the 30
accessions. A total of 24 alleles were identified across
the 146 samples and the eight loci (Table 3). The
observed number of alleles per locus (Na) ranged from
two to six with an average of three alleles per locus.
The effective number of allele (Ne) ranged from 1.07
(locus CA_08) to 4.70 (locus AC_06) with an average
of 1.90 (Table 4). The frequency of the alleles across
all accessions varied from 0.01 (Alleles A and D of
locus CA_01) to 0.97 (allele B of locus CA_08)
(Table 3). Locus CA_06 is the most polymorphic loci
with six alleles and polymorphic information content
(PIC) of 0.76, whereas the lowest PIC (0.06) was
recorded in locus CA_08. The highest (1.64) and
lowest (0.14) Shannon’s information index (I) were
recorded for loci CA_06 and CA_08, respectively. The
eight loci have a mean PIC and I values of 0.31 and
0.63, respectively (Table 4). Among the loci with only
two alleles, locus CA_11 is the most informative with
PIC value of 0.37. Among the eight loci, locus CA_06
and locus CA_08 are the highest and lowest values
(Table 4).
Gene diversity (He) estimated across the accessions
for each locus varied from 0.06 (locus CA_08) to 0.79
(locus CA_06) with a mean of 0.37, whereas observed
heterozygosity (Ho) varied from 0.01 (locus CA_08)
to 0.84 (locus CA_11) with a mean of 0.35 (Table 5).
Locus CA_01, CA_07, and CA_08 indicated signif-
icant inbreeding of individuals in each accession (FIS)
whereas all loci, except CA_10 and CA_11, showed
significant inbreeding of individual genotypes when
compared to random association of alleles in the whole
population (FIT). CA_02, CA_06 and CA_08 are
major contributing loci for significant population
differentiations (FST). The mean total (Ht), within
accessions (Hs) and among accessions (DST) gene
diversity were 0.37, 0.35 and 0.02, respectively. The
overall average population differentiation (FST) and
gene flow (Nm) were 0.11 and 2.06 (Table 5).
Gene diversity across the loci for each accession
varied from 0.15 (accession An) to 0.44 (accession
Dg). Other accessions with a relatively high diversity
include Ar (0.43), Gm (0.41), Gu (0.41), Ac (0.40) and
Sy (0.40) (Table 6). The average allelic richness of the
accessions varied from 1.38 (accession An) to 2.38
(accession Ar) (Table 6). Accessions with second
highest allelic richness (AR = 2.13) are Dw, Gm, Nj,
Dg, Ld, Ac, and Bt.
A minimum frequency of major allele (FMA)
observed was 0.2 for Ld accession by locus CA-10.
Table 3 Frequency distribution of different alleles of the eight polymorphic loci
Allele/Locus CA-01 CA_02 CA_06 CA_07 CA_08 CA_09 CA_10 CA_11
Allele A 0.01 0.22 0.05 0.02 0.03 0.10 0.28 0.51
Allele B 0.61 0.78 0.24 0.05 0.97 0.88 0.72 0.49
Allele C 0.37 – 0.23 0.93 – 0.02 – –
Allele D 0.01 – 0.17 – – – – –
Allele E – – 0.25 – – – – –
Allele F – – 0.05 – – – – –
The alleles are named from ‘‘A’’ to ‘‘F’’ in respective order of their allele size (small to large) within the observed allele size range
(OASR) provided in Table 1 for each locus
Table 4 Summary of genetic diversity parameters across all
accessions for the eight polymorphic loci
Locus Na Ne I PIC AFR
CA_01 4 1.96 0.75 0.39 0.01–0.61
CA_02 2 1.52 0.53 0.28 0.22–0.78
CA_06 6 4.70 1.64 0.76 0.05–0.25
CA_07 3 1.15 0.29 0.13 0.02–0.93
CA_08 2 1.07 0.14 0.06 0.03–0.97
CA_09 3 1.28 0.42 0.20 0.02–0.88
CA_10 2 1.67 0.59 0.32 0.28–0.72
CA_11 2 2.00 0.69 0.37 0.49–0.51
Mean 3 1.90 0.63 0.31
Na observed number of alleles, Ne expected number of alleles,
I Shannon diversity index, PIC polymorphic information
content, AFR allele frequency range
123
Genet Resour Crop Evol (2021) 68:2337–2350 2343
The maximum allelic richness (AR) scored was 5.0 for
both Ld and Bt accessions by the same locus, CA-06.
Average gene diversity (He) for the 30 anchote
accessions for each of the eight polymorphic loci as
well as their mean values are presented in Supple-
mentary Table 1.
The pair-wise Nei’s standard genetic distance
between the accessions varied from 0.02 (accessions
Ss vs Sh) to 0.48 (accessionsGu vsDs) with the overall
mean genetic distance of 0.13 (Supplementary
Table 2). 12.9% of the accession-pairs had genetic
distance of above 0.2 between them whereas 34.9%
pairs had genetic distance of below 0.1 between them
(data not shown). Accession Ds is the most differen-
tiated with a mean genetic distance of 0.30 from the
other accessions whereas accessions Ar and Ac are the
least differentiated with a mean genetic distance of
0.09. The analysis of pair-wise FST revealed that
12.9% accession-pairs were significantly differenti-
ated (Supplementary Table 2).
Analysis of molecular variance (AMOVA) of the
genotypic data of the 30 accessions partitioned the
Table 5 Estimates of overall heterozygosity, genetic diversity, population differentiation, F-statistics and gene flow for the eight
polymorphic loci across the 30 anchote accessions
Locus Ho He Hs Ht DST FIS FIT FST Nm
CA_01 0.35 0.49 0.48 0.49 0.01 0.23** 0.30** 0.09 2.60
CA_02 0.29 0.35 0.29 0.35 0.06 - 0.06 0.16* 0.20** 0.98
CA_06 0.71 0.79 0.72 0.80 0.07 - 0.04 0.09* 0.13** 1.65
CA_07 0.08 0.13 0.12 0.13 0.01 0.26** 0.35** 0.12 1.82
CA_08 0.01 0.06 0.07 0.10 0.02 0.90** 0.93** 0.30** 0.59
CA_09 0.19 0.22 0.21 0.22 0.01 0.06 0.15* 0.09 2.61
CA_10 0.36 0.40 0.38 0.41 0.02 0.00 0.10 0.10 2.23
CA_11 0.84 0.50 0.50 0.50 0.00 - 0.69 - 0.67 0.01 19.71
Mean 0.35 0.37 0.35 0.37 0.02 - 0.08 0.04 0.11 2.06
Ho observed heterozygosity; He expected heterozygosity; Hs gene diversity within populations; Ht total gene diversity; DST Nei’s
(1978) unbiased average gene diversity between populations; FIS, FIT, FST fixation indices; Nm Gene flow estimated from FST (Nei
1987)*P\ 0.05; **P\ 0.01
Table 6 Average
frequency of major alleles,
allelic richness, and gene
diversity of the 30 anchote
accessions
Accession FMA AR He Accession FMA AR He
Ay 0.74 2.00 0.37 Dh 0.83 1.63 0.21
La 0.78 1.88 0.30 Gu 0.70 1.88 0.41
DS 0.83 1.75 0.27 Go 0.82 1.75 0.27
Dw 0.73 2.13 0.38 Gr 0.75 1.75 0.34
Sy 0.70 2.00 0.40 Sh 0.76 1.88 0.34
An 0.90 1.38 0.15 Dd 0.82 1.75 0.26
Gm 0.75 2.13 0.41 Dc 0.83 1.63 0.25
Nj 0.71 2.13 0.39 Gg 0.74 2.00 0.37
Mt 0.78 1.88 0.32 Dg 0.66 2.13 0.44
Al 0.80 1.75 0.27 Ar 0.72 2.38 0.43
Hr 0.74 1.88 0.35 Ld 0.74 2.13 0.36
Yy 0.79 1.75 0.29 Ss 0.73 1.75 0.36
Ch 0.81 1.75 0.26 Wt 0.69 1.75 0.34
De 0.82 1.75 0.26 Ac 0.73 2.13 0.40
Gc 0.83 1.50 0.25 Bt 0.72 2.13 0.39
123
2344 Genet Resour Crop Evol (2021) 68:2337–2350
total variance into among individuals within acces-
sions and among accessions components which
accounted for 84.65% and 15.35% of the total
variance, respectively. The analysis revealed a signif-
icant population differentiation with FST value of 0.15
(P\ 0.001) (Table 7). Furthermore, the genetic
relationship between the 30 accessions was deter-
mined through Neighbor-joining cluster analysis
based on Nei’s standard genetic distance between the
accessions. The cluster analysis revealed three major
clusters, with cluster-I, II and III comprising 25, 2 and
3 accessions, respectively (Fig. 2), but fails to show
any clear association with geographic origin. The
optimal number of genetic clusters (K) revealed
through admixture model-based population structure
analysis was four (Fig. 3a). Hence, the 146 individuals
of the 30 accessions most likely originated from four
genetic populations. The analysis clearly suggested
low differentiation between the accessions, as all
accessions have alleles originated from the four
genetic populations (clusters) as shown by graphical
representation of the population genetic structure of
the 30 accessions at K = 4 (Fig. 3b).
Discussion
Anchote has a good potential to share the burden of
cereals and other crops as additional calorie source in
Ethiopia. The fact that anchote is a drought tolerant
tuber crop (Getahun 1973), makes it ideal candidate
for the current erratic climatic conditions character-
ized by frequent droughts. However, despite its
attributes in terms of economy, nutrition and resi-
lience, it has yet to receive research attention and only
semi-domesticated (Dawit and Estifanos 1991; Hora
1995; Gelmesa 2010). In the present study, a total of
12 EST-SSR markers were developed from publicly
available EST sequences of watermelon. The fact that
these markers were able to amplify expected size
fragments in anchote, opens the door of adopting other
genomic tools such as SNPs/indels developed in
related species, like watermelon, for anchote gene
pool characterization and improvement. In addition,
these newly developed EST-SSR markers can be a
vital tool for taxonomic status determination of the
species and subspecies of the genus Coccinia.
Among the newly developed EST-SSR markers,
four were monomorphic across 146 individuals
included in study, hence excluded from further
diversity analyses. These monomorphic loci may turn
out to be polymorphic if additional anchote accessions
from wider geographic areas were to be included,
because analysis of a wider gene pool of crop species
facilitates the identification of rare alleles (Kalinowski
2004). Rare alleles within genes of significant func-
tions are usually related to out-breeding species, like
anchote, where they occur in heterozygous form;
otherwise can be deleterious in their homozygote state
(Frankham 2002). Hence, further study is required to
elucidate the significance of these loci in population
genetics analyses of anchote.
The two commonly occurring types of microsatel-
lites, in flowering plant species, are di-nucleotide and
tri-nucleotide (Simko 2009; Cavagnaro et al. 2010;
Wen et al. 2010; Dillon, et al. 2014). In the present
study, di, tri, tetra, penta and hexanucleotide repeat
EST-SSRs were included in the initially targeted 47
loci although most of them were di and trinucleotide
repeat SSRs. However, only the dinucleotide and
trinucleotide repeat SSRs successful in amplifying
homologous regions in anchote (Table 1), suggesting
the higher transferability of these two groups of SSRs
than the other groups. However, changes in repeat
number of di-nucleotide repeat SSRs causes frame
shift mutation that may alter the associated protein
and/or function (Metzgar et al. 2000). Interestingly,
the highest allelic richness in the present study was
Table 7 Analysis of molecular variance (AMOVA) of the 30 anchote accessions based on data from the eight polymorphic loci
Source Degrees of freedom Sum of squares Variance components Percentage of variation FST P
Between accessions 29 79.29 0.179 va 15.35 0.15 \ 0.001
Within accessions 262 259.10 0.989 vb 84.65
Total 291 338.39 1.168 100
123
Genet Resour Crop Evol (2021) 68:2337–2350 2345
observed for locus CA_06 which is a di-nucleotide
repeat. The other polymorphic loci were also moder-
ately informative, with locus CA_01 (trinucleotide
repeat) being the second most polymorphic locus with
four alleles. In general, the maximum number of
alleles per locus (EST-SSR marker) was six, which is
comparable with what was observed in related species
like Cucumis sativus (Kong et al. 2006), Cucur-
bitaceae species (Reddy 2009) and Cucurbita pepo
(Mao et al. 2014).
The moderate genetic diversity observed in this
study indicates that strong selection pressure; probably
farmers’ intervention to improve yield and related
traits may have caused reduced polymorphism. A
similarly narrow genetic base in cultivated cultivars of
watermelon was observed when genotyped with
similar or other types of molecular markers (Wang
2011). Therefore, in the future breeding efforts,
accessions with highest genetic diversity among the
30 accessions such as Dg, Ar and Gm should be
considered for diversifying the genetic base of culti-
vated varieties and also to incorporate resilient genes
capable of withstanding different abiotic stresses. On
the contrary, An, Dh, Gc were the accessions with the
lowest genetic diversity observed, which might be
inferred to recent adaptation of anchote to the areas
where these accessions were collected (founder effect)
or recent bottleneck (Hawks et al. 2000).
The highest values of fixation indices for inbreed-
ing (FIS and FIT) were recorded for loci CA_08 and
Fig. 2 Neighbor joining tree of 30 anchote accessions constructed based on Nei’s standard genetic distance
123
2346 Genet Resour Crop Evol (2021) 68:2337–2350
CA_07 (Table 4). Theses loci have higher heterozy-
gote deficiency, because for most populations the
alleles of these loci were less than two. Such loci
should be given further research attention as they
might be linked to or located at vital genes that
minimize recombination frequency in order to prevent
loss of vital function. Contrary to these loci, locus
CA_11 was represented by excess heterozygosity
which suggest heterozygote advantage at that locus.
Multi-allelic loci are of greater value for poorly
studied crops like anchote as they can serve as
molecular barcodes for variety/accession identifica-
tion or can play a role in efficient conservation and
management of anchote germplasms in Ethiopia and
beyond. Hence, loci CA_06 and CA_01 should get
priority for these purposes, as they have higher PIC
than other loci. In this study, low but significant
differentiation was obtained between the accessions
(FST = 0.15; P\ 0.001) with 54.2% of the accession-
pairs showing significant differentiation. Similar study
using inter simple sequence (ISSR) markers revealed
higher population differentiation (FST = 0.49) (Bekele
et al. 2014). Lower population differentiation and
reduced genetic diversity are expected when EST-
SSRs are used instead of neutral markers such as ISSR,
as mutation rate in EST-SSRs is relatively low (Serra
et al. 2007). Hence, significantly differentiated acces-
sions need to be given priority both for conservation
and breeding purposes. The low population differen-
tiation is attributable to the moderate level of gene
flow between anchote populations similar with other
studies (Wickert et al. 2012). In future germplasm
collecting missions, districts from which more diverse
populations were obtained should be given priority in
order to capture further diversity. Analysis of molec-
ular variance revealed that about 85% of the total
variation accounted for the variation within acces-
sions. This is probably due to direct germplasm
exchange between farmers of different districts, and
Fig. 3 a DK plot showing its maximum value at K = 4
suggesting the optimal number of genetic clusters (populations)
of four; and b CLUMPAK plot (K = 4) of the 30 anchote
accessions based on the STRUCTURE and STRUCTURE
Harvester out puts. Each color represents a different cluster and
black vertical lines separate the accessions, and accession codes
are shown under the plot
123
Genet Resour Crop Evol (2021) 68:2337–2350 2347
through local market system as well as pollen flow
between anchote populations.
The genetic distance between populations did not
follow the geographic distance between the sampling
sites of the accessions. The neighbor-joining cluster
analysis also clearly showed complex patter of clus-
tering that is not in line with the geographic origin of
the accessions. The population structure analysis
revealed high level of admixture between the acces-
sions. The analysis also revealed not much differences
between the genotypes in terms of the likelihood of
their membership in each of the genetic populations,
indicating lack of clear population structure in
anchote. The anchote accessions included in the
present study were collected from areas where the
local communities share similar socio-economic and
food culture, which might have resulted lower than
expected differentiation/diversity between the acces-
sions and lack of clear population structure. It would
be interesting to analyze anchote germplasm repre-
senting local communities with different socioeco-
nomic and food culture in the future studies.
Conclusions
The development of novel EST-SSR markers, and
their application for the analyses of genetic diversity,
is the first of its kind in anchote. Although the number
of markers developed in this study is few, they have an
important role to play in characterization of anchote
genetic resources for conservation and breeding
purposes. The result of this study also serves as
additional evidence in supporting the across-species
transferability, within a family, of a good proportion of
EST-SSR markers. The present study revealed mod-
erate level of genetic diversity in anchote accessions
and suggested potential hotspots for conservation.
However, further research including more germ-
plasms and molecular markers is required in order to
have a clearer pattern about its population genetics as
well as for its improvement through breeding. Hence,
this study serves as a motivation for researchers to
develop genomic tools for this vital but orphan crop so
that modern breeding tools such marker assisted
selection (MAS) and genomic selection (GS) can be
applied, for accelerated delivery of high yielding and
‘‘climate smart’’ anochte varieties with wider adoption
which in turn contributes towards food and nutrition
security in Ethiopia and Eastern Africa.
Acknowledgement Wollega University partially financed
this work. We thank Addis Ababa University (Microbial,
Cellular and Molecular Biology Department) and Swedish
University of Agricultural Sciences (Department of Plant
Breeding) for their collaboration to facilitate this research;
Holeta Agricultural Research Center is acknowledged for
allowing us to grow anchote accessions used for this study at
their experimental site.
Funding This research work is financed by the Swedish
Research Council (VR) through Swedish Research Link project
grant and Addis Ababa University’s Thematic Research Project.
Availability of data and materials The authors declare that
the data supporting the findings of this study are available within
the article.
Ethics approval and consent to participate
Conflict of interest The authors declare that they have no
conflict of interest.
Not applicable.
Consent to publish We all agree to publish this work.
Consent for publication Not applicable.
Open Access This article is licensed under a Creative Com-
mons Attribution 4.0 International License, which permits use,
sharing, adaptation, distribution and reproduction in any med-
ium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative
Commons licence, and indicate if changes were made. The
images or other third party material in this article are included in
the article’s Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not
included in the article’s Creative Commons licence and your
intended use is not permitted by statutory regulation or exceeds
the permitted use, you will need to obtain permission directly
from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by/4.0/.
References
Anderson J, Churchill G, Autrique J, Tanksley S, Sorrells M
(1993) Optimizing parental selection for genetic linkage
maps. Genetics 36:181–186
Ballard LW, Adams PS, Bao Y, Bartely D, Bintzler D, Kasch L,
Petukhova L, Rosato C (2002) Strategies for genotyping:
electiveness of tailing primers to increase accuracy in short
tandem repeat determinations. J Biomol Technol 13:20–29
Bekele A, Feyissa T, Tesfaye K (2014) Genetic diversity of
anchote (Coccinia abyssinica (Lam.) Cogn.) from Ethiopia
123
2348 Genet Resour Crop Evol (2021) 68:2337–2350
as revealed by ISSR markers. Gen Res Crop Evol
61:707–719
Cavagnaro PF, Senalik DA, Yang L, Simon PW, Harkins TT,
Kodira CD, Huang S, Weng Y (2010) Genome-wide
characterization of simple sequence repeats in cucumber
(Cucumissativus L.). BMC Genom 11:1–18
Dawit A, Estifanos H (1991) Plants as a primary source of drugs
in the traditional health practices of Ethiopia. In: Engles
JMM, Hawkes JG, Worede M (eds) Plant genetic resources
of Ethiopia. Cambridge University Press, Cambridge
Desta F (2011) Phenotypic and nutritional characterization of
anchote [Coccinia abyssinica (Lam.) Cogn] accessions of
Ethiopia. M.Sc. thesis, Jimma University
Dillon NL, Innes DJ, Bally ISE,Wright CL, Devitt LC, Dietzgen
RG (2014) Expressed sequence tag-simple sequence repeat
(EST-SSR) marker resources for diversity analysis of
mango (Mangifera indica L.). Diversity 6:72–87
Edwards S, Tadesse M, Hedberg I (1995) Flora of Ethiopia and
Eritrea. National Herbarium, Addis Ababa University and
Uppsala University, Addis Ababa
Ellis JR, Burke JM (2007) EST-SSRs as a resource for popu-
lation genetic analyses. Heredity 99:125–132
Ellis JS, Knight ME, Darvill B, Goulson D (2006) Extremely
low effective population sizes, genetic structuring, and
reduced genetic diversity in a threatened umblebee species,
Bombus sylvarum (Hymenoptera: Apidae). Mol Ecol
15:4375–4386
Evanno G, Regnaut S, Goudet J (2005) Detecting the number of
clusters of individuals using the software STRUCTURE: a
simulation study. Mol Ecol 14:2611–2620
Excoffier L, Laval G, Schneider S (2005) Arlequin (version 3.0):
An integrated software package for population genetics
data analysis. Evol Biol 1:47–50
Frankham R, Ballou JD, Briscoe DA, Mclnness KH (2002)
Introduction to conservation genetics. Cambridge Univer-
sity Press, New York
Geleta M, Herrera I, Monz A, Bryngelsson T (2012) Genetic
Diversity of arabica coffee (Coffea arabica L.) in Nicar-
agua as estimated by simple sequence repeat markers. Sci
World J 2012:1–11
Gelmesa D (2010) Shifting to alternative food source: potential
to overcome Ethiopians’ malnutrition and poverty prob-
lems. In: Innovation and sustainable development in agri-
culture and food (ISDA), Montpellier-France, pp 1–10
Getahun A (1973) Developmental anatomy of tubers of anchote:
a potential dry land crop. Acta Hortic Tech Commun ISHS
33:51–64
Gilbert KJ, Andrew RL, Bock DG, Franklin MT, Kane NC,
Moore JS, Moyers BT, Renaut S, Rennison DJ, Veen T,
Vines TH (2012) Recommendations for utilizing and
reporting population genetic analyses: the reproducibility
of genetic clustering using the program STRUCTURE.
Mol Ecol 21(20):4925–4930
Hawks J, Hunley K, Lee SH, Wolpoff M (2000) Population
bottlenecks and pleistocene human evolution. Mol Biol
Evol 17(1):2–22
Hora A, Edwards S, Giday M, Tesfaye Y (1995) Anchote an
endemic tuber crop. Artistic Printing Press Enterprise,
Addis Ababa
Jakobsson M, Rosenberg NA (2007) CLUMPP: a cluster
matching and permutation program for dealing with label
switching and multimodality in analysis of population
structure. Bioinformatics 23(18):1–6
Kalinowski ST (2004) Counting alleles with rarefaction: Private
alleles and hierarchical sampling designs. Conserv Genet
5:39–543
Kalinowski ST (2005) HP-RARE 1.0: a computer program for
performing rarefaction on measures of allelic richness. Mol
Ecol Notes 5:187–189
Kong Q, Xiang C, Yu Z (2006) Development of EST-SSRs in
Cucumis sativus from sequence database. Mol Ecol Notes
6:1234–1236
Liu K, Muse SV (2005) PowerMarker: an integrated analysis
environment for genetic marker analysis. Bioinformatics
21:2128–2129
Mao W, Xu S, Ye S, Wang H, Mo GX, Gong X, Y, (2014)
Development of 19 transferable Cucurbita pepo EST-SSR
markers for the study of population structure and genetic
diversity in pumpkin (Cucurbita moschata). POJ
7(5):345–352
Martins WS, Lucas DCS, NevesBertioli KFDJ (2009) WebSat-
A web software for microsatellite marker development.
Bioinformation 3(6):282–283
Metzgar D, Bytof J, Wills C (2000) Selection against frame shift
mutations limits microsatellite expansion in coding DNA.
Genome Res 10:72–80
Peakall PR, Smouse PP (2004) GenAlEx 6.5: genetic analysis in
excel. population genetic software for teaching and
research-an update. Bioinformatics 28:2537–2539
Pritchard JK, Stephens M, Donnelly P (2000) Inference of
population structure using multilocus genotype data.
Genetics 155:945–959
Redd BU (2009) Cladistic analyses of a few members of
Cucurbitaceae using rbcL nucleotide and amino acid
sequences. Int J Biol Res 1:58–64
Rosenberg NA (2004) Distruct: a program for the graphical
display of population structure. Mol Ecol Notes 4:137–138
Rozen S, Skaletsky H (2000) Primer3 on the WWW for general
users and for biologist programmers. Meth Mol Biol
132:365–386
Serra IA, Procaaaini G, Intrieri MC, Migliaccio M, Mazzuca S,
Innocenti AM (2007) Comparison of ISSR and SSR
markers for analysis of genetic diversity in the seagrass
Posidonia oceanica. Mar Ecol Prog Ser 338:71–79
Simko I (2009) Development of EST-SSR markers for the study
of population structure in lettuce (Lactuca sativa L.). J Her100:256–262
Wang JH, Kang J, Son B, Kim K, Park Y (2011) Genetic
diversity in watermelon cultivars and related species based
on AFLPs and EST-SSRs. Not Bot Hortic Agrobiol
39(2):285–292
Wen W, Wang H, Xia Z, Zou M, Lu C, Wang W (2010)
Development of EST-SSR and genomic-SSR markers to
assess genetic diversity in Jatropha curcas L. BMC Res
Notes 3(42):1–8
Westphal E (1974) Pulses in Ethiopia, their taxonomy and
agricultural significance. Doctoral thesis, Wageningen:
Centre for Agricultural Publishing and Documentation
Wickert E, de Goes A, de Souza A, Lemos EG (2012) Genetic
diversity and population differentiation of the causal agent
of citrus black spot in Brazil. Sci World J 2012:1–14.
https://doi.org/10.1100/2012/368286
123
Genet Resour Crop Evol (2021) 68:2337–2350 2349
Wondimu T, Alamerew S, Ayana A, GaredewW (2014) Genetic
diversity analysis among anchote (Coccinia abyssinica)accessions in western Ethiopia. Int J Agric Res 9:149–157
Yeh FC, Yang R (1999) Popgene Version 1.31. Microsoft
Window based freeware for population genetic analysis.
University of Alberta and Tim Boyle, Centre for Interna-
tional Forestry Research, Alberta, Canada, vol 1, pp 1–28
Publisher’s Note Springer Nature remains neutral with
regard to jurisdictional claims in published maps and
institutional affiliations.
123
2350 Genet Resour Crop Evol (2021) 68:2337–2350