Global Database on Crop Wild Relatives

1
A global database for the distributions of crop wild relatives Nora P. Casta˜ neda- ´ Alvarez 1,2,* , Colin K. Khoury 1,3 , Chrystian C. Sosa 1 , Ruth J. Eastwood 4 , Ruth Harker 4 , Holly Vincent 2 , Harold A. Achicanoy 1 , Vivian Bernau 1 , Nigel Maxted 2 and Andy Jarvis 1 1 International Center for Tropical Agriculture (CIAT), Colombia; 2 University of Birmingham, UK; 3 Wageningen University, The Netherlands; 4 Royal Botanic Gardens, Kew, UK; * [email protected] Introduction Crop wild relatives (CWR) are undomesticated plant species that are impor- tant for agriculture due to their wide genetic diversity and their relatively close genetic relationship to cultivated species, making them an important source of unique traits in crop improvement (Maxted et al., 2006). They have been successfully used in plant breeding programs for many decades, con- tributing improved yield and nutritional content, resistance to pest and dis- eases, and tolerance to abiotic stresses (e.g. drought, flood, temperatures outside the optimal range of the crop, soils with unsuitable pH conditions) (Hajjar and Hodgkin, 2007; Maxted and Kell, 2009). Their use has been grow- ing during the last two decades, as the techniques and understanding under- lying the transfer of such traits into cultivated species improve and advance (Tanksley and McCouch, 1997). Due to these features, CWR have been specif- ically identified as valuable resources that can contribute to the adaptation of agriculture to climate change (Dempewolf et al., 2013; McCouch et al., 2013). Despite their importance and potential, crop wild relatives are underrepre- sented in ex situ collections (FAO, 2010) and their habitats are increasingly threatened due to rapid changes in land use, invasive species, pollution and climate change (Jarvis et al., 2008). A sufficient understanding of the natural distributions of CWR is fundamental to guide future collecting and conser- vation efforts. Figure 1: Examples of crop wild relatives Main Objective Building a comprehensive occurrence database of crop wild relatives, that can be used as an input for research in ecology, botany, evolution, biodiversity, and conservation, specifically to furthering our understanding of the conservation status, major threats and potential uses of these valuable wild genetic resources. Methodology 1. We designed a MySQL database following DarwinCore standards to capture information of the taxonomic classification of each record (e.g. scientific name, botanical family, au- thor and date of determination), the location where the germplasm/herbarium sample was originally found (e.g. country, coordinates, elevation, locality description), details of the institution where the germplasm accession or herbarium specimen is found, and other associated information. We also included fields from the multi-crop passport de- scriptors (Alercia et al., 2012), aiming to store details of the germplasm accessions of crop wild relatives that were included in the database. 2. Following the inventory of CWR (Vincent et al., 2013), we compiled information for these taxa from multiple sources, namely genebanks, herbaria, national programs, international agricultural research centers, online databases, scientific literature and individual scien- tists. We approached institutions and individuals by e-mail requests, visits, posters and presentations in conferences from 2011-2014. 3.The information obtained from direct visits to herbaria was digitized manually, captur- ing all the information available for the specimen (e.g. recent determinations, locality description, coordinates -if available-, barcodes, duplicates sent to other herbaria, flow- ers/fruits present in the specimen). Such digitized information was returned in kind to the providing herbaria. 4. An iterative process to detect errors in the coordinates (e.g. records falling in the ocean, records outside of the countries where the passport describes they were found, multiple records coinciding with the centroid of the country) was followed (Hijmans et al., 1999). 5.We used an automatized procedure to assign coordinates to those records having suffi- cient descriptions of the provenance of the sample (e.g. country, locality), but no coordi- nates, using The Google Geocoding API ®. 6. The entire dataset was taxonomically verified using the Taxonomic Name Resolution Ser- vice (TNRS) (Boyle et al., 2013), TaxonStand (Cayuela et al., 2012) and GRIN-Taxonomy (USDA ARS National Genetic Resources Program, n.d.). Results We created a MySQL database comprising 165 fields, holding 5,647,442 total records, where 34% of the records correspond to germplasm accessions and 66% to herbarium samples. A total of 3,231,286 records have cross-checked coordinates (see Figure 2). 322,735 records were newly georeferenced using The Google Geocoding API ® and 15,713 new records were ob- tained after digitizing the information contained in herbaria specimens. Data was gathered from more than 100 data providers, including GBIF (a comprehensive list of institutions and individuals is available here: http://www.cwrdiversity.org/data-sources/). The geographic coverage of the dataset includes 96% of the world countries and also includes records of cultivated plants (1/3 of the dataset). Records of the crop wild relatives of 80 crop gene pools can be queried and visualized in this interactive map: http://www.cwrdiversity.org/distribution-map/. Figure 2: Distributions of records currently held in the CWR occurrences database Figure 3: Existing records for 29 selected genera of wild relatives important for global food security. The size of the box represents the proportion of records held in the database Forthcoming steps We are preparing this dataset to make it available to the public by us- ing already established and robust platforms such as GBIF (through a col- laboration with the Bioversity International node) and improving the user experience on the interactive map (http://www.cwrdiversity.org/ distribution-map). We are currently refreshing data, by querying new records available through GBIF and other datasets such as speciesLink (http://splink.cria.org.br/), checking and assigning coordinates, and verifying taxonomy when necessary. References Alercia, A., Diulgheroff, S. and Mackay, M. (2012), ‘FAO/Bioversity Multi-crop passport descriptors V.2’. Boyle, B., Hopkins, N., Lu, Z., Raygoza Garay, J. A., Mozzherin, D., Rees, T., Matasci, N., Narro, M. L., Piel, W. H., McKay, S. J., Lowry, S., Freeland, C., Peet, R. K. and Enquist, B. J. (2013), ‘The taxonomic name resolution service: an online tool for automated standardization of plant names.’, BMC bioinformatics 14(1), 16. Cayuela, L., Granzow-de la Cerda, I. n., Albuquerque, F. S. and Golicher, D. J. (2012), ‘taxonstand: An r package for species names standardisation in vegetation databases’, Methods in Ecology and Evolution 3(6), 1078–1083. Dempewolf, H., Eastwood, R. J., Guarino, L., Khoury, C., M¨ uller, J. V. and Toll, J. (2013), ‘Adapting Agriculture to Climate Change : A Global Initiative to Collect, Conserve, and Use Crop Wild Relatives’, Agroecology and Sustainable Food Systems 38, 369–377. FAO (2010), The Second Report on the State of the Worlds Plant Genetic Resources for Food and Agriculture, FAO, Rome, Italy. Hajjar, R. and Hodgkin, T. (2007), ‘The use of wild relatives in crop improvement: a survey of developments over the last 20 years’, Euphytica 156(1-2), 1–13. Hijmans, R. J., Schreuder, M., De La Cruz, J. and Guarino, L. (1999), ‘Using GIS to check co-ordinates of genebank accessions’, Genetic Resources and Crop Evolution 46, 291–296. Jarvis, A., Lane, A. and Hijmans, R. J. (2008), ‘The effect of climate change on crop wild relatives’, Agriculture, Ecosystems & Environ- ment 126, 13–23. Maxted, N., Ford-Lloyd, B. V., Jury, S., Kell, S. and Scholten, M. (2006), ‘Towards a definition of a crop wild relative’, Biodiversity and Conservation 15(8), 2673–2685. Maxted, N. and Kell, S. (2009), Establishment of a global network for the in situ conservation of crop wild relatives: status and needs, Technical report, Commission on Genetic Resources for Food and Agriculture, Rome, Italy. URL: http://www.fao.org/docrep/013/i1500e/i1500e18a.pdf McCouch, S., Baute, G. J., Bradeen, J., Bramel, P., Bretting, P. K., Buckler, E., Burke, J. M., Charest, D., Cloutier, S., Cole, G., Dempe- wolf, H., Dingkuhn, M., Feuillet, C., Gepts, P., Grattapaglia, D., Guarino, L., Jackson, S., Knapp, S., Langridge, P., Lawton-Rauh, A., Lijua, Q., Lusty, C., Michael, T., Myles, S., Naito, K., Nelson, R. L., Pontarollo, R., Richards, C. M., Rieseberg, L., Ross-Ibarra, J., Rounsley, S., Hamilton, R. S., Schurr, U., Stein, N., Tomooka, N., van der Knaap, E., van Tassel, D., Toll, J., Valls, J., Varshney, R. K., Ward, J., Waugh, R., Wenzl, P. and Zamir, D. (2013), ‘Feeding the future’, Nature 499, 23–24. Tanksley, S. D. and McCouch, S. R. (1997), ‘Seed Banks and Molecular Maps: Unlocking Genetic Potential from the Wild’, Science 277, 1063–1066. USDA ARS National Genetic Resources Program (n.d.), ‘Germplasm Resources Information Network - (GRIN)’. URL: http://www.ars-grin.gov/ sbmljw/cgi-bin/taxcrop.pl?language=en Vincent, H., Wiersema, J., Kell, S., Fielder, H., Dobbie, S., Casta˜ neda ´ Alvarez, N. P., Guarino, L., Eastwood, R., Len, B. and Maxted, N. (2013), ‘A prioritized crop wild relative inventory to help underpin global food security’, Biological Conservation 167, 265–275. Acknowledgements Data gathering and analyses were undertaken as part of the initiative ”Adapting Agriculture to Climate Change: Collecting, Protecting and Preparing Crop Wild Relatives” which is supported by the Govern- ment of Norway. The project is managed by the Global Crop Diversity Trust with the Millennium Seed Bank of the Royal Botanic Gardens, Kew, UK, and implemented in partnership with national and inter- national genebanks and plant breeding institutes around the world. For further information, see http: //www.cwrdiversity.org/.

Transcript of Global Database on Crop Wild Relatives

A global database for the distributions of cropwild relativesNora P. Castaneda-Alvarez1,2,*, Colin K. Khoury1,3, Chrystian C. Sosa1, Ruth J. Eastwood4, Ruth Harker4, Holly Vincent2,Harold A. Achicanoy1, Vivian Bernau1, Nigel Maxted2 and Andy Jarvis1

1 International Center for Tropical Agriculture (CIAT), Colombia; 2 University of Birmingham, UK; 3 Wageningen University, The Netherlands;4 Royal Botanic Gardens, Kew, UK; *[email protected]

Introduction

Crop wild relatives (CWR) are undomesticated plant species that are impor-tant for agriculture due to their wide genetic diversity and their relativelyclose genetic relationship to cultivated species, making them an importantsource of unique traits in crop improvement (Maxted et al., 2006). They havebeen successfully used in plant breeding programs for many decades, con-tributing improved yield and nutritional content, resistance to pest and dis-eases, and tolerance to abiotic stresses (e.g. drought, flood, temperaturesoutside the optimal range of the crop, soils with unsuitable pH conditions)(Hajjar and Hodgkin, 2007; Maxted and Kell, 2009). Their use has been grow-ing during the last two decades, as the techniques and understanding under-lying the transfer of such traits into cultivated species improve and advance(Tanksley and McCouch, 1997). Due to these features, CWR have been specif-ically identified as valuable resources that can contribute to the adaptation ofagriculture to climate change (Dempewolf et al., 2013; McCouch et al., 2013).

Despite their importance and potential, crop wild relatives are underrepre-sented in ex situ collections (FAO, 2010) and their habitats are increasinglythreatened due to rapid changes in land use, invasive species, pollution andclimate change (Jarvis et al., 2008). A sufficient understanding of the naturaldistributions of CWR is fundamental to guide future collecting and conser-vation efforts.

Figure 1: Examples of crop wild relatives

Main Objective

Building a comprehensive occurrence database of crop wild relatives, that can be used as aninput for research in ecology, botany, evolution, biodiversity, and conservation, specificallyto furthering our understanding of the conservation status, major threats and potential usesof these valuable wild genetic resources.

Methodology

1. We designed a MySQL database following DarwinCore standards to capture informationof the taxonomic classification of each record (e.g. scientific name, botanical family, au-thor and date of determination), the location where the germplasm/herbarium samplewas originally found (e.g. country, coordinates, elevation, locality description), detailsof the institution where the germplasm accession or herbarium specimen is found, andother associated information. We also included fields from the multi-crop passport de-scriptors (Alercia et al., 2012), aiming to store details of the germplasm accessions of cropwild relatives that were included in the database.

2. Following the inventory of CWR (Vincent et al., 2013), we compiled information for thesetaxa from multiple sources, namely genebanks, herbaria, national programs, internationalagricultural research centers, online databases, scientific literature and individual scien-tists. We approached institutions and individuals by e-mail requests, visits, posters andpresentations in conferences from 2011-2014.

3. The information obtained from direct visits to herbaria was digitized manually, captur-ing all the information available for the specimen (e.g. recent determinations, localitydescription, coordinates -if available-, barcodes, duplicates sent to other herbaria, flow-ers/fruits present in the specimen). Such digitized information was returned in kind tothe providing herbaria.

4. An iterative process to detect errors in the coordinates (e.g. records falling in the ocean,records outside of the countries where the passport describes they were found, multiplerecords coinciding with the centroid of the country) was followed (Hijmans et al., 1999).

5. We used an automatized procedure to assign coordinates to those records having suffi-cient descriptions of the provenance of the sample (e.g. country, locality), but no coordi-nates, using The Google Geocoding API ®.

6. The entire dataset was taxonomically verified using the Taxonomic Name Resolution Ser-vice (TNRS) (Boyle et al., 2013), TaxonStand (Cayuela et al., 2012) and GRIN-Taxonomy(USDA ARS National Genetic Resources Program, n.d.).

Results

We created a MySQL database comprising 165 fields, holding 5,647,442 total records, where34% of the records correspond to germplasm accessions and 66% to herbarium samples. Atotal of 3,231,286 records have cross-checked coordinates (see Figure 2). 322,735 records werenewly georeferenced using The Google Geocoding API ® and 15,713 new records were ob-tained after digitizing the information contained in herbaria specimens. Data was gatheredfrom more than 100 data providers, including GBIF (a comprehensive list of institutions and

individuals is available here: http://www.cwrdiversity.org/data-sources/). Thegeographic coverage of the dataset includes 96% of the world countries and also includesrecords of cultivated plants (1/3 of the dataset).

Records of the crop wild relatives of 80 crop gene pools can be queried and visualized inthis interactive map: http://www.cwrdiversity.org/distribution-map/.

Figure 2: Distributions of records currently held in the CWR occurrences database

Figure 3: Existing records for 29 selected genera of wild relatives important for global food security. The sizeof the box represents the proportion of records held in the database

Forthcoming steps

We are preparing this dataset to make it available to the public by us-ing already established and robust platforms such as GBIF (through a col-laboration with the Bioversity International node) and improving the userexperience on the interactive map (http://www.cwrdiversity.org/distribution-map). We are currently refreshing data, by querying newrecords available through GBIF and other datasets such as speciesLink(http://splink.cria.org.br/), checking and assigning coordinates,and verifying taxonomy when necessary.

ReferencesAlercia, A., Diulgheroff, S. and Mackay, M. (2012), ‘FAO/Bioversity Multi-crop passport descriptors V.2’.

Boyle, B., Hopkins, N., Lu, Z., Raygoza Garay, J. A., Mozzherin, D., Rees, T., Matasci, N., Narro, M. L., Piel, W. H., McKay, S. J.,Lowry, S., Freeland, C., Peet, R. K. and Enquist, B. J. (2013), ‘The taxonomic name resolution service: an online tool for automatedstandardization of plant names.’, BMC bioinformatics 14(1), 16.

Cayuela, L., Granzow-de la Cerda, I. n., Albuquerque, F. S. and Golicher, D. J. (2012), ‘taxonstand: An r package for species namesstandardisation in vegetation databases’, Methods in Ecology and Evolution 3(6), 1078–1083.

Dempewolf, H., Eastwood, R. J., Guarino, L., Khoury, C., Muller, J. V. and Toll, J. (2013), ‘Adapting Agriculture to Climate Change :A Global Initiative to Collect, Conserve, and Use Crop Wild Relatives’, Agroecology and Sustainable Food Systems 38, 369–377.

FAO (2010), The Second Report on the State of the Worlds Plant Genetic Resources for Food and Agriculture, FAO, Rome, Italy.

Hajjar, R. and Hodgkin, T. (2007), ‘The use of wild relatives in crop improvement: a survey of developments over the last 20 years’,Euphytica 156(1-2), 1–13.

Hijmans, R. J., Schreuder, M., De La Cruz, J. and Guarino, L. (1999), ‘Using GIS to check co-ordinates of genebank accessions’, GeneticResources and Crop Evolution 46, 291–296.

Jarvis, A., Lane, A. and Hijmans, R. J. (2008), ‘The effect of climate change on crop wild relatives’, Agriculture, Ecosystems & Environ-ment 126, 13–23.

Maxted, N., Ford-Lloyd, B. V., Jury, S., Kell, S. and Scholten, M. (2006), ‘Towards a definition of a crop wild relative’, Biodiversity andConservation 15(8), 2673–2685.

Maxted, N. and Kell, S. (2009), Establishment of a global network for the in situ conservation of crop wild relatives: status and needs,Technical report, Commission on Genetic Resources for Food and Agriculture, Rome, Italy.URL: http://www.fao.org/docrep/013/i1500e/i1500e18a.pdf

McCouch, S., Baute, G. J., Bradeen, J., Bramel, P., Bretting, P. K., Buckler, E., Burke, J. M., Charest, D., Cloutier, S., Cole, G., Dempe-wolf, H., Dingkuhn, M., Feuillet, C., Gepts, P., Grattapaglia, D., Guarino, L., Jackson, S., Knapp, S., Langridge, P., Lawton-Rauh,A., Lijua, Q., Lusty, C., Michael, T., Myles, S., Naito, K., Nelson, R. L., Pontarollo, R., Richards, C. M., Rieseberg, L., Ross-Ibarra,J., Rounsley, S., Hamilton, R. S., Schurr, U., Stein, N., Tomooka, N., van der Knaap, E., van Tassel, D., Toll, J., Valls, J., Varshney,R. K., Ward, J., Waugh, R., Wenzl, P. and Zamir, D. (2013), ‘Feeding the future’, Nature 499, 23–24.

Tanksley, S. D. and McCouch, S. R. (1997), ‘Seed Banks and Molecular Maps: Unlocking Genetic Potential from the Wild’, Science277, 1063–1066.

USDA ARS National Genetic Resources Program (n.d.), ‘Germplasm Resources Information Network - (GRIN)’.URL: http://www.ars-grin.gov/ sbmljw/cgi-bin/taxcrop.pl?language=en

Vincent, H., Wiersema, J., Kell, S., Fielder, H., Dobbie, S., Castaneda Alvarez, N. P., Guarino, L., Eastwood, R., Len, B. and Maxted,N. (2013), ‘A prioritized crop wild relative inventory to help underpin global food security’, Biological Conservation 167, 265–275.

AcknowledgementsData gathering and analyses were undertaken as part of the initiative ”Adapting Agriculture to ClimateChange: Collecting, Protecting and Preparing Crop Wild Relatives” which is supported by the Govern-ment of Norway. The project is managed by the Global Crop Diversity Trust with the Millennium SeedBank of the Royal Botanic Gardens, Kew, UK, and implemented in partnership with national and inter-national genebanks and plant breeding institutes around the world. For further information, see http://www.cwrdiversity.org/.