TLI 2012: Data flows in integrated breeding
-
Upload
cgiar-generation-challenge-programme -
Category
Technology
-
view
585 -
download
1
description
Transcript of TLI 2012: Data flows in integrated breeding
Data Flows in Integrated Breeding
Graham McLaren
Principles of DM for Integrated Breeding (IB)
IB requires high standards of sample and pedigree identification,
it requires integration of field and lab data, and quality is of paramount importance. Data collected during breeding processes
has immediate value for breeders and it also has cumulative value over years and
populations.
Information Cycle for Crop Improvement
Public Crop Informationaccessible via internet
Genetic Resources Information
Systems
Genomicsand
Genetics Databases
Crop Lead CentersCuration, integration and publication
of Public Crop Information
Breeding Inform
aticsC
omm
unity of Practice
InstitutionalCIS
NationalCIS
ProjectCIS
PrivateCIS
ARILocal CIS
NARSLocal CIS
NetworksLocal CIS
SMEsLocal CIS
Shared Information management Practices
Compatibility of DM Schemes
Users may have existing DM systems which need to be accommodated.
DM needs to be compatible across all members working on the same project.
Use of analysis and decision support tools and sharing of data with partners requires data to be formatted and stored in defined ways.
Training and support in DM and analysis is essential for IB projects
Breeding Partner 1
Breeding Partner 1Breeding Partner 2
Breeding Partner 3Breeding Partner n
Copy ofProject
Database
Data manager (DM):•Database management•Breeding logistics•Fieldbook preparation•Data entry/checking•Data management
Breeding Project n
Breeding Project 1Breeding Project 2
Breeding Project 3
Project data management
ProjectData
LocalBreeding
Data
Central DB curator:•QA for public data•Curation and integration•Distribution to projects•Publication on Internet•Global Trait Dictionary•Catalogue of Templates•Training of DMs and Curators
Update to project database
PublicDatabase
PublicCrop
Information
Crop lead Center n
Project data curator:•QA for project data•Curation and integration•Distribution to partners•Project Trait Dictionary•Fieldbook Templates•Update to public DB•Download of public DB•Training of partner DMs
Breeding datamanagement
Project databaseshared
Central database< shared and published >
Public CropCentral
Database
Breeding Data Flows
Genetic Resources
Improved Lines
Parental Material
Crossing Block
Nursery 1
Nursery 2
Evaluation TrialsGRSS
Cultivarsand
breeding lines
High densitygenotypingPhenotypic
characterization
High densitygenotypingPhenotypicevaluation
Multi-locationtesting
Breeding Inform
ation system
Public C
rop Information
STLIMS
FDM
MSL
TSL
A&DSChoose parental material based on haplotype values, known genes, traits and adaptation
A&DSDevelop crossing scheme based on genotype and phenotype compatibility
STLIMS
FDM
MSL
TSL
PIMPedigree information updated
Selection of lines based on QTL analysis / estimation of marker breeding valuesA&DS
Markergenotyping
PIM
ST LIMS MSL
A&DS
Pedigree information updated
Selection on index of marker values
ST FDM TSL
PIM
A&DSSelection of improved lines based on trait improvement and adaptation
Pedigree information updated
GRSS
GRSS
MSL
TSL
KeyInformation System
ST
LIMS
FDM
A&DS
PIM
SampleTracking
Pedigree Information
Laboratory Information
Field Data
Analysis & Decision Support
GeneticResourceService
Marker Service
TraitService
Platform Services
n cycles of selection and recombination
Interaction of breeding workflow and platform elements
The IBP Configurable Workflow System
Breeding Activities
Parental selectionCrossingPopulation development
GermplasmManagement
Open ProjectSpecify objectivesIdentify teamData resourcesDefine strategy
Project Planning
Experimental DesignFieldbook productionData collectionData loading
GermplasmEvaluation
Marker selectionFingerprintingGenotypingData loading
MolecularAnalysis
Quality AssuranceTrait analysisGenetic AnalysisQTL AnalysisIndex Analysis
DataAnalysis
Selected linesRecombinesRecombination plans
BreedingDecisions
MB design tool,Cross predictionand Strategic simulation
Breeding ProjectPlanning
Breeding nurseryand pedigreerecordmanagement
Breeding Management
System
Trial field bookand environment characterizationsystem
Field Trial Management
System
Genotypic DataManagement
System
Statistical analysisapplications andselection indices
AnalyticalPipeline
MABCMASMARSGWS
Decision Support System
Breeding Applications
Lab book,quality assuranceand diversityanalysis
The Breeding Management System
Breeding Management
System
ST
•Nursery Management•Characterization lists•Pedigree maintenance•Evaluation lists•Seed Inventory
Genotypic DataManagement
System
Field TrialManagement
System
Sample TrackingST
Characterizationlists
Genotyping Data Management System
Genotypic DataManagement
SystemBreeding
ManagementSystem
•Planting list•Sample list
LIMS
•Genotyping Data•Quality Assurance
STAnalyticalPipeline
Data Transformation-Genotyping Database-Application file formats
Tracking Genotyping SamplesST
Genotyping order formLIMS
Genotyping results:LIMS
Evaluation lists
Field Trial Management System
Field Trial Management
SystemBreeding
ManagementSystem
•Fieldbook preparation
Data Collection-Hand-held devises-Automatic measurement
•Environmentalcharacterization•Quality Assurance•Phenotyping data
AnalyticalPipeline
Data Transformation-Phenotyping Database-Application file formats
Experimental design and randomization
CWSConfiguration
System
Trait templates
The Trial Template
Diversity scoresPedigree treesCOP matricesPhenotype meansGenotype BLUPSStability measuresAdaptation scoresMarker scoresGenetic distanceGenetic mapsQTL estimates
Analytical Pipeline
AnalyticalPipeline
Genotypic DataManagement
System
•Genotyping QA•Diversity analysis•Genetic mapping•Phenotyping QA•Single site analysis•Multi site analysis•GxE Analysis•QTL Analysis•QTLxE Analysis
Field TrialManagement
System
Phenotyping data
Decision Support Tools
Genotyping data
Genotyping scores:LIMS
Diversity scoresPedigree treesCOP matricesPhenotype meansGenotype BLUPSStability measuresAdaptation scoresMarker scoresGenetic distanceGenetic mapsQTL estimates
Decision Support and Simulation
Decision Support Tools
•MBDT•Breeding indices•OptiMas
AnalyticalPipeline
Simulation Tools
•QuLine•QuHybrid•QuMARS•QuGene
Breeding Decisions
Germplasm lists forcharacterizationForeground markersBackground markersTarget genotypesDonor germplasmRecipient germplasmRanked germplasmSelection listsParental listsCrossing schemes
Population sizesSelection intensityMarker densitiesCrossing schemesSelection schemes
Trait selectionGE targetingOptimal breeding systems
Genetic modelsGE systemsBreeding methods
ICIS COP matrix
Lower Triangular part of Coefficient of Parentage MatrixROWID COLID ROWNO COLNO COP Optional Labels
50533 50533 1 1 0.9577 "IR 64" "IR 64"70125 50533 2 1 0.2231 "IR 72" "IR 64"70125 70125 2 2 0.9896 "IR 72" "IR 72"11105 50533 3 1 0.1872 "IR 36" "IR 64"11105 70125 3 2 0.5108 "IR 36" "IR 72"11105 11105 3 3 0.9478 "IR 36" "IR 36"
Lower Triangular part of Inverse Coefficient of Parentage MatrixROWID COLID ROWNO COLNO INV-COP Optional Labels
50533 50533 1 1 1.1113776 "IR 64" "IR 64"70125 50533 2 1 -0.1900738 "IR 72" "IR 64"70125 70125 2 2 1.4324875 "IR 72" "IR 72"11105 50533 3 1 -0.1170834 "IR 36" "IR 64"11105 70125 3 2 -0.7344297 "IR 36" "IR 72"11105 11105 3 3 1.4739708 "IR 36" "IR 36"
Flapjack QTL Information File
Compulsory FieldsQTLChromosomePositionMinimumMaximumTraitExperiment
Optional FieldsAddEffectsAddSEMinlog10(P)%VarExplainedPosMinFMPosMaxFMLFMRFM
Flapjack Map Data
The map file should contain information on the markers, the chromosome they are on, and their position within that chromosome. The markers do not need to be in any particular order as Flapjack will group and sort them by chromosome and distance once they are loaded.
Breeding program designer
Blue/gray – strategyGreen – GenerationYellow – selection roundPink/red – trait selection step
• To start, open ‘BreedingProgram.jar’• Can create/drag/drop any new objects anywhere• Use left mouse click to drag any piece and drop on higher hiearchy• Use centre mouse click to zoom• Edit in list/value boxes to set parameters
+ add new object at next levelX delete object
clone object
Scott Chapman
Available breeding simulation tools
QuLine, a computer software that simulates breeding programs for developing inbred lines
QuHybrid, a computer software that simulates breeding programs for developing hybrids
QuMARS, a computer software that simulates marker-assisted recurrent selection and genome-wide selection
Jiankang Wang
What can QuLine do?
Comparison of genetic gains from different selection methods Change in population mean Change in gene frequency Change in Hamming distance (distance of a selected
genotype to the target genotype) Comparison of cross performance
Selection history Rogers’ genetic distance Number of lines retained from each cross
Comparison of cost efficiency Number of families Individual plants per generation
Validation of theoriesJiankang Wang
Breeding Management
System
•Nursery Management•Characterization lists•Pedigree maintenance•Evaluation lists•Seed Inventory
Genotypic DataManagement
System
•Planting list•Sample list
LIMS
•Genotyping Data•Quality Assurance
Field Trial Management
System
•Fieldbook preparation
Data Collection-Hand-held devises-Automatic measurement
•Environmentalcharacterization•Quality Assurance•Phenotyping data
Genotypic DataManagement
System
•Planting list•Sample list
•Genotyping Data•Quality Assurance
AnalyticalPipeline
•Genotyping QA•Diversity analysis•Genetic mapping•Phenotyping QA•Single site analysis•Multi site analysis•GxE Analysis•QTL Analysis•QTLxE Analysis
Decision Support Tools
•MBDT•Breeding indices•OptiMas
Simulation Tools
•QuLine•QuHybrid•QuMARS•QuGene
GMS DMS GDMS
Integrating the applications of the Configurable Workflow System