The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by...

195
The Complex Inheritance of Maize Domestication Traits and Gene Expression By Zachary H. Lemmon A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Genetics) at the UNIVERSITY OF WISCONSIN – MADISON 2014 Date of final oral examination: 4/29/14 The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor, Genetics David A. Baum, Professor, Botany and Genetics Shawn M. Kaeppler, Professor, Agronomy Patrick H. Masson, Professor, Genetics Bret A. Payseur, Professor, Genetics

Transcript of The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by...

Page 1: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

The Complex Inheritance of Maize Domestication Traits and

Gene Expression

By

Zachary H. Lemmon

A dissertation submitted in partial fulfillment of

the requirements for the degree of

Doctor of Philosophy

(Genetics)

at the

UNIVERSITY OF WISCONSIN – MADISON

2014

Date of final oral examination: 4/29/14

The dissertation is approved by the following members of the Final Oral Committee:John F. Doebley, Professor, GeneticsDavid A. Baum, Professor, Botany and GeneticsShawn M. Kaeppler, Professor, AgronomyPatrick H. Masson, Professor, GeneticsBret A. Payseur, Professor, Genetics

Page 2: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

i

Acknowledgements

I want to extend my thanks to John Doebley for making this dissertation possible. John

has been a constant voice of encouragement and insight throughout my graduate career.

He has been instrumental in keeping me focused on the big question, while allowing me

the freedom to chase down side interests and projects. John has taught me the importance

of focusing my scientific inquiry on the core of a research question, which has shaped the

way I approach research. While I have carried out the experiments described in this work,

the first steps taken in these projects belong to John and I am grateful for the chance I

was given to shepherd them to completion. Every day and conversation I have had with

John as my advisor has made me into a better scientist and I am extremely thankful for

the opportunity I was given six years ago when I joined the Doebley lab.

I have been fortunate enough to also work in an outstanding lab full of supportive

individuals on both a personal and professional level. The work performed by a number

of my fellow lab members was crucial to the completion of these experiments. Without

their help the many DNA and RNA extractions, PCR reactions, measured phenotypes,

and plants grown would simply have not happened. Fellow graduate students, postdocs,

lab technicians, and undergraduate workers have all assisted in their own way. I am

also thankful that in addition to being wonderful coworkers in a professional sense, lab

members have contributed to making the lab a fun, exciting, and enjoyable place to

spend my Ph.D. career. I will never forget the power of “Tak”, being “skinny up top”,

or the “lab master”. To Tony, Laura, CJ, Ali, Bao, Tina, Lisa, Eric III, Jesse, Elizabeth,

David, Claudia, Wei, and the numerous undergrads, thank you for making this wonderful

experience possible.

In addition to my friends and colleagues at Wisconsin, I have been fortunate enough to

be involved in a larger community of maize researchers at Cornell University, University of

Page 3: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

ii

Missouri, North Carolina State University, and University of California - Davis. Working

with these scientists has exposed me to a variety of questions and topics in maize research

regarding phenotype, quantitative genetics, and large scale data collection and analysis

resulting in a greatly expanded experience. In particular, collaborations with Qi Sun

and Robert Bukowski at Cornell have greatly contributed to analysis in the third chapter

of this thesis. Also dialog with Jeff Ross-Ibarra and Matt Hufford at UC-Davis has

continuously provided me with insight into the population genetics of maize domestication

and given me a valuable resource to draw on.

My Ph.D. committee has been an excellent resource during my graduate career. Bret

Payseur and Shawn Kaeppler in particular have provided valuable insight into scientific

questions and suggested analyses that have become part of this dissertation. David Baum

has always made time in his busy schedule to meet with me and keep up to date with my

progress. Finally, Patrick Masson has been a constant source of encouragement and has

assisted me in several capacities both within and outside of the Ph.D. committee.

I am also eternally grateful to my family, who have stood by my side throughout

this process. My parents, Karen and Holden, for giving me the tools and opportunity to

pursue my goals. My sisters, Addie and Kelsey, for always being there and my wonderful

nieces, Laney and Havi, for always making me smile. My amazing friend, Alex, who has

been a constant source of support in my life and is one of the family now. Finally, my

wife Megan, you have kept me grounded throughout these six years in Madison in both

the good and bad times. You are my rock and this would not have been possible without

you.

Page 4: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

iii

Contents

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i

Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

Page 5: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

iv

1 Genetic dissection of a genomic region with pleiotropic effects on do-

mestication traits in maize reveals multiple linked QTL 1

1.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3.1 Plant Material, Genotypes, and Phenotypes . . . . . . . . . . . . . 6

1.3.2 Mixed Models and Heritability . . . . . . . . . . . . . . . . . . . . . 7

1.3.3 QTL Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.3.4 Simulation Experiment . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.4.1 QTL mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.4.2 Simulation Experiment . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2 Fine mapping of chromosome five domestication genes in maize 26

2.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.3 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.3.1 Plant material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.3.2 Field Trials and Phenotypes . . . . . . . . . . . . . . . . . . . . . . 32

2.3.3 Genotyping with PCR and next generation sequencing . . . . . . . 33

2.3.4 Statistical analysis and segregation of phenotypes . . . . . . . . . . 35

2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.4.1 RCNIL generation and phenotype least squared means . . . . . . . 38

2.4.2 PCR and GBS genotyping . . . . . . . . . . . . . . . . . . . . . . . 40

2.4.3 QTL fail to segregate as Mendelian traits . . . . . . . . . . . . . . . 42

2.4.4 Multiple factors contribute to culm diameter and kernel row number 45

Page 6: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

v

2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.5.1 The complex genetic architecture of culm and kernel row number . 48

2.5.2 Future work on chromosome five QTL . . . . . . . . . . . . . . . . 50

3 The role of cis regulatory evolution in maize domestication 52

3.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.3 Materials and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.3.1 Plant material, RNA preparation, and sequencing . . . . . . . . . . 56

3.3.2 Bioinformatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.3.3 Maize:teosinte gene expression ratios . . . . . . . . . . . . . . . . . 58

3.3.4 Testing for cis and trans effects . . . . . . . . . . . . . . . . . . . . 59

3.3.5 Candidate genes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3.3.6 Proportion of cis variation in maize and teosinte . . . . . . . . . . . 62

3.3.7 Additive and dominant gene expression . . . . . . . . . . . . . . . . 63

3.3.8 CCT gene enrichment in various functional categories . . . . . . . . 64

3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3.4.1 RNAseq provides expression data for more than 17,000 genes per

tissue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3.4.2 Prolific regulatory variation characterized by relatively few consis-

tent cis differences . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.4.3 Possible directional bias in cis evolution . . . . . . . . . . . . . . . 74

3.4.4 Gene expression variation is greater in teosinte . . . . . . . . . . . . 76

3.4.5 Selection candidate genes are enriched for CCT genes . . . . . . . . 78

3.4.6 Microarray and RNAseq data partially correspond . . . . . . . . . . 81

3.4.7 CCT genes are unrelated to differentially methylated regions . . . . 83

3.4.8 Dominant and additive gene expression inheritance . . . . . . . . . 85

Page 7: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

vi

3.4.9 Candidate genes enriched in various functional categories . . . . . . 86

3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

3.5.1 Regulatory change between and within maize and teosinte . . . . . 89

3.5.2 What is the frequency of cis and trans regulatory change? . . . . . 90

3.5.3 Tissue specific expression of CCT candidates . . . . . . . . . . . . . 92

3.5.4 Bias toward increased maize expression? . . . . . . . . . . . . . . . 93

3.5.5 Selection-candidates enriched for cis regulatory change . . . . . . . 94

3.5.6 Leaf tissue candidates are enriched for photosynthesis and chloro-

plast GO terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

3.5.7 Do crop domestication genes show cis differences? . . . . . . . . . . 96

3.5.8 A catalog of genes with cis regulatory variation . . . . . . . . . . . 96

Page 8: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

vii

Appendices 99

A Supplemental Content: Genetic dissection of a genomic region with

pleiotropic effects on domestication traits in maize reveals multiple

linked QTL 100

A.1 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

A.2 Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

B Supplemental Content: Fine mapping of chromosome five domestication

genes in maize 106

B.1 Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

C Supplemental Content: The role of cis regulatory evolution in maize

domestication 109

C.1 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

C.2 Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

D Characterization of domestication traits for selection candidate gene Zea

agamous2 157

D.1 Forward . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

D.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

D.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

D.3.1 RCNILs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

D.3.2 Transgenic RNAi lines . . . . . . . . . . . . . . . . . . . . . . . . . 161

D.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

D.4.1 RCNILs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

D.4.2 Transgenic RNAi lines . . . . . . . . . . . . . . . . . . . . . . . . . 163

D.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

Page 9: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

viii

References 168

Page 10: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

ix

List of Figures

1.1 Cumulative plot of QTL detected in the mapping experiment. . . . . . . . 15

1.2 The number of detected QTL and mean detected QTL effect size versus

number of simulated causative loci. . . . . . . . . . . . . . . . . . . . . . . 19

1.3 The proportion of detected QTL with zero, one, or more than one simulated

causative genes in the 1.5 LOD support interval. . . . . . . . . . . . . . . . 21

2.1 Histograms of least squared means for the culm diameter and kernel row

number phenotypes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.2 GBS genotypes for kernel row number RCNILs. . . . . . . . . . . . . . . . 41

2.3 RCNILs sorted by phenotype from least to greatest. . . . . . . . . . . . . . 43

2.4 Density plots of the culm diameter and kernel row number phenotypes

grouped by founding HIF. . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2.5 QTL LOD profiles for fine mapping of culm diameter and kernel row num-

ber traits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.1 Overlap of genes assessed in the three tissues overall and in the CCT-AB

gene list. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.2 Parent versus hybrid ear tissue allele specific expression ratios. . . . . . . . 72

3.3 Proportion of expression divergence due to cis regulatory difference. . . . . 73

Page 11: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

x

3.4 Cis versus estimated trans regulatory effect for CCT-ABC genes in the ear,

leaf, and stem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

3.5 The proportion of average maize to teosinte R2 from linear models explain-

ing F1 hybrid expression by maize and teosinte parent. . . . . . . . . . . . 77

3.6 Density plots of ln(XPCLR) score of conserved versus CCT-AB candidate

genes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

3.7 Proportion of cis only and trans only genes identified as having dominant

or additive inheritance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

A.1 Histograms of the least squared means for phenotyped traits from the QTL

mapping population. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

A.2 Example histograms of simulated traits for several different conditions in

terms of number of causative loci, effect size, and heritability. . . . . . . . 102

A.3 Proportion of detected QTL with zero, one, or multiple causative genes in

the 1.5 LOD support interval. . . . . . . . . . . . . . . . . . . . . . . . . . 103

C.1 Parent versus hybrid leaf tissue allele specific expression ratios. . . . . . . . 110

C.2 Parent versus hybrid stem tissue allele specific expression ratios. . . . . . . 111

C.3 Dominance by additivity ratio grouped by regulatory category. . . . . . . . 112

D.1 Single kernel weight estimates for zag2 RCNILs. . . . . . . . . . . . . . . . 164

Page 12: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

xi

List of Tables

1.1 NIRIL phenotyped traits, descriptions, approximate distribution, between

year Pearson correlation coefficients, and Pearson p-values. . . . . . . . . . 8

1.2 Final models selected for the thirteen NIRIL phenotypes. . . . . . . . . . . 9

1.3 Detected QTL for the T5S mapping population with position, heritability,

and LOD score statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.1 Final linear mixed models used to produce least squared means for fine

mapping RCNILs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.2 Detected QTL and HIF effects including LOD, percent variation explained,

and additive effect. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.1 Regulatory category as defined by significant (Sig.) or not significant (Not

Sig.) binomial tests (BT) and Fisher’s Exact Tests (FET). . . . . . . . . . 60

3.2 Assignable RNAseq Read Counts from F1 hybrids and parents. . . . . . . . 68

3.3 Genes for which RNAseq data was collected and expression was assayed.1 . 69

3.4 Fisher’s Exact Tests for overlap of selection and CCT candidates. . . . . . 80

3.5 Fisher’s Exact Tests for enrichment/depletion of cis and trans only genes

in selection features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

3.6 Fisher’s Exact Tests for overlap between microarray and CCT differentially

expressed genes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Page 13: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

xii

3.7 Regulatory category of the closest maize homolog of 6 maize and 22 non-

maize domestication loci. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

A.1 RFLP Markers used during backcrossing of QTL mapping population. . . . 104

A.2 Genetic markers used to score BC6S6 mapping population. . . . . . . . . . 105

B.1 PCR markers used for genotyping RCNILs including gene or SNP target,

AGPv2 position, and primer sequence. . . . . . . . . . . . . . . . . . . . . 107

C.1 Biological replicates for RNAseq experiment. . . . . . . . . . . . . . . . . . 113

C.2 Adapter name, barcode sequence, and barcode length for Illumina adapters

used in RNAseq libraries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

C.3 Number of genomic paired end reads and coverage obtained for constructing

pseudo-transcriptomes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

C.4 Proportion of divergence due to cis regulatory effect grouped by overall

parental divergence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

C.5 The number of genes for which the maize or teosinte allele is expressed at

a higher level. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

C.6 Bias for the maize allele grouped by inbred line for the three tissues in the

CCT-ABC gene list. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

C.7 Allele specific expression variation among F1 hybrids explained by maize

and teosinte parent. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

C.8 Number of genes with significant cis expression variation explained by

maize and/or teosinte. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

C.9 Comparison of observed and expected numbers of genes classified as differ-

entially expressed (DE) or not differentially expressed (NDE) by RNAseq

and MicroArray assays in groups A, B, and C in the three tissue types. . . 121

Page 14: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

xiii

C.10 Regulatory categories for genes identified as differentially expressed be-

tween maize and teosinte by microarray assays. . . . . . . . . . . . . . . . 122

C.11 Fisher’s Exact Tests for the overlap between genes associated with differ-

entially methylated regions (DMRs) and CCT-ABC genes from each of the

three experimental tissues in our work. . . . . . . . . . . . . . . . . . . . . 123

C.12 Number of candidate genes neighboring differentially methylated regions

(DMRs) between maize and teosinte and proportion in which expression

data agrees with methylated status. . . . . . . . . . . . . . . . . . . . . . . 124

C.13 Dominance/additivity ratios for genome-wide gene expression . . . . . . . 125

C.14 Contingency tables for additive and dominant gene counts for A, AB, and

ABC candidate lists. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

C.15 Degree of overlap between our CCT (AB list) genes and genes in different

transcription factor families. . . . . . . . . . . . . . . . . . . . . . . . . . . 127

C.16 Degree of overlap between CCT (AB list) differentially expressed genes and

genes in the 1.5 support intervals for QTL from a previous study. . . . . . 133

C.17 Degree overlap between our CCT (AB list) differentially expressed genes

and genes in metabolic pathways defined in KEGG. . . . . . . . . . . . . . 134

C.18 Significantly enriched and depleted GO terms from CCT and trans only

gene lists. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

D.1 Trait abbreviations and descriptions from the zag2 experiment. . . . . . . 162

D.2 Zag2 transgenic RNAi insertion event, background, phenotype, and t-test

p-value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

Page 15: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

xiv

Abstract

The genetic basis for morphological change in divergent species is a central question in

evolutionary biology. The domestication of maize from its wild progenitor, teosinte, is an

excellent system to address this question. We explore the large effect on domestication

phenotypes of a poorly understood region of the maize genome using a chromosome five

specific mapping population. Unlike other large effect regions of the maize genome, many

traits have multiple QTL that do not stack on a single locus suggesting multiple genes on

the fifth chromosome influence domestication traits. Simulation studies show clear evi-

dence for limited power to detect QTL for highly polygenic traits that do not accurately

portray the true complexity of the underlying genetic architecture. Two QTL in different

locations were chosen for fine mapping studies to identify the underlying causative genes.

While a single gene was not identified for either QTL, both were successfully narrowed

to less than three centimorgan intervals with relatively few genes and evidence of positive

selection during maize domestication. Finally, the first genome-wide effort to characterize

cis and trans regulatory change between a domesticated crop and its wild progenitor found

extensive regulatory variation with relatively few genes having consistent cis differences,

which were determined to be under positive selection during the domestication and crop

improvement of maize. Consistent with loss of diversity during the domestication bottle-

neck, cis expression variation explained by the maize parent is reduced in comparison to

teosinte with an even greater reduction seen in cis candidate genes. A general increase in

the expression of maize alleles was also observed suggesting domestication in maize may

have led to a general increase in gene expression. Collectively, these experiments shed

light on the evolution of divergent phenotypes and gene regulation in the domesticated

maize and its wild progenitor.

Page 16: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

xv

Preface

The nature of functional changes to the genes responsible for phenotypic divergence in re-

lated species is a topic of ongoing research in evolutionary biology. Many types of genomic

features have been shown to influence the development of novel phenotypes. Studies in

closely related species have identified gene duplications [1], various types of expression

modification [2–4], and gene coding changes [5, 6] that give rise to altered phenotypes. A

major contributor to evolutionary biology research is the study of domesticated crops and

their wild ancestors, where the intense artificial selection for agronomic traits during the

domestication process serves as a proxy for natural selection mechanisms. Experiments

characterizing the functional changes responsible for novel phenotypes in the domesticated

systems of rice, tomato, wheat, and sorghum have been met with great success [7].

One of the most successfully used domestication crop models is maize, where scientists

have extensively investigated the morphological differences between maize (Zea mays spp.

mays) and its wild progenitor (Zea mays spp. parviglumis). Maize is an excellent sys-

tem to pursue evolutionary questions for a number of reasons. Maize was domesticated

approximately 9,000 years ago in the Balsas River valley of Mexico [8]. Like other domes-

ticated systems, maize-teosinte F1 hybrids are fertile, which allows the use of powerful

genetic techniques to dissect the genetics of complex traits. The maize reference genome

also greatly facilitates research by empowering the use of sequence based analyses and

comparative genomics [9]. A common collection of phenotypic differences seen between

domesticated crops and their wild progenitors is also observed when comparing maize and

teosinte. This “domestication syndrome” [10, 11] consists of phenotypes that improve the

suitability of a crop for human use such as loss of shattering (natural seed dispersal),

increased apical dominance, loss of prolificacy (concentration of seed into one unit), and

gigantism of vegetative and reproductive tissues.

Page 17: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

xvi

One method commonly used to examine genetic factors controlling morphological vari-

ation in maize is quantitative trait locus (QTL) mapping. Studies examining the domes-

tication of maize have shown QTL representing the profound morphological differences

between maize and its wild progenitor teosinte can be primarily attributed to six regions

of large effect on the first five chromosomes of maize [8]. Three of these genomic features

have been further characterized, identifying single genes of large, pleiotropic effect. The

functional causative polymorphisms of these genes include new tissue specific expression

patterns [4], elevated expression [3], and coding sequence change [5]. In contrast to these

well characterized loci, other regions of the genome with large effect on domestication

phenotypes are poorly understood.

A prominent theory in evolutionary biology suggests the primary mechanism by which

adaptive evolution occurs is through modification of cis regulatory elements [12, 13].

Consistent with this theory, altered cis regulatory elements in domesticated crops account

for a large proportion of identified domestication genes [7]. A striking characteristic of

these genes is the variety of functional changes that result from cis regulatory change

with examples including elevated and decreased expression [3, 14], development of novel

tissue specific expression patterns [4, 15], and heterochronic shifts in expression [16]. The

demonstrated importance of gene regulatory change in the evolution of new forms has

led to several studies investigating genome-wide gene expression in domesticated crops

[17–19]. While measuring gene expression differences between a modern crop and its wild

relative is an important step in exploring regulatory variation in an evolutionary context,

it falls short of the global analyses in yeast and fruit fly [20, 21] that specifically dissect

cis and trans regulatory variation.

The work presented in this dissertation seeks to explore two facets of diversification

between maize and teosinte. First, quantitative genetic methods are used to specifically

assess the architecture of domestication QTL and causative genes on the fifth chromosome

Page 18: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

xvii

of maize, providing insight into the genetic factors underlying this previously uncharac-

terized region of large phenotypic effect in the maize genome. Second, genome-wide regu-

latory variation due to cis and trans regulatory change is investigated on a genome-wide

scale using deep RNA sequencing. This work is presented in three chapters.

1. The first chapter describes a chromosome five specific QTL mapping experiment. A

large BC6S6 population was developed while fixing other regions known to impact

domestication traits for a homozygous genotype. Thirteen phenotypes representing

differences between the progenitor and maize were measured in two summers and

QTL mapping was performed. We detected an average of approximately two QTL

per trait with QTL mapping to multiple regions. This suggested that unlike other

genomic regions of importance in maize domestication, the fifth chromosome houses

a complex of linked loci that all contribute to the phenotypic effect. Additional

efforts were made to examine the power and precision of our mapping population

with simulated trait datasets. Heritability of a trait was found to have the primary

influence on the maximum number of detectable QTL and we observed the Beavis

Effect on estimated QTL effect size. This work provides a focused examination of

a previously poorly understood region of the maize genome with large phenotypic

effects on domestication traits.

2. The second chapter focuses on fine mapping efforts for two QTL for culm diame-

ter and kernel row number on the fifth chromosome identified in chapter one. Our

strategy used a population of plants with homozygous recombinant chromosomes

in replicated field trials. Neither QTL was successfully mapped to a single gene,

however, the culm diameter QTL was greatly reduced in size (∼2.5% of the original

1.5 LOD support interval). The kernel row number QTL was analyzed with whole

genome genotyping data and a complex set of genetic factors influencing the trait

were identified. The main kernel row number QTL in terms of LOD score on chro-

Page 19: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

xviii

mosome five shifted to a different region outside of the original support interval. The

culm diameter and kernel row number QTL contained 40 and 63 genes, respectively,

which were examined for attractive candidate genes. Neither QTL had a clear best

candidate, but several genes showed evidence for cis regulatory change and multiple

genes had evidence of positive selection during the domestication of maize. While

this work was unsuccessful in identifying a single causative gene, we greatly reduce

the size of the culm diameter QTL and find evidence for complex inheritance of the

kernel row number phenotype.

3. Finally, the extent of genome-wide gene regulatory change is examined using next

generation sequencing methods. Three tissues from a collection of maize-teosinte

F1 hybrids and their inbred parents were harvested and next generation Illumina

sequencing was performed to assess differential expression of alleles. Using a hier-

archical series of statistical tests, we differentiate between significant cis and trans

regulatory effects for approximately 17,000 genes in each of the three tissues studied.

We produce a list of filtered candidate genes (∼500 genes per tissue) with significant

and consistent cis effects. These genes are significantly associated with selection

features from a recent genome-wide scan for selection in maize, suggesting genes

with cis regulatory changes are frequently the target of positive selection. Addi-

tionally, the proportion of effect due to cis was observed to be positively correlated

with overall divergence. Several other characteristics of the candidate cis genes were

also analyzed including gene ontology and other functional annotations. This study

represents the first genome-wide effort in a domesticated crop and wild progenitor

to assess allele specific expression dissecting cis and trans effects using F1 hybrids.

Page 20: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

1

Chapter 1

Genetic dissection of a genomic

region with pleiotropic effects on

domestication traits in maize reveals

multiple linked QTL

Page 21: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

2

1.1 Abstract

The domesticated crop maize and its wild progenitor, teosinte, have been used in numerous

experiments to investigate the nature of divergent morphologies. This study examines a

poorly understood region on the fifth chromosome of maize associated with a number of

traits under selection during domestication using a QTL mapping population specific to

the fifth chromosome. In contrast with other major domestication loci in maize where

large effect, highly pleiotropic, single genes are responsible for phenotypic effects, our

study found the region on chromosome five fractionates into multiple QTL, none with

singularly large effects. The smallest 1.5 LOD support interval for a QTL contained

54 genes, one of which was a MADS MIKCC transcription factor, a family of proteins

implicated in many developmental programs. We also used simulated trait datasets to

investigate the power of our mapping population to identify QTL for which there is a

single underlying causal gene. This analysis showed that while QTL for traits controlled

by single genes can be accurately mapped, our population design can detect no more than

∼4.5 QTL per trait even when there are 100 causal genes. Thus when a trait is controlled

by 5 or more genes in the simulated data, the number of detected QTL can represent a

simplification of the underlying causative factors. Our results show how a QTL region

with effects on several traits may be due to multiple linked QTL of small effect as opposed

to a single gene with large and pleiotropic effects.

Page 22: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

3

1.2 Introduction

In evolutionary biology, quantitative trait locus (QTL) mapping has been used with great

success to define the genetic architecture controlling morphological differences between

species. These QTL mapping experiments have identified a number of QTL with large

effects in animal [22–24] and plant systems [25–28]. Often these experiments identify QTL

clusters in a relatively small number of genomic regions, suggesting an underlying genetic

architecture of single pleiotropic genes or several closely linked genes [8, 24, 29–31]. The

phenotypic effects of QTL have been successfully mapped to single large effect pleiotropic

genes in many species [3, 5, 15, 16, 32–34]. However, these large effect genes often only

explain a portion of the divergence between species, leaving a considerable amount of

phenotypic differences unexplained. Characterization of QTL clusters not associated with

single genes will lead to a more comprehensive understanding of the genetic architecture

that contributes to divergent phenotypes.

Domesticated crop plants and maize in particular provide a well-suited system in which

to study the evolution of new morphologies for a number of reasons. First, maize (Zea

mays spp. mays) and its wild progenitor teosinte (Z. mays spp. parviglumis) differ for a

suite of traits commonly seen in domesticated crop pairs. Collectively, these differences

are known as the domestication syndrome and include reduced lateral branching, loss

of natural seed dispersal, and gigantism of vegetative and reproductive tissues [10, 11].

Second, intense artificial selection upon domesticated crops, including maize, for desirable

agronomic traits leaves a signature of selection (reduced nucleotide diversity) allowing for

identification of putative targets of artificial selection in selective sweeps [35]. Third, like

most domestication events, maize domestication took place in the last 10,000 years and

surviving wild progenitor populations serve as reasonable surrogates for the ancestor [36].

In addition, maize and teosinte are inter-fertile, allowing for the use of genetic techniques

and crosses to dissect the genetic architecture underlying divergent traits [37, 38]. Finally,

Page 23: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

4

researchers studying maize have the advantage of a powerful tool in the reference maize

genome sequence providing the ability to anchor genetic markers to physical positions,

annotation of candidate genes, and characterization of important genomic features such

as centromeres [9]. The combination of these characteristics and available tools make

maize an effective model system in which to study the evolution of new forms.

Previous work in maize and its wild progenitor suggests the genes responsible for

phenotypic change are scattered throughout the genome but with several concentrations

of genes (QTL) controlling large portions of the phenotypic differences [8, 25]. To date,

three large effect pleiotropic genes have been mapped to these genomic regions of large

phenotypic importance. The short arm of chromosome one is home to grassy tillers1 (gt1 ),

which influences tillering [39] and is largely responsible for the concentration of seed into

a single large ear [4]. The gene teosinte branched1 (tb1 ) is found on the long arm of

chromosome one and has a large pleiotropic impact on plant and inflorescence branching

[3, 40]. Finally, the gene teosinte glume architecture1 (tga1 ) liberates the kernel from

its stony fruit case in teosinte [5]. In comparison to these extensively studied genes,

little is known about the genetic factors on other chromosomes responsible for phenotypic

divergence during maize domestication.

While early studies identified tb1 as the gene responsible for much of the phenotypic

effect on the long arm of chromosome one [41], a more recent study has identified at least

two additional loci upstream of tb1 with significant effects on phenotype [42]. These loci

influence the expression of tb1 -like phenotypes in both additive and epistatic ways. The

nearest of these loci was only 5 centimorgans (cM) away from tb1 itself and also had an

effect specific to ear traits, leaving plant architecture traits such as tillering unaffected.

This suggests secondary factors to major effect genes are potentially quite closely linked

and could also mediate tissue specific effects. Similarly, the work identifying gt1 also found

Page 24: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

5

evidence of a secondary factor located downstream of the identified causative region that

slightly increases prolificacy (the number of ears) in plants carrying the teosinte allele [4].

One of the six genomic regions of large pleiotropic effect identified in maize is on

chromosome five where the genetic architecture underlying the large phenotypic effects

is largely unknown [8]. Previous work has found a number of domestication QTL on

chromosome five for culm diameter, kernel row number, ear diameter, disarticulation,

and pedicellate spikelet length [8, 37, 38]. A more recent experiment also found QTL

for a number of these traits on chromosome five, some of which (kernel row number, ear

diameter, and disarticulartion) had particularly large effect and LOD score [25]. While

these previous mapping experiments found significant QTL for domestication traits on

chromosome five, they could not determine whether this region contained a major QTL

with pleiotropic effects on several traits or multiple linked QTL.

In this paper, we undertook a QTL mapping study to better characterize the effect of

chromosome five on domestication traits. This experiment utilized a population of nearly

isogenic recombinant inbred lines (NIRILs) that allowed for concentration of informative

crossover events in the region of interest (chromosome five) and replicated block experi-

ments to improve trait measurements. Both of these characteristics increase the mapping

power specifically on chromosome five in comparison with a standard F2 mapping pop-

ulation, improving the ability to differentiate between closely linked, moderate to small

effect, and interacting QTL. Our QTL mapping detected QTL at multiple locations on

the fifth chromosome, none of which have singularly large effect. This suggests that un-

like other regions of the maize genome with single large effect genes [3–5], chromosome

five houses several linked factors influencing phenotype. We also performed a simulation

study to gauge the power and precision of our mapping population. This analysis indi-

cates that for some traits the genetic architecture could be more complex than observed

with empirical data.

Page 25: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

6

1.3 Materials and Methods

1.3.1 Plant Material, Genotypes, and Phenotypes

We conducted a QTL mapping experiment to investigate the genetic architecture of do-

mestication traits on maize chromosome five using a collection of nearly isogenic recombi-

nant inbred lines (NIRILs) in the summers of 2009 and 2010. The experimental population

was built by introgressing the majority of the short arm of chromosome five and part of

the long arm from a teosinte (Iltis and Cochrane collection 81) into the maize inbred

W22 by six generations of backcrossing. RFLP markers (Supplemental Table A.1) were

used during this process to follow the desired genomic segment and eliminate teosinte

segments at other known domestication QTL identified in a previous study [43]. The

extensive backcrossing in tandem with tracking and eliminating teosinte segments from

specific regions of the genome allowed the experiment to be focused on the segregating

teosinte introgression on chromosome five. Five BC6 individuals heterozygous for the tar-

get segment on chromosome five were selfed to produce five BC6S1 families. The families

were then selfed for five additional generations to give an experimental BC6S6 population

of 259 highly homozygous NIRILs, which carried a collection of teosinte fifth chromosome

introgressions in an isogenic W22 background.

Genomic DNA was extracted with a standard CTAB protocol from tissue collected

from an average of 15 individuals from each NIRIL in the summer of 2009. A collec-

tion of 25 insertion/deletion and microsatellite markers (Supplemental Table A.2) were

genotyped across the fifth chromosome introgression using standard PCR and gel elec-

trophoresis methods. In total, there were 443 observed recombination breakpoints among

the NIRILs or approximately 1.7 events per line. The range of recombination breakpoints

went from zero to six with the majority of lines (51.7%) having either zero or a single

recombination event. The number of lines with each number of breakpoints are as fol-

Page 26: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

7

lows: 56 (0 breakpoints), 78 (1 breakpoint), 49 (2 breakpoints), 48 (3 breakpoints), 19 (4

breakpoints), 7 (5 breakpoints), and 2 (6 breakpoints).

Phenotype data was collected for the experimental NIRILs in three replicated blocks,

two in the summer of 2009 and one in 2010, grown at the West Madison Agricultural

Research Station in Madison, Wisconsin. Blocks consisted of the 259 NIRILs planted

in randomized plots of ten or twelve plants each in 2009 and 2010, respectively. Five

plants from each plot were assessed for thirteen phenotypes (Table 1.1) representing a

number of plant and inflorescence phenotypic differences between teosinte and maize.

Plant traits included plant height, days to pollen shed, the amount of tillering, length of

the primary lateral branch, prolificacy, and culm diameter. Inflorescence traits measured

in the female inflorescence (ear) were kernels per rank, kernel row number, ear diameter,

ear length, and percent staminate spikelets. Several traits from the male inflorescence or

tassel were also measured and include the pedicellate spikelet length and tassel branch

number. Genotype and phenotype data are available from the Dryad Digital Repository:

http://dx.doi.org/10.5061/dryad.7sq67.

1.3.2 Mixed Models and Heritability

We estimated the NIRIL phenotype for all traits by fitting a linear mixed model. Fixed

effects consisted of NIRIL, NIRIL family, and position within block, while block and year

were used as random effects. A model (Equation 1.1) was fit with the MIXED procedure

in SAS [44] as an initial scope. In this model, Yijklmno is the individual trait value,

µ the overall mean, fj the family effect, ai(fj) is line nested in family, random block

effect is bk, horizontal and vertical position in the field nested in block are represented

by cl(bk) and dm(bk) respectively, tn the year, eijklmno is the experimental error (between

plots), and finally gijklmno for within plot sampling error. Each model term was tested for

significance on a trait-by-trait basis with t-tests for fixed effects and likelihood ratio tests

Page 27: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

8

Table 1.1: NIRIL phenotyped traits, descriptions, approximate distribution, between yearPearson correlation coefficients, and Pearson p-values.

Trait Description Distribution Pearson p-value

CULM Diameter of culm normal 0.688 <0.0001DTP Days to pollen shed normal 0.668 <0.0001EARD Ear diameter bimodal 0.907 <0.0001EARL Ear length normal 0.409 <0.0001KPR Kernels per rank bimodal 0.698 <0.0001KRN Kernel row number bimodal 0.718 <0.0001LBLH Primary lateral branch length normal 0.519 <0.0001PLHT Plant height normal 0.652 <0.0001PROL Prolificacy, ears on lateral branch exponential 0.422 <0.0001SPLH Spikelet length normal N/A N/ASTAM Percent staminate spikelets exponential 0.321 <0.0001TBN Tassel branch number normal 0.691 <0.0001TILL Tillering index exponential 0.346 <0.0001

Page 28: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

9

Table 1.2: Final models selected for the thirteen NIRIL phenotypes.

Trait Model

CULM line(family) + family + x(plot) + y(plot)DTP line(family) + family + x(plot) + y(plot) + x*y(plot)

EARD line(family) + family + x(plot) + y(plot) + x*y(plot)EARL line(family) + family + x(plot) + y(plot) + x*y(plot)KPR line(family) + family + x(plot) + y(plot) + x*y(plot)KRN line(family) + family + x(plot)LBLH line(family) + family + x(plot) + y(plot) + x*y(plot)PLHT line(family) + family + x(plot) + y(plot)PROL line(family) + family + x(plot)SPLH line(family) + family + xSTAM line(family) + family + x(plot) + y(plot) + x*y(plot)TBN line(family) + family + x(plot) + y(plot) + x*y(plot)TILL line(family) + family + y(plot)

Page 29: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

10

with one degree of freedom for random effects. Likelihood ratio and t-tests with p-values

greater than 0.05 were deemed not significant and the corresponding terms were removed

from the model. While the initial scope of the model included a random block and year

effect, none of the random effects were found to be significant. Following definition of

appropriate models for the studied traits (Table 1.2), least squared means for each trait

were calculated and used for QTL mapping.

Yijklmno = µ+ai(fj)+fj +bk+cl(bk)+dm(bk)+cl(bk)∗dm(bk)+tn+eijklmn+gijklmno (1.1)

Broad-sense heritabilities on a plot means basis (H2) were calculated for each of the

traits. The variance components needed for this calculation were found using a linear

mixed model with plot means as the dependent variable and plot and line as random

independent variables. Variance components for the line or genotypic component (σ2g),

the plot (σ2p), and the residual variance due to environment (σ2

e) were extracted and

equation 1.2 was used to calculate H2. The plot variance (σ2p) was calculated in the

model as a known source of variation in phenotype. Since this plot variance is known, it

does not contribute to unaccounted for environmental variation as seen by the residual

variance (σ2e) and was not used to calculate heritability.

H2 = (σ2g)/(σ

2g + σ2

e) (1.2)

1.3.3 QTL Mapping

We mapped QTL using a model based approach in R/qtl [45, 46] with phenotype, repre-

sented by least squared means, and 25 genetic markers for the NIRILs. The introgression

on the fifth chromosome started as a heterozygous segment in the BC6 generation and

segregates as a S6 population. Consequently, we analyzed the population as a BC0S6

in R/qtl. Genotypes were first used to produce a genetic map for the teosinte segment

introgression using the Kosambi mapping function [47], with a 0.0001 genotyping error

Page 30: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

11

rate as implemented in R/qtl. Genetic marker order was initially found by BLAST to

the AGPv2 genome and confirmed using the ripple function in R/qtl with a five marker

window. Significant LOD score thresholds were determined for each trait with a 5% cutoff

based on 10,000 permutations of the data.

QTL models for each phenotype were determined by scanning for potential QTL using

the Haley-Knott regression method and testing for QTL significance one-by-one. Defini-

tion of QTL models was accomplished by first scanning for QTL with the R/qtl function

scanone to find an initial QTL position with a LOD score greater than the 5% cutoff

calculated by permutations. Next, we scanned for additional QTL using the addqtl func-

tion. If this secondary QTL scan detected a QTL that exceeded the 5% LOD score cutoff

defined by permutations, it was added to the model and QTL positions were refined using

the R/qtl function refineqtl. QTL were added to the model using this cycle of: (1) scan-

ning for additional QTL, (2) adding significant QTL to the model, and (3) refining QTL

positions until no more significant QTL could be added. Once all significant QTL were

added, pairwise interactions between QTL were tested using the addint function of R/qtl.

Significant pairwise interactions (F-test, p < 0.05) were added to the model one by one

until no more significant interactions were detected. After the model was finalized, each

QTL in the final QTL model was tested for significance with dropone ANOVA analysis.

1.3.4 Simulation Experiment

In order to explore the theoretical maximum number of detectable QTL possible in this

study, we mapped QTL with simulated datasets where causative genes were randomly

chosen from the genes in the teosinte introgressed region. Simulated traits were made for

one to 15 causative genes, then 20 to 50 genes by fives, and then 75 and 100 causative

genes for a total of 24 different causative gene set sizes. The 25 genotyped markers in

our 259 NIRILs were used to assign genotype probabilities to the 2,576 total genes in

Page 31: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

12

the introgressed segment of chromosome five based on the genotype of flanking markers.

These genotype probabilities were assigned based on physical proximity to the two flanking

markers assuming physical distance was proportional to genetic distance so that a gene

closely linked to a given marker had a high probability of sharing that marker genotype.

When consecutive markers had identical genotypes, this method resulted in all genes

between them matching the flanking genotypes.

Phenotypic trait values are based on both the underlying genetic contributions of genes

and random environmental noise, which together define the heritability of a trait. The

genetic values in the simulated data were set as follows. For each simulated dataset, the

randomly chosen causative genes were assigned a genotype based on the previously derived

genotype probabilities and two effect types: equal and random gamma distributed (alpha

= 1.36 and beta = 1) [48]. The effect types for each gene were given a positive, zero,

or negative value depending on whether the assigned genotype was homozygous maize,

heterozygous, or homozygous teosinte, respectively. Thus, each simulated causative gene

had two numeric values (one for equal and one for gamma distributed effects) representing

the magnitude and direction of effect on the trait. The total genetic contribution to NIRIL

phenotype was then found by simply summing the gene values (equal and gamma effects

kept separate) for all simulated causative genes.

Environmental noise was added to the summed NIRIL genetic phenotype values by

taking random draws from a normal distribution with variance equal to the additional

variance needed to reach the desired level of heritability. Two levels of heritability were

simulated, 67% and 90%, to mimic the heritabilities of two actual traits, the moderately

heritable culm diameter and highly heritable ear diameter. Heritability of the simulated

traits was required to be within 2.5% of the desired heritability, otherwise the normal

distribution was resampled. This process resulted in each set of simulated causative genes

Page 32: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

13

having four states for the NIRILs: equal effect 67% H2, equal effect 90% H2, gamma

effect 67% H2, and gamma effect 90% H2.

We simulated twenty-four causative gene set sizes with two effect types and two her-

itabilities for a total of 96 distinct simulated states. Each of these states was replicated

1,000 times resulting in 96,000 simulated sets of phenotypes for the 259 NIRILs. These

phenotype values were then used with actual NIRIL genotypes to map QTL in the R/qtl

software using the same method as described in the previous section. Pairwise QTL in-

teractions were not tested for or added in the simulated datasets because interactions

were not part of the simulated conditions. Mapping of QTL for thousands of simulated

traits could not be accomplished manually and consequently was done with a custom R

script that automated the addition of QTL and saved summary information including

QTL estimated effect size, position, LOD scores, and number of QTL.

1.4 Results

1.4.1 QTL mapping

Previous work has shown chromosome five to be home to several high LOD score and large

effect size QTL for a number of inflorescence and plant architecture domestication traits

[8, 25]. We undertook a high resolution mapping experiment with a population of NIRILs

with variable fifth chromosome teosinte introgressions in a W22 maize background. In

the summers of 2009 and 2010, the 259 NIRILs were grown in randomized plots arranged

in three replicated blocks. Phenotype data for thirteen traits was collected for five plants

per plot. Spikelet length was only collected for a single block in the summer of 2010. We

analyzed trait measurements from all three grow environments together in a single linear

mixed model with block and year as random effects and position, NIRIL, and family as

Page 33: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

14

fixed explanatory variables. Least squared means were estimated from the mixed models

and later used for QTL mapping.

Histograms of the least squared means show several distribution types including nor-

mal, bimodal, and exponential (Supplemental Figure A.1). NIRILs genotyped as 100%

maize (29 lines) and 100% teosinte (27 lines) were used to determine whether traits

behaved as expected with the full teosinte introgression lines having more teosinte like

phenotypes. Several traits believed to not be primary targets of selection during domes-

tication such as days to pollen shed and plant height appear to have little or no overall

difference between NIRILs containing the maize and teosinte introgression, while traits

that were the primary focus of selection during domestication including kernel row num-

ber (KRN) and ear diameter (EARD) have a substantial phenotypic difference between

homozygous maize and teosinte NIRILs. For all domestication traits, we observed a dif-

ference (sometimes quite small) between the least squared means for maize and teosinte

NIRILs consistent with the expected effect of domestication. Particularly large differences

are shown for EARD and KRN traits, where the maize genotype is 17.3% and 14.8% larger

than the teosinte genotype, respectively. Also of interest is the CULM trait, where the

maize genotype was 6.5% larger than teosinte.

There was a balanced representation of maize and teosinte genotypes with a high de-

gree of homozygosity in the QTL mapping population. Overall genotypes of the NIRILs

were 48.3% maize, 48.2% teosinte, and 3.5% heterozygous. The NIRIL population in-

cluded lines with teosinte introgressions across 162.24 megabases (Mbp), from position

6,985,619 to 169,231,037 on the maize reference genome (AGPv2). This introgression

included 74.47% of the approximately 218 megabase fifth chromosome. Of the 4,503 fifth

chromosome genes on the Filtered Gene Set (version 5b), 411 genes on the tip of the small

arm and 1,516 genes on the long arm were not included in the teosinte introgressions used

in this study. The genetic map generated with the Kosambi mapping function in R/qtl

Page 34: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

15

Figure 1.1: Cumulative plot of QTL detected in the mapping experiment. Molecularmarker positions are shown in centimorgans at the bottom. QTL name consisting of anabbreviated trait name, chromosome number, and QTL number are located on the leftside. The 1.5 LOD support intervals for QTL are indicated by horizontal bars and peakLOD scores by vertical lines. Hatched bars indicate interacting QTL while solid barsare non-interacting. In total, 24 QTL were identified across the fifth chromosome with avariety of confidence interval sizes, max LOD scores, and effect sizes (See Table 1.3 forQTL statistics). Five QTL clusters with contiguous regions of five or more QTL 1.5 LODsupport intervals are indicated by grey shading. A grey-scale heat map depicting numberof QTL 1.5 LOD support intervals from white (0) to black (8) is located at the top.

Page 35: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

16

was calculated to be 86.64 centimorgans (cM), giving an average Mbp to cM ratio of 1.873

Mbp/cM.

We analyzed 13 traits and identified 24 QTL (Figure 1.1, Table 1.3) with a broad range

of LOD scores ranging from 2.70 (KPR) to 47.22 (KRN). A single epistatic interaction

was detected between the two kernel row number QTL, suggesting epistasis is minimal.

QTL 1.5 LOD support intervals ranged from 2.3 cM (KRN) to 50.6 cM (KPR) with an

average value of approximately 12.5 cM. Heritability on a plot mean basis (Table 1.3) for

each trait varied with an average H2 of 63% and range of 23% (PROL) to 90% (EARD).

Five QTL clusters, defined as contiguous regions with five or more QTL 1.5 LOD support

intervals, were found in the mapping region on chromosome five near 2, 51, 61, 70, and 84

cM (Figure 1.1). There is no clear single concentration of QTL, suggesting this genomic

region lacks a single gene of large, pleiotropic effect and that multiple linked factors at loci

spread across the fifth chromosome are responsible for the previously identified influence

of chromosome five on domestication traits.

1.4.2 Simulation Experiment

We performed a simulation experiment to determine the power and precision of our map-

ping population. Using causative genes projected onto actual NIRIL genotypes, a total

of 96 distinct simulated states in terms of number of genes (between one and 100), heri-

tability (67% and 90%), and effect type (equal and gamma) were replicated 1,000 times

for a grand total of 96,000 simulated NIRIL trait datasets. Histograms of simulated traits

with 90% heritability were clearly bimodal when one causative gene was simulated and

progressively moved towards a normal distribution as more and more causative genes

were simulated. In comparison, simulated traits with 67% heritability lack a clear bi-

modal distribution even when only a single causative gene was simulated and are clearly

approximately normal when 100 genes are simulated (Figure A.2).

Page 36: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

17

Table 1.3: Detected QTL for the T5S mapping population with position, heritability, andLOD score statistics.

LOD 1.5 LOD SI Peak Location Percent Variation H2

culm5.1 13.50 58.9 – 69.3 65.3 21.3% 66.5%

dtp5.1 16.36 0.0 – 11.7 2.3 20.1% —dtp5.2 18.76 75.7 – 80.0 77.4 23.6% —

dtp model 28.93 — — 40.1% 67.3%

eard5.1 3.00 0.0 – 24.2 12.9 1.7% —eard5.2 17.99 50.1 – 54.4 51.9 11.7% —eard5.3 33.76 82.9 – 85.9 84.4 25.6% —

eard model 65.62 — — 69.0% 90.0%

earl5.1 12.38 0.0 – 5.4 1.9 19.7% 49.1%

kpr5.1 2.70 0.0 – 50.6 2.2 3.0% —kpr5.2 6.80 44.9 – 64.8 63.2 7.9% —kpr5.3 4.11 76.0 – 86.2 80.9 4.6% —

kpr model 27.41 — — 38.5% 72.7%

krn5.1 6.22 18.8 – 24.7 21.5 4.8% —krn5.2 47.22 82.6 – 84.9 83.8 53.4% —

krn5.1:2 3.32 — — 2.5% —krn model 50.56 — — 59.2% 73.7%

lblh5.1 24.61 75.0 – 81.1 79.0 35.3% 53.5%

plht5.1 7.64 0.0 – 2.4 0.0 11.3% —plht5.2 2.89 24.3 – 39.2 31.7 4.1% —

plht model 14.06 — — 22.0% 63.1%

prol5.1 8.38 56.9 – 71.6 64.2 13.8% 22.9%

splh5.1 9.14 0.0 – 18.7 13.0 10.2% —splh5.2 7.16 65.7 – 68.4 67.7 7.9% —splh5.3 2.78 74.3 – 86.6 78.0 2.9% —

splh model 30.60 — — 41.8% 88.3%

stam5.1 6.50 50.7 – 86.6 83.8 10.9% 25.9%

tbn5.1 8.28 0.0 – 4.0 0.3 13.1% —tbn5.2 4.60 43.6 – 53.2 47.3 7.1% —

tbn model 10.46 — — 16.9% 69.9%

till5.1 7.21 44.1 – 62.9 58.7 9.8% —till5.2 3.22 77.2 – 85.9 81.8 4.2% —

till model 18.61 — — 28.1% 34.3%

Page 37: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

18

Since calculating significant LOD score thresholds via permutations for all 96,000

simulated phenotype sets would have taken weeks of computation time, we calculated

LOD score cutoffs in the first 50 replicates of the 96 states. The average threshold was

lower for 90% heritability than 67% heritability with no clear difference in threshold

caused by the effect type of causative genes. Simulated phenotypes with few causative

genes had a lower threshold on average with this effect more pronounced for the gamma

distributed effect type. The range of LOD score thresholds determined was quite narrow

(2.37 to 2.59 for gamma distributed and 2.38 to 2.60 for equal effects). Consequently,

instead of running permutations for the remaining datasets we set a conservative LOD

score threshold for mapping all simulated traits. The cutoff we chose was the maximum

of the 5% cutoffs found in the first 50 replicates of each of the 96 states.

After simulated phenotypes were generated and significance thresholds were set, QTL

were mapped using the 96,000 simulated datasets with actual genotypes for the NIRILs

in this study. Increasing the number of simulated causative genes from one to 100 caused

the mean number of detected QTL to rise from one to ∼4.5 or ∼3.0 for simulated traits

with 90% or 67% heritability, respectively (Figure 1.2). Thus, heritability was an impor-

tant factor in determination of the number of detectable QTL in our experiment. The

simulated gamma effects, as opposed to equal effects, appeared to cause the maximum

number of detectable QTL to be reached at a larger number of simulated causative genes,

but there was no difference in the overall maximum number of QTL detected.

Our results show that QTL 1.5 LOD support intervals quickly become associated with

multiple genes when many causative genes are simulated (Figure 1.3). In the case of

five causative genes with equal effect and 67% heritability, the chance of a QTL contain-

ing a single causative gene has already dropped to approximately 50% (Similar patterns

are seen for gamma simulated phenotypes in Supplemental Figure A.3). This suggests

when making decisions about fine mapping of QTL, researchers would be well advised to

Page 38: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

19

Figure 1.2: The number of detected QTL and mean detected QTL effect size versusnumber of simulated causative loci. Black lines indicate 95% confidence intervals. (A)Simulations consistently detect one QTL when a single causative gene is simulated, butwhen using as few as three or four causative genes, we lose the ability to distinguishbetween genes. With high numbers of simulated causative genes, total QTL detectedreaches a ceiling of ∼4.5 QTL for simulated traits with 90% heritability and ∼3.0 fortraits with 67% heritability. (B) The effects of unresolved genes are merged into the fewlarge effect QTL that are detected, consistent with the Beavis Effect. This is seen in thenegative correlation between mean estimated effect and number of causative genes.

Page 39: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

20

consider factors such as trait heritability and the power of their mapping population to

identify QTL support intervals that contain single causative genes.

In our simulation experiment, increasing the number of causative genes also led to an

increase in the average estimated effect size of detected QTL (Figure 1.2). We interpreted

this as the effects of multiple underlying causative genes being combined into a single

detected QTL with a cumulative effect, consistent with the Beavis Effect where multiple

small effect loci are detected as single QTL of larger effect [49]. On average, the total

additive effect for each simulated phenotype should be the product of the total number of

simulated causative genes and the average effect size. We found this expected relationship

between number of detected QTL, average estimated additive effect of each detected QTL,

and expected total additive effect for both equal and gamma distributed effect size and

both heritabilities.

Our mapping results using empirical, measured traits, found three QTL for a trait

with heritability of 90% (ear diameter) and a single QTL for a trait with 67% heritability

(culm diameter). Comparison of these results with the simulations show that for traits

with 90% heritability, when three or more QTL are detected there is likely to be anywhere

from four to six underlying causative genes, making a 1:1 relationship between number

of QTL and causative genes uncertain (Figure 1.2). In contrast to this result, simulated

traits with heritability of 67% and a single causative gene averaged a single detected

QTL which contained the causative gene 90% to 95% of the time. These observations

have implications for future fine mapping efforts to identify the causative gene underlying

QTL.

Page 40: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

21

Figure 1.3: The proportion of detected QTL with zero, one, or more than one simulatedcausative genes in the 1.5 LOD support interval. High numbers of causative genes lead todetected QTL that contain multiple causative genes. There is a reasonable percentage ofdetected QTL in the simulations that contain a single causative gene when few (less than4) causative genes are simulated, but as the number of simulated causative genes increaseswe quickly lose the power to distinguish between closely linked causative genes and theybecome lumped into single detected QTL. Equal effect simulations shown here are verysimilar to those seen for the gamma distributed effects (Supplemental Figure A.3).

Page 41: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

22

1.5 Discussion

Previous studies in maize have found single genes underlying genomic regions of large

effect on multiple domestication traits [3–5, 41, 50]. This is in stark contrast to our work

on chromosome five, where the previously observed large effect of chromosome five on

several domestication traits in maize [8, 25] is caused by multiple regions spread across

the chromosome. This suggests the nature of genetic factors controlling domestication

traits on chromosome five of maize are different from other large domestication loci in

maize. Whether or not the situation of chromosome five in maize is unique in maize

or crop plants is yet to be seen, but the several loci identified in this study suggest

that in addition to effectively acting on highly pleiotropic, large effect single genes, the

domestication process also has the capacity to work on several linked genes of variable

effect to produce a chromosomal region of large QTL effect.

Although our results show that several regions on chromosome five contain QTL af-

fecting different traits, this chromosomal region was initially defined as several tightly

clustered QTL in F2 crosses between teosinte and a small-eared primitive Mexican lan-

drace [43]. In contrast, our NIRIL population was developed from a cross of teosinte by a

modern agronomic maize inbred (W22) and is expected to harbor domestication QTL as

well as improvement QTL selected on during the past 9,000 years since maize was domes-

ticated. Thus while results from this analysis suggest chromosome five houses a complex

made of multiple linked factors, we cannot discount the possibility that a simpler genetic

architecture would have been observed had we used a primitive maize landrace rather

than the maize W22 inbred line.

One potential use of QTL mapping results is interrogation of the genes within QTL 1.5

LOD support intervals for likely candidates. The marker density in our experiment leads

to most QTL 1.5 LOD support intervals containing hundreds of annotated genes. How-

ever, two QTL had a narrow support interval that contained a relatively small number of

Page 42: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

23

genes. These two QTL were krn5.2 and eard5.3, which co-localize to the same ∼2.3 cM

region. When expanded to the nearest genetic markers, these QTL fell between umc1348

and um1966, which spanned a 4.81 cM region that included 2.654 Mbp with 54 genes

from the maize filtered gene set (AGPv2). One interesting candidate that falls in this

range is AC212823.4 FG003, which encodes a MADS box transcription factor previously

cataloged as MADS-transcription factor 65 (mads65) in the GRASSIUS transcription

factor database [51]. Initially identified in plants as important floral organ identity reg-

ulators [52, 53], the MADS-box family of transcription factors has since been shown to

be involved in a wide variety of developmental programs in various organs and stages

of plant development [54]. This particular MADS-box gene has homology to the rice

gene OsMADS57, a type II MIKCC MADS gene. The large subclass of MIKCC MADS

genes is quite diverse with members involved in floral specification, phase transition, and

root development among other developmental functions [54]. This gene was also found to

be selected during crop improvement by a recent study [55] and was expressed in many

tissues as described in the maize gene expression atlas [56]. All of these factors make

AC212823.4 FG003 an attractive candidate in future studies to fine map the causative

gene for kernel row number on chromosome five.

The limits of a QTL experiment in terms of power and resolution are important factors

to consider when undertaking an experiment in any mapping population. To better inform

our QTL results with empirically measured traits, we explored the computational limits of

the experimental mapping population using simulated trait datasets. In this experiment,

we never detected more than six QTL for any of the simulated conditions. The most

important characteristic of simulated traits in determining number of detected QTL was

heritability and not effect type. As expected, when the number of underlying causative

genes increased to a high level, we saw the effect of multiple causative genes being rolled

into single detected QTL. This result is consistent of the Beavis Effect [49], a phenomenon

Page 43: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

24

that describes the tendency for QTL of small effect to be combined into a single QTL with

large estimated effect. If these polygenic QTL, which can have quite high LOD score and

effect size, were chosen for fine mapping we would be unlikely to find a single underlying

causative polymorphism. Consequently, when considering QTL for fine mapping purposes,

researchers must be careful in choosing QTL that have high heritability and mapping

populations with sufficient power to resolve QTL to single genes. It is important to

realize that the simulation results reflect the specific markers, genotypes, and mapping

population used in this study. While some results are likely generally applicable to other

QTL experiments, simulations using mapping population specific parameters will provide

the best insight into potential genetic architectures and information on population power

and precision.

QTL mapping has been used to great effect to characterize the genomic regions con-

trolling traits selected on during domestication in maize. These studies have shown that

while genetic factors controlling domestication traits are spread throughout the genome,

there are concentrated genomic regions where QTL for several domestication traits are

in close proximity to each other [8, 25]. In this study, we use a QTL mapping popula-

tion of NIRILs with teosinte introgressions specific to chromosome five to closely examine

previously mapped QTL for a number of domestication traits. We confirmed QTL for

these traits exist on chromosome five, however, in our population these QTL further frac-

tionate into multiple QTL. This is in contrast to other genomic regions of large effect

in maize where single pleiotropic genes were identified as the causative factor underly-

ing genomic regions of large effect [3–5, 50]. The presence of multiple QTL in several

locations on chromosome five suggests the existence of a complicated, linked, multi-gene

locus controlling various aspects of domestication traits. This apparent complexity of the

chromosome five locus is consistent with results from our simulation experiment, where

Page 44: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

25

we show that traits with multiple mapped QTL likely have a more complicated underlying

genetic architecture than is indicated by the initial QTL mapping results.

Page 45: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

26

Chapter 2

Fine mapping of chromosome five

domestication genes in maize

Page 46: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

27

2.1 Abstract

The fifth chromosome of Zea mays has previously been shown to contain a large ef-

fect QTL for several domestication traits. In this work I describe efforts to identify the

causative polymorphisms responsible for several of these QTL for the domestication traits

of culm diameter and kernel row number. These two QTL represent the first and eighth

highest LOD scores detected in the QTL mapping experiment of chapter 1. We utilized

several heterogeneous inbred families drawn from a BC2S3 mapping population that were

heterozygous in the 1.5 LOD support interval of these QTL to generate two sets of recom-

binant chromosome nearly isogenic lines, one for the culm diameter QTL and one for the

kernel row number QTL. Lines were grown in replicated, randomized blocks in four years

and phenotypes were measured. A linear mixed model was used to obtain least squared

means for each line and we looked for segregation of the phenotype based on indel and

genotyping by sequencing markers. Simple Mendelian segregation of the lines was not

observed for any of the traits of interest, suggesting a single locus does not explain the

differences in phenotype. Consequently, we used QTL mapping software to map QTL in

the segregating regions of interest on chromosome five for culm diameter and kernel row

number. These analyses showed a highly significant heterogeneous inbred family effect

as well as multiple QTL in the target region for kernel row, suggesting the genetic fac-

tors underlying kernel row number and culm diameter have a complex relationship with

multiple loci on several chromosomes.

Page 47: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

28

2.2 Introduction

The ultimate goal of many studies investigating the evolution of novel morphology in di-

vergent lineages is identification of the causative genes responsible for phenotypic change.

Towards this end, genes causing new forms have been identified a number of times in

many species including maize, tomato, wheat, barley, and most successful in rice. Over

the years there have been more than 20 genes identified in rice with important effects

on agronomic and domestication phenotypes such as loss of shattering in domesticated

plants [15], increased grain yield in terms of grain number [57], grain weight [58, 59], and

plant architecture [60, 61]. In contrast, there are considerably fewer success stories in fine

mapping in other organisms. In maize, recent experiments have mapped several high LOD

score, large effect domestication QTL to single genes including teosinte branched1 (tb1 )

[3, 41], grassy tillers1 (gt1 ) [4, 39], teosinte glume architecture1 (tga1 )[5], and ZmCCT

[50, 62]. One common characteristic of these genes is they were initially characterized as

massive, high LOD, large effect size QTL.

In maize, domestication phenotypes have been shown to be largely controlled by six

regions of the genome [8]. The large concentration of domestication QTL on the fifth

chromosome has been repeatedly observed in several studies [25, 37, 43], however, little

is known about the causative genes and underlying polymorphisms that cause this large

effect. Experiments designed to examine chromosome five in maize have several challenges

caused by characteristics of the chromosome. First, this chromosome has gametophyte fac-

tor2 (ga2 ) [63], a pollen incompatibility factor which greatly influences pollination rates of

specific genotype combinations. Second, there is an extended region of low recombination

rate around the centromere (102.3 megabase to 109.2 megabase) that complicates collec-

tion of recombinant chromosomes for mapping experiments. In spite of these challenges,

characterizing the many domestication QTL for plant architecture and inflorescence traits

on the fifth chromosome of maize is a necessary step towards fully understanding the ef-

Page 48: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

29

fect domestication had on the maize genome. While many traits have QTL that map

to the fifth chromosome, QTL with exceptionally high LOD score and effect size are of

particular interest for fine mapping studies.

A high LOD score, large effect QTL for kernel row number (krn) and ear diameter

(eard), previously reported on chromosome five of maize [25, 37, 43], was shown to frac-

tionate into at least two or three QTL in Chapter 1. The largest QTL for both of these

traits in terms of LOD score and effect size (eard5.3 and krn5.2 ) were both located to-

wards the right of the mapping interval between umc1966 and umc1348. The krn5.2 QTL

had a LOD score of 45.2, explained 51.98% of phenotype variation, and was estimated

to have an additive effect of -0.73 kernel rows. The co-localizing eard5.3 QTL also had

a trait high 32.7 LOD score, 25.1% variation explained, and effect of −1.41 mm. This

region was ∼1.3 cM or 2.65 Mb and was the narrowest confidence interval found for the

mapping population used in chapter 1. The kernel row number and ear diameter traits

are highly related, both affecting ear size in the transverse plane. This fact, viewed in the

context of co-localization of eard5.3 and krn5.2, suggests a single gene influences both

traits.

In addition to the high LOD score QTL for krn and eard, the fifth chromosome of

maize was shown (Chapter 1) to have QTL for plant architecture traits including tillering,

lateral branch length, and culm diameter. The QTL for culm diameter in chapter 1 had

the eighth highest LOD score detected. In contrast with the krn5.2 and eard5.3 QTL,

mapping for culm diameter revealed a single QTL of moderate effect, culm5.1. This QTL

had a considerably larger 1.5 LOD support interval (97.3 megabases), lower LOD score

(19.8), lower variation explained (21.27%), and smaller additive effect size (−0.67 mm).

The characteristics of culm5.1 in terms of number of QTL, LOD score, and effect size

make for a different type of fine mapping candidate than krn5.2 and eard5.3.

Page 49: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

30

An experiment was designed to further investigate and identify the causative poly-

morphisms behind the large effect and LOD score krn5.2 /eard5.3 QTL and the moderate

effect culm5.1 QTL. This project used a collection of recombinant chromosome nearly iso-

genic lines (RCNILs) grown in replicated randomized blocks over multiple years. These

RCNILs were derived from heterogeneous inbred families (HIFs) drawn from a BC2S3

population with a massive ear diameter QTL [25] with a maximum LOD score of 144.4.

Lines were generated, genotyped, and grown in replicated blocks in the summers of 2010,

2012, and 2013. RCNILs did not segregate cleanly in the target QTL 1.5 LOD support

intervals for the kernel row number and culm diameter phenotypes. I next used genome-

wide genotyping and QTL mapping methods to account for secondary segregating regions

in the genome. The results of this analysis suggest that not only are secondary sites segre-

gating with significant effects on kernel row number and culm diamter, but that multiple

factors are again segregating within the initial target QTL support interval. Overall, these

results suggest the genetic architecture controlling domestication traits is quite complex

with multiple loci contributing to kernel row number and culm diameter phenotypes across

the genome. Chromosome five in particular appears to house a collection of genes affect-

ing several domestication traits and represents at least three linked loci that may have

been selected as a unit during maize domestication.

2.3 Materials and Methods

2.3.1 Plant material

We chose to identify the causative genes underlying the large LOD score and effect size

QTL for kernel row number (krn) and culm diameter (culm) on chromosome five using re-

combinant chromosome nearly isogenic lines (RCNILs). These lines consist of individuals

carrying two copies of a recombinant chromosome with a recombination breakpoint in the

Page 50: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

31

region of interest, which corresponds with the 1.5 LOD support intervals for culm5.1 and

krn5.2. Based on QTL mapping results from chapter 1, the two QTL are adjacent to each

other with the culm diameter QTL from 54,416,924 to 151,717,831 bp and the kernel row

number QTL from 166,576,639 to 169,231,037 bp. Base pair coordinates for these QTL

are based on BLAST of flanking marker primer sequences against the second version of

the maize reference genome (AGPv2) [9]. I chose to generate RCNILs from segregating

heterogeneous inbred families (HIFs) taken from a large BC2S3 mapping population. Four

founding HIFs, two per QTL, heterozygous for the genomic region of interest defined by

QTL 1.5 LOD support intervals and surrounding regions were used in production of RC-

NILs. Care was taken to use HIFs with limited heterozygosity adjacent to the primary

region of interest and elsewhere in the genome.

A large number of plants from each HIF were screened with PCR based insertion dele-

tion (indel) markers flanking the region of interest to identify plants with recombinant

chromosomes in the summers of 2009 and 2010. The initial screening of HIFs for indi-

viduals with recombinant chromosomes used three flanking markers (ZHL0029, ZHL0033,

and umc1966) located at 38,994,478 bp, 151,446,717 bp, and 169,230,959 bp, respectively.

These markers were chosen to be as close as possible to the boundaries of the QTL.

Individuals with recombinant chromosomes were self pollinated and seed was harvested

and planted in the following winter grow seasons. Plants were grown in winter seasons

in a greenhouse environment, where they were genotyped again at the flanking markers

to identify plants homozygous for the initially detected recombinant chromosome. These

individuals were then self pollinated to make RCNIL seed, carrying two copies of the origi-

nal recombinant chromosome, for use in subsequent summers for randomized phenotyping

blocks and seed increasing purposes.

Page 51: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

32

2.3.2 Field Trials and Phenotypes

The RCNILs were grown in a total of 16 replicated, randomized blocks in multiple sum-

mers between 2010 and 2013. Phenotyping experiments took place at the West Madison

Agricultural Research Station (WMARS) with RCNILs for the culm5.1 QTL grown in

2010 and 2012 with the krn5.2 QTL lines grown in 2012 and 2013. When possible, seed

for a single RCNIL was taken from up to five seed packets and mixed prior to planting in

order to minimize the effect of any single seed lot (mother plant) on phenotype. In each

summer, four blocks of RCNILs per QTL were grown in twelve plant plots. Individuals

were planted with equal spacing in 14 foot rows with 30 inches between adjacent rows

and two foot walkways separating the end and start of a new row. Up to five individuals

per plot were measured in 2010 and 2012, while in 2013 kernel row number was assessed

for all possible plants.

In addition to twelve plant plots, select lines for the culm diameter QTL were grown in

a phenotyping block of fully randomized single plant plots (SPP) in the summer of 2012.

This block of plants consisted of 60 individuals each from seventeen RCNILs and eight

control RCNILs (homozygous for the maize or teosinte chromosomal segment) grown in a

completely randomized scheme. The seventeen RCNILs were chosen due to recombination

breakpoints being close to preliminary estimates of the causative gene location based on

initial analysis of data from the summer of 2010. Individual plants were separated by

a larger than normal distance (30 inches in the X and 48 inches in the Y dimension) in

order to allow them to grow to their full phenotypic potential with minimal competition

and shading from neighboring plants.

Traits were measured by hand with culm diameter taken manually in the field with

calipers at the narrowest point of the stalk and kernel row number counted after harvest

in the lab (2012) or in the field (2013). In the SPP we also measured culm diameter at the

largest point to calculate culm area and other basic plant architecture traits (plant height

Page 52: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

33

and tiller number) for use in later analyses. In total, 3,182 individuals were assessed for

culm diameter (1,021 in 2010 and 2,161 in 2012) and kernel row number was counted

for 8,625 individuals (3,168 in 2012 and 5,457 in 2013). A highly related trait to kernel

row number (ear diameter) with a co-localizing QTL detected in chapter 1 (See Table 1.3

for details) was also measured in some environments, but was not considered for later

analyses since kernel row number and ear diameter are highly related traits.

2.3.3 Genotyping with PCR and next generation sequencing

Genomic DNA was extracted from the initial plant of each RCNIL with a standard CTAB

method and genotyping from this “founder” individual was used to represent the RCNIL

genotype in later analyses. The genotypes of RCNILs were obtained using two strategies,

a PCR based method targeting known polymorphisms and a high throughput next gen-

eration sequencing protocol. All RCNILs were genotyped using PCR of known indels and

single nucleotide polymorphisms (SNPs), while a subset were genotyped using the high

throughput genotyping by sequencing (GBS) protocol. All RCNILs developed for fine

mapping of krn5.2 were genotyped by GBS while only a subset of culm5.1 RCNILs were

genotyped by GBS. However, genotyping of culm5.1 lines was done with a more extensive

collection (18 markers) of PCR markers than krn5.2 RCNILs (5 markers).

PCR based genetic markers (Supplemental Table B.1) were used to genotype RC-

NILs with standard agarose gel electrophoresis, florescent fragment analysis, and Sanger

sequencing detected SNPs. These three styles of marker were initially developed by identi-

fication of scorable polymorphisms that distinguished maize and teosinte control RCNILs

through Sanger sequencing of annotated genes in the maize reference genome (AGPv2).

Size polymorphism differences greater than approximately 10% of total PCR product

length were scored on 4% agarose gels and smaller size polymorphisms were redesigned

with florescently labeled primers and genotyped using GeneScan software (v1.70) from

Page 53: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

34

Applied Biosystems. If the only scorable polymorphism was a SNP, RCNILs were geno-

typed by Sanger sequencing and hand calling of SNPs.

While great care was taken to choose founding HIFs with minimal heterozygosity, all

HIFs had secondary sites segregating elsewhere in the genome. In order to identify these

regions and account for their effect on phenotype, we performed GBS [64] on RCNIL

genomic DNA for all kernel row number and the subset of culm diameter lines grown in

the single plant plot experiment. In order to use the GBS protocol, additional molecular

work was required. DNA was treated with 1 µL of RNaseI at room temperature for 30

minutes to remove total RNA from the CTAB DNA preparation. Next, the samples were

digested using the methylation sensitive ApeKI restriction enzyme and 96-plex barcoded

sequencing adapters were ligated to individual samples. Finally, the 96 barcoded samples

were mixed and sequenced (100 bp reads) on an Illumina HiSeq machine [64]. Sequence

tags were aligned to the reference maize genome (AGPv2) and SNPs were called and im-

puted using the GBS pipeline as implemented at Cornell University. This GBS procedure

resulted in 955,650 SNPs made up of raw A, T, C, and G SNP calls for the RCNILs across

the ten maize chromosomes.

Raw SNP calls were further processed in order to call RCNIL genotypes into maize

and teosinte using a custom Perl script. The genotype calls were made using SNP calls

from the pure maize parent inbred line, W22. Only biallelic markers (43,025 total) were

kept and the non-W22 SNP in the RCNILs was assumed to be the teosinte allele. After

converting the genotypes into maize, teosinte, and heterozygous calls, SNPs separated

by less than 100 base pairs were merged into a single marker, leaving 25,736. If SNP

genotypes within a merged marker did not agree they were converted to missing data,

“N”. After marker genotypes were called and merged, a final genotype imputation step

was carried out using another custom Perl script. In an effort to have this script correct

bad and missing data, all genotype calls were subject to imputation. The criteria for

Page 54: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

35

changing a call in any given RCNIL involved ten marker windows both upstream and

downstream of a given marker. If all markers in one direction or the other were 100%

consistent, then the genotype was changed if and only if seven of the ten markers on the

other side were also the same genotype.

The imputation methods for GBS data described above greatly improved genotype

continuity, however, certain regions of the genome were still questionably called. The

most inconsistently called genomic regions included extended heterozygous and recom-

bination breakpoints where genotypes switched. Following the processing steps using

custom Perl scripts, the data were manually screened to remove and correct inconsis-

tently called markers. Uninformative markers where the adjacent marker to either side

had exactly the same genotypes were also removed from the dataset. Regions of the

genome where RCNILs had maize and teosinte fixed genotypes that associated with HIF

(non-segregating regions fixed for different genotypes in the founding HIFs) were also re-

moved. Finally, independently segregating regions on the same chromosome were given

unique chromosomes names (5a, 5b, etc.) to avoid inflation of the genetic map between

fixed ancestral recombination breakpoints. After imputation and filtering, a total of 522

genome-wide GBS markers spread across 13 segregating regions of the genome on six

chromosomes were used in the final analysis. The four other maize chromosomes were

completely fixed for a single homozygous genotype and consequently were excluded from

the analysis.

2.3.4 Statistical analysis and segregation of phenotypes

We utilized the statistical program SAS to fit a linear mixed model with the PROC

MIXED command [44]. Variables used in the model included the RCNIL, HIF, block, year

grown, and position within the block. A forward model selection method was used in which

the starting model had a minimum number of variables (fixed effects for HIF and RCNIL

Page 55: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

36

nested in HIF) and additional variables were added to the model one at a time until the

Aikake Information Criterion (AIC) reached its lowest point. The most complicated model

selected was for the culm diameter single plant plot experiment, where five explanatory

variables were used (Table 2.1). In these models Y stands for the measured phenotype,

µ stands for the grand mean, ai the RCNIL, fj corresponds to the HIF, bk the block, cl

and dm denote the horizontal and vertical position in block respectively, tn stands for the

year, hp is the tiller number phenotype used in SPP experiment only, and finally e and g

are error terms. While the single plant plot culm diameter had the most complex model,

the other models had only one less variable.

Least squared means for the RCNIL nested in HIF effect were extracted and used

as an average line phenotype value for subsequent analyses. The goal of these analyses

was to associate RCNIL phenotype and genotype. If a single locus in the segregating

region is responsible for the phenotypic effect, one should observe simple, clean Mendelian

segregation of least squared means based on genotype. Towards this end, RCNILs were

sorted by phenotypic value (as represented by least squared mean). Unfortunately, we did

not observe segregation of least squared means based on genotype, suggesting multiple

factors influence the measured traits.

We have two main hypotheses as to why RCNILs failed to segregate in a Mendelian

manner. First, the less advanced nature (in comparison to the BC6S6 population from

chapter 1) of the BC2S3 founding HIFs of the RCNILs may have additional factors seg-

regating elsewhere in the genome that are confounding Mendelian segregation. Second,

the primary locus of interest on chromosome five is not a single gene, but rather multiple

linked genes that when split up by the various recombination breakpoints in our RCNILs

leads to complicated segregation patterns. In order to investigate both of these possi-

bilities, we obtained whole genome genotypes using GBS (described above) and mapped

QTL in the R/qtl software package for plants grown in twelve plant rows. The single

Page 56: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

37

Table 2.1: Final linear mixed models used to produce least squared means for fine mappingRCNILs.

Trait Linear Mixed Model

culm (rows) Yijkmo = µ+ ai(fj) + fj + bk + dm(bk) + eijklm + gijkmo

culm (SPP) Yijlmop = µ+ ai(fj) + fj + cl + dm + hp + eijklmnp + gijkmnpo

krn (rows) Yijkno = µ+ ai(fj) + fj + bk(tn) + tn + eijklmn + gijkmno

Page 57: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

38

plant plot culm diameter experiment was not analyzed with QTL mapping methods since

it only included seventeen RCNILs and consequently lacked power for a QTL analysis.

The benefit of using GBS and the statistical methods of QTL mapping are simulta-

neous exploration of multiple factors in the target QTL region and secondary genomic

regions of significant effect outside the QTL. A potential flaw in this approach is lack of

statistical power to differentiate closely linked, moderate effect QTL in the relatively small

RCNIL fine mapping populations. The full set of RCNILs was used to map QTL for the

krn and culm diameter traits in order to maximize our potential power to differentiate be-

tween tightly linked factors. In total, 75 lines were used in mapping of the culm5.1 QTL

(67 recombinant RCNILs and 8 homozygous maize and teosinte controls). The krn5.2

QTL was mapped with 92 lines, all of which were recombinant chromosome lines. QTL

mapping was conducted using the R/qtl package [45] with genetic maps calculated using

the Kosambi mapping function with 0.001 error rate. Ten thousand permutations of the

data were used to define a significant QTL threshold. QTL were mapped using a step-wise

model based approach where QTL were added to a model one-by-one using the addqtl,

fitqtl, and refineqtl functions of R/qtl until no more significant QTL were detected. In

addition to using detected QTL in the model, the founding HIF was used as an additive

covariate to account for variation caused by fixed non-segregating regions of the genome

that differ between HIFs removed in the manual curation of GBS genotypes. Details of

the step-wise QTL mapping method are available in chapter 1 methods.

2.4 Results

2.4.1 RCNIL generation and phenotype least squared means

I screened 4,180 total individuals from the four founding HIFs for recombinant chro-

mosomes in the summers of 2009 to 2011. In total 67 and 92 recombinant individuals

Page 58: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

39

Figure 2.1: Histograms of least squared means for the culm diameter and kernel rownumber phenotypes. Distribution of least squared means is approximately normal for theculm diameter least squared means. The kernel row number counts have a noticeableleft skew. Average least squared mean for homozygous teosinte and maize RCNILs (des-ignated by solid and dashed lines, respectively) have the expected relationship with theteosinte average always being the lower phenotypic value.

Page 59: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

40

were identified and turned into RCNILs in the 1.5 LOD support intervals of culm5.1 and

krn5.2 /eard5.3, respectively. The vast majority (3,230 of 4,180) of screened individu-

als came from HIFs intended for study of the culm diameter QTL. This large number

of individuals was required due to the presence of the centromere in the middle of the

target QTL region, which greatly reduced recombination rate and limited the number of

recombinant individuals.

Three linear mixed models were used to analyze the phenotype data for kernel row

number and culm diameter. Each model was selected using a forward selection method in

which one variable was added to the model at a time until the model fit, as measured by

AIC, did not improve. When plotted as histograms, the least squared means of the various

RCNILs followed a roughly normal distribution for culm diameter, while the kernel row

number trait had a left skew. RCNILs homozygous for the maize and teosinte segment

showed the expected relationship with maize RCNILs having larger culm diameter and

more kernel rows (Figure 2.1).

2.4.2 PCR and GBS genotyping

Initial genotyping of the RCNIL homozygous recombinant chromosome genomic DNA

was carried out through traditional methods using PCR. In total I placed 18 markers on

75 RCNILs (including four maize and four teosinte control lines) for the culm diameter

QTL and five markers on 92 RCNILs (no maize or teosinte control lines) for the kernel

row number QTL (Table B.1). Only five markers were placed on kernel row number

RCNILs for two reasons. First, the kernel row RCNILs had recombination events in a

much smaller physical distance (17.78 Mb versus 112.45 Mb for the culm diameter QTL).

Second, we expected to obtain thorough genome-wide genotyping using GBS, which had

already been initiated for the krn RCNILs.

Page 60: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

41

Fig

ure

2.2:

GB

Sge

not

yp

esfo

rke

rnel

row

num

ber

RC

NIL

s.T

hir

teen

regi

ons

acro

ssth

ege

nom

ear

ese

greg

atin

gin

the

kern

elro

wnum

ber

RC

NIL

s.T

he

pri

mar

yQ

TL

ofin

tere

stis

loca

ted

inth

e5b

regi

onw

her

eal

lR

CN

ILs

hav

ecr

osso

ver

even

ts.

Sec

ondar

yse

greg

atin

gre

gion

sin

only

one

ofth

etw

ofo

undin

gH

IFs

are

clea

rly

vis

ible

for

seve

ral

genom

icre

gion

s,fo

rex

ample

chro

mos

ome

8cse

greg

ates

inH

IFM

R08

41but

not

MR

0818

.F

igure

issc

aled

tom

arke

r,so

each

unit

ofle

ngt

hre

pre

sents

asi

ngl

em

arke

r.

Page 61: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

42

Of the nearly one million original SNP calls, only about 5% appeared to be segregat-

ing in a biallelic manner. The end genotyping resulted in zero segregating markers on

chromosomes one, two, four, and six. The structure of the founding HIFs implies each in-

dependent region of the genome that was heterozygous segregates independently of other

regions. To account for this, each segregating region was assigned its own linkage group

(5a, 5b, etc.) for QTL mapping so that non-segregating segments between heterozygous

regions would not influence the results. Overall, 522 markers in 13 linkage groups were

segregating across the six other chromosomes (Figure 2.2).

2.4.3 QTL fail to segregate as Mendelian traits

RCNILs were sorted by the phenotype least squared mean from least to greatest and we

looked for distinct maize and teosinte RCNIL groupings. There was not a clean segregation

of RCNILs into maize and teosinte classes for a single marker, suggesting multiple factors

within the primary QTL of interest or elsewhere in the genome are influencing the traits

of interest (Figure 2.3). The culm diameter trait came closer than the kernel row number

trait to clean segregation, especially for lines planted in the single plant plot.

An additional complication for both the culm diameter and kernel row number fine

mapping was the distinct difference between the grand mean of RCNILs derived from

different founding HIFs. For kernel row number, there was an average difference of ap-

proximately 1.8 kernel rows between RCNILs from different HIFs and the average rank

of HIFs differed by over 40. For culm diameter, the two HIFs differed by an average of

approximately 0.1 cm (Figure 2.4). Founding HIF was part of the linear mixed model used

to produce least squared means, but obviously the model failed to fully correct for differ-

ences between founding HIFs. With this in mind, HIF was used in subsequent mapping

methods to further account for differences caused by the starting HIFs.

Page 62: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

43

Figure 2.3: RCNILs sorted by phenotype from least to greatest. Genotypes for RCNILsare indicated on the left by green (teosinte), yellow (maize), grey(heterozygous), or white(N) with least squared means as barplots on the right. (A) Culm diameter least squaredmean from the twelve plant rows. (B) Culm area as measured in the single plant plots. (C)Kernel row number counted from twelve plant rows. A single causative gene should leadto segregation as a Mendelian locus when sorting RCNILs by phenotype. This was notseen and the genotypes appear more or less random suggesting multiple factors influencingphenotype in the RCNILs.

Page 63: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

44

Figure 2.4: Density plots of the culm diameter and kernel row number phenotypes groupedby founding HIF. Distinct differences between distributions are visible between the twofounding HIFs for culm diameter (both in the (A) single plant plot and (B) twelve plantrow designs) as well as for the (C) kernel row number phenotypes. The overall phenotypemeans for each HIF are designated by the dashed line for the red HIF and the solid linefor the blue HIF).

Page 64: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

45

2.4.4 Multiple factors contribute to culm diameter and kernel

row number

QTL mapping was performed using least squared means as phenotypes and merged geno-

types from GBS and PCR methods. Since a limited number (17) of culm diameter RCNILs

were genotyped with GBS, we used PCR markers only for culm diameter mapping and

consequently QTL were only mapped in the primary segregating region of interest on

chromosome five between 59.6 Mb and 144.8 Mb. In contrast, all 92 RCNILs generated

for fine mapping of the kernel row number phenotype were genotyped by GBS allowing

for full accounting of QTL in genomic locations away from the primary QTL of interest.

A single QTL was detected for the culm diameter trait, suggesting a single factor could

be responsible for culm5.1 (Figure 2.5). However, there was a very significant founding

HIF effect in the QTL mapping model (Table 2.2). This founding HIF effect (F-test, p

< 8.59e-10) suggests secondary sites in the genome are still at play and could explain the

inability to observe clean, simple segregation of RCNILs based on genotype in the QTL of

interest. While mapping of a single QTL for culm diameter is encouraging, the relatively

weak QTL LOD score (5.1) and small additive effect (-0.035) in comparison with the HIF

LOD (8.7) and effect (0.098) tells us that secondary sites are more important contributors

to culm diameter than the QTL we were seeking to fine map.

Results for kernel row number QTL mapping are not particularly comparable to the

culm diameter results due to the inclusion of full genome genotypes, which extended

the mapping to segregating sites elsewhere in the genome. Four QTL were detected

(Figure 2.5), two in the primary region of interest on chromosome five with a single QTL

each detected on chromosomes seven and ten (Table 2.2). Unsurprisingly, the founding

HIF once again had a very significant effect (F-test, p < 2e-16). The two QTL in the target

region had the highest LOD and additive effect of mapped QTL. Like culm diameter, the

HIF effect had the overall highest LOD score and effect.

Page 65: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

46

Figure 2.5: QTL LOD profiles for fine mapping of culm diameter and kernel row numbertraits. QTL are color coded and labeled as “chromosome@position”. So the highest LODscore kernel row number QTL ([email protected]) should be read as QTL on chromosome 5b atposition 7.0. (A) Culm diameter LOD profile for the single QTL detected in the primarymapping region. LOD score (y-axis) versus map position in centimorgans is shown. (B)Kernel row number LOD profiles for four detected QTL, two in the primary region ofinterest on chromosome 5b. Secondary QTL on chromosomes 7b and 10a have lower LODscore and effect size than the 5b QTL. In addition to significant QTL, a highly significantHIF effect with high LOD score (culm = 8.688, krn = 19.787) was also included in QTLmodels for both traits.

Page 66: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

47

Table 2.2: Detected QTL and HIF effects including LOD, percent variation explained,and additive effect.

Name LOD Var. Explained (%) Add. Effect

krn5b.1 10.161 7.28% -0.413krn5b.2 13.309 10.40% -0.387krn7b.1 4.758 2.95% 0.264krn10a.1 5.391 3.40% -0.114krn HIF 19.787 18.59% -1.550

krn model 44.127 89.02% —

culm5.1 5.126 18.73% 0.0975culm HIF 8.688 35.69% -0.0353

culm model 11.081 49.36% —

Page 67: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

48

2.5 Discussion

2.5.1 The complex genetic architecture of culm and kernel row

number

Efforts to identify causative factors underlying QTL have recently been met with great

success in maize. These successful studies have identified genes contributing to loss of

prolificacy [4], day length neutrality [50, 62], liberation of the kernel from its fruitcase [5],

and apical dominance [3]. Our study set out to contribute to this growing list of genes

by examining domestication QTL affecting important traits on the fifth chromosome.

Unfortunately, we were unable to identify a single gene contributing to the domestication

traits of culm diameter and kernel row number. Instead, we found evidence of multiple

factors on chromosome five and other chromosomes controlling kernel row number and

culm diameter suggesting the underlying genetic architecture for these traits is quite

complex.

Prior analyses identified domestication QTL for culm diameter and kernel row number

spanning the fifth chromosome from 54.4 to 151.7 Mb and 166.6 to 169.2 Mb, respectively.

Using QTL mapping with fine mapping RCNILs produced mixed results. The culm5.1

QTL was further refined to a much smaller region from 83.74 Mb to 86.26 Mb on the fifth

chromosome, a reduction in size to ∼2.5 Mb from an initial 1.5 LOD support interval of

close to 100 Mb. Unfortunately, the RCNILs used to fine map krn5.2 resulted in multiple

causative QTL on the fifth and other chromosomes while also harboring major differences

between founding HIFs. The fine mapping QTL for kernel row number closest to the

original target region was located from 160.7 Mb to 163.94 Mb on the fifth chromosomes,

shifted upstream of the original interval by approximately 3 Mb. It is interesting that

the largest LOD score QTL from chapter 1 moved and fractionated into multiple factors,

while the comparably smaller LOD score and effect size QTL culm5.1 was narrowed to

Page 68: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

49

an interval ∼2.5% the size of the original interval. In terms of number of genes in the 1.5

LOD support intervals, the fine mapping region for culm diameter had a total of 40 genes

and the kernel row number QTL had 63 genes. While this fell short of the ultimate goal

of a single gene, a small enough number of genes are in the confidence intervals to begin

looking for interesting candidate genes.

The forty genes in the culm diameter QTL were characterized by looking at functional

annotation, expression results from chapter 3 of this thesis, and inclusion in selection

features from a recent genome-wide population genetics scan in maize [55]. In terms of

protein functional annotations, these genes had a variety of biological functions such as

nucleases, transmembrane proteins, metabolic enzymes, chlorophyll binding proteins, and

a number of transcription factors. Gene regulatory differences from the allele specific

RNAseq experiment gave results for 21 of the 40 genes. Seven genes were classified as

having a significant cis regulatory change. However, none of these seven genes were part of

the final filtered candidate gene list. Eight of the forty genes were also inside domestication

selection features, suggesting genes in the culm QTL were under positive selection during

maize domestication. While evidence points to differential gene expression and selection

on the genes in the culm5.1 QTL, no single gene has multiple lines of supporting evidence.

Genes in the highest LOD score kernel row number QTL were also examined for

interesting candidates. Like the culm5.1 QTL, genes in the kernel row QTL had many

different functions including ubiquitin association, ribosomal proteins, nucleases, nuclear

transporters, and several transcription factors. Of the 63 total genes, 36 were not assessed

by the chapter 3 RNAseq experiment. The majority of the assayed genes (24) were

not on filtered candidate gene lists for cis regulatory change, however three genes were.

The most interesting candidate is an armadillo repeat containing protein with a U-box

domain. Armadillo proteins were first characterized in fruit fly and are implicated in

a number of functions including intracellular signaling and cytoskeletal regulation. The

Page 69: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

50

U-box family of proteins is a class of ubiquitin-protein E3 ligases. While there is evidence

for positive selection during maize domestication in the krn QTL, none of the genes with

cis regulatory change show signs of positive selection, leaving no ideal candidate for the

kernel row number QTL defined in our fine mapping experiment.

This work provides a cautionary note for researchers looking to identify causal genes

for QTL. In this study we set out to identify the causative gene underlying two QTL, a

large effect and LOD score QTL with a narrow confidence interval and a moderate effect

and LOD score QTL with a larger support interval. Contrary to expectations, we were

actually more successful in narrowing the QTL region for the weaker effect QTL, while

the high LOD kernel row number QTL shifted positions slightly and was influenced by

multiple factors. We show that the inheritance of genetic factors influencing kernel row

number on chromosome five are quite complicated and that a previously mapped high

LOD score QTL fractionates into multiple linked factors. The lower LOD score QTL for

culm diameter actually resulted in the better fine mapping result with a greatly reduced

confidence interval.

2.5.2 Future work on chromosome five QTL

The fifth chromosome of maize has been implicated as a major contributor to maize do-

mestication in several studies [8, 25]. QTL for the kernel row number and ear diameter

traits are of particular interest due to their large effect, high LOD score, and obvious link

to desirable domestication phenotypes. The work in this thesis shows that ear diameter

and kernel row number fractionate into multiple linked QTL on the fifth chromosome.

Evidence from fine mapping and chapter 1 of this thesis put the kernel row number QTL

between 160.7 and 169.2 Mb. Unfortunately, this region contains over 100 genes and we

could not identify a single highly attractive candidate gene based on gene annotation, ex-

pression profiles, and scans for selection. Even though these efforts were not met with full

Page 70: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

51

success, the importance of chromosome five on domestication traits (kernel row number

and ear diameter in particular) cannot be understated and future studies looking at this

chromosome are inevitable.

To aid future studies on these QTL, insight can be taken from this work to maximize

the chances of success. I believe there are two primary insights that would be useful

to future researchers in this endeavor. First, the uniformity of the genetic background

on chromosome five and other chromosomes appears to be of critical importance. The

founding HIFs taken from a BC2S3 population in this experiment proved to have a com-

plicated background with multiple secondary segregating sites that caused problems when

mapping in the target QTL interval. Consequently, a more advanced population would

be desired. Second, distinct differences between founding HIFs were detected for both the

culm diameter and kernel row number QTL, suggesting comparison of RCNILs generated

from different HIFs could be misleading. Either designing the experiment to draw on a

single founding HIF to avoid this issue or accounting for HIF in analysis of the pheno-

type data will be important. In spite of accounting for founding HIF in the linear mixed

models, I still observed a large difference in kernel row number between founding HIFs

suggesting simplification of the experiment to use a single HIF may be the best design.

The use of more extensive backcrossing and generation of RCNILs from a single found-

ing HIF will allow for an overall more isogenic genomic background with minimal seg-

regation outside of the desired region. Drawing starting HIFs from the BC6S6 NIRIL

population from chapter 1 is an easy and logical way to do this. Additionally, the kernel

row number QTL is already confirmed in the population. Towards this end, we have

started the crosses necessary to produce a new population of segregating RCNILs from

several of the lines in the mapping population from chapter 1. These RCNILs will be used

in future field trials for a new and improved fine mapping attempt of the highly important

kernel row number and ear diameter phenotypes on chromosome five.

Page 71: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

52

Chapter 3

The role of cis regulatory evolution

in maize domestication

Page 72: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

53

3.1 Abstract

Gene expression differences in divergent lineages caused by modification of cis regulatory

elements are thought to be a critically important process in the evolution of species.

In this study, we assay genome-wide cis and trans regulatory differences between maize

and its wild progenitor, teosinte, using deep RNA sequencing in F1 hybrid and parent

inbred lines. Three tissues were sampled and approximately 70% of ∼17,000 genes showed

evidence of allele specific expression. Approximately 1,000 of these genes show consistent

cis differences among the sampled maize and teosinte lines, of which ∼70% are specific to a

single tissue. The number of genes with cis regulatory differences is greatest for ear, which

underwent a drastic transformation in form during domestication. Genes with cis effects

were also under positive selection during maize domestication and improvement more often

than expected by chance. Over all genes, maize was shown to possess less cis regulatory

variation than teosinte, a deficit that is greatest for genes with cis regulatory divergence.

We observed a directional bias where genes with cis differences favored higher expression in

maize, suggesting domestication led to a general upregulation of gene expression. Finally,

this work documents the cis and trans regulatory changes between maize and teosinte in

over 17,000 genes for three tissues.

Page 73: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

54

3.2 Introduction

Changes in the cis regulatory elements (CREs) of genes with functionally conserved pro-

teins have been considered a key mechanism, if not the primary mechanism, by which the

evolution of the diverse forms of multicellular eukaryotic organisms evolved [12, 13, 65].

Variation in CREs allows for the deployment of tissue specific patterning of gene expres-

sion, differences in developmental timing of expression, and variation in the quantitative

levels of gene expression. Furthermore, modification of CREs, as opposed to coding se-

quence changes, are assumed to have less pleiotropy and consequently a lower chance of

being deleterious due to unintended consequences in secondary tissues. The importance

of CREs for the development of novel morphologies is supported by the growing catalog

of examples for which differences in CREs of specfic genes between closely related species

contributed to the evolution of diversity in form and pigmentation patterning [66].

While compelling evidence for the importance of CREs in evolution has come from

mapping causative variants to CREs, additional evidence has been emerging from genomic

analyses. These analyses have shown that cis regulatory variation is abundant both within

[67–70] and between species [20, 21, 71]. Some studies have reported a bias such that genes

with cis differences between species or ecotypes often show preferential upregulation of

the alleles of one parent, possibly as a result of natural selection [21, 68, 72]. Consistent

with the proposal that cis differences are a key element of adaptive divergence, divergence

for cis regulation between yeast species is more often associated with positive selection

than trans divergence [20, 73].

Crop plants offer a powerful system for the investigation of evolutionary mechanisms

because they display considerable divergence in form from their wild progenitors, yet

exhibit complete cross-fertility with these progenitors [7, 36, 74]. QTL fine-mapping

experiments have provided multiple examples of changes in CREs that underlie trait

divergence between crops and their ancestors. These studies include examples in which

Page 74: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

55

cis changes confer the upregulation of a gene during domestication [3], the downregulation

of a gene [14, 62], the loss of a tissue specific expression pattern [15], the gain of a tissue

specific expression pattern [4], and a heterochronic shift in the expression profile [16].

These diverse results suggest that changes in CREs offer a powerful means to fine-tune

gene expression to generate new plant morphologies.

Several genomic scale assays of gene expression differences between crops and their

ancestors have been performed, although the experimental designs used did not allow

the separation of cis and trans effects. These studies have shown that hundreds or even

thousands of genes have altered expression in crops as compared to their progenitors and

that genes with altered expression are more likely to show evidence for past selection than

genes with conserved expression [17–19]. The data suggest massive alterations in gene

expression profiles accompanied domestication. Work in cotton and maize shows a more

frequent upregulation of genes in the cultivated as compared to the wild parent, however

whether this was due to cis or trans effects was not discernible [17, 18].

In this study, we used RNAseq to parse genome-wide expression differences between

maize and its progenitor, teosinte (Zea mays ssp. parviglumis), into cis and trans effects.

Three tissue types were assayed: immature ear, seedling leaf, and seedling stem. Approx-

imately 70% of the 17,000 genes assayed show evidence of regulatory divergence between

maize and teosinte. Over 1,000 genes show cis divergence that is highly consistent across

our sampled lines of maize and teosinte. For ∼70% of genes with consistent cis effects,

the cis effects are specific to just one of the three tissue types. The number of genes with

cis differences is greatest for the ear, which underwent a profound transformation in form

during domestication. Genes with cis regulatory differences between maize and teosinte

more frequently show evidence for positive selection associated with domestication than

do trans genes. Maize also possesses less cis regulatory variation than teosinte over all

genes, and this deficit in maize is greatest for genes with cis regulatory divergence from

Page 75: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

56

teosinte. We observed a directional bias in that genes with cis differences more frequently

have upregulated expression of maize alleles over teosinte, although we cannot exclude

the possibility that this is an artifactual result. Finally, our data provide a catalog of cis

and trans regulatory variation for over 17,000 genes in three tissue types for maize and

teosinte.

3.3 Materials and Methods

3.3.1 Plant material, RNA preparation, and sequencing

Six maize inbred lines, nine teosinte inbred lines, and 29 of their 54 possible maize-teosinte

F1 hybrids were used in this experiment (Supplemental Table C.1). An average of 1.96

biological replicates (range 1 to 4) of each genotype were used. Plants were grown in

growth chambers with a 12 hour dark-light cycle for up to 6 weeks, after which they were

moved to a greenhouse. Fifty to 100 milligram samples of the immature ear, leaf, and

seedling stem were harvested for RNA extraction during this time. Leaf and seedling stem

(including the shoot apical meristem) tissue was collected at the v4 leaf stage. Single ears

from maize and F1 hybrid plants were collected when the ears weighed 50 to 100 milligrams

with silks just beginning to be visible. Teosinte ears were also collected when silks just

started to appear, however, due to the small size of teosinte ears 7 to 16 ears (average of

11.27) from each plant were pooled to obtain ∼50 milligrams of tissue. These three tissue

types will from here on be referred to as the ear, leaf, and stem tissues.

Total RNA was extracted from the plant tissues using a standard TRIzol protocol. To-

tal RNA was then quantified by spectrophotometer and normalized to 1 µg/µL in nuclease

free water. Starting with 5 µg total RNA, we generated polyA selected, strand specific,

barcoded RNAseq libraries with a previously published protocol using a five minute frag-

mentation time and 12 PCR amplification cycles [75]. Library adapters used barcode

Page 76: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

57

sequences of four and five base pairs (Supplemental Table C.2) designed to balance per-

cent nucleotide composition within the first five base pairs of sequence reads and to have

at least two base pair differences from any other barcode. RNAseq libraries were then

pooled in groups of 14 (F1s) or 15 (parents), and the pooled libraries sequenced on one

lane (parents) or two lanes (F1s) of an Illumina HiSeq2000 sequencer at the University of

Wisconsin Biotech Center.

3.3.2 Bioinformatics

A pipeline was developed to quantify gene expression in F1 hybrid and parental inbred

lines using the RNAseq reads. The pipeline, based on work by Wang et al. [76], has two

main steps (1) construction of a pseudo-transcriptome for each parent line from the B73

reference genome and polymorphisms derived from non-B73 genomic paired-end reads

and (2) alignment of RNAseq reads to the pseudo-transcriptomes followed by evaluation

of read depth at segregating sites.

Pseudo-transcriptomes were constructed using the B73 reference genome (version

AGPv2) and transcriptome (version ZmB73 5a WGS) plus an average of 403.1 million

(17.5X coverage) paired-end genomic sequencing reads from each of the other 14 inbred

lines (Supplemental Table C.3). For each of the 14 non-B73 inbreds, paired-end genomic

sequencing reads were aligned to the reference genome with the BWA aligner (version

0.5.9) [77]. Only uniquely mapping reads with up to two mismatches were used to limit

false polymorphism detection due to paralogous read alignment. Segregating sites from

single nucleotide polymorphisms (SNPs) and small insertion or deletion (indel) polymor-

phisms were called using the GATK package (version 1.0.5588) [78, 79] and filtered to

include only polymorphisms that were homozygous in the inbred with read depth of at

least 4X. A strand bias filter was also applied to ensure that the polymorphism was de-

tected on both the plus and minus strand. Polymorphisms surviving these filters were

Page 77: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

58

then inserted into the reference B73 transcriptome to make a pseudo-transcriptome for

each parent.

For each of the 29 maize-teosinte pairs, a robust set of segregating sites was determined

by comparing the pseudo-transcriptomes of the two parents and taking the sites where: the

two parental alleles differed, coverage in genomic read alignment was at least four for both

parents within the read length (88bp) of the site, and no heterozygous polymorphisms

were detected in genomic read alignments of the two parents within the read length of

the site.

RNAseq reads from each F1 hybrid and each corresponding pair of inbred parents

were then aligned to the combined pseudo-transcriptomes of the two parents (in the case

of the B73 parent, the B73 reference transcriptome was used) using the Bowtie aligner

(version 0.12.7) [80]. Allele specific expression was assessed by counting depths of reads

originating from each parent at segregating sites (determined as described above). Since

only perfect alignments were allowed, assignment of reads to parents was straightforward

(a read from a given parent could only align to this parent’s allele at a segregating site).

3.3.3 Maize:teosinte gene expression ratios

We calculated F1 hybrid and parent maize:teosinte expression ratios for each gene for

each of the 29 individual F1 hybrid comparisons. The F1 expression ratio for individual

F1s (e.g. B73 x TIL01) was calculated as the number of maize reads to the number of

teosinte reads summed over all segregating sites in the gene. The parent expression ratio

for individual F1 comparisons was calculated as the number of reads for the maize parent

(e.g. B73) to the number of reads for the teosinte parent (e.g. TIL01) summed over all

segregating sites in the gene after correcting for any difference in the total number of

reads between the two parent lines. The result of these calculations was a set 29 matched

F1 and parent ratios of read counts for each gene. For example, for the B73 x TIL01

Page 78: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

59

comparison at a single gene, the F1 and parent maize:teosinte ratios could be 52:56 and

34:30, respectively.

We also calculated F1 hybrid and parent maize:teosinte expression ratios for each

gene summed over all F1 hybrid comparisons by pooling the read depth values for the 29

F1 hybrids and their parents, respectively. To calculate the overall F1 expression ratio,

the maize and teosinte read counts from the F1 hybrids were simply summed over all

segregating sites in a gene and across all hybrids. The calculation of the overall parent

expression ratio required weighting. The weighting was necessary to avoid counting the

parent reads multiple times for each of the F1 hybrids in which it was a parent and to

compensate for the fact that different parents had variable total numbers of reads. Only

genes with a read depth of at least 100 in both the F1 and its parent were included. The

result of these calculations was an overall F1 and parent ratio of read counts for each gene.

For example, for a gene, the overall F1 and parent maize:teosinte ratios could be 804:796

and 123:130, respectively.

3.3.4 Testing for cis and trans effects

The combination of F1 hybrid and parent inbred expression data allows us to estimate

both the cis and trans effects on gene expression. For the F1 hybrids, the maize and

teosinte alleles at each gene are in a common trans cellular environment, and thus any

deviation of the maize:teosinte F1 expression ratio from 1:1 represents purely cis effects.

By contrast, the maize:teosinte parent expression ratio is a combination of the cis and

trans effects and any deviation of this ratio from 1:1 reflects the combined cis plus trans

effects. Therefore, the trans effects can be estimated by subtracting the F1 hybrid ratio

(cis) from the parent ratio (cis plus trans).

Maize and teosinte gene expression as measured by the read depth counts at genes were

used for statistical testing of cis and trans effects. Significant cis and trans effects were

Page 79: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

60

Table 3.1: Regulatory category as defined by significant (Sig.) or not significant (NotSig.) binomial tests (BT) and Fisher’s Exact Tests (FET).

Category Parent BT Hybrid BT FET Favored allele?

Cis Sig. Sig. Not Sig. —Trans Sig. Not Sig. Sig. —Cis + Trans Sig. Sig. Sig. SameCis x Trans Sig. Sig. Sig. OppositeCompensatory Not Sig. Sig. Sig. —Conserved Not Sig. Not Sig. — —Ambiguous All other patterns of significant or not significant

Page 80: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

61

determined using binomial and Fisher’s Exact Tests as described in McManus et al. [21].

In brief, two binomial tests were used to identify genes with maize:teosinte expression

ratios significantly different from 1:1 in the F1 hybrid and parent comparisons. Genes

with an expression ratio significantly different from 1:1 for the F1 hybrid and/or parent

comparison were then subjected to a Fisher’s Exact Test to determine if the parent and F1

hybrid maize:teosinte expression ratios were different from one another. An FDR rate of

0.5% using Storey’s q-value [81] was used to compensate for the large number of statistical

tests being performed. The combination of the two binomial tests and Fisher’s Exact Test

allowed us to classify each gene into one of seven different regulatory categories (Table 3.1)

as described in McManus et al. [21].

3.3.5 Candidate genes

Genes whose expression level was the direct target of selection during maize domestication

are expected to show a maize:teosinte cis expression ratio that is significantly different

from 1:1. These genes can fall into either the cis only (C) or cis plus trans (CT) groups on

Table 3.1 as determined by the binomial and Fisher’s Exact Tests. We call this combined

group CCT genes and they are the differential expression candidates that are the focus

of many of our analyses.

The list of CCT genes from the overall test was large (5,609 ear; 5,392 leaf; 5,426 stem;

see results). The large number of CCT genes reflects the considerable statistical power

to detect slight overall expression biases given that some genes had thousands of reads

aligning to segregating sites. We observed significant maize:teosinte expression biases

as small as 1.0:1.02 in the overall tests. Such small differences seem unlikely to have

biological importance and genes showing these small differences are weak candidates for

genes with cis expression variation that is causal in maize domestication and improvement.

Page 81: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

62

Therefore, we applied filters to identify candidates with the strongest and most consistent

regulatory differences.

To narrow down the CCT gene list to candidate genes that show the strongest evidence

for differential cis regulation between maize and teosinte, we applied two filters. (1) Genes

with the strongest evidence should not only fall in the CCT group for the overall test using

the pooled data from all 29 F1 hybrid comparisons, but the best supported genes for cis

differences will be the ones for which we have data from a large proportion of our sampled

maize and teosinte parents. Thus, we filtered the initial list of CCT genes for those with

data from at least fifteen F1 hybrids that include at least three different maize inbreds

and five different teosinte inbreds. (2) For genes with cis differences that contributed to

maize domestication/improvement, they should not only appear in the CCT list from the

overall test, but the direction of the expression bias should be highly consistent among

each of the individual F1 hybrids. To classify CCT genes for consistency of directionality

of expression bias among the F1s, we partitioned the genes into groups with 100%, 90%

and 80% of F1s showing the same directionality. In calculating these percentages, we used

read depth for each F1 at the gene to weight the contribution of the F1s to the overall

percentage. We refer to the CCT genes with 100%, 90% and 80% consistent directionality

among the F1s as the A-list, B-list and C-list, respectively. For comparative purposes, we

made similar A, B and C lists of genes for the cis only or trans only classes.

3.3.6 Proportion of cis variation in maize and teosinte

The existence of multiple cis regulatory regimes within maize and teosinte populations

are expected to manifest as variation in the expression ratios among F1 hybrids. We asked

whether cis expression variation among F1 hybrid ratios was more heavily influenced by

maize or teosinte inbred parent. Since three teosinte inbreds (TIL05, TIL10, and TIL15)

were involve in only a single F1 each, the three F1s involving these inbreds were removed

Page 82: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

63

from the data in order to balance the number of maize and teosinte inbred parents in the

dataset for this analysis. Genes were tested for variation among the F1 expression ratios

(cis variation) using a linear model. The log2(maize:teosinte) F1 expression ratio as the

dependent variable was fit to the maize (j=1 to 6) and teosinte (k=1 to 6) parents as the

independent variables. All models were fit on a gene-by-gene basis. Significant maize and

teosinte parent terms were identified with an F-test (p < 0.05) using the drop1 function

in R. The data for each F1 was weighted by its total depth at the gene to account for

different read-depths in the F1 hybrids.

3.3.7 Additive and dominant gene expression

One theory in domesticated systems states that genes responsible for rapid morpholog-

ical evolution are primarily loss of function (LOF) alleles [82]. In this scenario, a non-

domesticated allele would be dominant to the LOF domesticated allele. While there is

some support for this theory in rice diversification and improvement [83], recent QTL and

domestication gene cloning experiments present a more diverse collection of functional

gene changes [84]. In domesticated systems, the mode of inheritance for gene expression

in terms of additivity and dominance has yet to be explored.

Our dataset consisting of parent inbred and hybrid expression profiles gives the op-

portunity to address the LOF hypothesis in terms of gene expression on a genome-wide

scale. We calculated the additive effect, dominant effect, and dominant/additive (D/A)

ratio for each gene and maize-teosinte F1 hybrid comparison. The overall maize-teosinte

average D/A ratio was then calculated after exclusion of outlier F1 D/A ratios using the

Dixon method [85]. Genes were next classified as having overdominant (1.25 < |D/A|),

dominant (0.75 < | D/A | < 1.25), semi-dominant (0.25 < | D/A | < 0.75), or additive (|

D/A | < 0.25) gene action depending on D/A ratio. Following calculation of overall D/A

ratios and assignment of gene action, we looked for patterns in D/A ratios and gene action

Page 83: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

64

that support the LOF hypothesis [82]. Specifically, we looked for evidence of extensive

dominance of the teosinte (non-domesticated) allele for genes with trans only regulatory

change.

3.3.8 CCT gene enrichment in various functional categories

We assessed whether CCT genes are over or under represented in several categories as com-

pared to all genes or genes with conserved expression levels between maize and teosinte.

The categories we tested include transcription factors, several metabolic pathways, gene

ontology (GO) categories, selection candidates, and domestication QTL. A list of maize

transcription factors and their associate families was downloaded from the plant tran-

scription factor database [86]. Metabolic enzyme cDNA sequences for starch and lipid

metabolism pathways in maize were downloaded from the Kyoto Encyclopedia of Genes

and Genomes (KEGG) [87, 88] and matched with genes from the maize filtered gene set

(version 5b) by BLAST. Matches (single gene hit with percent identity greater than 95%)

were found for 370 out of 379 genes and used to test for enrichment of CCT genes in the

various metabolic pathways. Genes under positive selection during maize domestication

and improvement were taken from a recent genomic scan for selection [55]. We obtained a

list of QTL associated with maize domestication and improvement traits from Table A.1

in work by Shannon [25].

In general, we tested for enrichment or depletion of CCT genes in various categories

using Fisher’s Exact Tests on 2x2 contingency tables that parse genes by CCT and cate-

gory status. Statistical testing was first done for CCT-AB candidate genes and extended

to CCT-A and CCT-ABC lists if an interesting result presented itself. Additionally, there

were a few differences in this general approach depending on what category was being

analyzed. For QTL, we looked for enrichment of CCT genes among the genes within the

1.5 LOD support intervals for each trait separately and only included QTL whose 1.5

Page 84: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

65

LOD support intervals were narrow enough to encompass 20 or fewer genes. For genes

under positive selection during domestication and improvement, we performed an addi-

tional three tissue union comparison where genes on any of the three tissue CCT lists

were considered a CCT candidate gene.

One expectation for genes under selection for CREs is the signature of selection at

the CRE itself, upstream of the gene in question. Since there is no hard rule as to

how far upstream cis enhancer and repressor elements can function, we addressed this

expectation by looking at selection pressure at the transcriptional start site of genes. The

raw selection score, represented by cross population composite likelihood ratio (XPCLR)

[89], from Hufford et al. [55] served as a test statistic for this analysis. A three tissue

union comparison was made between all genes on CCT-AB lists and all genes identified

as conserved in the initial assay. Significant differences between the XPCLR score at the

transcriptional start site were tested by Kolmogorov-Smirnov and simple t-tests to look

for change in the overall distribution and mean of conserved versus CCT genes.

Finally, we used the goseq package [90] in R [91] to test for GO term enrichment and

depletion in our CCT gene lists, using median gene length to adjust the reference in the

goseq analysis. The base background GO term reference consisted of genes for which

allele specific expression was assessed in 15 crosses, three unique maize, and five unique

teosinte inbred lines with a cumulative depth of 100 at segregating sites in F1 and parent

comparisons. GO terms occurring at least five times in the background reference were

tested for enrichment and depletion in the CCT-A, CCT-AB, and CCT-ABC gene lists

with p-values corrected for multiple testing using the Benjamini-Hochberg method [92].

Page 85: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

66

3.4 Results

3.4.1 RNAseq provides expression data for more than 17,000

genes per tissue

RNAseq data for seedling leaf, seedling stem (including the shoot apical meristem), and

immature ear from six maize inbreds, nine teosinte inbreds, and 29 of their 54 possible

F1 hybrids were used to examine variation in gene expression on a genome-wide scale. In

total, 259 RNAseq libraries were constructed from an average of 1.96 biological replicates

for each parent inbred and F1.

Overall, 996 million, 1.13 billion, and 1.21 billion F1 hybrid and 286 million, 283

million, and 276 million parent RNAseq reads were collected for ear, leaf, and stem tissue

types, respectively (Table 3.2). These reads were aligned with custom-made parent specific

pseudo-transcriptomes containing an average of 54,000 segregating sites (SNPs or small

indels) in each of the 29 maize-teosinte contrasts. Out of the reads from the F1 hybrids,

556 million, 670 million, and 716 million reads mapped to pseudo-transcriptomes in ear,

leaf, and stem tissue, respectively. For parent inbred line reads, 171 million, 170 million,

and 163 million mapped to the pseudo-transcriptomes (Table 3.2). Thus, approximately

the same percentage of reads (58.1% and 59.6%) mapped to pseudo-transcriptomes in

both the F1 hybrids and parent datasets with about 7.15% of the total reads mapping to

segregating sites in the individual F1 hybrids and their parents.

The RNAseq reads from the pooled data for all 29 F1 hybrids and 15 parents that

aligned to segregating sites in the transcriptomes represent 23,045, 23,434, and 23,792

genes for ear, leaf and stem tissues, respectively (Table 3.3). The union of these three

groups is 24,983 genes, which is 63% of the 39,423 genes from the maize filtered gene set

(version 5b). We applied a filter to this list, requiring a read-depth of 100 in both the

parent inbreds and F1 hybrids. This filter reduced the lists to 15,939, 15,925, and 16,018

Page 86: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

67

Figure 3.1: Overlap of genes assessed in the three tissues overall and in the CCT-ABgene list. Each compartment of the Venn diagram contains the tissue combination ontop, number of genes overall in the middle, and number of genes from the CCT-AB genelist on bottom. CCT-AB overlap numbers marked by an “*” indicate significantly moreoverlap than expected by chance (permutation tests, p < 1e-5). In the overall analysisthe vast majority of genes (82%) were assayed in all three tissues. While this percent ismuch smaller for the CCT-AB candidate gene list (∼7%), this is still more of an overlapthan expected by chance. The much higher degree of overlap of CCT-AB genes thanexpected suggests some CREs act in multiple tissues. Additionally, there are also manysingle tissue CCT-AB genes, which points towards the many cis elements that appear tofunction in tissue specific patterns.

Page 87: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

68

Table 3.2: Assignable RNAseq Read Counts from F1 hybrids and parents.

Tissue F1 Hybrid Count Parent CountF1 Hybrid

PercentParentPercent

Total ReadsEar 996,210,711 286,233,926 - -Leaf 1,133,517,167 282,553,096 - -Stem 1,211,779,746 276,295,164 - -

Aligned ReadsEar 556,387,109 171,185,368 55.85% 59.81%Leaf 670,175,942 169,564,817 59.12% 60.01%Stem 716,223,906 162,866,225 59.11% 58.95%

SegregatingSite Reads

Ear 74,556,872 85,296,872a 7.48% 29.80%a

Leaf 72,995,272 78,878,805a 6.44% 27.92%a

Stem 91,355,219 78,583,423a 7.54% 28.44%a

a A higher number and percentage of reads map to segregating sites in parents due toeach set of parent reads being used in multiple comparisons. In contrast each of the F1

comparisons can only map to segregating sites between two pseudo-transcriptomes.

Page 88: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

69

Table 3.3: Genes for which RNAseq data was collected and expression was assayed.1

Ear Leaf Stem Union

Genes with mapped RNAseq reads 32,858 32,645 33,316 34,636Genes with RNAseq reads and segregating sites 22,072 22,393 22,901 24,052

Overall Genes (filtered100 depth) 15,939 15,925 16,018 17,575Total CCT genes 5,618 5,402 5,435 10,101

Filtered CCT Genes (15F1+ 3M + 5T) 4,770 4,490 4,601 8,398ABC-List CCT 1,545 1,288 1,371 3,018

C-List CCT 990 843 940 2,314B-List CCT 512 424 404 1,036A-List CCT 43 21 27 69

1 Only genes from the maize filtered gene set (version 5b) were considered.

Page 89: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

70

genes in ear, leaf, and stem tissues, respectively. The union of these three groups is 17,575

genes or about 45% of the filtered gene set. There is a large degree of overlap among the

genes expressed in the three tissues. From the total list of 17,575 genes, 14,420 (82%) were

seen in all three tissues. Of the remaining genes, 1,467 are in some combination of two

tissues and 1,688 are in only a single tissue (Figure 3.1). All except 16 of these single or

two tissue genes were detected at a read depth below 100 in additional tissues. However,

for the 1,688 genes expressed in only single tissues at 100 read-depth, an average of 67.4%

of their reads come from the tissue with the most reads. For genes detected in all three

tissues at 100 read-depth, this value is only 46.9%. Thus, while very few of the 1,688

genes are absolutely tissue specific, this group of 1,688 genes shows greater differences in

expression among tissues than the 14,420 genes detected in all three tissues.

3.4.2 Prolific regulatory variation characterized by relatively

few consistent cis differences

We measured log2 of the ratio of maize to teosinte read counts in F1 hybrids (cis regu-

latory effect) and the parent log2 ratio (combined cis and trans regulatory effect). The

trans effect was estimated as the difference between the F1 and parent log2 ratios. Bi-

nomial and Fisher’s Exact Tests were used on read counts to determine whether these

ratios deviated from 1:1 and to assign genes to one of seven regulatory categories (Ta-

ble 3.1). In an overall maize versus teosinte comparison, about 69% of genes (69.27% ear,

74.27% leaf, and 63.82% stem genes) from the three tissues were classified as having some

combination of significant cis and/or trans regulatory effect (Figure 3.2). The remaining

genes were classified as having conserved (18.6%, 15.3%, and 20.7%) expression in maize

and teosinte or ambiguous (12.1%, 10.4%, and 15.5%) expression patterns. All three

tissues had similar proportions of genes falling into the different regulatory categories in

Page 90: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

71

the overall maize-teosinte comparison (Ear: Figure 3.2, Leaf: Supplemental Figure C.1,

Stem: Supplemental Figure C.2).

We asked what proportion of regulatory divergence between maize and teosinte was

due to cis effects by calculating the ratio: |cis|/(|cis|+ |trans|) [21]. Overall genes, cis

effects account for 45%, 42% and 47% of regulatory divergence for ear, leaf and stem

tissue, respectively (Supplemental Table C.4). We further asked the relative contribution

of cis and trans in generating large expression differences by binning genes based on

overall expression difference between maize and teosinte (log2 parent ratio). This analysis

shows the magnitude of cis regulatory change is positively correlated with total divergence

in expression (Figure 3.3). At high degrees of expression divergence between maize and

teosinte (log2 change of 5 or more), over 75% of the divergence is due to cis. Thus, large

expression differences appear to be caused primarily through difference in cis regulation

as opposed to trans.

A primary goal in this study was to identify genes with cis regulatory differences

between maize and teosinte. Such genes are candidates for being direct targets of selection

during maize domestication or improvement for altered gene expression. Genes selected

for regulatory differences would be in either the cis only or cis plus trans regulatory

categories. We designate this combined group CCT genes. We identified 5,618 ear, 5,402

leaf and 5,435 stem CCT genes in the overall analysis (Table 3.3). To narrow the list

of CCT genes to those with a broad degree of support, the list was filtered to include

only those assayed in at least 15 maize-teosinte F1s involving at least three maize and five

teosinte inbred lines. This filtering resulted in reduced lists of 4,770 ear, 4,490 leaf, and

4,601 stem CCT genes. The union of these three sets includes 8,398 genes.

Next, we asked if the 8,398 genes on the filtered CCT list from the overall analysis

have a consistent directionality in favor of the maize or teosinte allele in the individual F1

hybrids. The goal was to exclude CCT genes for which the significant overall cis effect was

Page 91: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

72

Figure 3.2: Parent versus hybrid ear tissue allele specific expression ratios. The parent(x-axis) versus F1 hybrid (y-axis) allele specific expression ratios are plotted against eachother. Regulatory category in terms of the combination of significant statistical testsdetermined using the method described in methods is shown designated by color. Pro-portion and count of genes falling into the various regulatory categories are also shown inthe lower right hand corner barplot.

Page 92: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

73

Figure 3.3: Proportion of expression divergence due to cis regulatory difference. Theamount of total differential expression between the maize and teosinte parents due to thedirectly measured cis effect (F1 hybrid expression ratio) is shown with error bars depictingone standard error. Total divergence (parent expression ratio) was binned from 0-1, 1-2,2-3, 3-4, 4-5, and 5+. Divergence due to cis effects increases with total divergence, sug-gesting large expression differences tend to be caused by cis rather than trans regulatorydifferences.

Page 93: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

74

caused by a large expression bias in a minority or even one of the F1 crosses. We defined

three levels of consistency: groups A, B and C for which 100%, 90% and 80% of F1s

showed the same directionality, respectively. Groups A, B, and C genes combined across

tissues contained 69, 1,036, and 2,314 genes respectively (Table 3.3). Thus, relatively

few of the 8,398 filtered CCT genes show a significant overall cis effect that is highly

consistent among 15 or more F1 hybrids.

3.4.3 Possible directional bias in cis evolution

Visual examination of Figure 3.2 shows a greater density of cis genes (black dots) with

positive log2 hybrid expression ratios than with negative ratios, suggesting cis evolution

during domestication more often favored alleles with increased expression in maize relative

to teosinte. Consistent with this visual observation, the number of CCT (ABC list)

genes with a positive (maize biased) vs. negative (teosinte biased) log2 hybrid expression

ratio are 947:598, 814:474 and 826:545 for ear, leaf and stem, respectively (Supplemental

Table C.5). All of these ratios are significantly different from a 50:50 unbiased expectation

(binomial test, p< 0.001). Additionally, a plot of the distribution of log2 hybrid expression

ratio for CCT genes shows a much greater density of genes with positive values (Figure 3.4)

for all three tissue types.

The apparent bias in directionality of cis evolution could be the result of error in our

bioinformatics pipeline. One potential error is preferential alignment of maize RNAseq

reads due to overall greater sequence divergence of teosinte lines from the reference tran-

scriptome (B73) in comparison to non-reference maize inbred lines. If such systematic

error exists, the observed bias in directionality of cis evolution would be expected to be

greatest for F1s involving the reference B73 (zero alignment bias of maize reads and high

bias for teosinte) and less extreme for crosses between teosinte and non-reference maize

lines (moderate bias for non-reference maize and high bias for teosinte).

Page 94: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

75

Figure 3.4: Cis versus estimated trans regulatory effect for CCT-ABC genes in the ear,leaf, and stem. CCT genes have a directional bias with more genes overall favoring themaize allele than teosinte. Genes with consistent cis regulatory differences tend to favorthe domesticated maize allele. This phenomenon exists in all three tissues. While wecannot discount references bias as the cause, this trend suggests there may be an overalldirectional bias for cis regulatory evolution in maize domestication.

Page 95: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

76

To test this expectation, we calculated the number of CCT (ABC list) genes with

positive (maize biased) vs. negative (teosinte biased) log2 hybrid expression ratios sep-

arately for F1s involving B73 and non-B73 maize parents. For ear tissue, there are 569

teosinte-biased and 975 maize-biased genes for B73 F1s and 606 teosinte-biased and 939

maize-biased genes for non-B73 F1s. A Fisher’s Exact Test fails to reject the null hy-

pothesis that these two ratios are equivalent (p = 0.18). There was also no evidence for

non-equivalent ratios with the other two tissue types (Supplemental Table C.6). Thus,

we see no evidence for significantly greater bias for maize alleles in crosses involving B73

versus the non-reference maize parents, supporting the argument that alignment bias in-

troduced by use of pseudo-transcriptomes does not explain the excess of CCT genes with

the maize allele expressed higher than the teosinte allele.

3.4.4 Gene expression variation is greater in teosinte

Both the domestication/improvement bottleneck and selection during domestication are

expected to reduce variation in maize as compared to teosinte. We asked if these reduc-

tions in variation are apparent in our gene expression data. To quantify whether variation

in maize or in teosinte was the source of the variation in our expression ratios among F1 hy-

brids, we fit a linear model on a gene-by-gene basis where maize and teosinte inbred parent

were used as explanatory factors for the expression ratio. Among ∼13,000 genes included

in this analysis, the maize parent explains only 85% as much variation as the teosinte

parent (Supplemental Table C.7). This represents the general reduction in diversity of

maize as compared to teosinte, presumably a result of the domestication/improvement

bottleneck.

While the bottleneck should cause a reduction in expression variation in maize for all

genes, genes that were targets of selection for regulatory differences should have an even

greater reduction in expression variation. Consistent with this expectation, we observed

Page 96: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

77

Figure 3.5: The proportion of average maize to teosinte R2 from linear models explainingF1 hybrid expression by maize and teosinte parent. Error bars represent ± 1 standarderror. In all three tissues, the proportion of maize to teosinte R2 decreases in candi-date CCT gene lists with the most ideal candidates (CCT-A) having the most extremereduction.

Page 97: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

78

a greater reduction in variation in maize as compared to teosinte for CCT genes than the

full set of ∼13,000 genes (Figure 3.5, Supplemental Table C.7). This greater reduction

likely reflects the combined effects of the bottleneck plus selection during domestication.

For the full ABC groups of CCT genes, maize contributes 79% of teosinte variation, for

the AB group about 74%, and for the A group about 52% of teosinte variation. Thus,

among our strongest candidates (A group) for genes with cis regulatory difference between

maize and teosinte, the data indicate that maize explains only about half as much of the

cis regulatory variation as teosinte.

The reduction in gene expression variation in maize vs. teosinte is also seen in the

number of individual genes with significant effects due to the maize and/or teosinte parent

(Supplemental Table C.8). In terms of numbers of genes, there were 2.0 to 2.5 fold more

genes for which only the teosinte parent effect was significant than genes for which only

the maize parent effect was significant among AB list genes, and 5-fold more among the

A list CCT genes.

3.4.5 Selection candidate genes are enriched for CCT genes

We compared our list of CCT genes to putative targets of selection during maize domes-

tication and improvement [55]. There is significant enrichment for CCT genes among

selection candidate genes for all three tissues (Table 3.4). The strength of the evidence

for selection is strongest for the union of CCT genes from all three tissues. For example,

there are 134 CCT-AB genes among the selected genes, while 86.7 would be expected by

chance. Also, there were 10 CCT (A-list) genes from stem tissue among selected genes,

although only 2.16 are expected by chance, a nearly 5-fold enrichment.

XPCLR scores (cross population composite likelihood ratios) [89] quantify the de-

gree of support for positive selection on a genomic region. We drew on a recent study

[55] looking at XPCLR score in 10 kilobase windows in maize on a genome-wide scale.

Page 98: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

79

Figure 3.6: Density plots of ln(XPCLR) score of conserved versus CCT-AB candidategenes. CCT genes have a significantly higher signature of selection in the 10kb windowholding the transcriptional start site. The natural log transformed XPCLR scores forCCT-AB genes are consistently and statistically higher than genes that were identified asconserved in the initial analysis. The distributions of conserved and CCT-AB genes aresignificantly different by both the shape sensitive Kolmogorov-Smirnov test (p = 1.0587e-11) and simple difference of the means t-test (p = 2.2119e-10)

Page 99: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

80

Table 3.4: Fisher’s Exact Tests for the overlap between genes in domestication and im-provement selection candidate genes and CCT genes from each of the three experimentaltissues.

CCT Group Overlap Ear Leaf Stem Union

AExpected 3.42 1.41 2.16 5.6Observed 11 5 10 20p-value 3.52e-04 9.73e-03 1.89e-05 2.49e-07

ABExpected 44.71 35.29 34.78 86.7Observed 70 57 60 134p-value 9.12e-05 1.79e-04 1.74e-05 1.13e-07

ABCExpected 125.48 105.68 109.89 248.92Observed 174 135 139 317p-value 2.11e-06 1.289e-03 1.626e-03 3.54e-07

Page 100: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

81

Comparison of the distributions of ln(XPCLR) scores at the transcriptional start site for

CCT-AB genes and genes with conserved expression between maize and teosinte shows

that CCT genes having a higher mean XPCLR than conserved genes (Figure 3.6). These

two distributions are significantly different in terms of shape (Kolmogorov-Smirnov test,

p = 1.06e-11) and overall mean (t-test, p = 2.21e-10).

A goal of this study was to explore the relative importance of cis versus trans regula-

tory divergence during maize domestication. To address this question, we looked at the

evidence for selection on genes with cis only effects in comparison to genes that had trans

only effects. Genes in the cis and trans only regulatory categories were filtered to only

include those that had consistent effects in the F1 hybrid contrasts. Consistent effect was

defined as 100%, 90%, and 80% of hybrid contrasts favoring the same directionality of

effect. Due to this definition genes in the cis only group were merely the cis only subset

of CCT genes. For the trans only group in this analysis, the trans effect was estimated

from parent and hybrid expression ratios and a weighted percent of hybrid contrasts fa-

voring maize or teosinte alleles was calculated. Fisher’s Exact Tests on 2x2 contingency

tables tabulating cis and trans genes with selection feature genes from Hufford et al.

[55] show cis only genes are significantly enriched (p-value < 0.05) for selection in 7 of

9 comparisons, while trans only genes are never enriched and are actually significantly

underrepresented among selected genes in two cases (Table 3.5).

3.4.6 Microarray and RNAseq data partially correspond

We assessed the degree of correspondence between our CCT genes and 612 differentially

expressed genes identified by a recent microarray study in maize [18]. We constructed

2x2 contingency tables for differentially expressed (DE) and non-differentially expressed

(NDE) genes from the two studies. A Fisher’s Exact Test shows a highly significant degree

of correspondence between the two studies for all three tissue types (Table 3.6). Using our

Page 101: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

82

Table 3.5: Fisher’s Exact Tests for enrichment/depletion of cis and trans only genes inselection features.

TissueRegulatoryCategory

Group Observed Expected p-value

EarCis only

A List

5 1.998 0.043Leaf 3 0.751 0.032Stem 3 1.316 0.138Ear

Trans only4 5.327 0.818

Leaf 3 2.346 0.506Stem 1 0.282 0.256

EarCis only

AB List

36 24.449 0.018Leaf 24 13.516 0.006Stem 32 19.647 0.006Ear

Trans only28 41.954 0.020

Leaf 34 38.388 0.490Stem 16 12.032 0.222

EarCis only

ABC List

95 70.113 0.002Leaf 54 45.427 0.175Stem 84 65.615 0.016Ear

Trans only78 97.036 0.033

Leaf 91 101.461 0.273Stem 42 43.148 0.935

Page 102: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

83

CCT-AB list, ∼25 gene are identified as DE in both studies while about 7 are expected

by chance. However, the absolute level of correspondence between the two studies is

rather low. For example, of the 328 leaf genes identified as DE by RNAseq, only 25 (7%)

were also identified by the microarray study (Supplemental Table C.9). Thus, while the

overlap between our two studies is statistically significant, the two methodologies resulted

in largely different lists of DE genes.

The largely different lists of DE genes identified by microarray and RNAseq analysis

could be due in part to the fact that the microarray analysis includes genes with trans and

cis x trans differences. To assess the proportion of the 612 genes that have trans versus

cis effects, we examined the regulatory categories of the ∼250 differentially expressed

genes (241, 261, 259; ear, leaf, and stem) for which there is both microarray and RNAseq

data (Supplemental Table C.10). About 20% of these genes are classified as trans only

or cis x trans by RNAseq, while 55% are classified as either cis only or cis + trans. The

remainder (25%) are classified as conserved, ambiguous or compensatory. These results

suggests the very different lists of DE genes from the two technologies is to a large degree

due to differences in tissue, germplasm, environment, sampling error, or technical error,

and that inclusion/exclusion of trans and cis x trans genes by the two studies does not

explain all of the difference.

3.4.7 CCT genes are unrelated to differentially methylated re-

gions

In a recent study, Eichten et al. [93] identified differentially methylated regions (DMRs)

in maize and teosinte. We compiled a list of the nearest genes both upstream and down-

stream of each DMR which gave a list of 332 genes. Of these genes, we have RNAseq data

from 115, 116, and 121 for the ear, leaf, and stem tissues, respectively. Of these genes, 19,

14, and 17 genes were on the CCT-ABC gene lists (Supplemental Table C.11). We asked if

Page 103: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

84

Table 3.6: Fisher’s Exact Tests for the overlap between differentially expressed genes fromthe microarray study and CCT genes from each of the three experimental tissues in ourwork.

CCT Group Overlap Ear Leaf Stem Union

AExpected 0.556 0.274 0.359 1.040Observed 4 3 2 8p-value 2.14e-03 2.28e-03 4.92e-02 7.83e-06

ABExpected 7.501 6.409 6.248 15.778Observed 23 25 25 48p-value 1.56e-06 4.84e-09 2.91e-09 1.61e-12

ABCExpected 21.774 19.363 20.579 46.069Observed 52 48 46 90p-value 9.58e-10 1.69e-09 1.05e-07 6.34e-12

Page 104: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

85

CCT-ABC list genes are over-represented among the DMR associated genes as compared

to random expectation and found that they are not (Fisher’s Exact Test, p = 0.1092, p

= 0.4309, p = 0.1755; ear, leaf, and stem). Finally, the relationship between methyla-

tion status of the DMR does not correspond with the differential expression of maize vs.

teosinte alleles at CCT-ABC list genes. Rather than observing that the more methylated

allele was expressed at a lower level, the data show that ∼50% of the time, the methylated

allele is expressed higher and ∼50% expressed lower (Supplemental Table C.12).

3.4.8 Dominant and additive gene expression inheritance

The dominance/additivity (D/A) ratio was calculated for genes that were assessed in

at least 15 crosses with three unique maize and five unique teosinte inbred lines. The

overall average of gene D/A ratios was close to zero in all three tissues (Supplemental

Table C.13), suggesting there is not an extreme overall trend for dominance of non-

domesticated teosinte alleles over domesticated maize alleles. Tissues with active devel-

opmental programs, immature ear and seedling stem, are quite close to a 1:1 ratio of genes

with a positive D/A ratio to genes with negative D/A ratio (1.084 and 0.982 for ear and

stem, respectively). In contrast the leaf tissue has substantially more genes with a nega-

tive D/A value (1.287 ratio of positive to negative D/A ratios), indicating a higher rate of

domesticated maize allele dominance in the leaf tissue. Of the three experimental tissues,

two (Ear and Leaf) have an overall mean significantly different from zero (z-test, p <

0.05) and significantly more negative D/A ratios (binomial test, p < 0.05) than positive,

suggesting teosinte allele dominance (Supplemental Table C.13).

The average D/A ratios of the seven regulatory categories and three CCT gene lists

are also fairly close to an overall mean of zero. Even the smallest CCT-A lists (21

to 43 genes) were always less than a fully dominant D/A ratio of one. Density plots

for D/A ratio grouped by the seven regulatory categories do not show an obvious shift

Page 105: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

86

in distribution (Supplemental Figure C.3). Thus, there is evidence for a weak overall

tendency for dominance of non-domesticated expression levels in the ear and leaf tissues

with no evidence for this teosinte dominance being linked to a specific regulatory category

or candidate CCT gene list.

We compared the proportions of genes showing dominant versus additive gene action

in the cis only and trans only regulatory classes. Our trans only genes will show dominant

gene action when there are haplo-sufficient loss-of-function (LOF) alleles at their trans

regulators. In contrast, the effects of cis regulatory elements are expected to be purely

additive in absence of transvection or similar mechanism [94]. When one of our cis only

genes is classified as having dominant gene action that may also indicate error in classi-

fication because of trans effects on its expression that were below the level of statistical

detection. Consistent with the expectation that dominance is more likely for trans only

genes, the proportion of genes classified as dominant is higher for trans only genes in all

three tissue types (Figure 3.7, Supplemental Table C.14).

It has been proposed that the allelic variants responsible for evolution during domes-

tication are primarily recessive LOF alleles [82]. Under this model, a non-domesticated

allele would be dominant to the recessive LOF domesticated allele. Among our cis only

genes with dominant gene action, dominance of the maize versus teosinte allele does not

differ from the 50:50 expectation (Figure 3.7, Supplemental Table C.14). Among our

trans only genes with dominant gene action, the maize allele is dominant to the teosinte

allele more often than expected by chance. These results are counter to the proposal that

domestication favored recessive LOF alleles.

3.4.9 Candidate genes enriched in various functional categories

We examined our list of CCT genes for enrichment of several functional classes of maize

genes including transcription factors, genes in known metabolic pathways, genes underly-

Page 106: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

87

Figure 3.7: The proportion of genes showing dominant (red) versus additive (blue) geneaction for cis only and trans only AB lists. For all tissues, trans only genes have a higherrate of dominance, however this difference is only significant for the ear and leaf tissues(Fisher’s exact test, p < 0.005 indicated by “*”). The proportion of genes in the transonly lists that are dominant for the teosinte allele (green) and the maize allele (yellow) isshown in the barplot to the right of each pie graph. There is significant deviation from theneutral expectation (1:1) for the ear and leaf tissue (binomial test, p < 0.005 indicatedby “*”).

Page 107: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

88

ing QTL, and gene ontology (GO) groups. First, a list of maize transcription factors and

their corresponding families were compiled from the transcription factor database [86]. Al-

though CCT genes (AB-list) were found to be slightly enriched for several transcription

factor families (ARF, MADS-MIKC, and LBD) by Fisher’s Exact Tests, these results do

not stand up to Bonferroni multiple test correction (Supplemental Table C.15). We con-

clude that there is no compelling evidence that CCT genes are enriched for transcription

factors.

Our list of CCT (AB list) genes was also compared with results from a recent QTL

mapping experiment for a number of domestication and improvement traits [25]. We

compared observed vs. expected overlap between CCT genes from the three tissues to

the genes located within 1.5 LOD QTL support intervals for 16 traits. Testing was done

on a trait by trait basis and restricted to 1.5 LOD QTL intervals containing 20 or fewer

genes. After correction for multiple testing (Bonferroni), no significant enrichment for

CCT-AB genes in domestication QTL was observed (Supplemental Table C.16). The

greatest enrichment was seen with the trait ear diameter for which there were four CCT

genes assayed in ear tissue within the QTL interval when only 1.22 were expected by

chance (Fisher’s Exact Test, p = 0.03).

A test for enrichment of CCT and trans only genes in 15 different metabolic path-

ways defined in the Kyoto Encyclopedia of Genes and Genomes (KEGG) was done using

Fisher’s Exact Test on 2x2 contingency tables. There was no compelling evidence for

enrichment/depletion of either groups of genes in any of the 15 pathways tested (Supple-

mental Table C.17). The smallest p-value identified was for the cutin, suberine, and wax

biogenesis pathway in leaf tissue for trans only genes (p = 0.012), however this result does

not remain significant after Bonferroni multiple test correction.

We tested for GO term enrichment and depletion in the CCT and trans only gene

lists. These analyses found significant GO term associations in the leaf CCT-ABC gene

Page 108: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

89

list for five different categories including enrichment for chloroplast, plastid, thylakoid,

and chloroplast thylakoid membrane, and depletion for DNA binding (Supplemental Ta-

ble C.18). For trans only genes, significant enrichment for a number of GO terms in the

ear tissue was detected for transcription factor and photosynthesis related terms with

additional enrichment for ribosomal GO terms found in the leaf tissue (Supplemental

Table C.18).

3.5 Discussion

3.5.1 Regulatory change between and within maize and teosinte

Of the ∼17,000 genes assayed 70% have significant cis and/or trans regulatory differences,

suggesting considerable regulatory change has occurred during maize domestication and

subsequent crop improvement. A similar proportion of genes were found to have cis

and/or trans differences in a recent study between two species of Drosophila [21] and

yeast [73]. This high amount of variation between maize and teosinte is not surprising

given the incredible diversity of maize. Simple presence and absence of gene expression

within maize itself is quite variable as shown in a recent study where 27.9% of genes were

only expressed in a subset of maize inbred lines [95]. Additionally, this study found over

a thousand novel genes not present in the reference B73 genome, suggesting considerable

presence absence variation (PAV) also exists within maize. This finding is consistent

with another study where PAV and copy number variation (CNV) were assessed, finding

hundreds of CNVs and thousands of PAVs that included at least 180 single copy genes

[96]. These CNVs and PAVs are accompanied by millions of additional SNPs both within

and between genes [97]. In light of the known diversity within maize, it is not particularly

surprising to see evidence for prolific cis and trans regulatory variation in gene expression

between maize and teosinte.

Page 109: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

90

Gene expression differences between populations only addresses some of the varia-

tion seen in the dataset. There is also a large amount of variation within the maize and

teosinte populations. Only considering cis differences through F1 hybrids, upwards of 60%

of genes have evidence for multiple maize or teosinte expression levels and consequently

multiple alleles within population. Furthermore, our study shows a drop in expression

variation in maize consistent with the reduction in overall diversity caused by the domesti-

cation/improvement bottleneck with an even greater reduction in expression variation for

genes thought to be under additional artificial selection (CCT candidate genes) [55, 98].

The high level of expression variation still present in teosinte represents an unexplored

source of diversity in maize, which may be useful for future crop improvement and plant

breeding efforts.

This study sheds light on the large amount of expression variation within and between

maize and teosinte. However, only a small fraction of this diversity results in consistent

expression differences that distinguish maize and teosinte inbred lines. The relatively

small number of genes in this study showing consistent expression differences between

maize and teosinte (∼1000 of 17,000, ∼6%) is similar to the fraction of genes seen in

another recent study by Swanson-Wagner et al. [18]. Thus, this study reveals an immense

amount of regulatory diversity within and between maize and teosinte, while also showing

only a small fraction of this diversity appears to be fixed for discrete expression patterns

that distinguish maize and teosinte populations.

3.5.2 What is the frequency of cis and trans regulatory change?

Our study shows cis and trans regulatory differences occur at a similar frequency. How-

ever, this is only part of the story, since we also show that cis effects are arguably more

important for the generation of large divergence in expression between maize and teosinte

(Figure 3.3). Our observation of cis effects accounting for the majority of large expression

Page 110: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

91

differences was also seen in a recent Drosophila study by McManus et al. [21]. The fre-

quency of cis and trans regulatory differences in our sampling of maize and teosinte are

fairly similar in the three experimental tissues and consistent with work in Drosophila,

however, cis regulatory effects account for a significant portion of large expression diver-

gence.

In a recent study, Swanson-Wagner et al. [18] used microarrays to assess expression in

a number of maize and teosinte inbred lines, many in common with our RNAseq based

study. They found a relatively few number of genes (612 of ∼18,000) with differential

expression between maize and teosinte. Of the genes assayed in both our RNAseq study

and the Swanson-Wagner microarray experiment, all seven regulatory categories were

found, with approximately 25% classified as cis only, 10% as trans only, and 25% as cis

plus trans. While only ∼50% of the microarray differentially expressed genes were classified

as cis only or cis plus trans in our study (potential CCT candidate genes), the overall

low correlation between our RNAseq and the Swanson-Wagner microarray experiment

makes direct comparison difficult. Comparisons made between two parental samples will

identify genes with cumulative cis plus trans regulatory differences, consistent with this

expectation cis only, cis plus trans, and trans only were the three most frequent regulatory

categories assigned to differentially expressed microarray genes.

A prominent hypothesis in evolutionary biology is that mutation in CREs of func-

tionally conserved proteins is the primary mechanism by which morphological evolution

occurs [12]. In this hypothesis, mutation of the CREs of highly pleiotropic “master reg-

ulator” genes, and the resulting downstream effects, contribute substantially to overall

morphological change, which if true predicts large scale rearrangement of gene expression

networks based on trans effects. While it is true trans effects occur at a high frequency

in this study, these effects are accompanied by an equal number of larger cis regula-

tory driven expression differences. Thus, we believe the changes to gene regulation during

Page 111: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

92

maize domestication are best interpreted as frequent “shaving” of expression by cis regula-

tory change to fine-tune various pathway elements in addition to the broader adjustments

to whole pathways through trans regulatory differences.

3.5.3 Tissue specific expression of CCT candidates

We compared the expression of genes identified as candidates between tissues. There

was significantly more overlap between the candidate genes from the three experimental

tissues than expected by random chance (Permutation tests, p < 1e-5, Figure 3.1). This

suggests a high degree of shared cis regulatory effects between tissues. The functioning

of CREs in multiple tissues is also supported by the high observed correlation between

the direction and magnitude of cis effect in different tissues (Adj. R2 ≈ 80%, Pearson

correlation ≈ 90%). These results suggest many CREs function in multiple tissues to

drive expression of a gene.

While there is evidence for significant overlap of CCT genes between tissues, a very

high proportion of total CCT genes (∼70%) are only found in a single tissue. The lowest

overlap between tissues for the CCT-AB list (52 genes) was between the ear and leaf

tissue, arguably the two most developmentally different tissues studied. This trend is

seen in candidate genes as well as when considering all assayed genes. There have been

relatively few genome-wide studies using F1 hybrids to dissect cis and trans effects and

even fewer that consider multiple tissues [69, 72], but our results are consistent with these

previous studies where ∼70% of identified genes were identified in single tissues. Overall,

many CCT genes are shared between tissues, but the majority of genes are tissue specific,

suggesting modification of both globally active and tissue specific CREs occurred during

maize domestication.

Even though gene expression is highly correlated between tissues, there is evidence for

approximately 20% more functional, consistent cis regulatory changes in the ear seen in

Page 112: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

93

the larger number of CCT genes in the ear tissue (555) than in leaf (445) and stem (431).

The imbalance in number of differentially expressed genes in different tissues was also

observed in a recent study looking in Arabidopsis [72], where the three studied tissues

had an approximately 80% difference in number of differentially expressed genes. The

maize and teosinte ear have massive morphological differences in terms of size, placement

of spikelets, glume, and absence of fruit case. These morphological differences may be

due in part to these frequent tissue specific cis regulatory differences. This observation

is again at odds with the view of large morphological change in evolution/domestication

caused by mutation of CREs for a few “master regulator” genes [12]. Instead this data

again sheds light on the many single gene expression changes through “shaving” of allele

specific expression with modification of multiple tissue specific CREs.

3.5.4 Bias toward increased maize expression?

In the F1 hybrid analysis ∼55% of genes have higher expression of the maize allele than the

teosinte allele. High expression of the maize allele also occurs in the comparison between

parent inbred lines, except for leaf, where there is the same number of genes favoring

maize and teosinte alleles. This same trend of up regulated maize expression extends to

the CCT gene lists, where ∼60% of genes favor the maize allele. Our observation of high

expression for one of the parents (maize) is also consistent with several previous studies in

multiple organisms including maize [18], cotton [17], Arabidopsis [72], Cirsium [68], and

fruit fly [21]. Our experimental method using parent derived pseudo-transcriptomes and

perfect alignment to segregating sites should ameliorate the issue of alignment bias, but

we cannot be sure to have fully eliminated it. While potential alignment bias prevents

firm conclusions, genes consistent across all maize and teosinte inbreds are less likely to

be artifacts, suggesting the overall bias for maize alleles seen in candidate genes is a real

phenomenon.

Page 113: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

94

3.5.5 Selection-candidates enriched for cis regulatory change

Changes in gene expression, specifically through altered CREs, is not uncommon in the

history of domesticated crops. These changes have led to increased fruit size in tomato

[16], maize apical dominance [3, 40, 99], loss of prolificacy in maize [4], and changes

in rice yield and flowering time [57, 58]. These examples represent cases where large

sometimes pleiotropic genetic changes are caused by singular genes. There is no disputing

the important role of these types of genetic changes in creating some of the world’s most

productive crops. However, this study sheds light on the hundreds of other genes with

differential expression patterns, caused by CREs, between maize and teosinte.

These hundreds of genes with regulatory differences between maize and teosinte are

enriched in selection features [55] and have stronger selection upstream and at the gene

in comparison to conserved genes. Positive selection for regulatory effects is restricted to

genes specifically with CRE differences, since genes with trans only regulatory change are

never enriched for selected genes. While genes with consistent CREs differentiating pop-

ulations are not all likely to play large, equal, or even critical roles in the domestication of

maize. Corroborating evidence such as selection scans can provide the information needed

to elucidate truly important players in the domestication process, even if discovering the

function for all of these genes in domestication is likely an impossible task.

One example of how data from other sources, such as selection scans, can help shed

light on candidates is the importance of cis effect magnitude. A number of genes in

this study show large shifts in expression between maize and teosinte (log2(M:T) > 10),

however, the magnitude of cis effect has no correlation with strength of selection, suggest-

ing magnitude of effect is not particularly important. In retrospect, this is not surprising

considering subtle changes in gene expression are known to cause drastic phenotypic differ-

ences. New tissue specific shifts in gt1 expression largely led to elimination of secondary

ears in maize [4] and a relatively moderate 2-fold change in expression of tb1 leads to

Page 114: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

95

greatly increased apical dominance [3, 40]. In light of this result, selection on CREs dur-

ing maize domestication may be best characterized as subtle fine-tuning of expression

patterns to generate phenotypic change.

3.5.6 Leaf tissue candidates are enriched for photosynthesis and

chloroplast GO terms

A number of gene ontology terms implicated in photosynthesis and carbon fixation were

found to be enriched in the leaf CCT-ABC list. Mapping these genes back to photosyn-

thesis and carbon fixation pathways show two components in the photosystem I receptor

as well as part of the ATP synthase (delta subunit). Additionally, a number of enzymes

involved in carbon fixation were found to be up or down regulated in maize through cis

regulatory means. Most of these enzymes were involved in reactions converting malate to

other substrates in carbon fixation.

Cytosolic and mitochondrial forms of malate dehydrogenase (mdh) were two of the

identified differentially expressed genes. Mdh2, a mitochondrial form, is higher in teosinte,

whereas mdh4, cytosolic, is expressed at a higher level in maize. These expression dif-

ferences suggest there were changes made to malate-oxaloacetate flux between the mi-

tochondria and cytoplasm during maize domestication. Movement of oxaloacetate (OA)

has important implications in energy metabolism and photorespiration [100, 101]. The

changes in expression suggest there may be lower conversion between OA and malate

within the mitochondrial matrix, leading to reduced malate in the mitochondrial and re-

duced transport of OA into mitochondria. In theory this would leave more OA in the

cytoplasm where it would be available for conversion to malate and transport to bundle

sheath cells for photosynthesis. This could lead to improved rates of photosynthesis in

maize. However, these results should be treated with caution, since the malate dehydro-

Page 115: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

96

genase enzymes identified are on a secondary candidate gene list and are not considered

to be our best candidates.

3.5.7 Do crop domestication genes show cis differences?

Domestication is characterized by a number of common phenotypes including gigantism,

loss of prolificacy, loss of shattering, changes to pollination mechanisms, apical dominance,

and branching that are collectively considered the domestication syndrome [10, 11]. While

domestication syndrome is characterized by several common phenotypes, the genetic mod-

ifications that lead to these traits may or may not be due to changes in homologous genes.

Genes such as waxy [102–104], tb1 [3, 105], and ghd7 [50, 57] represent several genes that

were selected on in multiple crop species, however, there are many more unique genes

controlling domestication traits [106–109]. To get a sense of the regulatory status of sev-

eral crop domestication genes in maize, we generated a list of 28 domestication genes (6

maize and 22 non-maize) and identified the closest homologous gene in maize by protein

to protein BLAST (Table 3.7). Of these 28 genes, only sugary1 from maize, an isoamylase

starch debranching enzyme, in the ear was on the CCT-B gene list. Furthermore, only two

of the remaining genes were on the C list. The inability to identify cis regulatory changes

for maize homologs of non-maize domestication genes suggests cis regulatory change in

a domestication context may tend to operate on unique genes in different organisms as

opposed to a single gene with conserved functions in multiple species.

3.5.8 A catalog of genes with cis regulatory variation

A product of this study, similar to selection scans, is a list of candidates for future in-

vestigation. The complete set of 25,000 genes (with information on RNAseq read counts,

parent and F1 expression ratios, regulatory classification, and other summary informa-

Page 116: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

97T

able

3.7:

Reg

ula

tory

cate

gory

ofth

ecl

oses

tm

aize

hom

olog

of6

mai

zean

d22

non

-mai

zedom

esti

cati

onlo

ci.

Org

anis

mL

ocu

sN

ame

Funct

ional

Chan

geE

arL

eaf

Ste

m

Reg

.C

at.

CC

TR

eg.

Cat

.C

CT

Reg

.C

at.

CC

T

Mai

zetg

a1C

odin

gtr

ans

only

--

--

-M

aize

Zm

YA

B2.

1E

xpre

ssio

nci

s+

tran

s-

--

--

Mai

zeS

h2E

xpre

ssio

nco

nse

rved

Dtr

ans

only

Dco

mp.

DM

aize

Su

1C

odin

gci

son

lyB

com

p.

D-

-M

aize

gt1

Expre

ssio

ntr

ans

only

D-

-ci

sx

tran

sD

Mai

zetb

1E

xpre

ssio

nci

s+

tran

sD

--

ambig

uou

sD

Am

aran

ths

wax

yC

odin

gci

s+

tran

sD

cis

xtr

ans

Dco

nse

rved

DB

arle

yN

ud

Del

etio

n-

-tr

ans

only

Dtr

ans

only

DB

rass

ica

qFT

10-4

Expre

ssio

ntr

ans

only

D-

--

-B

rass

ica

BoC

AL

Codin

gci

son

lyC

cis

xtr

ans

-tr

ans

only

-P

eaP

sEL

F3

Codin

gci

sx

tran

sD

tran

son

lyD

cis

+tr

ans

DR

ice

DT

H2

Uncl

ear

tran

son

lyD

conse

rved

Dco

nse

rved

DR

ice

GS

6C

odin

gam

big

uou

s-

conse

rved

--

-R

ice

GS

5E

xpre

ssio

nco

nse

rved

Dam

big

uou

sD

conse

rved

DR

ice

qSH

1E

xpre

ssio

nci

son

lyD

conse

rved

Dtr

ans

only

DR

ice

shat

1C

odin

gco

mp.

Dam

big

uou

sD

tran

son

lyD

Ric

eB

h4C

odin

g-

-co

nse

rved

--

-R

ice

TA

C1

Expre

ssio

nci

sx

tran

sD

cis

only

Cci

son

lyD

Ric

eG

W2

Codin

gco

mp.

Dci

son

lyD

cis

only

DR

ice

Ehd

1C

odin

gci

sx

tran

sD

cis

+tr

ans

Dtr

ans

only

DR

ice

BA

DH

2C

odin

gci

sx

tran

sD

tran

son

lyD

cis

only

DR

ice

OsS

PL

16E

xpre

ssio

nci

son

ly-

--

--

Ric

eqP

E9-

1L

oss

ofF

unct

ion

com

p.

Dtr

ans

only

D-

-Sor

ghum

Sh1

expre

ssio

nci

s+

tran

s-

--

--

Sor

ghum

Tan

nin

1C

odin

gco

nse

rved

-co

nse

rved

-tr

ans

only

-T

omat

oF

AS

Expre

ssio

nci

s+

tran

s-

--

--

Whea

tQ

Codin

gan

dex

pre

ssio

nci

sx

tran

sD

tran

son

lyD

conse

rved

DW

hea

tV

rn1

Expre

ssio

nci

son

ly-

--

--

Page 117: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

98

tion) will be a valuable tool to investigators for screening for new genes of interest and

answering preliminary questions about the expression of specific genes.

From example, one attractive CCT candidate gene is barren stalk1 (ba1 ), a known

maize single gene mutant that causes a defect in branch formation in both the whole plant

and tassel [110]. The wild type function of ba1 is inferred to be in branch initiation. In

our study, ba1 was one of our strongest candidates with all assayed crosses showing higher

expression of the maize allele in the ear. The overall shift in expression was substantial

( 4-fold) and this shift is caused by cis regulatory differences alone. ba1 was also found

to be under selection during maize domestication in two independent studies [55, 110].

These combined observations suggest that there was selection for a CRE that codes the

upregulation of ba1 in the ear, perhaps resulting in a greater number of rows (branches)

of kernels in the maize ear as compared to the teosinte ear. Compelling evidence for this

hypothesis could be obtained by fine-mapping and identifying the hypothesized CRE and

demonstrating with expression assays that the maize and teosinte alleles of the CRE have

the imagined effects on gene expression during ear development and on phenotype (kernel

row number) in the adult ear. ba1 illustrates the power of genomic scans to identify

strong candidates for future study that can inform us about the fine details of evolution

under domestication.

Page 118: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

99

Appendices

Page 119: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

100

Appendix A

Supplemental Content: Genetic

dissection of a genomic region with

pleiotropic effects on domestication

traits in maize reveals multiple

linked QTL

Page 120: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

101

A.1 Figures

Figure A.1: Histograms of the least squared means for phenotyped traits from the QTLmapping population. Several of these distributions are approximately normal, but othertraits take on an exponential distribution. The average least squared mean for NIRILswith 100% maize and teosinte genotypes is indicated with an arrow and “M” for maizeand “T” for teosinte.

Page 121: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

102

Figure A.2: Example histograms of simulated traits for several different conditions interms of number of causative loci, effect size, and heritability. Histograms from traitswith equal effects - 67% H2, equal effects - 90% H2, gamma distributed effect - 67% H2and gamma distributed effect - 90% H2 are shown in different columns from left to right.Histograms from simulated traits with one, five, ten, twenty, fifty, seventy-five, and onehundred causative loci are shown from top to bottom. The average simulated phenotypevalue for NIRILs that are 100% maize and teosinte are indicated with arrows labeled by“M” for maize and “T” for teosinte.

Page 122: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

103

Figure A.3: Proportion of detected QTL with zero, one, or multiple causative genes in the1.5 LOD support interval. As seen in the equal effect size simulations, a high number ofgamma distributed causative genes leads to detected QTL with multiple causative factors.There is a reasonable percentage of detected QTL in the simulations containing a singlecausative gene when few (less than 4) causative genes are simulated, but as the numberof simulated causative genes increases we quickly lose the power to distinguish betweenclosely linked causative genes and they become lumped into single detected QTL.

Page 123: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

104

A.2 Tables

Table A.1: RFLP Markers used during backcrossing of QTL mapping population.

Marker Chromosome Marker Chromosome

bnl5.62 1 php20725 4umc157 1 umc19 4umc37b 1 umc127a 4npi255 1 bnl10.17b 4BZ2 1 umc15 4

bnl8.10 1 bnl8.23 4npi615 1 bnl8.33 5umc107 1 bnl6.25 5npi225 1 umc90 5bnl8.45 2 umc27 5umc53 2 umc166 5npi320 2 bnl7.71 5npi421 2 npi412 5umc6 2 umc54 5umc34 2 umc127b 5umc134 2 umc104a 5umc131 2 bnl6.29 6umc2b 2 umc65 6umc5a 2 umc21 6

php20005 2 umc46 6umc122 2 umc132 6umc49a 2 umc62 6umc36 2 npi114 8umc32 3 bnl9.11 8umc121 3 umc117 8

php20042 3 umc7 8umc42b 3 npi253 9umc161 3 umc113 9umc18 3 umc81 9TE1 3 umc95 9

bnl5.37 3 bnl3.04 10bnl8.01 3 umc130 10umc60 3 umc49b 10

bnl12.97 3 umc117b 10php10080 3 bnl7.49 10

npi425 3umc2a 3

Page 124: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

105

Table A.2: Genetic markers used to score BC6S6 mapping population.

Marker Genetic Position AGPv2

umc2036 0.00 6,985,618bnlg565 6.54 8,492,871bnlg105 20.90 13,812,586phi008 21.54 14,072,755umc2293 25.26 15,110,054umc2060 27.79 16,462,750bnlg1046 31.75 18,701,374umc2035 42.17 23,891,611umc1705 45.36 28,196,243umc1056 48.10 32,036,007umc2294 48.43 33,783,084umc1935 53.24 51,438,549umc1850 54.79 54,416,924mmp58 61.98 74,916,830GRMZM2G116761 63.55 82,236,166umc2298 65.07 84,800,717umc1110 65.39 84,825,409umc1224 66.70 92,368,617umc1283 67.52 111,997,867bnlg1287 67.69 121,584,002dupssr10 68.70 142,483,421bnlg2323 74.26 151,717,831ZHL0301 77.01 159,447,730umc1348 81.83 166,576,639umc1966 86.64 169,231,037

Page 125: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

106

Appendix B

Supplemental Content: Fine

mapping of chromosome five

domestication genes in maize

Page 126: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

107

B.1 Tables

Table B.1: PCR markers used for genotyping RCNILs including gene or SNP target,AGPv2 position, and primer sequence.

Gene or SNP Name AGPv2 Position Primers

GRMZM2G003313 38,994,478CCACAGAATCTCTCCACCAGACTTTTGCTTCTCACCCCAGA

GRMZM2G048045 62,595,351GCCTACGAGCTGCAACAGGGCCCTCCGTTCTACACACAG

GRMZM2G116761 82,236,265TCGCATCTGGAAAGAGCTTC

TGAATTGCAAAAGAGGAAACA

PZE-105075181 82,970,868GGCCCGGGCTAGAGAACCGAGTGCGGAGCTTGGGACCGAC

GRMZM2G158520 82,952,563TCGGGCACGAAAGGTGTCGCCACTCTCTCCCGCTCCCGCT

GRMZM2G387127 83,436,098CGCAAGCCGATCTTTTACTCGCAGTTGAACTCGAAGTGGA

GRMZM2G387127 83,436,808CGCAAGCCGATCTTTTACTCGCAGTTGAACTCGAAGTGGA

GRMZM2G026117 84,249,368CTCAGGCCAAGGTCTCACTCAGAGTGTGCGGCTTTCAGTT

umc1110 84,825,350TTACACCAAGGTCCGAAACAAGATTCTTGGAAGGCAAGACTCTACCTG

PZE-105076775 85,553,605CAAACCTCCCAAGAGAATGCTTGATGCAGATTCGCTGAAC

GRMZM2G017882 85,864,165GTCCGCCTCGGCGACCTAGACCAGAGGGGACCTGTGGGGG

AC207043.3 FG002 86,014,290CCACACTCATTTGACCAACGTGACGCGTGTTCTAGCTTGT

AC207043.3 FG002 86,014,338CCACACTCATTTGACCAACGTGACGCGTGTTCTAGCTTGT

PZE-105077135 86,221,700AAAGACGCAGCAGGAGAGAGTGCTACGTTACAGGCTGTCG

Table B.1: (continued)

Page 127: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

108

Gene or SNP Name AGPv2 Position Primers

GRMZM2G102758 86,783,453AGCAGGGTCAAGGACTACCATCCTGCAGCTCCTCTTCTTC

GRMZM2G063106 87,114,719TGCATTTCTCTGACCTCCTTGTCCGACTTGAGGATCCTGTT

umc1283 111,997,810CTGCTCCCTTATGATGTGATGATGTGCACTGAGGTGTAGGTAGAGCAA

GRMZM2G012923 151,446,717AGCAAAGCATGGGCTAGTGTGCCATGCTGCTTATGGATCT

GRMZM2G027886 159,447,674AACAGCTTTGCTTCCCTGAACCCAGAGGATCCAGAGTCAG

umc1348 166,576,570CTCACTGACACTTGAACACACACGTTACTGGTCTCCTGATCCTTAGCG

umc1221 168,671,954GCAACAGCAACTGGCAACAG

AAACAGGCACAAAGCATGGATAG

umc1966 169,230,959GTTTTCGACGAGGGGACTACATTTCACGGTTGAGAACTTCGCTTGTAG

Page 128: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

109

Appendix C

Supplemental Content: The role of

cis regulatory evolution in maize

domestication

Page 129: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

110

C.1 Figures

Figure C.1: Parent versus hybrid leaf tissue allele specific expression ratios. The parent(x-axis) versus F1 hybrid (y-axis) allele specific expression ratios are plotted against eachother. Regulatory category in terms of the combination of significant statistical testsdetermined using the method described in methods is shown designated by color. Pro-portion and count of genes falling into the various regulatory categories are also shown inthe lower right hand corner barplot.

Page 130: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

111

Figure C.2: Parent versus hybrid stem tissue allele specific expression ratios. The parent(x-axis) versus F1 hybrid (y-axis) allele specific expression ratios are plotted against eachother. Regulatory category in terms of the combination of significant statistical testsdetermined using the method described in methods is shown designated by color. Pro-portion and count of genes falling into the various regulatory categories are also shown inthe lower right hand corner barplot.

Page 131: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

112

Figure C.3: Dominance by additivity ratio grouped by regulatory category. Density plotsof gene dominance by additivity (D/A) ratios for the three tissues grouped by regulatorycategory. There is no obvious shift in the distribution for any of the tissues or regulatorycategories, indicating the gene regulatory category does not significantly impact overalladditivity or dominance.

Page 132: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

113

C.2 Tables

Table C.1: Biological replicates of F1 hybrid and parent inbred lines for RNAseq expressionstudy with hybrid replicates internal and parent around the perimeter.

B73 CML103 Ki3 Mo17 Oh43 W22 Inbred

TIL01 2/2/2 0/2/2 2/2/2 2/2/2 2/2/2TIL03 2/1/1 2/2/2 1/2/2 1/2/2 2/2/2 2/1/1TIL05 2/2/2 2/2/2TIL09 2/2/1 2/2/2 3/2/2 2/2/2 2/2/2TIL10 2/2/2 2/2/2TIL11 2/2/2 2/2/2 2/2/2 2/2/2 2/2/2 2/2/2TIL14 4/2/2 2/2/2 2/1/2 2/2/2 1/2/2 2/2/2TIL15 2/2/2 2/2/2TIL25 4/3/2 3/2/2 2/2/2 2/2/2

Inbred 2/2/2 2/2/2 2/2/2 2/2/2 2/2/2 2/2/2

Page 133: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

114

Table C.2: Adapter name, barcode sequence, and barcode length for Illumina adaptersused in RNAseq libraries.

Adapter # Adapter Name Barcode Sequence Barcode Length

1 PE YC3 GCATGT 5 nt2 PE YC4 TGTGCT 5 nt3 PE YC5 AGTCAT 5 nt4 PE YC6 GTAAGT 5 nt5 PE YC7 TCCTCT 5 nt6 PE YC8 CAGGTT 5 nt7 PE JM 1 TCCAT 4 nt8 PE JM 2 TAGCT 4 nt9 PE JM 3 GTTCT 4 nt10 PE JM 4 CGATT 4 nt11 PE TB 1 ATCGT 4 nt12 PE TB 2 GCTAT 4 nt13 PE TB 3 TGGAT 4 nt14 PE TB 4 ATGCT 4 nt15 PE ZL1 CACTAT 5 nt

Page 134: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

115

Table C.3: Number of genomic paired end reads and coverage obtained for constructingpseudo-transcriptomes.

Inbred Line # Reads genome coverage

CML103 4.46E+08 21.24Ki3 4.38E+08 19.85

Mo17 2.57E+08 11.37Oh43 5.59E+08 20.56TI01 3.44E+08 14.5TI03 3.16E+08 13.15TI05 4.76E+08 17.8TI09 3.42E+08 15.21TI10 5.29E+08 24.29TI11 3.41E+08 15.97TI14 3.22E+08 13.82TI15 5.39E+08 24.22TI25 4.27E+08 19.93W22 3.07E+08 13.19

Average 4.03E+08 17.50714

Page 135: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

116

Table C.4: Proportion of divergence due to cis regulatory effect grouped by overallparental divergence.

Gene Group1 N Tissue % cis ± SE

All genes 15939 Ear 0.4519 ± 0.00210 to 1 14140 Ear 0.4583 ± 0.00221 to 2 1312 Ear 0.3918 ± 0.00812 to 3 268 Ear 0.3524 ± 0.01883 to 4 95 Ear 0.337 ± 0.02984 to 5 45 Ear 0.4713 ± 0.0495

5+ 79 Ear 0.7777 ± 0.0273

All genes 15925 Leaf 0.4164 ± 0.00210 to 1 13784 Leaf 0.4262 ± 0.00221 to 2 1739 Leaf 0.3309 ± 0.00652 to 3 277 Leaf 0.3752 ± 0.01733 to 4 52 Leaf 0.4458 ± 0.04374 to 5 21 Leaf 0.6534 ± 0.0566

5+ 52 Leaf 0.7707 ± 0.0298

All genes 16018 Stem 0.4704 ± 0.00210 to 1 14746 Stem 0.4715 ± 0.00221 to 2 1000 Stem 0.4284 ± 0.00962 to 3 149 Stem 0.4629 ± 0.02333 to 4 40 Stem 0.5051 ± 0.05394 to 5 23 Stem 0.6365 ± 0.059

5+ 60 Stem 0.8081 ± 0.0248

1 Group (except for “All genes”) indicates group-ing of genes by the absolute value of the parentlog2(Maize:Teosinte) ratio.

Page 136: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

117

Table C.5: The number of genes for which the maize or teosinte allele is expressed at ahigher level.

CCT Group Tissue Maize Teosinte

A Ear 34 9A Leaf 16 5A Stem 19 8B Ear 319 193B Leaf 265 159B Stem 249 155C Ear 594 396C Leaf 533 310C Stem 558 382

ABC Ear 947 598ABC Leaf 814 474ABC Stem 826 545

Page 137: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

118

Tab

leC

.6:

Bia

sfo

rth

em

aize

alle

legr

oup

edby

inbre

dline

for

the

thre

eti

ssues

inth

eC

CT

-AB

Cge

ne

list

.

Maiz

eIn

bre

dC

CT

Gro

up

Tis

sue

Teosi

nte

Bia

sN

oB

ias

Maiz

eB

ias

Maiz

e:T

eosi

nte

Rati

o

B73

AB

CE

ara

569

197

51.

7135

CM

L10

366

16

839

1.26

93K

i360

25

915

1.51

99M

o17

605

1284

51.

3967

Oh43

594

194

91.

5976

W22

640

488

91.

3891

non

-B73

606

093

91.

5495

B73

AB

CL

eafb

465

082

31.

7699

CM

L10

355

66

688

1.23

74K

i350

63

760

1.50

20M

o17

478

476

51.

6004

Oh43

477

080

71.

6918

W22

502

177

51.

5438

non

-B73

494

079

41.

6073

B73

AB

CSte

mc

524

084

71.

6164

CM

L10

358

27

739

1.26

98K

i355

52

793

1.42

88M

o17

520

480

61.

5500

Oh43

512

185

71.

6738

W22

546

181

41.

4908

non

-B73

545

082

61.

5156

aF

isher

’sE

xac

tT

est

for

B73

vers

us

cum

ula

tive

non

-B73

rati

o,p

=0.

1821

.b

Fis

her

’sE

xac

tT

esfo

rB

73ve

rsus

cum

ula

tive

non

-B73

rati

ot,

p=

0.25

39.

cF

isher

’sE

xac

tT

est

for

B73

vers

us

cum

ula

tive

non

-B73

rati

o,p

=0.

4326

.

Page 138: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

119

Table C.7: Allele specific expression variation among F1 hybrids explained by maize andteosinte parent.

Tissue Category R2 maize R2 teosinte Maize/Teosinte Gene Count

Ear All genes 32.48% 38.21% 85.01% 13194Leaf All genes 31.76% 37.18% 85.43% 13121

Stem All genes 32.04% 38.56% 83.09% 13305Ear ABC 32.25% 41.37% 77.96% 1545

Leaf ABC 31.94% 39.79% 80.27% 1288Stem ABC 32.20% 41.26% 78.05% 1371

Ear AB 30.76% 42.95% 71.64% 555Leaf AB 30.61% 41.69% 73.42% 445

Stem AB 32.28% 42.22% 76.45% 431Ear A 26.58% 48.86% 54.41% 43

Leaf A 20.11% 47.63% 42.22% 21Stem A 28.86% 48.26% 59.80% 27

Page 139: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

120

Table C.8: Number of genes for which the maize and/or teosinte parent contributed tothe variance among the F1 hybrid gene expression ratios (heterogeneous) and genes forwhich there was no variance in expression attributable to the maize or teosinte parent(homogeneous). CCT genes in groups A, B, and C in the three tissue types are shown.

Tissue Category Heterogeneous Homogenous Total

Maize Teosinte Maize+Teosinte

Ear All genes 1880 2959 2504 5851 13194Leaf All genes 1810 3005 2327 5979 13121Stem All genes 1924 3215 2645 5521 13305Ear ABC 195 417 350 583 1545Leaf ABC 165 322 285 516 1288Stem ABC 193 374 321 483 1371Ear AB 67 157 120 211 555Leaf AB 54 117 104 170 445Stem AB 57 128 105 141 431Ear A 3 17 5 18 43Leaf A 1 6 3 11 21Stem A 2 8 7 10 27

Page 140: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

121T

able

C.9

:C

ompar

ison

ofob

serv

edan

dex

pec

ted

num

ber

sof

genes

clas

sified

asdiff

eren

tial

lyex

pre

ssed

(DE

)or

not

diff

er-

enti

ally

expre

ssed

(ND

E)

by

RN

Ase

qan

dM

icro

Arr

ayas

says

ingr

oups

A,

B,

and

Cin

the

thre

eti

ssue

typ

es.

CC

TG

roup

Tis

sue

Obse

rved

Exp

ecte

d

Mic

roA

rray

-ND

EM

icro

Arr

ay-D

EM

icro

Arr

ay-N

DE

Mic

roA

rray

-DE

AE

arR

NA

seq-N

DE

9587

184

9583

.56

187.

44A

Ear

RN

Ase

q-D

E25

428

.44

0.56

AL

eaf

RN

Ase

q-N

DE

9774

192

9771

.27

194.

73A

Lea

fR

NA

seq-D

E11

313

.73

0.27

ASte

mR

NA

seq-N

DE

9804

198

9802

.36

199.

64A

Ste

mR

NA

seq-D

E16

217

.64

0.36

AU

nio

nR

NA

seq-N

DE

1009

720

310

090.

0420

9.96

AU

nio

nR

NA

seq-D

E43

849

.96

1.04

AB

Ear

RN

Ase

q-N

DE

9244

165

9228

.50

180.

50A

BE

arR

NA

seq-D

E36

823

383.

507.

50A

BL

eaf

RN

Ase

q-N

DE

9482

170

9463

.41

188.

59A

BL

eaf

RN

Ase

q-D

E30

325

321.

596.

41A

BSte

mR

NA

seq-N

DE

9532

175

9513

.25

193.

75A

BSte

mR

NA

seq-D

E28

825

306.

756.

25A

BU

nio

nR

NA

seq-N

DE

9414

163

9381

.78

195.

22A

BU

nio

nR

NA

seq-D

E72

648

758.

2215

.78

AB

CE

arR

NA

seq-N

DE

8529

136

8498

.77

166.

23A

BC

Ear

RN

Ase

q-D

E10

8352

1113

.23

21.7

7A

BC

Lea

fR

NA

seq-N

DE

8842

147

8813

.36

175.

64A

BC

Lea

fR

NA

seq-D

E94

348

971.

6419

.36

AB

CSte

mR

NA

seq-N

DE

8835

154

8809

.58

179.

42A

BC

Ste

mR

NA

seq-D

E98

546

1010

.42

20.5

8A

BC

Unio

nR

NA

seq-N

DE

7970

121

7926

.07

164.

93A

BC

Unio

nR

NA

seq-D

E21

7090

2213

.93

46.0

7

Page 141: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

122

Table C.10: Regulatory categories for genes identified as differentially expressed betweenmaize and teosinte by microarray assays.

Ear Leaf Stem

Ambiguous 5.81% 7.66% 9.65%Cis + Trans 25.73% 29.12% 22.39%Cis only 26.14% 28.74% 30.89%Cis x Trans 6.64% 6.13% 8.49%Componesatory 7.05% 8.05% 6.56%Conserved 13.28% 8.05% 12.74%Trans only 15.35% 12.26% 9.27%Total Genes 241 261 259

Page 142: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

123

Table C.11: Fisher’s Exact Tests for the overlap between genes associated with differen-tially methylated regions (DMRs) and CCT-ABC genes from each of the three experi-mental tissues in our work.

Overlap Ear Leaf Stem Union

Expected 13.466 11.387 12.468 27.493Observed 19 14 17 34p-value 0.1092 0.4309 0.1755 0.1605

Page 143: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

124

Table C.12: Number of candidate genes neighboring differentially methylated regions(DMRs) between maize and teosinte and proportion in which expression data agrees withmethylated status.

Ear Leaf Stem

Total 19 14 17A 1 0 0B 3 3 3C 15 11 14

Total-agree 57.90% 50.00% 58.80%A-agree 100% NA NAB-agree 100% 33.30% 33.30%C-agree 46.70% 54.50% 64.30%

Page 144: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

125

Table C.13: Characteristics of dominance/additivity ratios from a genome-wide analysisincluding basic statistics such as max, min, mean, and median as well as average D/Aratio for seven regulatory categories and the CCT candidate lists.

Ear Leaf Stem

Min -10.4557 -273.675 -27.8545Max 10.56194 70.80451 78.71309

Median 0.032991 0.160156 -0.01118Mean 0.035682 0.211276 -0.01638

Positive D/A 6863 7385 6593Negative D/A 6331 5736 6712Pos:Neg Ratio 1.084031 1.287483 0.982271

N 13194 13121 13305

Z-test p-value 2.442e-05 1.486e-13 0.354Binomial p-value 3.775e-06 4.741e-47 0.306

Ambiguous -0.00408 0.020225 -0.00841Cis + Trans -0.00204 0.455915 0.05871

Cis only -0.02053 0.044602 0.063987Cis x Trans 0.14616 0.32702 -0.16874

Compensatory 0.052921 -0.08854 0.002721Conserved 0.049997 0.009092 -0.05574Trans only 0.08708 0.382572 -0.10058

CCT-A 0.03508 0.329661 0.026347CCT-AB -0.0169 0.094785 0.129459

CCT-ABC -0.04257 0.208951 0.077445

Page 145: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

126

Table C.14: Additive and dominant gene counts for the A, AB, and ABC cis and trans onlycandidate lists. Dominance cells contain the number of genes for which the maize:teosinteallele was dominant. Fisher’s exact tests (FET) interrogate whether the degree of dom-inance/additivity differs between the cis and trans classes. The binomial test (BT) askswhether the number of maize:teosinte dominant alleles are equal.

Ear Leaf Stem

Add Dom Add Dom Add Dom

ACis only 11 1:0 5 1:0 3 2:1

Trans only 13 19:2* 5 4:3 2 0:2

FET p<0.005 FET p>0.05 FET p>0.05

ABCis only 95 22:18 53 18:17 52 19:20

Trans only 112 89:35* 72 81:29* 23 10:13

FET p<0.005 FET p<0.005 FET p>0.05

ABCCis only 266 62:65 136 50:56 178 68:71

Trans only 203 112:68* 121 107:65* 67 35:42

FET p<0.005 FET p<0.005 FET p<0.05

* Binomial test p-value < 0.005.

Page 146: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

127

Table C.15: Degree of overlap between our CCT (AB list) genes and genes in differenttranscription factor families.

Family TissueAssayedGenes

ObservedOverlap

ExpectedOverlap

FETp-value

AP2 Ear 6 0 0.25 1ARF Ear 27 4 1.14 0.03

ARR-B Ear 8 0 0.34 1B3 Ear 18 1 0.76 0.54

BBR-BPC Ear 4 0 0.17 1BES1 Ear 3 0 0.13 1bHLH Ear 42 1 1.77 0.84bZIP Ear 51 0 2.15 1C2H2 Ear 28 2 1.18 0.33C3H Ear 42 1 1.77 0.84

CAMTA Ear 8 0 0.34 1CO-like Ear 3 0 0.13 1

CPP Ear 7 1 0.29 0.26DBB Ear 4 0 0.17 1Dof Ear 7 0 0.29 1

E2F/DP Ear 10 0 0.42 1EIL Ear 4 0 0.17 1ERF Ear 17 0 0.72 1FAR1 Ear 15 2 0.63 0.13

G2-like Ear 11 0 0.46 1GATA Ear 10 0 0.42 1GeBP Ear 14 0 0.59 1GRAS Ear 21 1 0.88 0.59GRF Ear 8 0 0.34 1

HB-other Ear 14 0 0.59 1HB-PHD Ear 2 0 0.08 1HD-ZIP Ear 19 1 0.8 0.56

HSF Ear 12 1 0.5 0.4LBD Ear 3 0 0.13 1LFY Ear 0 0 0 NALSD Ear 3 0 0.13 1

M-type Ear 6 1 0.25 0.23MIKC Ear 23 2 0.97 0.25MYB Ear 23 2 0.97 0.25

MYB related Ear 42 4 1.77 0.1NAC Ear 25 0 1.05 1

Table C.15: (continued)

Page 147: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

128

Family TissueAssayedGenes

ObservedOverlap

ExpectedOverlap

FETp-value

NF-X1 Ear 2 0 0.08 1NF-YA Ear 10 0 0.42 1NF-YB Ear 7 0 0.29 1NF-YC Ear 7 0 0.29 1Nin-like Ear 11 1 0.46 0.38

RAV Ear 0 0 0 NAS1Fa-like Ear 0 0 0 NA

SBP Ear 12 0 0.5 1SRS Ear 2 0 0.08 1

STAT Ear 1 0 0.04 1TALE Ear 12 0 0.5 1TCP Ear 9 0 0.38 1

Trihelix Ear 22 0 0.93 1VOZ Ear 2 0 0.08 1

Whirly Ear 2 0 0.08 1WOX Ear 0 0 0 NA

WRKY Ear 20 0 0.84 1YABBY Ear 4 0 0.17 1ZF-HD Ear 1 0 0.04 1

ALL Ear 649 24 27.3 0.77

AP2 Leaf 8 0 0.27 1ARF Leaf 27 0 0.92 1

ARR-B Leaf 8 0 0.27 1B3 Leaf 16 1 0.54 0.42

BBR-BPC Leaf 4 0 0.14 1BES1 Leaf 3 0 0.1 1bHLH Leaf 41 0 1.39 1bZIP Leaf 42 0 1.42 1C2H2 Leaf 29 2 0.98 0.26C3H Leaf 41 0 1.39 1

CAMTA Leaf 8 0 0.27 1CO-like Leaf 5 0 0.17 1

CPP Leaf 7 0 0.24 1DBB Leaf 6 0 0.2 1Dof Leaf 8 0 0.27 1

E2F/DP Leaf 10 0 0.34 1EIL Leaf 4 0 0.14 1ERF Leaf 15 0 0.51 1FAR1 Leaf 14 0 0.47 1

Table C.15: (continued)

Page 148: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

129

Family TissueAssayedGenes

ObservedOverlap

ExpectedOverlap

FETp-value

G2-like Leaf 16 0 0.54 1GATA Leaf 14 0 0.47 1GeBP Leaf 14 0 0.47 1GRAS Leaf 19 1 0.64 0.48GRF Leaf 6 0 0.2 1

HB-other Leaf 14 1 0.47 0.38HB-PHD Leaf 2 0 0.07 1HD-ZIP Leaf 16 1 0.54 0.42

HSF Leaf 10 0 0.34 1LBD Leaf 1 1 0.03 0.03LFY Leaf 0 0 0 NALSD Leaf 3 0 0.1 1

M-type Leaf 3 0 0.1 1MIKC Leaf 9 2 0.31 0.04MYB Leaf 31 2 1.05 0.28

MYB related Leaf 44 1 1.49 0.78NAC Leaf 28 2 0.95 0.25

NF-X1 Leaf 2 0 0.07 1NF-YA Leaf 9 0 0.31 1NF-YB Leaf 5 0 0.17 1NF-YC Leaf 8 0 0.27 1Nin-like Leaf 10 0 0.34 1

RAV Leaf 0 0 0 NAS1Fa-like Leaf 0 0 0 NA

SBP Leaf 11 0 0.37 1SRS Leaf 0 0 0 NA

STAT Leaf 1 0 0.03 1TALE Leaf 12 0 0.41 1TCP Leaf 8 0 0.27 1

Trihelix Leaf 22 1 0.75 0.53VOZ Leaf 2 0 0.07 1

Whirly Leaf 2 0 0.07 1WOX Leaf 0 0 0 NA

WRKY Leaf 16 0 0.54 1YABBY Leaf 4 0 0.14 1ZF-HD Leaf 1 0 0.03 1

ALL Leaf 623 15 21.13 0.94

AP2 Stem 8 0 0.26 1ARF Stem 27 3 0.87 0.06

Table C.15: (continued)

Page 149: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

130

Family TissueAssayedGenes

ObservedOverlap

ExpectedOverlap

FETp-value

ARR-B Stem 8 0 0.26 1B3 Stem 14 0 0.45 1

BBR-BPC Stem 4 0 0.13 1BES1 Stem 3 0 0.1 1bHLH Stem 50 2 1.62 0.49bZIP Stem 47 1 1.52 0.79C2H2 Stem 28 2 0.91 0.23C3H Stem 41 1 1.33 0.74

CAMTA Stem 8 0 0.26 1CO-like Stem 4 0 0.13 1

CPP Stem 7 0 0.23 1DBB Stem 6 0 0.19 1Dof Stem 8 0 0.26 1

E2F/DP Stem 10 0 0.32 1EIL Stem 4 1 0.13 0.12ERF Stem 16 0 0.52 1FAR1 Stem 15 0 0.49 1

G2-like Stem 14 0 0.45 1GATA Stem 12 0 0.39 1GeBP Stem 13 0 0.42 1GRAS Stem 20 0 0.65 1GRF Stem 7 0 0.23 1

HB-other Stem 15 2 0.49 0.08HB-PHD Stem 2 0 0.06 1HD-ZIP Stem 17 1 0.55 0.43

HSF Stem 14 0 0.45 1LBD Stem 2 0 0.06 1LFY Stem 0 0 0 NALSD Stem 3 0 0.1 1

M-type Stem 4 1 0.13 0.12MIKC Stem 10 2 0.32 0.04MYB Stem 23 2 0.75 0.17

MYB related Stem 42 1 1.36 0.75NAC Stem 29 0 0.94 1

NF-X1 Stem 2 0 0.06 1NF-YA Stem 10 1 0.32 0.28NF-YB Stem 6 0 0.19 1NF-YC Stem 7 0 0.23 1Nin-like Stem 11 0 0.36 1

Table C.15: (continued)

Page 150: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

131

Family TissueAssayedGenes

ObservedOverlap

ExpectedOverlap

FETp-value

RAV Stem 0 0 0 NAS1Fa-like Stem 0 0 0 NA

SBP Stem 11 0 0.36 1SRS Stem 2 0 0.06 1

STAT Stem 1 0 0.03 1TALE Stem 13 0 0.42 1TCP Stem 6 1 0.19 0.18

Trihelix Stem 23 0 0.75 1VOZ Stem 2 0 0.06 1

Whirly Stem 2 0 0.06 1WOX Stem 0 0 0 NA

WRKY Stem 19 0 0.62 1YABBY Stem 4 0 0.13 1ZF-HD Stem 0 0 0 NA

ALL Stem 640 20 20.73 0.6

AP2 Union 10 0 0.76 1ARF Union 27 6 2.06 0.01

ARR-B Union 8 0 0.61 1B3 Union 18 2 1.38 0.41

BBR-BPC Union 4 0 0.31 1BES1 Union 3 0 0.23 1bHLH Union 53 3 4.05 0.78bZIP Union 52 1 3.97 0.98C2H2 Union 31 4 2.37 0.21C3H Union 42 2 3.21 0.84

CAMTA Union 8 0 0.61 1CO-like Union 5 0 0.38 1

CPP Union 7 1 0.54 0.43DBB Union 6 0 0.46 1Dof Union 9 0 0.69 1

E2F/DP Union 10 0 0.76 1EIL Union 4 1 0.31 0.27ERF Union 18 0 1.38 1FAR1 Union 15 2 1.15 0.32

G2-like Union 18 0 1.38 1GATA Union 15 0 1.15 1GeBP Union 15 0 1.15 1GRAS Union 23 2 1.76 0.53GRF Union 8 0 0.61 1

Table C.15: (continued)

Page 151: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

132

Family TissueAssayedGenes

ObservedOverlap

ExpectedOverlap

FETp-value

HB-other Union 15 2 1.15 0.32HB-PHD Union 2 0 0.15 1HD-ZIP Union 20 3 1.53 0.19

HSF Union 14 1 1.07 0.67LBD Union 3 1 0.23 0.21LFY Union 0 0 0 NALSD Union 3 0 0.23 1

M-type Union 7 2 0.54 0.09MIKC Union 25 5 1.91 0.04MYB Union 32 3 2.45 0.45

MYB related Union 48 4 3.67 0.51NAC Union 35 2 2.68 0.76

NF-X1 Union 2 0 0.15 1NF-YA Union 10 1 0.76 0.55NF-YB Union 7 0 0.54 1NF-YC Union 8 0 0.61 1Nin-like Union 11 1 0.84 0.58

RAV Union 0 0 0 NAS1Fa-like Union 0 0 0 NA

SBP Union 12 0 0.92 1SRS Union 2 0 0.15 1

STAT Union 1 0 0.08 1TALE Union 14 0 1.07 1TCP Union 9 1 0.69 0.51

Trihelix Union 24 1 1.83 0.85VOZ Union 2 0 0.15 1

Whirly Union 2 0 0.15 1WOX Union 0 0 0 NA

WRKY Union 23 0 1.76 1YABBY Union 4 0 0.31 1ZF-HD Union 1 0 0.08 1

ALL Union 724 49 55.34 0.84

Table C.15: (continued)

Page 152: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

133

Table C.16: Degree of overlap between CCT (AB list) differentially expressed genes andgenes in the 1.5 support intervals for QTL from a previous study.

Trait TissueAssayedGenes

ObservedOverlap

ExpectedOverlap

FETp-value

BARE Ear 0 0 0 1DIAM Ear 29 4 1.22 0.03

DIS Ear 4 1 0.17 0.16DTP Ear 10 1 0.42 0.35

GLCO Ear 3 0 0.13 1GLU Ear 0 0 0 1KRN Ear 15 2 0.63 0.13KW Ear 17 1 0.72 0.52LEN Ear 4 1 0.17 0.16

PROL Ear 5 0 0.21 1STAM Ear 10 1 0.42 0.35BARE Leaf 0 0 0 1DIAM Leaf 28 0 0.95 1

DIS Leaf 4 1 0.14 0.13DTP Leaf 9 0 0.31 1

GLCO Leaf 3 0 0.1 1GLU Leaf 0 0 0 1KRN Leaf 13 0 0.44 1KW Leaf 17 3 0.58 0.02LEN Leaf 4 0 0.14 1

PROL Leaf 5 0 0.17 1STAM Leaf 9 1 0.31 0.27BARE Stem 0 0 0 1DIAM Stem 28 1 0.91 0.6

DIS Stem 4 0 0.13 1DTP Stem 10 0 0.32 1

GLCO Stem 3 0 0.1 1GLU Stem 0 0 0 1KRN Stem 14 1 0.45 0.37KW Stem 18 3 0.58 0.02LEN Stem 4 0 0.13 1

PROL Stem 5 0 0.16 1STAM Stem 10 0 0.32 1

Page 153: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

134

Table C.17: Degree overlap between our CCT (AB list) differentially expressed genes andgenes in metabolic pathways defined in KEGG.

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Alpha-linoleic AcidMetabolism

Ear-CCT-A

26 14 0 0.046 1

chidonic AcidMetabolism

Ear-CCT-A

10 7 0 0.023 1

Biosynthesis ofUnsaturated Fatty

Acids

Ear-CCT-A

33 16 0 0.052 1

Cutin, Suberine,and Wax

Biosynthesis

Ear-CCT-A

10 5 0 0.016 1

Ether LipidMetabolism

Ear-CCT-A

11 7 0 0.023 1

Fatty AcidBiosynthesis

Ear-CCT-A

32 19 0 0.062 1

Fatty AcidDegradation

Ear-CCT-A

34 27 0 0.088 1

Fatty AcidElongation

Ear-CCT-A

16 8 0 0.026 1

GlycerolipidMetabolism

Ear-CCT-A

46 31 0 0.101 1

Glycerophospho-lipid

Metabolism

Ear-CCT-A

64 45 0 0.147 1

Linoleic AcidMetabolism

Ear-CCT-A

12 5 0 0.016 1

SphingolipidMetabolism

Ear-CCT-A

21 13 0 0.042 1

Starch and sucrosemetabolism

Ear-CCT-A

98 59 0 0.192 1

SteroidBiosynthesis

Ear-CCT-A

25 15 0 0.049 1

Synthe-sis/Degradation of

Ketone Bodies

Ear-CCT-A

8 8 0 0.026 1

ALLEar-

CCT-A353 223 0 0.727 1

Table C.17: 1 Tissue, candidate, and level of list.

Page 154: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

135

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Alpha-linoleic AcidMetabolism

Ear-CCT-AB

26 14 1 0.589 0.452

Arachidonic AcidMetabolism

Ear-CCT-AB

10 7 0 0.294 1

Biosynthesis ofUnsaturated Fatty

Acids

Ear-CCT-AB

33 16 0 0.673 1

Cutin, Suberine,and Wax

Biosynthesis

Ear-CCT-AB

10 5 0 0.21 1

Ether LipidMetabolism

Ear-CCT-AB

11 7 0 0.294 1

Fatty AcidBiosynthesis

Ear-CCT-AB

32 19 2 0.799 0.189

Fatty AcidDegradation

Ear-CCT-AB

34 27 1 1.136 0.687

Fatty AcidElongation

Ear-CCT-AB

16 8 0 0.337 1

GlycerolipidMetabolism

Ear-CCT-AB

46 31 0 1.304 1

Glycerophospho-lipid

Metabolism

Ear-CCT-AB

64 45 0 1.893 1

Linoleic AcidMetabolism

Ear-CCT-AB

12 5 1 0.21 0.193

SphingolipidMetabolism

Ear-CCT-AB

21 13 0 0.547 1

Starch and sucrosemetabolism

Ear-CCT-AB

98 59 3 2.482 0.454

Table C.17: 1 Tissue, candidate, and level of list.

Page 155: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

136

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

SteroidBiosynthesis

Ear-CCT-AB

25 15 1 0.631 0.475

Synthe-sis/Degradation of

Ketone Bodies

Ear-CCT-AB

8 8 1 0.337 0.291

ALLEar-

CCT-AB

353 223 8 9.38 0.726

Alpha-linoleic AcidMetabolism

Ear-CCT-ABC

26 14 5 1.639 0.018

Arachidonic AcidMetabolism

Ear-CCT-ABC

10 7 1 0.82 0.582

Biosynthesis ofUnsaturated Fatty

Acids

Ear-CCT-ABC

33 16 2 1.874 0.575

Cutin, Suberine,and Wax

Biosynthesis

Ear-CCT-ABC

10 5 0 0.585 1

Ether LipidMetabolism

Ear-CCT-ABC

11 7 1 0.82 0.582

Fatty AcidBiosynthesis

Ear-CCT-ABC

32 19 3 2.225 0.388

Fatty AcidDegradation

Ear-CCT-ABC

34 27 5 3.162 0.203

Fatty AcidElongation

Ear-CCT-ABC

16 8 0 0.937 1

GlycerolipidMetabolism

Ear-CCT-ABC

46 31 3 3.63 0.721

Glycerophospho-lipid

Metabolism

Ear-CCT-ABC

64 45 5 5.269 0.619

Table C.17: 1 Tissue, candidate, and level of list.

Page 156: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

137

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Linoleic AcidMetabolism

Ear-CCT-ABC

12 5 1 0.585 0.464

SphingolipidMetabolism

Ear-CCT-ABC

21 13 0 1.522 1

Starch and sucrosemetabolism

Ear-CCT-ABC

98 59 7 6.909 0.545

SteroidBiosynthesis

Ear-CCT-ABC

25 15 2 1.756 0.539

Synthe-sis/Degradation of

Ketone Bodies

Ear-CCT-ABC

8 8 1 0.937 0.631

ALLEar-

CCT-ABC

353 223 28 26.113 0.376

Alpha-linoleic AcidMetabolism

Ear-trans-A

26 14 1 0.062 0.06

Arachidonic AcidMetabolism

Ear-trans-A

10 7 0 0.031 1

Biosynthesis ofUnsaturated Fatty

Acids

Ear-trans-A

33 16 0 0.07 1

Cutin, Suberine,and Wax

Biosynthesis

Ear-trans-A

10 5 0 0.022 1

Ether LipidMetabolism

Ear-trans-A

11 7 1 0.031 0.03

Fatty AcidBiosynthesis

Ear-trans-A

32 19 0 0.084 1

Fatty AcidDegradation

Ear-trans-A

34 27 0 0.119 1

Fatty AcidElongation

Ear-trans-A

16 8 0 0.035 1

GlycerolipidMetabolism

Ear-trans-A

46 31 0 0.136 1

Table C.17: 1 Tissue, candidate, and level of list.

Page 157: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

138

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Glycerophospho-lipid

Metabolism

Ear-trans-A

64 45 2 0.198 0.017

Linoleic AcidMetabolism

Ear-trans-A

12 5 1 0.022 0.022

SphingolipidMetabolism

Ear-trans-A

21 13 0 0.057 1

Starch and sucrosemetabolism

Ear-trans-A

98 59 2 0.259 0.028

SteroidBiosynthesis

Ear-trans-A

25 15 0 0.066 1

Synthe-sis/Degradation of

Ketone Bodies

Ear-trans-A

8 8 0 0.035 1

ALLEar-

trans-A353 223 5 0.98 0.003

Alpha-linoleic AcidMetabolism

Ear-trans-AB

26 14 2 0.506 0.089

Arachidonic AcidMetabolism

Ear-trans-AB

10 7 0 0.253 1

Biosynthesis ofUnsaturated Fatty

Acids

Ear-trans-AB

33 16 1 0.578 0.445

Cutin, Suberine,and Wax

Biosynthesis

Ear-trans-AB

10 5 2 0.181 0.012

Ether LipidMetabolism

Ear-trans-AB

11 7 1 0.253 0.227

Fatty AcidBiosynthesis

Ear-trans-AB

32 19 1 0.687 0.503

Fatty AcidDegradation

Ear-trans-AB

34 27 2 0.976 0.255

Table C.17: 1 Tissue, candidate, and level of list.

Page 158: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

139

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Fatty AcidElongation

Ear-trans-AB

16 8 0 0.289 1

GlycerolipidMetabolism

Ear-trans-AB

46 31 4 1.121 0.025

Glycerophospho-lipid

Metabolism

Ear-trans-AB

64 45 5 1.627 0.023

Linoleic AcidMetabolism

Ear-trans-AB

12 5 1 0.181 0.168

SphingolipidMetabolism

Ear-trans-AB

21 13 0 0.47 1

Starch and sucrosemetabolism

Ear-trans-AB

98 59 3 2.133 0.36

SteroidBiosynthesis

Ear-trans-AB

25 15 0 0.542 1

Synthe-sis/Degradation of

Ketone Bodies

Ear-trans-AB

8 8 0 0.289 1

ALLEar-

trans-AB

353 223 15 8.062 0.016

Alpha-linoleic AcidMetabolism

Ear-trans-ABC

26 14 2 1.213 0.345

Arachidonic AcidMetabolism

Ear-trans-ABC

10 7 0 0.606 1

Biosynthesis ofUnsaturated Fatty

Acids

Ear-trans-ABC

33 16 1 1.386 0.766

Cutin, Suberine,and Wax

Biosynthesis

Ear-trans-ABC

10 5 2 0.433 0.063

Table C.17: 1 Tissue, candidate, and level of list.

Page 159: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

140

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Ether LipidMetabolism

Ear-trans-ABC

11 7 2 0.606 0.118

Fatty AcidBiosynthesis

Ear-trans-ABC

32 19 1 1.646 0.821

Fatty AcidDegradation

Ear-trans-ABC

34 27 3 2.339 0.418

Fatty AcidElongation

Ear-trans-ABC

16 8 1 0.693 0.516

GlycerolipidMetabolism

Ear-trans-ABC

46 31 6 2.686 0.047

Glycerophospho-lipid

Metabolism

Ear-trans-ABC

64 45 7 3.898 0.09

Linoleic AcidMetabolism

Ear-trans-ABC

12 5 1 0.433 0.364

SphingolipidMetabolism

Ear-trans-ABC

21 13 0 1.126 1

Starch and sucrosemetabolism

Ear-trans-ABC

98 59 5 5.111 0.588

SteroidBiosynthesis

Ear-trans-ABC

25 15 2 1.299 0.378

Synthe-sis/Degradation of

Ketone Bodies

Ear-trans-ABC

8 8 0 0.693 1

ALLEar-

trans-ABC

353 223 23 19.319 0.218

Alpha-linoleic AcidMetabolism

Leaf-CCT-A

26 13 0 0.021 1

Table C.17: 1 Tissue, candidate, and level of list.

Page 160: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

141

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Arachidonic AcidMetabolism

Leaf-CCT-A

10 7 0 0.011 1

Biosynthesis ofUnsaturated Fatty

Acids

Leaf-CCT-A

33 19 0 0.03 1

Cutin, Suberine,and Wax

Biosynthesis

Leaf-CCT-A

10 6 0 0.01 1

Ether LipidMetabolism

Leaf-CCT-A

11 7 0 0.011 1

Fatty AcidBiosynthesis

Leaf-CCT-A

32 19 0 0.03 1

Fatty AcidDegradation

Leaf-CCT-A

34 30 0 0.048 1

Fatty AcidElongation

Leaf-CCT-A

16 9 0 0.014 1

GlycerolipidMetabolism

Leaf-CCT-A

46 34 0 0.054 1

Glycerophospho-lipid

Metabolism

Leaf-CCT-A

64 47 0 0.075 1

Linoleic AcidMetabolism

Leaf-CCT-A

12 5 0 0.008 1

SphingolipidMetabolism

Leaf-CCT-A

21 14 0 0.022 1

Starch and sucrosemetabolism

Leaf-CCT-A

98 62 0 0.099 1

SteroidBiosynthesis

Leaf-CCT-A

25 15 0 0.024 1

Synthe-sis/Degradation of

Ketone Bodies

Leaf-CCT-A

8 8 0 0.013 1

ALLLeaf-

CCT-A353 236 0 0.378 1

Alpha-linoleic AcidMetabolism

Leaf-CCT-AB

26 13 0 0.441 1

Table C.17: 1 Tissue, candidate, and level of list.

Page 161: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

142

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Arachidonic AcidMetabolism

Leaf-CCT-AB

10 7 0 0.237 1

Biosynthesis ofUnsaturated Fatty

Acids

Leaf-CCT-AB

33 19 2 0.644 0.134

Cutin, Suberine,and Wax

Biosynthesis

Leaf-CCT-AB

10 6 2 0.203 0.016

Ether LipidMetabolism

Leaf-CCT-AB

11 7 1 0.237 0.215

Fatty AcidBiosynthesis

Leaf-CCT-AB

32 19 0 0.644 1

Fatty AcidDegradation

Leaf-CCT-AB

34 30 2 1.017 0.271

Fatty AcidElongation

Leaf-CCT-AB

16 9 0 0.305 1

GlycerolipidMetabolism

Leaf-CCT-AB

46 34 1 1.153 0.691

Glycerophospho-lipid

Metabolism

Leaf-CCT-AB

64 47 2 1.594 0.477

Linoleic AcidMetabolism

Leaf-CCT-AB

12 5 0 0.17 1

SphingolipidMetabolism

Leaf-CCT-AB

21 14 0 0.475 1

Starch and sucrosemetabolism

Leaf-CCT-AB

98 62 1 2.103 0.883

SteroidBiosynthesis

Leaf-CCT-AB

25 15 0 0.509 1

Table C.17: 1 Tissue, candidate, and level of list.

Page 162: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

143

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Synthe-sis/Degradation of

Ketone Bodies

Leaf-CCT-AB

8 8 1 0.271 0.241

ALLLeaf-CCT-AB

353 236 9 8.004 0.408

Alpha-linoleic AcidMetabolism

Leaf-CCT-ABC

26 13 1 1.276 0.739

Arachidonic AcidMetabolism

Leaf-CCT-ABC

10 7 1 0.687 0.515

Biosynthesis ofUnsaturated Fatty

Acids

Leaf-CCT-ABC

33 19 3 1.865 0.285

Cutin, Suberine,and Wax

Biosynthesis

Leaf-CCT-ABC

10 6 2 0.589 0.111

Ether LipidMetabolism

Leaf-CCT-ABC

11 7 1 0.687 0.515

Fatty AcidBiosynthesis

Leaf-CCT-ABC

32 19 2 1.865 0.569

Fatty AcidDegradation

Leaf-CCT-ABC

34 30 7 2.945 0.023

Fatty AcidElongation

Leaf-CCT-ABC

16 9 0 0.883 1

GlycerolipidMetabolism

Leaf-CCT-ABC

46 34 4 3.338 0.432

Glycerophospho-lipid

Metabolism

Leaf-CCT-ABC

64 47 4 4.614 0.691

Linoleic AcidMetabolism

Leaf-CCT-ABC

12 5 0 0.491 1

Table C.17: 1 Tissue, candidate, and level of list.

Page 163: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

144

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

SphingolipidMetabolism

Leaf-CCT-ABC

21 14 0 1.374 1

Starch and sucrosemetabolism

Leaf-CCT-ABC

98 62 6 6.086 0.577

SteroidBiosynthesis

Leaf-CCT-ABC

25 15 1 1.472 0.788

Synthe-sis/Degradation of

Ketone Bodies

Leaf-CCT-ABC

8 8 1 0.785 0.563

ALLLeaf-CCT-ABC

353 236 26 23.167 0.296

Alpha-linoleic AcidMetabolism

Leaf-trans-A

26 13 0 0.026 1

Arachidonic AcidMetabolism

Leaf-trans-A

10 7 0 0.014 1

Biosynthesis ofUnsaturated Fatty

Acids

Leaf-trans-A

33 19 0 0.038 1

Cutin, Suberine,and Wax

Biosynthesis

Leaf-trans-A

10 6 0 0.012 1

Ether LipidMetabolism

Leaf-trans-A

11 7 0 0.014 1

Fatty AcidBiosynthesis

Leaf-trans-A

32 19 0 0.038 1

Fatty AcidDegradation

Leaf-trans-A

34 30 0 0.059 1

Fatty AcidElongation

Leaf-trans-A

16 9 0 0.018 1

GlycerolipidMetabolism

Leaf-trans-A

46 34 0 0.067 1

Glycerophospho-lipid

Metabolism

Leaf-trans-A

64 47 0 0.093 1

Table C.17: 1 Tissue, candidate, and level of list.

Page 164: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

145

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Linoleic AcidMetabolism

Leaf-trans-A

12 5 0 0.01 1

SphingolipidMetabolism

Leaf-trans-A

21 14 0 0.028 1

Starch and sucrosemetabolism

Leaf-trans-A

98 62 0 0.123 1

SteroidBiosynthesis

Leaf-trans-A

25 15 0 0.03 1

Synthe-sis/Degradation of

Ketone Bodies

Leaf-trans-A

8 8 0 0.016 1

ALLLeaf-

trans-A353 236 0 0.468 1

Alpha-linoleic AcidMetabolism

Leaf-trans-AB

26 13 1 0.447 0.365

Arachidonic AcidMetabolism

Leaf-trans-AB

10 7 0 0.241 1

Biosynthesis ofUnsaturated Fatty

Acids

Leaf-trans-AB

33 19 0 0.653 1

Cutin, Suberine,and Wax

Biosynthesis

Leaf-trans-AB

10 6 0 0.206 1

Ether LipidMetabolism

Leaf-trans-AB

11 7 0 0.241 1

Fatty AcidBiosynthesis

Leaf-trans-AB

32 19 1 0.653 0.486

Fatty AcidDegradation

Leaf-trans-AB

34 30 0 1.031 1

Fatty AcidElongation

Leaf-trans-AB

16 9 0 0.309 1

Table C.17: 1 Tissue, candidate, and level of list.

Page 165: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

146

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

GlycerolipidMetabolism

Leaf-trans-AB

46 34 0 1.169 1

Glycerophospho-lipid

Metabolism

Leaf-trans-AB

64 47 0 1.616 1

Linoleic AcidMetabolism

Leaf-trans-AB

12 5 1 0.172 0.16

SphingolipidMetabolism

Leaf-trans-AB

21 14 1 0.481 0.387

Starch and sucrosemetabolism

Leaf-trans-AB

98 62 2 2.131 0.634

SteroidBiosynthesis

Leaf-trans-AB

25 15 1 0.516 0.408

Synthesis andDegradation ofKetone Bodies

Leaf-trans-AB

8 8 0 0.275 1

ALLLeaf-trans-AB

353 236 6 8.112 0.826

Alpha-linoleic AcidMetabolism

Leaf-trans-ABC

26 13 2 1.212 0.345

Arachidonic AcidMetabolism

Leaf-trans-ABC

10 7 0 0.652 1

Biosynthesis ofUnsaturated Fatty

Acids

Leaf-trans-ABC

33 19 1 1.771 0.844

Cutin, Suberine,and Wax

Biosynthesis

Leaf-trans-ABC

10 6 0 0.559 1

Ether LipidMetabolism

Leaf-trans-ABC

11 7 0 0.652 1

Table C.17: 1 Tissue, candidate, and level of list.

Page 166: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

147

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Fatty AcidBiosynthesis

Leaf-trans-ABC

32 19 2 1.771 0.54

Fatty AcidDegradation

Leaf-trans-ABC

34 30 3 2.796 0.539

Fatty AcidElongation

Leaf-trans-ABC

16 9 0 0.839 1

GlycerolipidMetabolism

Leaf-trans-ABC

46 34 2 3.169 0.839

Glycerophospho-lipid

Metabolism

Leaf-trans-ABC

64 47 3 4.381 0.827

Linoleic AcidMetabolism

Leaf-trans-ABC

12 5 1 0.466 0.387

SphingolipidMetabolism

Leaf-trans-ABC

21 14 1 1.305 0.746

Starch and sucrosemetabolism

Leaf-trans-ABC

98 62 3 5.779 0.937

SteroidBiosynthesis

Leaf-trans-ABC

25 15 3 1.398 0.158

Synthe-sis/Degradation of

Ketone Bodies

Leaf-trans-ABC

8 8 1 0.746 0.543

ALLLeaf-trans-ABC

353 236 17 21.997 0.897

Alpha-linoleic AcidMetabolism

Stem-CCT-A

26 15 0 0.03 1

Arachidonic AcidMetabolism

Stem-CCT-A

10 7 0 0.014 1

Table C.17: 1 Tissue, candidate, and level of list.

Page 167: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

148

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Biosynthesis ofUnsaturated Fatty

Acids

Stem-CCT-A

33 17 0 0.034 1

Cutin, Suberine,and Wax

Biosynthesis

Stem-CCT-A

10 6 0 0.012 1

Ether LipidMetabolism

Stem-CCT-A

11 7 0 0.014 1

Fatty AcidBiosynthesis

Stem-CCT-A

32 19 0 0.039 1

Fatty AcidDegradation

Stem-CCT-A

34 30 0 0.061 1

Fatty AcidElongation

Stem-CCT-A

16 8 0 0.016 1

GlycerolipidMetabolism

Stem-CCT-A

46 32 0 0.065 1

Glycerophospho-lipid

Metabolism

Stem-CCT-A

64 47 0 0.095 1

Linoleic AcidMetabolism

Stem-CCT-A

12 6 0 0.012 1

SphingolipidMetabolism

Stem-CCT-A

21 14 0 0.028 1

Starch and sucrosemetabolism

Stem-CCT-A

98 61 1 0.124 0.117

SteroidBiosynthesis

Stem-CCT-A

25 16 0 0.032 1

Synthe-sis/Degradation of

Ketone Bodies

Stem-CCT-A

8 8 0 0.016 1

ALLStem-

CCT-A353 235 1 0.477 0.382

Alpha-linoleic AcidMetabolism

Stem-CCT-AB

26 15 1 0.486 0.39

Arachidonic AcidMetabolism

Stem-CCT-AB

10 7 0 0.227 1

Table C.17: 1 Tissue, candidate, and level of list.

Page 168: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

149

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Biosynthesis ofUnsaturated Fatty

Acids

Stem-CCT-AB

33 17 1 0.551 0.429

Cutin, Suberine,and Wax

Biosynthesis

Stem-CCT-AB

10 6 0 0.194 1

Ether LipidMetabolism

Stem-CCT-AB

11 7 1 0.227 0.206

Fatty AcidBiosynthesis

Stem-CCT-AB

32 19 1 0.615 0.465

Fatty AcidDegradation

Stem-CCT-AB

34 30 1 0.972 0.628

Fatty AcidElongation

Stem-CCT-AB

16 8 0 0.259 1

GlycerolipidMetabolism

Stem-CCT-AB

46 32 0 1.037 1

Glycerophospho-lipid

Metabolism

Stem-CCT-AB

64 47 2 1.523 0.453

Linoleic AcidMetabolism

Stem-CCT-AB

12 6 0 0.194 1

SphingolipidMetabolism

Stem-CCT-AB

21 14 0 0.454 1

Starch and sucrosemetabolism

Stem-CCT-AB

98 61 1 1.976 0.866

SteroidBiosynthesis

Stem-CCT-AB

25 16 1 0.518 0.41

Synthe-sis/Degradation of

Ketone Bodies

Stem-CCT-AB

8 8 1 0.259 0.232

Table C.17: 1 Tissue, candidate, and level of list.

Page 169: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

150

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

ALLStem-CCT-AB

353 235 8 7.613 0.494

Alpha-linoleic AcidMetabolism

Stem-CCT-ABC

26 15 3 1.546 0.196

Arachidonic AcidMetabolism

Stem-CCT-ABC

10 7 1 0.721 0.533

Biosynthesis ofUnsaturated Fatty

Acids

Stem-CCT-ABC

33 17 2 1.752 0.535

Cutin, Suberine,and Wax

Biosynthesis

Stem-CCT-ABC

10 6 0 0.618 1

Ether LipidMetabolism

Stem-CCT-ABC

11 7 2 0.721 0.157

Fatty AcidBiosynthesis

Stem-CCT-ABC

32 19 1 1.958 0.874

Fatty AcidDegradation

Stem-CCT-ABC

34 30 4 3.091 0.374

Fatty AcidElongation

Stem-CCT-ABC

16 8 1 0.824 0.581

GlycerolipidMetabolism

Stem-CCT-ABC

46 32 1 3.297 0.969

Glycerophospho-lipid

Metabolism

Stem-CCT-ABC

64 47 9 4.843 0.048

Linoleic AcidMetabolism

Stem-CCT-ABC

12 6 0 0.618 1

SphingolipidMetabolism

Stem-CCT-ABC

21 14 0 1.443 1

Table C.17: 1 Tissue, candidate, and level of list.

Page 170: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

151

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Starch and sucrosemetabolism

Stem-CCT-ABC

98 61 9 6.286 0.172

SteroidBiosynthesis

Stem-CCT-ABC

25 16 1 1.649 0.825

Synthe-sis/Degradation of

Ketone Bodies

Stem-CCT-ABC

8 8 1 0.824 0.581

ALLStem-CCT-ABC

353 235 29 24.215 0.176

Alpha-linoleic AcidMetabolism

Stem-trans-A

26 15 0 0.006 1

Arachidonic AcidMetabolism

Stem-trans-A

10 7 0 0.003 1

Biosynthesis ofUnsaturated Fatty

Acids

Stem-trans-A

33 17 0 0.006 1

Cutin, Suberine,and Wax

Biosynthesis

Stem-trans-A

10 6 0 0.002 1

Ether LipidMetabolism

Stem-trans-A

11 7 0 0.003 1

Fatty AcidBiosynthesis

Stem-trans-A

32 19 0 0.007 1

Fatty AcidDegradation

Stem-trans-A

34 30 0 0.011 1

Fatty AcidElongation

Stem-trans-A

16 8 0 0.003 1

GlycerolipidMetabolism

Stem-trans-A

46 32 0 0.012 1

Glycerophospho-lipid

Metabolism

Stem-trans-A

64 47 0 0.018 1

Linoleic AcidMetabolism

Stem-trans-A

12 6 0 0.002 1

SphingolipidMetabolism

Stem-trans-A

21 14 0 0.005 1

Table C.17: 1 Tissue, candidate, and level of list.

Page 171: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

152

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Starch and sucrosemetabolism

Stem-trans-A

98 61 0 0.023 1

SteroidBiosynthesis

Stem-trans-A

25 16 0 0.006 1

Synthe-sis/Degradation of

Ketone Bodies

Stem-trans-A

8 8 0 0.003 1

ALLStem-

trans-A353 235 0 0.088 1

Alpha-linoleic AcidMetabolism

Stem-trans-AB

26 15 0 0.168 1

Arachidonic AcidMetabolism

Stem-trans-AB

10 7 0 0.078 1

Biosynthesis ofUnsaturated Fatty

Acids

Stem-trans-AB

33 17 0 0.19 1

Cutin, Suberine,and Wax

Biosynthesis

Stem-trans-AB

10 6 0 0.067 1

Ether LipidMetabolism

Stem-trans-AB

11 7 0 0.078 1

Fatty AcidBiosynthesis

Stem-trans-AB

32 19 0 0.213 1

Fatty AcidDegradation

Stem-trans-AB

34 30 0 0.336 1

Fatty AcidElongation

Stem-trans-AB

16 8 1 0.09 0.086

GlycerolipidMetabolism

Stem-trans-AB

46 32 0 0.358 1

Glycerophospho-lipid

Metabolism

Stem-trans-AB

64 47 0 0.526 1

Table C.17: 1 Tissue, candidate, and level of list.

Page 172: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

153

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Linoleic AcidMetabolism

Stem-trans-AB

12 6 0 0.067 1

SphingolipidMetabolism

Stem-trans-AB

21 14 0 0.157 1

Starch and sucrosemetabolism

Stem-trans-AB

98 61 1 0.683 0.498

SteroidBiosynthesis

Stem-trans-AB

25 16 0 0.179 1

Synthe-sis/Degradation of

Ketone Bodies

Stem-trans-AB

8 8 0 0.09 1

ALLStem-trans-AB

353 235 2 2.632 0.743

Alpha-linoleic AcidMetabolism

Stem-trans-ABC

26 15 0 0.601 1

Arachidonic AcidMetabolism

Stem-trans-ABC

10 7 0 0.28 1

Biosynthesis ofUnsaturated Fatty

Acids

Stem-trans-ABC

33 17 0 0.681 1

Cutin, Suberine,and Wax

Biosynthesis

Stem-trans-ABC

10 6 0 0.24 1

Ether LipidMetabolism

Stem-trans-ABC

11 7 0 0.28 1

Fatty AcidBiosynthesis

Stem-trans-ABC

32 19 0 0.761 1

Fatty AcidDegradation

Stem-trans-ABC

34 30 1 1.202 0.707

Table C.17: 1 Tissue, candidate, and level of list.

Page 173: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

154

Pathway Group1 PathwayGenes

AssayedGenes

Overlap(obs)

Overlap(exp)

FETp-value

Fatty AcidElongation

Stem-trans-ABC

16 8 1 0.32 0.279

GlycerolipidMetabolism

Stem-trans-ABC

46 32 1 1.282 0.73

Glycerophospho-lipid

Metabolism

Stem-trans-ABC

64 47 1 1.883 0.854

Linoleic AcidMetabolism

Stem-trans-ABC

12 6 0 0.24 1

SphingolipidMetabolism

Stem-trans-ABC

21 14 0 0.561 1

Starch and sucrosemetabolism

Stem-trans-ABC

98 61 4 2.444 0.228

SteroidBiosynthesis

Stem-trans-ABC

25 16 1 0.641 0.48

Synthe-sis/Degradation of

Ketone Bodies

Stem-trans-ABC

8 8 1 0.32 0.279

ALLStem-trans-ABC

353 235 8 9.414 0.73

Table C.17: 1 Tissue, candidate, and level of list.

Page 174: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

155

Table C.18: Significantly enriched and depleted GO terms from CCT and trans only genelists including tissue, group, accession, description, counts, rate of occurrence, and FDRcorrected p-values.

Group1 GODescription

Cand.genesin acc.

Genesin acc.

Prop.cand.genes

Prop.assayedgenes

FDR

Leaf-CCT-ABC

chloroplast 135 937 0.144 0.071 0.002

Leaf-CCT-ABC

plastid 146 1062 0.137 0.081 0.007

Leaf-CCT-ABC

thylakoid 35 171 0.205 0.013 0.012

Leaf-CCT-ABC

chloroplastthylakoidmembrane

26 115 0.226 0.009 0.017

Leaf-CCT-ABC

DNA binding 2 43 771 0.056 0.059 0.016

Ear-trans-A

chlorophyllbiosynthetic

process3 13 0.231 0.001 0.027

Ear-trans-AB

nucleic acidbinding

transcriptionfactor activity

26 228 0.114 0.017 0

Ear-trans-AB

sequence-specific DNA

bindingtranscriptionfactor activity

26 228 0.114 0.017 0

Ear-trans-AB

regulation oftranscription,

DNA-dependent40 474 0.084 0.036 0

Ear-trans-AB

sequence-specific DNA

binding20 160 0.125 0.012 0

Ear-trans-AB

chlorophyllbiosynthetic

process6 13 0.462 0.001 0.001

Ear-trans-AB

DNA binding 49 798 0.061 0.06 0.013

Table C.18: 1 Tissue, candidate, and level of list. 2 Under-represented GO term

Page 175: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

156

Group1 GODescription

Cand.genesin acc.

Genesin acc.

Prop.cand.genes

Prop.assayedgenes

FDR

Ear-trans-AB

biologi-cal process

273 6578 0.042 0.499 0.022

Ear-trans-ABC

nucleic acidbinding

transcriptionfactor activity

45 228 0.197 0.017 0

Ear-trans-ABC

sequence-specific DNA

bindingtranscriptionfactor activity

45 228 0.197 0.017 0

Ear-trans-ABC

regulation oftranscription,

DNA-dependent69 474 0.146 0.036 0.003

Ear-trans-ABC

chlorophyllbiosynthetic

process7 13 0.538 0.001 0.012

Ear-trans-ABC

sequence-specific DNA

binding29 160 0.181 0.012 0.02

Leaf-trans-AB

ribosome 30 294 0.102 0.022 0

Leaf-trans-AB

cell division 11 77 0.143 0.006 0.03

Leaf-trans-AB

microtubule 9 60 0.15 0.005 0.048

Leaf-trans-AB

structuralconstituent of

ribosome20 224 0.089 0.017 0.048

Leaf-trans-ABC

cell division 21 77 0.273 0.006 0.005

Table C.18: 1 Tissue, candidate, and level of list. 2 Under-represented

Page 176: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

157

Appendix D

Characterization of domestication

traits for selection candidate gene

Zea agamous2

Page 177: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

158

D.1 Forward

This appendix details unpublished work on characterization of a selected gene in maize

known as Zea agamous2 (zag2 ), a homolog of the Arabidopsis thaliana gene Agamous.

Work was carried out by myself with other members of the Doebley Lab contributing to

genotyping and phenotyping efforts.

Page 178: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

159

D.2 Introduction

Multiple studies have looked to identify the signature of selection, both artificial and

natural, in evolving species [20, 35, 111, 112]. In maize, recent studies have looked at

the signature of selection on both a gene by gene basis [113, 114] and in genome-wide

scans [55]. Knowing a gene was under selection during domestication can be difficult to

interpret in terms of phenotypic impact due to the inherent lack of phenotype association

in population genetic analyses. While some indication as to phenotypic effect can be

drawn from analysis of selected genes with protein domain annotation and gene ontology

tools, concrete association of a gene with a phenotypic effect using empirical data is still

desired.

One gene identified as the target of artificial selection in a recent study [114] is a known

homolog of Agamous from Arabidopsis thaliana. This Agamous homolog (Zea agamous2

or zag2 ) is located on the third chromosome at ∼137.2 megabases (AGPv2). The trans-

lated protein of zag2 is 258 amino acids long and downstream of the highly conserved

MADS-box domain shares approximately 45% identity and 60% similarity with the Ara-

bidopsis Agamous gene [115]. Expression of zag2 is associated with the carpel or flowering

section of Arabidopsis thaliana and in maize zag2 appears to be exclusively expressed in

the carpels of developing ears [116]. The expression of zag2 mRNA in developing ears

suggests a likely effect on domestication phenotypes in the female inflorescence.

Our study of zag2 involved two techniques. First, we generated a set of recombinant

chromosome near isogenic lines (RCNILs) that had recombination breakpoints between

zag2 and both the next up and downstream genes. RCNILs were genotyped using three

markers (upstream, at the gene, and downstream) to identify the recombination break-

points location with respect to zag2. Lines were then planted and phenotyped in multiple

environments for a large number of phenotypes that focused on ear traits, but also in-

cluded a number of other plant and tassel traits. Second, a transgenic RNAi construct

Page 179: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

160

carrying a portion of the zag2 gene was transformed into maize and backcrossed with

two maize inbred lines. We assessed percent fill as a proxy for sterility, while also testing

for presence of the construct using resistance to the BASTA herbicide. Neither of these

experiments produced evidence of a concrete link between a domestication phenotype and

zag2.

D.3 Methods

D.3.1 RCNILs

We screened 1,710 individuals in the winter and summer of 2009 that were drawn from a

heterogeneous inbred family, which was heterozygous at zag2. Markers used were umc1102

and PZD00100. From this screen, thirteen individuals with recombination breakpoints

between the upstream and downstream genes were identified. Recombinant individuals

were selfed and progeny were genotyped with the same markers in the winter 2010 season.

Homozygous individuals were identified and selfed again to produce founding members

of the RCNILs. RCNIL seed was then used in subsequent summers for seed increase and

replicated field block trials.

Genomic DNA was also extracted from founding RCNIL individuals and used to geno-

type at the zag2 coding sequence. This was done with PZD00013.3 (a Taqman SNP

marker) and ZHL0285-ZHL0286 (indel marker). We classified RCNILs by location of

breakpoint (up or downstream of zag2 ) and genotype (maize or teosinte) at zag2. This

resulted in four recombinant NIL classes and two control NIL classes.

Phenotyping blocks consisted of RCNILs and several control NILs that were homozy-

gous maize or teosinte for the entire zag2 region. Lines were planted in randomized twelve

plant plots in four blocks each in the summer of 2010 and 2011 at the West Madison

Page 180: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

161

Agricultural Research Station (WMARS). Thirteen plant architecture traits and seven

ear traits were measured (Table D.1) for up to five plants per phenotyping block.

Phenotype measurements were fit to a basic linear mixed model (Equation D.1) in R

[91] using the lme4 package. This basic model only included explanatory variables for the

RCNIL line (ai) and the block as a random effect (bj). This was done because the overall

size of blocks was small and positional variation due to X and Y position seemed unlikely

to be significant.

yijk = µ+ ai + bj + eijk (D.1)

After this model was fit, fixed effects estimates and standard errors were extracted and

we looked for association of the phenotypes (represented by fixed effects estimates) with

NIL class.

D.3.2 Transgenic RNAi lines

A zag2 interference RNA (RNAi) construct was developed and introduced into maize.

Thirteen insertion events of the RNAi construct were recovered and crossed by maize

inbreds B73 and A682. The resulting progeny were then planted in the summer of 2009

and ears were harvested for observation of phenotypes. We scored the percent fill of ears

in an effort to assess sterility of individuals with and without the RNAi construct insertion

events. The construct carried a BASTA herbicide resistance gene, which allowed for the

scoring of presence/absence of the construct by BASTA herbicide treatment.

In total, 275 individuals both BASTA resistant and susceptible (construct present

or absent) were harvested and scored for the sterility phenotype. Scoring was done by

estimation of percent fill in a randomized, blind method to avoid bias caused by knowl-

edge of the individual construct genotype. Phenotypes were analyzed using simple t-test

comparisons in R [91].

Page 181: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

162

Table D.1: Trait abbreviations and descriptions from the zag2 experiment.

Trait abbreviation Description

CULM Culm diameterBARE Barren nodesBRNO Number of nodes with silksLWID Leaf widthLCS Length of central spikeTBN Tassel branch numberEAHT Ear heightPLHT Plant heightTILL Tillering indexBRLH Branch length including earNODE Nodes on lateral branchLBIL Lateral branch internode lengthPROL ProlificacyFILL Percent fillEARL Ear lengthEARD Ear diameterKRN Kernel row numberCUPR Cupules per rankSTAM Percent staminate spikeletsKW Single kernel weight

Page 182: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

163

D.4 Results

D.4.1 RCNILs

The fixed effects estimates and standard errors were sorted from least to greatest, plot-

ted as barplots, and inspected for association with RCNIL type, in terms of genotype

upstream, at, and downstream of zag2. While a few single RCNILs differed from others,

there was no distinct clustering of RCNIL type in clearly differentiated phenotype group-

ings for any of the thirteen plant and seven ear phenotypes. Generally, the phenotype

estimates for the maize and teosinte control NILs also did not cleanly separate from each

other. An example of RCNIL estimates sorted from least to greatest is shown for single

kernel weight in Figure D.1). While the maize and teosinte control NILs are not inter-

mingled, there is no clustering of genotypes of the four RCNIL types. Additionally, we

see RCNILs with lower phenotype estimates than the maize control NILs, suggesting that

if zag2 influences kernel weight it does so in an unexpected underdominant manner.

D.4.2 Transgenic RNAi lines

Generally, high percent fill was seen in transgenic plants. The two maize backgrounds

(B73 and A682) were not significantly different from each other in percent fill (t-test,

p = 0.525). Data was collected from only three RNAi transformation events in both

the A682 and B73 maize inbred backgrounds. In these three events, a consistent result

was only seen for one event (event 39 had no effect in either background, Table D.2),

suggesting the effect of an event is dependent on genetic background. Of the fifteen maize

transformation event and background combinations, only four had significantly different

percent fill between resistant plants (construct positive) and susceptible plants (construct

negative). Three of the significant results were large shifts with more than a 60% change

in percent fill while the fourth significant result was a more moderate 11% change.

Page 183: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

164

Figure D.1: Single kernel weight estimates for zag2 RCNILs. RCNIL class is indicated inthe bar with error bars indicating the standard error. Maize and teosinte NILs are notintermingled, however, there is also no clear separation of the RCNIL types (t1, t2, t3,t4) when lines are sorted by estimated phenotype. Furthermore, RCNILs have a lowerphenotype than either of the control NILs, suggesting some sort of underdominance maybe at work.

Page 184: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

165

Table D.2: Zag2 transgenic RNAi insertion event, background, phenotype, and t-testp-value.

MaizeBackground

EventPercent Fill(Resistant)

Percent Fill(Susceptible)

p-value

A682 17 97.8% 98.9% 6.63e-01

B73 23 90.0% 91.0% 6.49e-01

B73 24 100.0% 97.8% 3.47e-01

B73 33 23.0% 97.5% 2.94e-04

A682 35 20.0% 94.0% 6.01e-04B73 35 98.0% 98.8% 6.87e-01

A682 39 98.0% 90.0% 1.15e-01B73 39 93.3% 95.0% 4.89e-01

B73 43 32.2% 95.0% 1.17e-07

A682 45 100.0% 92.9% 2.53e-01

A682 46 95.6% 94.4% 8.29e-01

A682 47 84.4% 83.0% 8.56e-01B73 47 100.0% 88.9% 2.75e-03

B73 49 94.4% 86.7% 1.68e-01

B73 50 91.1% 90.0% 3.47e-01

Page 185: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

166

D.5 Discussion

The results obtained from measurement of phenotypes in RCNILs do not present a clear

phenotypic effect of the zag2 gene. RCNIL estimates and standard errors of maize and

teosinte control lines were never significantly different from each other. Furthermore, the

remaining four genotype classes, distinguished by genotype upstream, at, and downstream

of zag2, failed to cluster in segregating groups based on phenotype. Overall, there is very

little if any evidence that zag2 has any effect on the 20 measured phenotypes.

The reduction in expression of zag2 via transgenic RNAi constructs, likewise failed

to present compelling evidence for a phenotypic effect on percent fill of the ear. Overall,

relatively few zag2 RNAi transformation events resulted in increased sterility (measured

by percent fill). The effect on sterility of any given event seems to be highly depen-

dent on genetic background, since less than half of the events assessed in multiple maize

backgrounds gave the same result. Most significant results consisted of drastic increase

in sterility, suggesting a major genetic dysfunction. We conclude that the zag2 RNAi

constructs have largely non-significant results, which are punctuated by several cases of

high genetic dysfunction. Furthermore, the inconsistent effects of specific transformation

events in different maize backgrounds seem unlikely to be related to zag2.

We failed to identify a phenotypic effect for zag2 in spite of evidence from the literature

that zag2 is expressed in the ear [116] and codes for a homolog of a known floral develop-

ment gene in Arabidopsis [115]. It may be that zag2 controls a phenotype that was under

selection during maize domestication that we did not measure. Work by Schmidt et al.

[116] shows that zag2 and another Agamous homolog (zag1 ) are expressed in endosperm

post-pollination, suggesting a potential role in kernel quality and composition. While we

did measure kernel weight, there are many factors that contribute to kernel quality and

desirability that we did not assess including hard to soft endosperm ratio, protein, oil,

and starch content.

Page 186: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

167

A potential complicating factor in our analysis of zag2 is the existence of three ad-

ditional Agamous homologs in maize [115]. These homologs also share a high degree of

identity with the Arabidopsis Agamous and consequently, a high degree of identity and

similarity with each other. Of particularly high protein identity with zag2 is Zea mays

Mads1 (zmm1 ), which is over 95% identical. The high degree of identity between the

maize Agamous homologs is concerning in conjunction with expression in the same tis-

sues [116] as it suggests functional conservation as well as sequence conservation. For

example, if the zmm1 gene can substitute functionally for the zag2 gene in the develop-

ing ear, then an experiment looking for an ear phenotypic response (such as the RCNIL

experiment) would need to account for the genotype at both zag2 and zmm1.

The failure to associate a domestication phenotype with zag2 demonstrates the dif-

ficulty in using a population genetics approach to identify interesting candidate genes.

From the perspective of population genetics, zag2 appears to have been under selection

during the maize domestication event and has homology with a known floral development

gene in Arabidopsis. A phenotypic effect on a domestication ear phenotype seems quite

likely, however, we did not see any noticeable effects in the female inflorescence in these

experiments. Similar difficulties in associating phenotype to selection candidate genes

has been encountered for two other genes in our lab. The Prolamin-box Binding Factor1

gene was extensively phenotyped in plant architecture and ear traits (unpublished data),

before finally identifying a slight difference in kernel size and density [14]. Additionally,

the Zea agamous-like1 gene appears to have a significant effect on days to anthesis or

flowering time in maize (unpublished data), however, flowering time is not a standard do-

mestication trait. This study sheds light on the difficulty of associating phenotype with

a selection candidate gene and provides a word of caution for future studies seeking to

accomplish this feat.

Page 187: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

168

References

[1] Gaines T, Zhang W, Wang D, Bukun B, Chisholm ST, et al. (2010) Gene ampli-fication confers glyphosate resistance in Amaranthus palmeri. Proceedings of theNational Academy of Sciences 107: 1029–34.

[2] Gompel N, Prud’homme B, Wittkopp PJ, Kassner V, Carroll SB (2005) Chancecaught on the wing: cis-regulatory evolution and the origin of pigment patterns inDrosophila. Nature 433: 481–7.

[3] Studer A, Zhao Q, Ross-Ibarra J, Doebley J (2011) Identification of a functionaltransposon insertion in the maize domestication gene tb1. Nature Genetics 43:1160–3.

[4] Wills DM, Whipple CJ, Takuno S, Kursel LE, Shannon LM, et al. (2013) FromMany, One: Genetic Control of Prolificacy during Maize Domestication. PLoSGenetics 9: e1003604.

[5] Wang H, Nussbaum-Wagler T, Li B, Zhao Q, Vigouroux Y, et al. (2005) The originof the naked grains of maize. Nature 436: 714–9.

[6] Sun L, Li X, Fu Y, Zhu Z, Tan L, et al. (2013) GS6, a member of the GRAS genefamily, negatively regulates grain size in rice. Journal of Integrative Plant Biology: 1–37.

[7] Olsen KM, Wendel JF (2013) A bountiful harvest: genomic insights into crop do-mestication phenotypes. Annual Review of Plant Biology 64: 47–70.

[8] Doebley J (2004) The genetics of maize evolution. Annual Review of Genetics 38:37–59.

[9] Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, et al. (2009) The B73 maizegenome: complexity, diversity, and dynamics. Science 326: 1112–5.

[10] Allaby RG, Fuller DQ, Brown TA (2008) The genetic expectations of a protractedmodel for the origins of domesticated crops. Proceedings of the National Academyof Sciences 105: 13982–6.

[11] Pickersgill B (2007) Domestication of plants in the Americas: insights fromMendelian and molecular genetics. Annals of Botany 100: 925–40.

Page 188: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

169

[12] Carroll SB (2008) Evo-devo and an expanding evolutionary synthesis: a genetictheory of morphological evolution. Cell 134: 25–36.

[13] Wittkopp PJ, Kalay G (2012) Cis-regulatory elements: molecular mechanisms andevolutionary processes underlying divergence. Nature Reviews Genetics 13: 59–69.

[14] Lang Z, Wills D, Lemmon Z, Shannon L, Bukowski R, et al. (2014) Defining therole of prolamin-box binding factor1 gene during maize domestication. The Journalof Heredity : In Press.

[15] Konishi S, Izawa T, Lin SY, Ebana K, Fukuta Y, et al. (2006) An SNP caused lossof seed shattering during rice domestication. Science 312: 1392–6.

[16] Frary A, Nesbitt TC, Grandillo S, Knaap E, Cong B, et al. (2000) fw2.2 : a quan-titative trait locus key to the evolution of tomato fruit size. Science 289: 85–8.

[17] Rapp RA, Haigler CH, Flagel L, Hovav RH, Udall JA, et al. (2010) Gene expressionin developing fibres of Upland cotton (Gossypium hirsutum L.) was massively alteredby domestication. BMC Biology 8: 139.

[18] Swanson-Wagner R, Briskine R, Schaefer R, Hufford MB, Ross-Ibarra J, et al. (2012)Reshaping of the maize transcriptome by domestication. Proceedings of the NationalAcademy of Sciences 109: 11878–83.

[19] Koenig D, Jimenez-Gomez JM, Kimura S, Fulop D, Chitwood DH, et al. (2013)Comparative transcriptomics reveals patterns of selection in domesticated and wildtomato. Proceedings of the National Academy of Sciences 110: E2655–62.

[20] Emerson JJ, Hsieh LC, Sung HM, Wang TY, Huang CJ, et al. (2010) Naturalselection on cis and trans regulation in yeasts. Genome Research 20: 826–36.

[21] McManus CJ, Coolon JD, Duff MO, Eipper-Mains J, Graveley BR, et al. (2010)Regulatory divergence in Drosophila revealed by mRNA-seq. Genome Research 20:816–25.

[22] White Ma, Stubbings M, Dumont BL, Payseur Ba (2012) Genetics and evolutionof hybrid male sterility in house mice. Genetics 191: 917–34.

[23] Alem S, Streiff R, Courtois B, Zenboudji S, Limousin D, et al. (2013) Geneticarchitecture of sensory exploitation: QTL mapping of female and male receivertraits in an acoustic moth. Journal of Evolutionary Biology 26: 2581–96.

[24] Miller CT, Glazer AM, Summers BR, Blackman BK, Norman AR, et al. (2014)Modular Skeletal Evolution in Sticklebacks Is Controlled by Additive and ClusteredQuantitative Trait Loci. Genetics : In Press.

[25] Shannon LM (2012) The Genetic Architecture of Maize Domestication and RangeExpansion. Ph.D. dissertation. Ph.D. thesis, University of Wisconsin - Madison.

Page 189: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

170

[26] Wills DM, Burke JM (2007) Quantitative trait locus analysis of the early domesti-cation of sunflower. Genetics 176: 2589–99.

[27] Paterson AH, Damon S, Hewitt JD, Zamir D, Rabinowitch HD, et al. (1991)Mendelian factors underlying quantitative traits in tomato: comparison acrossspecies, generations, and environments. Genetics 127: 181–97.

[28] Xiong LZ, Liu KD, Dai XK, Xu CG, Zhang Q (1999) Identification of geneticfactors controlling domestication-related traits of rice using an F2 population of across between Oryza sativa and O. rufipogon. Theoretical and Applied Genetics 98:243–251.

[29] Peng J, Ronin Y, Fahima T, Roder MS, Li Y, et al. (2003) Domestication quantita-tive trait loci in Triticum dicoccoides, the progenitor of wheat. Proceedings of theNational Academy of Sciences 100: 2489–94.

[30] Cai W, Morishima H (2002) QTL clusters reflect character associations in wild andcultivated rice. Theoretical and Applied Genetics 104: 1217–1228.

[31] Gyenis L, Yun SJ, Smith KP, Steffenson BJ, Bossolini E, et al. (2007) Geneticarchitecture of quantitative trait loci associated with morphological and agronomictrait differences in a wild by cultivated barley cross. Genome 50: 714–23.

[32] Simons KJ, Fellers JP, Trick HN, Zhang Z, Tai YS, et al. (2006) Molecular charac-terization of the major wheat domestication gene Q. Genetics 172: 547–55.

[33] Li C, Zhou A, Sang T (2006) Rice domestication by reducing shattering. Science311: 1936–9.

[34] Cong B, Barrero LS, Tanksley SD (2008) Regulatory change in YABBY-like tran-scription factor led to evolution of extreme fruit size during tomato domestication.Nature Genetics 40: 800–4.

[35] Wright SI, Bi IV, Schroeder SG, Yamasaki M, Doebley JF, et al. (2005) The effectsof artificial selection on the maize genome. Science 308: 1310–4.

[36] Doebley JF, Gaut BS, Smith BD (2006) The molecular genetics of crop domestica-tion. Cell 127: 1309–21.

[37] Briggs WH, McMullen MD, Doebley JF, Gaut BS (2007) Linkage mapping of do-mestication loci in a large maize teosinte backcross resource. Genetics 177: 1915–28.

[38] Doebley J, Stec A (1991) Genetic analysis of the morphological differences betweenmaize and teosinte. Genetics 129: 285–95.

[39] Whipple CJ, Kebrom TH, Weber AL, Yang F, Hall D, et al. (2011) Grassy Tillers1Promotes Apical Dominance in Maize and Responds To Shade Signals in theGrasses. Proceedings of the National Academy of Sciences 108: E506–12.

Page 190: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

171

[40] Doebley J, Stec A, Hubbard L (1997) The evolution of apical dominance in maize.Nature 386: 485–8.

[41] Clark RM, Nussbaum-Wagler T, Quijada P, Doebley J (2006) A distant upstreamenhancer at the maize domestication gene tb1 has pleiotropic effects on plant andinflorescent architecture. Nature Genetics 38: 594–7.

[42] Studer AJ, Doebley JF (2011) Do large effect QTL fractionate? A case study atthe maize domestication QTL teosinte branched1. Genetics 188: 673–81.

[43] Doebley J, Stec A (1993) Inheritance of the morphological differences between maizeand teosinte: comparison of results for two F2 populations. Genetics 134: 559–70.

[44] Littell R, Milliken G, Stroup W, Wolfinger R (1996) SAS system for mixed models.SAS Institute, Cary, NC., 2nd edition.

[45] Broman KW, Wu H, Sen S, Churchill G (2003) R/qtl: QTL mapping in experimentalcrosses. Bioinformatics 19: 889–890.

[46] Broman KW, Sen S (2009) A Guide to QTL Mapping with R/qtl. Statistics for Biol-ogy and Health. New York, NY: Springer New York. doi:10.1007/978-0-387-92125-9.URL http://www.springerlink.com/index/10.1007/978-0-387-92125-9.

[47] Kosambi DD (1944) The Estimation of Map Distances from Recombination Values.Annals of Eugenics 12: 172–175.

[48] Orr HA (1998) The Population Genetics of Adaptation: The Distribution of FactorsFixed during Adaptive Evolution. Evolution 52: 935.

[49] Beavis WD (1998) QTL Analyses: Power, Precision, and Accuracy. In: PatersonAH, editor, Molecular Dissection of Complex Traits, New York, NY: CRC Press,chapter 10. 1 edition, pp. 145–162.

[50] Hung HY, Shannon LM, Tian F, Bradbury PJ, Chen C, et al. (2012) ZmCCT andthe genetic basis of day-length adaptation underlying the postdomestication spreadof maize. Proceedings of the National Academy of Sciences 109: E1913–21.

[51] Yilmaz A, Nishiyama MY, Fuentes BG, Souza GM, Janies D, et al. (2009) GRAS-SIUS: a platform for comparative regulatory genomics across the grasses. PlantPhysiology 149: 171–80.

[52] Yanofsky MF, Ma H, Bowman JL, Drews GN, Feldmann KA, et al. (1990) Theprotein encoded by the Arabidopsis homeotic gene agamous resembles transcriptionfactors. Nature 346: 35–9.

[53] Schwarz-Sommer Z, Huijser P, Nacken W, Saedler H, Sommer H (1990) GeneticControl of Flower Development by Homeotic Genes in Antirrhinum majus. Science250: 931–6.

Page 191: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

172

[54] Smaczniak C, Immink RGH, Angenent GC, Kaufmann K (2012) Developmental andevolutionary diversity of plant MADS-domain factors: insights from recent studies.Development 139: 3081–98.

[55] Hufford MB, Xu X, van Heerwaarden J, Pyhajarvi T, Chia JM, et al. (2012) Com-parative population genomics of maize domestication and improvement. NatureGenetics 44: 808–11.

[56] Sekhon RS, Lin H, Childs KL, Hansey CN, Robin Buell C, et al. (2011) Genome-wide atlas of transcription through maize development. The Plant Journal : 1–11.

[57] Xue W, Xing Y, Weng X, Zhao Y, Tang W, et al. (2008) Natural variation inGhd7 is an important regulator of heading date and yield potential in rice. NatureGenetics 40: 761–7.

[58] Li Y, Fan C, Xing Y, Jiang Y, Luo L, et al. (2011) Natural variation in GS5 playsan important role in regulating grain size and yield in rice. Nature Genetics 43:1266–9.

[59] Fan C, Xing Y, Mao H, Lu T, Han B, et al. (2006) GS3, a major QTL for grainlength and weight and minor QTL for grain width and thickness in rice, encodes aputative transmembrane protein. Theoretical and Applied Genetics 112: 1164–71.

[60] Yu B, Lin Z, Li H, Li X, Li J, et al. (2007) TAC1, a major quantitative trait locuscontrolling tiller angle in rice. The Plant Journal 52: 891–8.

[61] Jin J, Huang W, Gao JP, Yang J, Shi M, et al. (2008) Genetic control of rice plantarchitecture under domestication. Nature Genetics 40: 1365–9.

[62] Yang Q, Li Z, Li W, Ku L, Wang C, et al. (2013) CACTA-like transposable elementin ZmCCT attenuated photoperiod sensitivity and accelerated the postdomesti-cation spread of maize. Proceedings of the National Academy of Sciences 110:16969–74.

[63] Kermicle JL (2006) A selfish gene governing pollen-pistil compatibility confers re-productive isolation between maize relatives. Genetics 172: 499–506.

[64] Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, et al. (2011) A robust,simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoSOne 6: e19379.

[65] Carroll SB (2005) Evolution at two levels: on genes and form. PLoS Biology 3:e245.

[66] Stern DL, Orgogozo V (2008) The loci of evolution: how predictable is geneticevolution? Evolution 62: 2155–77.

Page 192: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

173

[67] Springer NM, Stupar RM (2007) Allele-specific expression patterns reveal biasesand embryo-specific parent-of-origin effects in hybrid maize. The Plant Cell 19:2391–402.

[68] Bell GDM, Kane NC, Rieseberg LH, Adams KL (2013) RNA-seq analysis of allele-specific expression, hybrid effects, and regulatory divergence in hybrids comparedwith their parents from natural populations. Genome Biology and Evolution 5:1309–23.

[69] Song G, Guo Z, Liu Z, Cheng Q, Qu X, et al. (2013) Global RNA sequencingreveals that genotype-dependent allele-specific expression contributes to differentialexpression in rice F1 hybrids. BMC Plant Biology 13: 221.

[70] Zhang X, Borevitz JO (2009) Global analysis of allele-specific expression in Ara-bidopsis thaliana. Genetics 182: 943–54.

[71] Tirosh I, Reikhav S, Levy Aa, Barkai N (2009) A yeast hybrid provides insight intothe evolution of gene expression regulation. Science 324: 659–62.

[72] He F, Zhang X, Hu J, Turck F, Dong X, et al. (2012) Genome-wide Analysis ofCis-regulatory Divergence between Species in the Arabidopsis Genus. MolecularBiology and Evolution 29: 3385–3395.

[73] Schaefke B, Emerson JJ, Wang TY, Lu MYJ, Hsieh LC, et al. (2013) Inheritance ofgene expression level and selective constraints on trans- and cis-regulatory changesin yeast. Molecular Biology and Evolution 30: 2121–33.

[74] Purugganan MD, Fuller DQ (2009) The nature of selection during plant domestica-tion. Nature 457: 843–8.

[75] Zhong S, Joung Jg, Zheng Y, Chen Yr, Liu B, et al. (2011) High-throughput illuminastrand-specific RNA sequencing library preparation. Cold Spring Harbor Protocols2011: 940–9.

[76] Wang X, Soloway PD, Clark AG (2011) A survey for novel imprinted genes in themouse placenta by mRNA-seq. Genetics 189: 109–22.

[77] Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25: 1754–60.

[78] DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, et al. (2011) A frame-work for variation discovery and genotyping using next-generation DNA sequencingdata. Nature Genetics 43: 491–8.

[79] McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, et al. (2010) TheGenome Analysis Toolkit: a MapReduce framework for analyzing next-generationDNA sequencing data. Genome Research 20: 1297–303.

Page 193: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

174

[80] Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficientalignment of short DNA sequences to the human genome. Genome Biology 10: R25.

[81] Storey JD (2002) A direct approach to false discovery rates. Journal of the RoyalStatistical Society: Series B (Statistical Methodology) 64: 479–498.

[82] Lester RN (1989) Evolution under domestication involving disturbance of genicbalance. Euphytica 44: 125–132.

[83] Gross BL, Olsen KM (2010) Genetic perspectives on crop domestication. Trends inPlant Science 15: 529–537.

[84] Burger JC, Chapman MA, Burke JM (2008) Molecular insights into the evolutionof crop plants. American Journal of Botany 95: 113–122.

[85] Dean RB, Dixon WJ (1951) Simplified Statistics for Small Numbers of Observations.Analytical Chemistry 23: 636–638.

[86] Jin J, Zhang H, Kong L, Gao G, Luo J (2014) PlantTFDB 3.0: a portal for thefunctional and evolutionary study of plant transcription factors. Nucleic AcidsResearch 42: D1182–7.

[87] Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M (2012) KEGG for integrationand interpretation of large-scale molecular data sets. Nucleic Acids Research 40:D109–14.

[88] Kanehisa M (2000) KEGG: Kyoto Encyclopedia of Genes and Genomes. NucleicAcids Research 28: 27–30.

[89] Chen H, Patterson N, Reich D (2010) Population differentiation as a test for selectivesweeps. Genome Research 20: 393–402.

[90] Young MD, Wakefield MJ, Smyth GK, Oshlack A (2010) Gene ontology analysisfor RNA-seq: accounting for selection bias. Genome Biology 11: R14.

[91] R Development Core Team (2013). R: A language and environment for statisticalcomputing. URL http://www.r-project.org/.

[92] Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practicaland powerful approach to multiple testing. Journal of the Royal Statistical SocietySeries B (Methodological) 57: 289–300.

[93] Eichten SR, Briskine R, Song J, Li Q, Swanson-Wagner R, et al. (2013) Epigeneticand genetic influences on DNA methylation variation in maize populations. ThePlant Cell 25: 2783–97.

[94] Duncan IW (2002) Transvection effects in Drosophila. Annual Review of Genetics36: 521–56.

Page 194: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

175

[95] Hansey CN, Vaillancourt B, Sekhon RS, de Leon N, Kaeppler SM, et al. (2012)Maize (Zea mays L.) genome diversity as revealed by RNA-sequencing. PLoS One7: e33071.

[96] Springer NM, Ying K, Fu Y, Ji T, Yeh CT, et al. (2009) Maize inbreds exhibit highlevels of copy number variation (CNV) and presence/absence variation (PAV) ingenome content. PLoS Genetics 5: e1000734.

[97] Chia JM, Song C, Bradbury PJ, Costich D, de Leon N, et al. (2012) Maize HapMap2identifies extant variation from a genome in flux. Nature Genetics 44: 803–7.

[98] Tenaillon MI, U’Ren J, Tenaillon O, Gaut BS (2004) Selection versus demography:a multilocus investigation of the domestication process in maize. Molecular Biologyand Evolution 21: 1214–25.

[99] Clark RM, Linton E, Messing J, Doebley JF (2004) Pattern of diversity in thegenomic region near the maize domestication gene tb1. Proceedings of the NationalAcademy of Sciences 101: 700–7.

[100] Hanning I, Baumgarten K, Schott K, Heldt H (1999) Oxaloacetate transport intoplant mitochondria. Plant Physiology 119: 1025–32.

[101] Zoglowek C, Kromer S, Heldt HW (1988) Oxaloacetate and malate transport byplant mitochondria. Plant Physiology 87: 109–15.

[102] Hunt HV, Denyer K, Packman LC, Jones MK, Howe CJ (2010) Molecular basisof the waxy endosperm starch phenotype in broomcorn millet (Panicum miliaceumL.). Molecular Biology and Evolution 27: 1478–94.

[103] Fan L, Bao J, Wang Y, Yao J, Gui Y, et al. (2009) Post-domestication selection inthe maize starch pathway. PLoS One 4: e7612.

[104] Park YJ, Nemoto K, Nishikawa T, Matsushima K, Minami M, et al. (2009) Waxystrains of three amaranth grains raised by different mutations in the coding region.Molecular Breeding 25: 623–635.

[105] Dussert Y, Remigereau MS, Fontaine MC, Snirc A, Lakis G, et al. (2013) Poly-morphism pattern at a miniature inverted-repeat transposable element locus down-stream of the domestication gene Teosinte-branched1 in wild and domesticated pearlmillet. Molecular Ecology 22: 327–40.

[106] Sugimoto K, Takeuchi Y, Ebana K, Miyao A, Hirochika H, et al. (2010) Molecularcloning of Sdr4, a regulator involved in seed dormancy and domestication of rice.Proceedings of the National Academy of Sciences 107: 5792–7.

[107] Weller JL, Liew LC, Hecht VFG, Rajandran V, Laurie RE, et al. (2012) A conservedmolecular basis for photoperiod adaptation in two temperate legumes. Proceedingsof the National Academy of Sciences 109: 21158–63.

Page 195: The Complex Inheritance of Maize Domestication …...2014/06/22  · The dissertation is approved by the following members of the Final Oral Committee: John F. Doebley, Professor,

176

[108] Zhu BF, Si L, Wang Z, Zhou Y, Zhu J, et al. (2011) Genetic control of a transitionfrom black to straw-white seed hull in rice domestication. Plant Physiology 155:1301–11.

[109] Liu J, Van Eck J, Cong B, Tanksley SD (2002) A new class of regulatory genesunderlying the cause of pear-shaped tomato fruit. Proceedings of the NationalAcademy of Sciences 99: 13302–6.

[110] Gallavotti A, Zhao Q, Kyozuka J, Meeley RB, Ritter MK, et al. (2004) The role ofbarren stalk1 in the architecture of maize. Nature 432: 630–5.

[111] Carling MD, Brumfield RT (2009) Speciation in Passerina buntings: introgressionpatterns of sex-linked loci identify a candidate gene region for reproductive isolation.Molecular Ecology 18: 834–47.

[112] Pool JE, Corbett-Detig RB, Sugino RP, Stevens KA, Cardeno CM, et al. (2012)Population Genomics of sub-saharan Drosophila melanogaster : African diversityand non-African admixture. PLoS Genetics 8: e1003080.

[113] Zhao Q, Thuillet AC, Uhlmann NK, Weber AL, Rafalski JA, et al. (2008) The roleof regulatory genes during maize domestication: evidence from nucleotide polymor-phism and gene expression. Genetics 178: 2133–43.

[114] Zhao Q, Weber AL, McMullen MD, Guill K, Doebley J (2011) MADS-box genesof maize: frequent targets of selection during domestication. Genetics Research 93:65–75.

[115] Theissen G, Strater T, Fischer A, Saedler H (1995) Structural characterization,chromosomal localization and phylogenetic evaluation of two pairs of AGAMOUS -like MADS-box genes from maize. Gene 156: 155–66.

[116] Schmidt RJ, Veit B, Mandel MA, Mena M, Hake S, et al. (1993) Identification andmolecular characterization of ZAG1, the maize homolog of the Arabidopsis floralhomeotic gene AGAMOUS. The Plant Cell 5: 729–37.