Trends in Genetics 2014 Jan

41

description

This is the better of the two .

Transcript of Trends in Genetics 2014 Jan

Page 1: Trends in Genetics 2014 Jan
Page 2: Trends in Genetics 2014 Jan

Editor Rhiannon Macrae

Portfolio ManagerMilka Kostic

Journal ManagerBasil Nyaku

Journal AdministratorsRia Otten and Patrick Scheffmann

Advisory Editorial BoardK.V. Anderson, New York, USAA. Clark, Ithaca, USAG. Fink, Cambridge, USAS. Gasser, Geneva, SwitzerlandD. Goldstein, Durham, USAL. Guarente, Cambridge, USAY. Hayashizaki, Yokohama, Japan S. Henikoff, Seattle, USAJ. Hodgkin, Oxford, UKH.R. Horvitz, Cambridge, USAL. Hurst, Bath, UKE. Koonin, Bethesda, USAE. Meyerowitz, Pasadena, USAS. Moreno, Salamanca, SpainA. Nieto, Alicante, SpainC. Scazzocchio, Orsay, France and London, UKD. Tautz, Plön, GermanyO. Voinnet, Strasburg, FranceJ. Wysocka, Stanford, California

Editorial EnquiriesTrends in GeneticsCell Press600 Technology Square, 5th floorCambridge MA 02139, USATel: +1 617 397 2818Fax: +1 617 397 2810E-mail: [email protected]

Cover: The Ectodysplasin pathway controls the formation of ectodermal appendages such as teeth, hairs, and scales. Its role has been identified by combining the study of human patients with human genetics and mouse experimental approaches. Recently this pathway has been associated with specific adaptations in natural populations. One example of these adaptations is the magnitude of armour plates in sticklebacks: freshwater sticklebacks have a low plate phenotype with a low activity of the pathway whereas marine sticklebacks have a high armor plate phenotype with high activity of the pathway. On pages 24–31 of this issue Sadier et al., review how this pathway fine tunes the developmental network controlling the number, size, and density of ectodermal appendages and propose that its variation may underlie adaptive changes in ectodermal appendages in natural populations. The cover shows a freshwater (left) and marine (right) stickleback with distinct differences in plate number. Photographs by Jun Kitano

January 2014 Volume 30, Number 1 pp. 1–40

Reviews

Paul E. Love, Claude Warzecha, and LiQi Li

Satoko Yoshiba and Hiroshi Hamada

Alister P.W. Funnell and Merlin Crossley

Alexa Sadier, Laurent Viriot, Sophie Pantalacci, and Vincent Laudet

Luciana Musante and H. Hilger Ropers

1 Ldb1 complexes: the new master regulators of erythroid gene transcription

10 Roles of cilia, fluid flow, and Ca2+ signaling in breaking of left-right symmetry

18 Hemophilia B Leyden and once mysterious cis-regulatory mutations

24 The ectodysplasin pathway: from diseases to adaptations

32 Genetics of recessive cognitive disorders

Page 3: Trends in Genetics 2014 Jan

Ldb1 complexes: the new masterregulators of erythroid genetranscriptionPaul E. Love, Claude Warzecha, and LiQi Li

Eunice Kennedy Shriver, National Institute of Child Health & Human Development, National Institutes of Health,

Bethesda, MD 20892, USA

Elucidation of the genetic pathways that control redblood cell development has been a central goal of eryth-ropoiesis research over the past decade. Notably, datafrom several recent studies have provided new insightsinto the regulation of erythroid gene transcription. Tran-scription profiling demonstrates that erythropoiesis ismainly controlled by a small group of lineage-restrictedtranscription factors [Gata binding protein 1 (Gata1), Tcell acute lymphocytic leukemia 1 protein (Tal1), andErythroid Kruppel-like factor (EKLF; henceforth referredto as Klf1)]. Binding-site mapping using ChIP-Seq indi-cates that most DNA-bound Gata1 and Tal1 proteins arecontained within higher order complexes (Ldb1 com-plexes) that include the nuclear adapters Ldb1 andLmo2. Ldb1 complexes regulate Klf1, and Ldb1 com-plex-binding sites frequently colocalize with Klf1 at ery-throid genes and cis-regulatory elements, indicatingstrong functional synergy between Gata1, Tal1, andKlf1. Together with new data demonstrating that Ldb1can mediate long-range promoter–enhancer interac-tions, these findings provide a foundation for the firstcomprehensive models of the global regulation of ery-throid gene transcription.

Gata1, Tal1, and Klf1: the core erythroid transcriptionfactorsMammalian erythropoiesis is a dynamic, stepwise processthat begins in multipotent hematopoietic progenitors (he-matopoietic stem cells, HSCs) and ends with the genera-tion of mature enucleated red blood cells. Erythrocytedevelopment requires the coordinated expression and ac-tivity of several transcription factors that regulate termi-nal differentiation and induction of erythroid-specificgenes. Among these, Gata1, Tal1 (Scl), and Klf1 (EKLF)have been shown to be exceptional, earning the designationerythroid ‘master regulators’ (Box 1). Gata1, Tal1, and Klf1are each required for both primitive and definitive eryth-

ropoiesis. Indeed, absence of any one of these proteins inmice results in severe anemia and death by mid-gestation[1–6]. In addition, gene expression profiling has shown thatGata1, Tal1, and Klf1 are each required for b-globin (Hbb)and a-globin (Hba) gene expression, as well as for theinduction of a large number of other erythroid signaturegenes, suggesting that Gata1, Tal1, and Klf1 functionbroadly and synergistically to regulate the erythroid tran-scriptional program [7–12].

Widespread binding of Gata1, Tal1, and Klf1 at erythroidgenes and enhancersResults from recent experiments, in which chromatinimmunoprecipitation coupled with massively parallel se-quencing (ChIP-Seq) was used to map Gata1-, Tal1-, orKlf1-binding sites genome-wide in primary murine ery-throblasts or in murine erythroid cell lines, provided im-portant insights into the regulatory functions of theseproteins in controlling the expression of erythroid genes[7–9,12–16]. Each of these studies included transcriptionalprofiling by microarray or RNA-sequencing so that ChIP-Seq binding profiles could be correlated with gene expres-sion. Although these experiments were performed by inde-pendent groups and, in many cases, with differenterythroid cell populations, several general conclusionscan nevertheless be drawn from comparative analysis ofthe data. First, Gata1, Tal1, and Klf1 each bind at or near,and are required for the induction of, a large cohort oferythroid genes, including Hba and Hbb. Second, Gata1,Tal1, and Klf1 ChIP-Seq peaks strongly correlate with thepresence of their respective consensus DNA-binding motifs(Box 1). Third, although Gata1, Tal1, and Klf1 frequentlybind near transcription start sites of erythroid genes, mostbinding sites are either within introns of target genes or atintergenic regions often far removed from any knowngene [notably, some of these distal sites are within well-characterized erythroid cis-regulatory elements, includingthe Hbb locus control region (LCR) and the Hba MAREregulatory domain]. Fourth, considerable Gata1-, Tal1-,and Klf1-binding site convergence can be inferred by thepresence of consensus DNA-binding motifs for one or bothof the other two factors at Gata1, Tal1, or Klf1 ChIP-Seqsites. For example, Tal1-binding E-box (CANNTG) motifsare frequently detected near Gata1 ChIP-Seq bindingsites, especially those that are at or near genes that are

Review

0168-9525/$ – see front matter .

Published by Elsevier Ltd. http://dx.doi.org/10.1016/j.tig.2013.10.001

Corresponding author: Love, P.E. ([email protected]).Keywords: erythropoiesis; Gata1; Tal1; Klf1; Ldb1 complexes; transcriptionalregulation; ChIP-Seq.

Trends in Genetics, January 2014, Vol. 30, No. 1 1

Page 4: Trends in Genetics 2014 Jan

induced by Gata1 [13,14]. Likewise, after E-boxes, GATAmotifs were the most prevalent consensus sequences iden-tified within Tal1 ChIP-Seq peaks [7].

Analysis of merged ChIP-Seq runs from independentstudies confirmed frequent co-occupancy of binding sites byGata1, Tal1, and Klf1. For example, comparison of Gata1and Klf1 ChIP-seq results revealed striking binding-siteoverlap, with approximately 48% of all Klf1 ChIP-Seqpeaks located within 1 kb of Gata1-binding sites [8]. Inanother study, the authors noted strong positive correla-tion of DNA occupancy by Gata1 and Tal1 over a 66-Mbregion of chromosome 7 that includes a large number of keyerythroid genes [9]. Finally, merged analysis of indepen-dently generated Gata1, Tal1, and Klf1 ChIP-Seq runsidentified >300 genes that were co-occupied by all threefactors [16].

In general, cobinding by more than one of the threetranscription factors is associated with gene induction(positive regulation) rather than repression. Notably, sev-eral groups found that most Gata1-induced genes were co-occupied by both Gata1 and Tal1 [9,12,17,18]. Similarly,cobinding by Gata1 and Klf1 was found to be stronglyassociated with gene activation [8,10]. Collectively, theseobservations suggested that Gata1, Tal1, and Klf1 functiontogether to regulate positively erythroid gene expressionand establish erythroid lineage identity.

Ldb1 complexes are major instruments of Gata1- andTal1-regulated erythroid gene activationAn intriguing observation from several of the aforemen-tioned ChIP-Seq studies was the frequent detection of apaired E-box–GATA DNA motif within DNA fragmentsbound by either Gata1 [13], Tal1 [7], or Klf1 [8]. This pairedmotif, which comprises a preferentially ordered and spacedpartial or complete E-box and a consensus Gata1-bindingsequence [(CANN)TG-N7-9-WGATAR], matches the con-sensus binding site for a multimeric erythroid proteincomplex (herein designated the Ldb1 complex) that con-tains Gata1 and Tal1 in addition to the nuclear adapterproteins LIM domain-binding protein 1 (Ldb1) and LIMdomain only 2 (Lmo2) (Figure 1A) [19].

That such higher order complexes may be important forerythroid gene expression is supported by data derivedfrom Ldb1 and Lmo2 gene ablation studies in mice. Homo-zygous germline deletion of Ldb1 in mice results in apleiotropic phenotype with arrested development anddeath at embryonic days (E) 9.5–10 [20]. Among the mul-tiple abnormalities observed was a defect in the expansionof the yolk sac, including the absence of blood islands,indicating a role for Ldb1 in hematopoietic development.Further analysis demonstrated that Ldb1 is necessary forprimitive erythropoiesis because Ldb1-null yolk sac cellsare incapable of generating erythroid colonies in in vitroculture assays [21]. Using mice harboring conditional

Box 1. The core erythroid transcription factors

Gata1

Gata1 was the first erythroid ‘master regulator’ identified and remains

the best-studied hematopoietic transcription factor [58]. Gata1 is the

founding member of the Gata family of transcription factors that each

contains two zinc fingers and bind to a canonical GATA DNA motif.

Gata1 can directly associate with several cofactors, such as the zinc

finger protein Fog1 through its N-terminal zinc finger while making

contact with DNA via the C-terminal zinc finger [59]. Gata1 also

harbors an N-terminal transcriptional activation domain and has been

shown to interact with multiple transcription factors, co-activators,

and corepressors and is subject to a complex array of post-

translational modifications [60,61]. Deletion of Gata1 in male mice

(Gata1 is located on the X chromosome) results in death by E11.5

from severe anemia, demonstrating that Gata1 is essential for

erythropoiesis [62]. Gata1 is expressed at low levels in common

myeloid progenitors (CMP), but is highly upregulated in megakar-

yocyte-erythroid progenitors (MEP) and erythroblasts. Gata1 activates

a set of early erythroid genes, such as the erythropoietin receptor at

the onset of erythropoiesis, to sustain the proliferation and differ-

entiation of immature erythroid cells. As these cells mature, Gata1

also induces later stage erythroid genes, such as those encoding the

hemoglobin and heme biosynthesis proteins. Gata1 also has a critical

role in the terminal maturation of other hematopoietic lineages,

including megakaryocytes, eosinophils, and mast cells.

Tal1/Scl

Tal1 (also known as Scl) was discovered through its involvement in

translocation events that give rise to T cell acute lymphoblastic

leukemia. Tal1 belongs to the family of basic helix–loop–helix (bHLH)

transcription factors characterized by a 50-residue HLH protein

interaction domain, preceded by a ten-residue basic region that binds

DNA [63]. Tal1 forms heterodimers with ubiquitous bHLH factors,

such as the E2a-proteins E12 and E47 as a prerequisite to DNA

binding at E-box (CANNTG) motifs. Follow-up loss-of-function studies

demonstrated a critical role for Tal1 in early hematopoietic specifica-

tion because Tal1 null mice die at E9.5 from lack of yolk-sac

hematopoiesis [3,4]. Despite the requirement for Tal1 in hematopoie-

tic stem cell generation, it is not specifically required for hemato-

poietic stem cell survival, multipotency, or long-term repopulating in

mice due to functional redundancy with the related b-HLH protein,

Lyl1 [64]. However, there are severe defects in erythroid and

megakaryocytic development in the absence of Tal1 [2,65]. Mice

expressing a DNA-binding deficient form of Tal1 survive beyond the

E9.5 time point when Tal1-deficient embryos die. These mutant mice

displayed signs of anemia consistent with the idea that DNA binding

is dispensable for HSC generation but is important for proper

erythroid maturation [7].

Klf1/EKLF

Klf1 is the founding member of the KLF family of proteins that

comprises 17 different transcription factors that function in diverse

tissues and have many critical biological roles [66,67]. The KLF family

is characterized by three similar C2H2-type zinc fingers at the C

terminus that form a DNA-binding domain that binds the consensus

sequence CC[A/C]C[A/G]CCC. Klf1 also contains an N-terminal

transactivation domain comprising an acidic-patch and proline-rich

region. Klf1 is remarkably erythroid lineage restricted, being margin-

ally expressed in CMPs, upregulated in their MEP progeny, and

reaching peak expression levels at the mature erythroblast stage

[10,68]. Expression of Klf1 in megakaryocyte-erythroid progenitors

restricts megakaryocyte development and promotes erythropoiesis

[69,70]. Klf1 function is critical for the proper development of the

erythroid lineage, as supported by knockout experiments in mice.

Klf1-null mice die at E14–15 of gestation due to lack of proper

definitive erythropoiesis [5,6]. Although the role of Klf1 in regulating

b-globin gene expression has been the major focus of its character-

ization, recent studies have confirmed Klf1 as one of the primary

players in establishing and maintaining global erythroid gene

regulation. Combined approaches using expression profiling and

ChIP-Seq in fetal liver erythroid cells have demonstrated that Klf1

targets hundreds of genes and acts primarily as a transcriptional

activator [10,11].

Review Trends in Genetics January 2014, Vol. 30, No. 1

2

Page 5: Trends in Genetics 2014 Jan

deletion alleles of Ldb1 and expressing either an embry-onic endothelial and hematopoietic lineage-specific Cre(Tie2-Cre) or an inducible Mx1-Cre, it was shown thatLdb1 is also required for both fetal and adult definitiveerythropoiesis [21]. Deletion of Ldb1 resulted in a signifi-cant reduction of megakaryocyte-erythroid progenitors,megakaryocytes, and erythroblasts. Lmo2-null mice alsodisplayed hematopoietic defects that phenocopy Ldb1 de-letion in that both primitive and definitive erythropoiesisare almost absent [22,23].

To determine whether cobinding of Gata1 and Tal1 ismediated by Ldb1 complexes on a genome-wide scale, twogroups performed ChIP-Seq studies to map Ldb1-, Gata1-,and Tal1-binding sites in erythroid cells [24,25]. Remark-ably, both studies reported strong overlap of Gata1, Tal1,and Ldb1 peaks. Co-occupancy was especially high at sites

within or near known erythroid genes; for example, 84% ofLdb1, 79% of Gata1, and 84% of Tal1-binding sites aterythroid ‘fingerprint’ genes were occupied by the othertwo factors [25]. Genes involved in all aspects of erythro-poiesis, including transcriptional regulation (Klf1, E2f2,Zfpm1, and Sox6), heme and/or hemoglobin biosynthesis(Alad, Alas2, Cpox, Ppox, Fech, and Hbb), cytoskeletalorganization (Add2, Ank1, Epb4.1, Epb4.2, and Tmod),and ion or solute transport (Slc4a1, Slc25a37, Aqp1, andAqp9), as well as almost all known erythroid enhancerelements, contained Ldb1 complex-binding sites [24,25].Ldb1 complex-bound genes were mainly induced duringterminal erythroid differentiation, and Ldb1 knockdownconfirmed that activation of these genes is Ldb1 dependent[25]. Furthermore, given that nearly all of the genes boundby Ldb1 complexes were previously shown to require Gata1

CANNTG-N7-9-WGATAR

E-box/GATA mo�f

(A)

(B)

Gata1 Tal1

E2a

C

N Lmo2

Lim Lim

LID Ldb1

E-box – GATATarget gene

GATA – E-box

Distal enhancer

Promoter

Lmo2 E2a

Tal1

Gata1

Ldb1

Lmo2 Ldb1

DD

E2a

Tal1

Gata1

TRENDS in Genetics

Figure 1. The structure and function of erythroid Ldb1 complexes. (A) Model of the core erythropoietic Ldb1 complex. The zinc finger DNA-binding protein Gata1 and a

heterodimer of the basic helix-loop-helix (b-HLH) proteins Tal1 and E2a bind to a paired E-box–WGATAR motif (W is A/T; R is A/G) with a restricted spacing of 7–9 bp. The

dual LIM (Lin11, Isl-1 & Mec-3) domain protein Lmo2 bridges and associates with both Gata1 and the bHLH factors, whereas the LIM-interacting protein Ldb1 associates with

Lmo2. Gata1 and Tal1 are described in more detail in Box 1. Ldb1 is a transcription cofactor widely expressed throughout embryonic and adult tissues. Ldb1 has no known

enzymatic or nucleic acid-binding function, but rather it seems to act as an interface for specific protein interactions [29]. It achieves this through its two predominant

functional domains: an N-terminal self-association (dimerization) domain and a C-terminal region that interacts with the LIM domain that is common to a large family of

proteins that have important roles in tissue development. Lmo2 is a small protein comprising two LIM domains and is expressed in a variety of tissues, including

hematopoietic precursors, as well as many but not all hematopoietic lineages. In vitro binding studies have shown that association of Lmo2 with Tal1 increases the affinity

of Tal1 to bind E2a and in turn, more stably bind to the E-box sequence [71]. Lmo2 also binds to the N-terminal zinc finger of Gata1, thereby forming a bridge between Tal1

and Gata1 [45]. Furthermore, it has been demonstrated that Gata1 can recruit additional proteins, such as Fog1 (Zfpm1), to the complex, through its N-terminal zinc finger

[45]. A major function of the Ldb1 complex is to provide a stable scaffold through which the hematopoietic transcription factors Tal1 and Gata1 act together to regulate

erythroid gene transcription. (B) An illustration of Ldb1-mediated juxtaposition of two Ldb1 complexes bound to DNA at sites far apart from each other. Ldb1 can dimerize

through its self-association domain, facilitating DNA looping and juxtaposition of two Ldb1 complexes. This property of Ldb1 suggests a model whereby enhancers can

communicate with distal promoters (in cis or possibly in trans) via Ldb1 complex-mediated association. The Ldb1 self-association domain can also form trimeric structures

as well as dimers, and these types of higher order structure are likely relevant in instances where multiple Ldb1 complexes are assembled near a gene [e.g., the b-globin

(Hbb) locus] [30].

Review Trends in Genetics January 2014, Vol. 30, No. 1

3

Page 6: Trends in Genetics 2014 Jan

and/or Tal1 for their induction [7,9], these results advocat-ed a model in which Ldb1 complexes represent key struc-tures by which Gata1 and Tal1 positively regulateerythroid gene transcription.

Ldb1 complexes mediate long-range promoter–enhancer interactionsA consistent finding from studies using ChIP-Seq to mapthe binding profiles of essential erythropoietic transcrip-tion factors is that these proteins frequently bind at siteslocated far from potential target genes. Indeed, as men-tioned above, most Gata1-, Tal1-, Klf1-, and Ldb1 complex-binding sites identified by ChIP-Seq are intronic or withinintergenic regions [7,8,13,14,16,24,25]. Even at these dis-tal locations, core erythroid transcription factors tend to co-occupy the same sites and are accompanied by epigeneticenhancer marks, suggesting that these regions function asregulatory elements.

The Hbb LCR establishes contact with promoters of theactively transcribed Hbb genes through chromatin looping[26]. Gata1, Tal1, and Klf1 are bound to the Hbb LCR andare each required for loop formation as well for b-globintranscription [27,28]. However, these factors alone do notprovide a satisfactory model to account for looping anddistal interactions. By contrast, Ldb1 contains a self-asso-ciation domain that is capable of facilitating the formationof stable long-range promoter–enhancer interactionsthrough Ldb1-mediated oligomerization [29,30](Figure 1B). Experiments using ChIP, Chromosome Con-formation Capture (3C), and small hairpin (sh)RNA-medi-ated knockdown of Ldb1 in murine erythroleukemia (MEL)cells have shown that Ldb1 is physically present at the HbbLCR and Hbb promoters, and that it is required for loopformation and for transcriptional activation of Hbb genes[31]. A reciprocal experiment demonstrated that enforcedLdb1 dimerization was sufficient for LCR-promoter loop-ing and Hbb transcription even in the absence of Gata1[32]. Ldb1 is also required for migration of the Hbb locus toregions of active transcription in the nucleus [33]. Inaddition to the Hbb promoters and LCR, Ldb1 complexesbind to distal regulatory elements upstream of the Mybgene [34]. 3C experiments demonstrated that these regu-latory elements are brought into proximity with the Mybpromoter in an Ldb1-dependent manner [34]. Novel long-range cis-interactions between the Hbb promoter and Ldb1complexes bound to the promoters of other erythroid genes,including Uros and Tspan32, were also recently identified

[24]. Thus, in addition to providing Gata1 and Tal1, keytranscription factors necessary for erythropoiesis, Ldb1complexes provide a mechanism for promoter–enhancerinteractions through Ldb1 self-association.

Ldb1 complexes positively regulate Klf1

As noted above, the phenotype of Klf1�/�mice [5,6], togeth-er with the results of gene expression profiling studies, haveidentified a critical role for Klf1 in erythropoiesis and in theinduction of many erythroid genes [10,11,35–37]. Interest-ingly, ChIP-Seq experiments revealed that Klf1 is a target ofLdb1 complexes [25]. In addition, knockdown of Ldb1markedly attenuates induction of Klf1 in MEL cells, indi-cating that Klf1 is directly regulated by Ldb1 complexes [25].This result is consistent with the previous finding thatGata1 regulates the expression of Klf1 and that Klf1 tran-scription is dependent upon a paired E-box–GATA motifwithin the Klf1 promoter [38,39]. Interestingly, known cis-regulatory elements for the Gata1, Tal1, and Lmo2 genesare also occupied by Ldb1 complexes in erythroid progeni-tors, suggesting a positive autoregulatory role for Ldb1complexes in the expression of these subunits [24,25].

Ldb1 complexes cooperate with Klf1 to activatetranscription of erythroid genes through distinctregulatory mechanismsComparative analysis of transcriptional profiling datafrom three studies [8,25,40] identified a cohort of 62 ery-throid genes that are coregulated by Ldb1 complexes andby Klf1 (i.e., genes that are significantly downregulated inboth Klf1�/� fetal liver erythroblasts and in differentiatedMEL cells, where Ldb1 expression is reduced by shRNA)(Table 1). Strikingly, Ldb1 complex-binding sites weredetected at or within 10 kb of 85% (53/62) of these coregu-lated genes in primary murine erythroid cells [25], indi-cating that Ldb1 complexes function primarily to regulateerythroid gene transcription directly. In most cases (e.g.,Hba, Hbb, Gypc, Alad, Add2, Ermap, Urod, and Ppox),ChIP-Seq data revealed that both Klf1 and Ldb1 complexesbind to promoter-proximal sites, a configuration most con-sistent with a classical ‘feed-forward’ mechanism of tran-scriptional coregulation [41] (Figure 2A). However, asecond group of coregulated genes (e.g., Ank1, Slc25a37,Ctsb, Rhd, and Gypa) were bound by Ldb1 complexes butnot Klf1 near their transcription start site (TSS); yet, ineach case, prominent colocalized Ldb1 complex and Klf1binding was detected within the intron(s) of the same gene

Table 1. Ldb1 complex and Klf1 co-activated genesa

Gene ontology term Ldb1 complex and Klf1 co-activated genes

Membrane, cytoskeleton, or blood group Cd24a, Cd47, Cd59a, Ermap, Gypa, Gypc, Kcnn4, Kel, Mgll, Rhd, Slc4a1, Slc2a4, Slc22a4,

Tmcc2, Vamp5, Fam210b, Sppl2b

Cytoskeleton Add2, Ank1, Epb4.1, Epb4.2, Spna1, Spnb1

Heme synthesis and transport or mitochondrial Abcb10, Abcg2, Alad, Alas2, Blvrb, Bzrap1, Cat, Cpox, Fech, Hagh, Hmbs, Ppox, Urod, Vdac1

Hb or iron procurement Fxn, Hba, Hbb, Ppox, Slc25a37/mitoferrin, Slc11a2/Dmt1, Steap3, Tfrc

Apoptosis, survival or cell cycle Cdkn2c/p18INK4c, Ctsb, Dlgap5, Pim1, Prdx2, Ptp4a3, Rad23a, Rgcc

Cytoplasmic Arrb1, Dck, Pcx, Stx2, Ube2c, Ubap1

Nuclear or transcription Cdyl, E2f2, Mafk

aGenes included in the table are those significantly down-regulated in Ldb1 shRNA-mediated knockdown Murine Erythroleukemia cells and in Klf1�/� fetal liver cells. Color

code: black, genes bound by Ldb1 complex(es) (gene body �10 Kb) and by Klf1; blue, genes bound only by Klf1; red, genes bound only by Ldb1 complex(es) (gene body

�10 Kb); green, genes not bound by Ldb1 complexes or Klf1. Gene list was compiled from [8,25,40,72].

Review Trends in Genetics January 2014, Vol. 30, No. 1

4

Page 7: Trends in Genetics 2014 Jan

cis ortrans

Klf1-dependent transcrip�on factory

(A) (i) (ii)

(i) (ii)

(i) (ii)

(B)

(C)

14.992 -

0.135 _17.925 -

0.157 _14.992 -

0.127 _15.963 -

0.174 _1.548 -

0.057 _

IgG

Ldb1

Gata1

Tal1

Klf1

Ppox

10 kb

IgG

Ldb1

Gata1

Tal1

Klf1

Rhd

50 kb

Gata1

IgG

Ldb1

Tal1

Klf1

Spna1

100 kb

Ldb1complex

Klf1 Enhancer

Erythroid geneLdb1

Lmo2

Tal1/E2a

Gata1

Key

Ppox Ufc1Usp21B4galt3

Tmem576.607 -

0.135 _10.535 -

0.157 _6.607 -

0.127 _15.443 -

0.174 _1.892 -

0.057 _

Olfr430

11.689 -

0.135 _18.239 -

0.157 _11.689 -

0.127 _20.127 -

0.174 _2 -

0.057 _

TRENDS in Genetics

Figure 2. Hypothetical models for the cooperation of Ldb1 complexes and Klf1 in the regulation of erythroid gene transcription. Genome-wide mapping of Ldb1 complex

and Klf1 binding by ChIP-Seq suggests several possible mechanisms by which Klf1 functions in concert with the Ldb1 complex to activate erythroid genes known to be

dependent on both Klf1 and Ldb1 for their expression. Images on the left represent raw ChIP-Seq read data for Ldb1, Gata1, Tal1, and Klf1 [8,25] transformed into a density

plot for each factor and loaded into the University of California at Santa Cruz (UCSC) genome browser as a custom track. Models corresponding to the binding profiles are

depicted on the right. (A) (i) Example of an erythroid gene (Ppox) where an Ldb1 complex and Klf1 bind in close proximity to each other and to the transcription start site of a

gene to directly activate transcription. (ii) Model depicting the direct regulation of Ppox transcription by the Ldb1 complex and Klf1. (B) (i) Example of an erythroid gene

(Rhd) where an Ldb1 complex but not Klf1 binds at the promoter and where both an Ldb1 complex and Klf1 bind to a distal enhancer. (ii) Model depicting the recruitment of

Klf1 to the Rhd promoter through dimerization of the Ldb1 self-association domain. (C) (i) Example of an erythroid gene (Spna1) where Ldb1 complex binding is detected at

the gene but Klf1 binding is not detected within 100 kb of the gene. (ii) Model depicts Ldb1 complex-mediated recruitment of Spna1 to a transcriptional hub near the b-

globin (Hbb) gene. In this model, the transcriptional hub serves as a nexus where Klf1-dependent genes are brought into locations of direct Klf1 recruitment through the

self-interactions of Ldb1 complexes.

Review Trends in Genetics January 2014, Vol. 30, No. 1

5

Page 8: Trends in Genetics 2014 Jan

or at intergenic sites located 10–100-kb away at known orsuspected cis-regulatory elements (Figure 2B). Similar towhat has been shown at the Hbb and Myb loci [31,32,34],Ldb1-mediated dimerization likely facilitates promoter–enhancer interactions necessary to bring Klf1 into proxim-ity with the promoter, enabling transcriptional activationof these genes (Figure 2B). A third group of coregulatedgenes were bound by Ldb1 complexes, but no Klf1 bindingwas detected at or within 100 kb of the gene (Figure 2C).This configuration could reflect that Klf1 dependenceis indirect (i.e., transcription is controlled by a Klf1-dependent factor but not directly by Klf1). Not with stand-ing, recent data suggest that a more complex mechanism isresponsible for the regulation of many of these genes,including Arrb1, Fech, Uros, Kcnn4, Spna1, Spnb1, Kel,Tmcc2, and Cpox. In definitive erythroid cells, each of theaforementioned genes is recruited to active sites of tran-scription (designated transcriptional interactomes, activechromatin hubs, or transcription factories) near the Hba orHbb loci [40]. Previous work has shown that Ldb1 com-plexes [18,24,31] and Klf1 [8,42,43] bind to the Hbb pro-moters, and the a- and b-globin regulatory elements.Moreover, both Ldb1 and Klf1 are required to establishlong-range promoter–LCR interactions at these loci andare also required for a- and b-globin gene transcription[28,31]. We speculate, based on these findings, that Klf1-dependent genes, such as Spna1, are recruited by Ldb1complexes to sites of active transcription at the Hba andHbb loci (Figure 2C). A corollary of this model (Figure 2C)is that, although Klf1 is necessary for initiating transcrip-tion at active chromatin hubs, it may not be directlyrequired for the transcription of genes that are subsequent-ly recruited to these sites. It is also likely that sites of Ldb1complex–Klf1 cobinding other than those at the Hba andHbb loci function as erythroid transcriptional interac-tomes. Finally, recent data indicate that Ldb1 proteinsmay preferentially form trimers or higher-order oligomers[30], raising the possibility that individual Ldb1 complex-bound enhancer modules could be recruited to more thanone target gene or that multiple Ldb1 complex-boundregulatory elements can be recruited to genes with a singleLdb1 complex-binding site. Accordingly, these data sug-gest that Ldb1 complexes function in several distinct waysto orchestrate transcriptional activation on a global scaleduring terminal erythropoiesis.

Role of Ldb1 complexes in gene repressionAlthough Ldb1 complex binding strongly correlates withgene activation, binding of Ldb1 complexes has been asso-ciated with gene repression in a few cases. For example,Ldb1 complexes bind to the Lyl1 and Egr1 promoters andknockdown of Ldb1 in MEL cells causes both genes to besignificantly upregulated [25]. An emerging concept isthat, whereas ‘core’ Ldb1/Lmo2/Tal1/Gata1 Ldb1 com-plexes function mainly as transcriptional activators, re-pressive potential can be conferred by recruitment ofadditional factors by subunits of the Ldb1 complex. Gata1can associate with the zinc finger protein Friend ofGata1 (Fog1/Zfpm1), which is capable of recruiting thenucleosome remodeling and deacetylase (NuRD) complexand the corepressor CtBP, providing a mechanism for

Gata1-dependant gene repression [44]. It was recentlyshown that Gata1 can simultaneously interact withLmo2 and Fog1, demonstrating that Fog1, and presumablyother associated factors, can be recruited to Ldb1 com-plexes through Gata1 [45]. Ldb1 complexes can also po-tentially acquire repressive activity through Tal1-mediated recruitment of Cbfa2t3 (Eto2), which can bindhistone deacetylases [46–48]. Interestingly, ChIP-Seq datahave shown that Eto2 and the related protein Mtgr1 bindto Ldb1 complexes in undifferentiated MEL cells, butdisassociate upon terminal erythroid differentiation, whenmost erythroid genes are strongly induced [24].

The polycomb repressive complex 2 (PRC2) has alsorecently been implicated in Gata1-mediated erythroidgene repression [9,14]. Using an estrogen receptor-induc-ible system to induce Gata1 nuclear localization, it wasobserved that a subset of the Gata1-repressed genesshowed enrichment of repressive histone H3 trimethylLys27 (H3K27me3) histone modification and loss of cobind-ing of Tal1 with Gata1 [9,14]. Gata1 can physically associ-ate with Suz12 and Ezh2, two core subunits of PRC2, andSuz12, in turn, can recruit the transcriptional repressorGfi-1b [49]. Interestingly, Gata1-binding sites within sev-eral Gata1-repressed genes, including c-Kit, Gata2, and c-Myb, were co-occupied by Gfi-1b [14]. The loss of Tal1 (andpresumably also Lmo2 and Ldb1) at these sites may facili-tate, or be a consequence of, the formation of repressiveGata1/Gfi-1b/PRC2 complexes.

Function of Ldb1 complexes in other hematopoieticlineagesSeveral recent studies have identified important roles forLdb1 complexes in nonerythroid hematopoietic lineages.Notably, DNA-binding complexes that include Ldb1, Tal1,and Gata2 in lieu of Gata1 have been shown to regulate atranscriptional program required for HSC maintenance[50]. Whereas Gata1 is essential for erythropoiesis, Gata2,which is highly expressed in hematopoietic progenitors,performs an equally critical role in the generation andmaintenance of HSCs [51]. Commitment of HSCs to theerythroid lineage is associated with an event known as the‘Gata switch’, which involves the induction of Gata1 andthe Gata1-mediated repression of Gata2 (Box 2) [52]. Sub-stitution of Gata1 for Gata2 within Ldb1 complexes resultsin induction of Klf1 and global erythroid gene activation.Thus, the modular design of Ldb1 complexes enables ac-quisition of distinct gene regulatory activities in HSCs anderythroblasts through the stage-specific assembly of differ-ent Ldb1 complexes that incorporate either Gata2 orGata1. Ldb1 complexes that contain Gata2 are likelyrequired for hematopoietic specification in the embryoand are also necessary for proper development of heman-gioblasts, the common progenitors of hematopoietic andendothelial cells [53]. Megakaryocytes and mast cells havebeen shown to require Gata1, Gata2, Tal1, and Ldb1 fortheir normal development, indicating that Ldb1 complexesperform key functions in these lineages [21,54–57].In agreement with this idea, Gata1, Tal1, and Ldb1 co-occupancy has been observed at several key megakaryo-cyte genes, including Mpl, aIIb, GpIa, Mc-Cpa, FceR1-b,Pf4 [18], and Itga2b [25].

Review Trends in Genetics January 2014, Vol. 30, No. 1

6

Page 9: Trends in Genetics 2014 Jan

Concluding remarksIn large part attributable to the advent of new technolo-gies enabling genome-wide DNA-binding site profiling,the past few years have witnessed dramatic advances inunderstanding of the genetic regulatory mechanismscontrolling erythropoiesis. One of the most intriguingfindings is that two major erythroid transcription factors,Gata1 and Tal1, often function cooperatively withinhigher-order Ldb1-nucleated protein complexes to acti-vate erythroid genes. Whether, or to what extent, theregulatory properties of Gata1 and/or Tal1 are affected bytheir inclusion within Ldb1 complexes remains to bedetermined. An important role of Ldb1 complexes is tofacilitate, via Ldb1-mediated oligomerization, long-dis-tance interactions, including the juxtaposition of ery-throid promoters and enhancer elements and therecruitment of erythroid genes to transcriptional inter-actomes near the Hba and Hbb genes. Whether Ldb1

oligomerization can facilitate trans as well as cis chromo-somal interactions is unclear and it is also currentlyunknown if a single complex can recruit multiple regula-tory elements through Ldb1 oligomerization. Other im-portant questions for future investigation are whetherLdb1 complexes are required for the formation of ery-throid transcriptional interactomes and if Ldb1 complex-mediated associations are static or instead dynamic andtransient. Although the core subunits of erythroid Ldb1complexes, which include Ldb1, Lmo2, Gata1, Tal1, andE2a, have been identified, clear evidence exists for Ldb1complex modularity that can potentially modify or fun-damentally alter the regulatory properties of Ldb1 com-plexes. Identifying the precise subunit composition andbinding sites of Ldb1 complexes at different stages ofhematopoiesis and correlating these data with gene ex-pression profiling represent important goals of ongoingand future studies.

Box 2. The ‘Gata switch’ model

Gata2 has a critical role in the emergence and maintenance of HSCs

as well as in the specification of early erythroid progenitors [73]. The

zinc fingers of Gata2 and Gata1 share a high degree of similarity and

they both bind a consensus GATA DNA motif (Box 1). Gata2 can

partially restore primitive erythropoiesis in the absence of Gata1 [74].

However, the N and C termini of Gata1 and Gata2 are divergent,

suggesting that Gata1 and Gata2 interact with a unique set of

cofactors. Gata1 and Gata2 are involved in a key regulatory loop

during erythropoiesis designated the ‘Gata-switch’ [75]. Gata2 directly

activates its own gene in HSCs but Gata1 is induced during the early

stages of erythropoiesis (Figure IA). As a consequence, Gata1

replaces Gata2 at both genes, resulting in repression of Gata2

transcription and increased Gata1 transcription. Studies using ChIP-

Seq to interrogate the binding profiles of Gata2 and Gata1 in erythroid

precursor cells have found that they share many chromatin sites and

also reveal that this exchange in Gata factor binding is widespread

[12,13]. Thus, an elegant ‘switch’ of Gata factors triggers erythroid

differentiation. In HSCs, Gata2 functions in large part within the

context of Ldb1 complexes to control expression of genes responsible

for HSC maintenance [50], whereas Ldb1 complexes that contain

Gata1 regulate the expression of a large cohort of erythroid signature

genes [24,25] (Figure IB,C). Consequently, the Ldb1 complex serves as

a core structure through which the ‘Gata switch’ operates to regulate

erythroid lineage commitment and differentiation.

HSCmaintenance

genes

CANNTG-N7-9-WGATARCANNTG-N7-9-WGATAR

Erythroidgenes

Expr

essio

n

HSC CM P ME P

(A)

(B) (C)

Erythroblast

Gata2 Gata1

Erythroid matura�on

Gata1 Tal1

E2aLmo2

Ldb1

Gata2

Lmo2

Ldb1

Ldb1

Tal1

Lmo2

Tal1

E2a

TRENDS in Genetics

Figure I. ‘Gata switch’ model. (A) Graph representing the expression of Ldb1 complex components as hematopoietic progenitor cells differentiate toward the erythroid

lineage. Notably, Gata2 expression is high and Gata1 expression is low in the earliest progenitor cells, but as cells become committed to the erythroid lineage, Gata1

expression is induced whereas Gata2 is repressed. Relative expression levels were deduced from microarray data available at http://www.BioGps.org and RNA-Seq data

available on the University of California at Santa Cruz (UCSC) Genome Browser. (B,C) Effect of the Gata switch on the subunit composition of Ldb1 complexes. In HSCs,

Ldb1 complexes that contain Gata2 regulate expression of HSC maintenance genes. As a result of the Gata switch during erythropoiesis, Gata1 is incorporated into the

Ldb1 complex to activate expression of erythroid genes. Abbreviations: CMP, common myeloid progenitor; HSC, hematopoietic stem cell; MEP, megakaryocyte-

erythroid progenitor.

Review Trends in Genetics January 2014, Vol. 30, No. 1

7

Page 10: Trends in Genetics 2014 Jan

AcknowledgmentsThis work was supported by the Intramural Research Programs of EuniceKennedy Shriver NICHD [Project number1ZIAHD001803-19 (to P.E.L.)].The authors thank Karl Pfeifer for critical review of the manuscript.

References1 Pevny, L. et al. (1991) Erythroid differentiation in chimaeric mice

blocked by a targeted mutation in the gene for transcription factorGATA-1. Nature 349, 257–260

2 Schlaeger, T.M. et al. (2005) Tie2Cre-mediated gene ablation definesthe stem-cell leukemia gene (SCL/tal1)-dependent window duringhematopoietic stem-cell development. Blood 105, 3871–3874

3 Robb, L. et al. (1995) Absence of yolk sac hematopoiesis from mice witha targeted disruption of the scl gene. Proc. Natl. Acad. Sci. U.S.A. 92,7075–7079

4 Porcher, C. et al. (1996) The T cell leukemia oncoprotein SCL/tal-1 isessential for development of all hematopoietic lineages. Cell 86, 47–57

5 Nuez, B. et al. (1995) Defective haematopoiesis in fetal liver resultingfrom inactivation of the EKLF gene. Nature 375, 316–318

6 Perkins, A.C. et al. (1995) Lethal beta-thalassaemia in mice lacking theerythroid CACCC-transcription factor EKLF. Nature 375, 318–322

7 Kassouf, M.T. et al. (2010) Genome-wide identification of TAL1’sfunctional targets: insights into its mechanisms of action in primaryerythroid cells. Genome Res. 20, 1064–1083

8 Tallack, M.R. et al. (2010) A global role for KLF1 in erythropoiesisrevealed by ChIP-seq in primary erythroid cells. Genome Res. 20, 1052–1063

9 Cheng, Y. et al. (2009) Erythroid GATA1 function revealed by genome-wide analysis of transcription factor occupancy, histone modifications,and mRNA expression. Genome Res. 19, 2172–2184

10 Tallack, M.R. et al. (2012) Novel roles for KLF1 in erythropoiesisrevealed by mRNA-seq. Genome Res. 22, 2385–2398

11 Hodge, D. et al. (2006) A global role for EKLF in definitive and primitiveerythropoiesis. Blood 107, 3359–3370

12 Wu, W. et al. (2011) Dynamics of the epigenetic landscape duringerythroid differentiation after GATA1 restoration. Genome Res. 21,1659–1671

13 Fujiwara, T. et al. (2009) Discovering hematopoietic mechanismsthrough genome-wide analysis of GATA factor chromatin occupancy.Mol. Cell 36, 667–681

14 Yu, M. et al. (2009) Insights into GATA-1-mediated gene activationversus repression via genome-wide chromatin occupancy analysis. Mol.Cell 36, 682–695

15 Papadopoulos, G.L. et al. (2013) GATA-1 genome-wide occupancyassociates with distinct epigenetic profiles in mouse fetal livererythropoiesis. Nucleic Acids Res. 41, 4938–4948

16 Wontakal, S.N. et al. (2012) A core erythroid transcriptional network isrepressed by a master regulator of myelo-lymphoid differentiation.Proc. Natl. Acad. Sci. U.S.A. 109, 3832–3837

17 Wozniak, R.J. et al. (2008) Molecular hallmarks of endogenouschromatin complexes containing master regulators of hematopoiesis.Mol. Cell. Biol. 28, 6681–6694

18 Tripic, T. et al. (2009) SCL and associated proteins distinguish activefrom repressive GATA transcription factor complexes. Blood 113,2191–2201

19 Wadman, I.A. et al. (1997) The LIM-only protein Lmo2 is a bridgingmolecule assembling an erythroid, DNA-binding complex whichincludes the TAL1, E47, GATA-1 and Ldb1/NLI proteins. EMBO J.16, 3145–3157

20 Mukhopadhyay, M. et al. (2003) Functional ablation of the mouse Ldb1gene results in severe patterning defects during gastrulation.Development 130, 495–505

21 Li, L. et al. (2010) A requirement for Lim domain binding protein 1 inerythropoiesis. J. Exp. Med. 207, 2543–2550

22 Yamada, Y. et al. (1998) The T cell leukemia LIM protein Lmo2 isnecessary for adult mouse hematopoiesis. Proc. Natl. Acad. Sci. U.S.A.95, 3890–3895

23 Warren, A.J. et al. (1994) The oncogenic cysteine-rich LIM domainprotein rbtn2 is essential for erythroid development. Cell 78,45–57

24 Soler, E. et al. (2010) The genome-wide dynamics of the binding of Ldb1complexes during erythroid differentiation. Genes Dev. 24, 277–289

25 Li, L. et al. (2013) Ldb1-nucleated transcription complexes function asprimary mediators of global erythroid gene activation. Blood 121,4575–4585

26 Tolhuis, B. et al. (2002) Looping and interaction betweenhypersensitive sites in the active beta-globin locus. Mol. Cell 10,1453–1465

27 Drissen, R. et al. (2004) The active spatial organization of the beta-globin locus requires the transcription factor EKLF. Genes Dev. 18,2485–2490

28 Vakoc, C.R. et al. (2005) Proximity among distant regulatory elementsat the beta-globin locus requires GATA-1 and FOG-1. Mol. Cell 17, 453–462

29 Matthews, J.M. and Visvader, J.E. (2003) LIM-domain-binding protein1: a multifunctional cofactor that interacts with diverse proteins.EMBO Rep. 4, 1132–1137

30 Cross, A.J. et al. (2010) LIM domain binding proteins 1 and 2 havedifferent oligomeric states. J. Mol. Biol. 399, 133–144

31 Song, S.H. et al. (2007) A positive role for NLI/Ldb1 in long-rangebeta-globin locus control region function. Mol. Cell 28, 810–822

32 Deng, W. et al. (2012) Controlling long-range genomic interactions at anative locus by targeted tethering of a looping factor. Cell 149, 1233–1244

33 Song, S.H. et al. (2010) Multiple functions of Ldb1 required for beta-globin activation during erythroid differentiation. Blood 116, 2356–2364

34 Stadhouders, R. et al. (2012) Dynamic long-range chromatininteractions control Myb proto-oncogene transcription duringerythroid development. EMBO J. 31, 986–999

35 Drissen, R. et al. (2005) The erythroid phenotype of EKLF-null mice:defects in hemoglobin metabolism and membrane stability. Mol. Cell.Biol. 25, 5205–5214

36 Nilson, D.G. et al. (2006) Major erythrocyte membrane protein genes inEKLF-deficient mice. Exp. Hematol. 34, 705–712

37 Pilon, A.M. et al. (2011) Genome-wide ChIP-Seq reveals adramatic shift in the binding of the transcription factor erythroidKruppel-like factor during erythrocyte differentiation. Blood 118,e139–e148

38 Anderson, K.P. et al. (1998) Multiple proteins binding to a GATA-E box-GATA motif regulate the erythroid Kruppel-like factor (EKLF) gene. J.Biol. Chem. 273, 14347–14354

39 Crossley, M. et al. (1994) Regulation of the erythroid Kruppel-likefactor (EKLF) gene promoter by the erythroid transcription factorGATA-1. J. Biol. Chem. 269, 15440–15444

40 Schoenfelder, S. et al. (2010) Preferential associations between co-regulated genes reveal a transcriptional interactome in erythroidcells. Nat. Genet. 42, 53–61

41 Mangan, S. and Alon, U. (2003) Structure and function of the feed-forward loop network motif. Proc. Natl. Acad. Sci. U.S.A. 100, 11980–11985

42 Shyu, Y.C. et al. (2006) Chromatin-binding in vivo of the erythroidkruppel-like factor, EKLF, in the murine globin loci. Cell Res. 16,347–355

43 Vernimmen, D. et al. (2007) Long-range chromosomal interactionsregulate the timing of the transition between poised and active geneexpression. EMBO J. 26, 2041–2051

44 Hong, W. et al. (2005) FOG-1 recruits the NuRD repressor complex tomediate transcriptional repression by GATA-1. EMBO J. 24, 2367–2378

45 Wilkinson-White, L. et al. (2011) Structural basis of simultaneousrecruitment of the transcriptional regulators LMO2 and FOG1/ZFPM1 by the transcription factor GATA1. Proc. Natl. Acad. Sci.U.S.A. 108, 14443–14448

46 Schuh, A.H. et al. (2005) ETO-2 associates with SCL in erythroid cellsand megakaryocytes and provides repressor functions inerythropoiesis. Mol. Cell. Biol. 25, 10235–10250

47 Meier, N. et al. (2006) Novel binding partners of Ldb1 are required forhaematopoietic development. Development 133, 4913–4923

48 Goardon, N. et al. (2006) ETO2 coordinates cellular proliferation anddifferentiation during erythropoiesis. EMBO J. 25, 357–366

49 Saleque, S. et al. (2002) The zinc-finger proto-oncogene Gfi-1b isessential for development of the erythroid and megakaryocyticlineages. Genes Dev. 16, 301–306

Review Trends in Genetics January 2014, Vol. 30, No. 1

8

Page 11: Trends in Genetics 2014 Jan

50 Li, L. et al. (2011) Nuclear adaptor Ldb1 regulates a transcriptionalprogram essential for the maintenance of hematopoietic stem cells.Nat. Immunol. 12, 129–136

51 Ling, K.W. et al. (2004) GATA-2 plays two functionally distinct rolesduring the ontogeny of hematopoietic stem cells. J. Exp. Med. 200, 871–882

52 Bresnick, E.H. et al. (2010) GATA switches as developmental drivers.J. Biol. Chem. 285, 31087–31093

53 Mylona, A. et al. (2013) Genome-wide analysis shows that Ldb1controls essential hematopoietic genes/pathways in mouse earlydevelopment and reveals novel players in hematopoiesis. Blood 121,2902–2913

54 Mikkola, H.K. et al. (2003) Haematopoietic stem cells retain long-termrepopulating activity and multipotency in the absence of stem-cellleukaemia SCL/tal-1 gene. Nature 421, 547–551

55 Salmon, J.M. et al. (2007) Aberrant mast-cell differentiation in micelacking the stem-cell leukemia gene. Blood 110, 3573–3581

56 Cantor, A.B. et al. (2008) Antagonism of FOG-1 and GATA factors infate choice for the mast cell lineage. J. Exp. Med. 205, 611–624

57 Migliaccio, A.R. et al. (2003) GATA-1 as a regulator of mast celldifferentiation revealed by the phenotype of the GATA-1low mousemutant. J. Exp. Med. 197, 281–296

58 Cantor, A.B. and Orkin, S.H. (2002) Transcriptional regulation oferythropoiesis: an affair involving multiple partners. Oncogene 21,3368–3376

59 Trainor, C.D. et al. (2000) GATA zinc finger interactions modulate DNAbinding and transactivation. J. Biol. Chem. 275, 28157–28166

60 Lee, H.Y. et al. (2009) Controlling hematopoiesis through sumoylation-dependent regulation of a GATA factor. Mol. Cell 36, 984–995

61 Lowry, J.A. and Mackay, J.P. (2006) GATA-1: one protein, manypartners. Int. J. Biochem. Cell Biol. 38, 6–11

62 Fujiwara, Y. et al. (1996) Arrested development of embryonic red cellprecursors in mouse embryos lacking transcription factor GATA-1.Proc. Natl. Acad. Sci. U.S.A. 93, 12355–12358

63 Green, A.R. and Begley, C.G. (1992) SCL and related hemopoietichelix-loop-helix transcription factors. Int. J. Cell Cloning 10, 269–276

64 Souroullas, G.P. et al. (2009) Adult hematopoietic stem and progenitorcells require either Lyl1 or Scl for survival. Cell Stem Cell 4, 180–186

65 Hall, M.A. et al. (2003) The critical regulator of embryonichematopoiesis, SCL, is vital in the adult for megakaryopoiesis,erythropoiesis, and lineage choice in CFU-S12. Proc. Natl. Acad.Sci. U.S.A. 100, 992–997

66 Siatecka, M. and Bieker, J.J. (2011) The multifunctional role of EKLF/KLF1 during erythropoiesis. Blood 118, 2044–2054

67 McConnell, B.B. and Yang, V.W. (2010) Mammalian Kruppel-likefactors in health and diseases. Physiol. Rev. 90, 1337–1381

68 Frontelo, P. et al. (2007) Novel role for EKLF in megakaryocyte lineagecommitment. Blood 110, 3871–3880

69 Tallack, M.R. and Perkins, A.C. (2010) Megakaryocyte-erythroidlineage promiscuity in EKLF null mouse blood. Haematologica 95,144–147

70 Bouilloux, F. et al. (2008) EKLF restricts megakaryocyticdifferentiation at the benefit of erythrocytic differentiation. Blood112, 576–584

71 Ryan, D.P. et al. (2008) Assembly of the oncogenic DNA-bindingcomplex LMO2-Ldb1-TAL1-E12. Proteins 70, 1461–1474

72 Tallack, M.R. and Perkins, A.C. (2010) KLF1 directly coordinatesalmost all aspects of terminal erythroid differentiation. IUBMB Life62, 886–890

73 Vicente, C. et al. (2012) The role of the GATA2 transcription factor innormal and malignant hematopoiesis. Crit. Rev. Oncol. Hematol. 82, 1–17

74 Fujiwara, Y. et al. (2004) Functional overlap of GATA-1 and GATA-2 inprimitive hematopoietic development. Blood 103, 583–585

75 Grass, J.A. et al. (2003) GATA-1-dependent transcriptional repressionof GATA-2 via disruption of positive autoregulation and domain-widechromatin remodeling. Proc. Natl. Acad. Sci. U.S.A. 100, 8811–8816

Review Trends in Genetics January 2014, Vol. 30, No. 1

9

Page 12: Trends in Genetics 2014 Jan

Roles of cilia, fluid flow, and Ca2+

signaling in breaking of left–rightsymmetrySatoko Yoshiba* and Hiroshi Hamada

Developmental Genetics Group, Graduate School of Frontier Biosciences, Osaka University, 1-3 Yamada-oka, Suita,

Osaka 565-0871, Japan

The emergence of left–right (L–R) asymmetry duringembryogenesis is a classic problem in developmentalbiology. It is only since the 1990s, however, that sub-stantial insight into this problem has been achieved bymolecular and genetic approaches. Various genes re-quired for L–R asymmetric morphogenesis in vertebrateshave now been identified, and many of these genes arerequired for the formation and motility of cilia. Breakingof L–R symmetry in the mouse embryo occurs in theventral node, where two types of cilia are present.Whereas centrally located motile cilia generate a left-ward fluid flow, peripherally located immotile cilia sensea flow-dependent signal, which is either chemical ormechanical in nature. Although Ca2+ signaling is impli-cated in flow sensing, the precise mechanism remainsunknown. Here we summarize current knowledge of L–Rsymmetry breaking in vertebrates (focusing on themouse), with a special emphasis on the roles of cilia,fluid flow, and Ca2+ signaling.

An introduction to L–R asymmetryVisceral organs of vertebrates exhibit L–R asymmetry withregard to their position and morphology. Recent molecularand genetic studies, which began in the 1990s, have un-covered mechanisms responsible for the generation of suchL–R asymmetry. These mechanisms are largely conservedamong vertebrates, although substantial diversity hasbeen identified [1]. Four steps are required to establishL–R asymmetric patterning in the mouse embryo(Figure 1) [2]: (i) symmetry breaking by a leftward fluidflow (nodal flow) generated by the rotational movement ofprimary cilia at the node (see Glossary); (ii) transmission ofan asymmetric signal (or signals) produced in or aroundthe node to the lateral plate mesoderm (LPM); (iii) asym-metric expression of Nodal and the gene for its feedback

inhibitor Lefty2 in the left LPM; and (iv) situs-specificmorphogenesis as a result of asymmetric expression ofPitx2, which encodes a transcription factor activated byNodal signaling. Despite recent progress, a key issue thatremains unresolved is how the asymmetric signal (orsignals) is transferred from the node to the LPM. Herewe describe current understanding of the mechanism of L–R patterning, which is based largely on genetic evidence,with special emphasis on cilia, fluid flow, and Ca2+ signal-ing. Observations discussed hereafter were made withmouse embryos, unless indicated otherwise.

Review

Glossary

Crown cells: cells located at the periphery of the ventral node. Most possess an

immotile cilium and express specific genes required for normal L–R patterning

including Nodal, Gdf1, and Cerl2.

Endoderm: a group of cells that line digestive and respiratory tubes within the

embryo or body. At E8 during mouse development, when breaking of L–R

symmetry takes place, the surface layer of cells near the node is also

endoderm.

Gastrocoel roof plate: a ciliated epithelium located at the posterior end of the

notochord in amphibian embryos, where flow develops shortly before the

onset of L–R asymmetric gene expression.

Headfold stage: a stage about 7.5–8.0 days after fertilization in the mouse

embryo, when a ventral fold is formed by rapid growth of the head of the

embryo.

Kupffer’s vesicle: a transient spherical structure that arises in the tailbud of

teleost embryos. It is filled with fluid and its inside is covered with cells with

motile cilia.

Lateral plate mesoderm (LPM): a group of mesoderm cells located in the lateral

region of the embryo that will contribute to the mesenchyme of various organs.

Although LPM is present on both sides of the embryo, that on the left side

expresses Nodal and will be patterned differently from that on the right side.

Nodal: a transforming growth factor b (TGFb)–related factor that regulates

various aspects of early embryogenesis including mesoderm formation and L–

R patterning.

Node: a ventral indentation with a ciliated epithelium located posterior to the

notochord and anterior to the primitive streak in the mouse embryo. Although

‘the posterior notochord’ may be an embryologically more precise term [72],

‘node’ is used in this review.

Pit cells: cells located at the central region of the ventral node. Most possess a

motile cilium that generates nodal flow.

Planar cell polarity (PCP): a conserved mechanism responsible for the

polarization of cells along specific axes in a tissue. Its core components include

Dvl (Dishevelled), Fz (Frizzled), Vangl, and Prickle, some of which are localized

to one side of a cell while others are localized to the opposite side.

Primary ciliary dyskinesia (PCD): a group of genetic disorders caused by a

defect of motile cilia. Its main symptoms include chronic respiratory infections,

infertility, and laterality defects. A number of PCD-causing genes have been

identified in human [73,74]. Many of these genes encode an axonemal dynein

component, whereas others are required for assembly of dynein complex,

transport of axonemal dyneins to the cilium, or the regulation of dynein activity.

Situs-specific morphogenesis: formation of visceral organs according to left–

right positional information. Almost all visceral organs are left–right asym-

metric in their position or shape.

0168-9525/$ – see front matter

� 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.tig.2013.09.001

Corresponding authors: Yoshiba, S. ([email protected]); Hamada, H.([email protected]).Keywords: cilia; fluid flow; left–right asymmetry.

* Current address: Centrosome Biology Lab, National Institute of Genetics, Yata1111, Mishima, Shizuoka 411-8540, Japan.

10 Trends in Genetics, January 2014, Vol. 30, No. 1

Page 13: Trends in Genetics 2014 Jan

Symmetry breaking by ciliaUnidirectional fluid flow generated by rotational

movement of cilia

There are 200–300 monociliated pit cells in the node cavityof the mouse embryo at embryonic (E) day E8.0 (early somitestage), and the cilia of these cells rotate in a clockwisedirection at a speed of �600 rpm. This rotational movementgenerates a leftward fluid flow, rather than a vortical flow,because the cilia protrude in a posteriorly tilted manner[3,4]. Asymmetry in the rotational stroke may also contrib-ute to generation of the unidirectional flow [5].

An essentially similar mechanism is adopted by many, ifnot all, vertebrates. The gastrocoel roof plate in Xenopus[6], and Kupffer’s vesicle (KV) in zebrafish, which areequivalent to the mouse node, also possess motile ciliawhose movement generates a unidirectional flow. Thearchitecture of KV in zebrafish differs from that of themouse node, however. The motile cilia are present at theventral and dorsal surface inside KV, but they are prefer-entially localized in the anterior region. In contrast to thecilia in the anterior region, the basal bodies of the cilia arenot markedly shifted toward the posterior side. Nonethe-less, a unidirectional flow is generated in the anterior–dorsal region of KV [7,8]. In the chick embryo, however,there is no cavity on the ventral surface of Hensen’s node inwhich a flow of fluid could be established, and motile ciliaequivalent to those in the mouse node are not present. Ithas been suggested [9,10] that the chick embryo adopts adifferent symmetry-breaking mechanism involving L–Rasymmetric cell migration around Hensen’s node. Inter-estingly, in the urochordate Ciona intestinalis, which alsomanifests L–R asymmetric expression of Nodal and Pitx2,

cilia appear immediately before the onset of asymmetricNodal expression. These cilia are immotile, however, andthus can only act as sensors [11]. Flow may be generated bymotile cilia yet to be discovered in C. intestinalis or by anunknown mechanism not involving motile cilia.

Node cilia are tilted as a result of planar cell polarity

Each cilium protrudes from a basal body, and the locationof the basal body within individual pit cells of the mousenode is posteriorly shifted (Figure 2). Time-lapse observa-tions of the node in live mouse embryos have revealed thatthe position of the basal body changes during development[12]. At the early bud stage, when unidirectional flow is notyet apparent, the basal body is localized in a relativelycentral region of each pit cell. The position of the basal bodygradually shifts toward the posterior side, however, suchthat at the three-somite stage, when the leftward flow ismaximal, the basal body is found on the posterior side ofmost pit cells. The position of the basal body thus closelycorrelates with the strength of the leftward flow.

The basal body is positioned by a mechanism known asplanar cell polarity (PCP). Deficiency of PCP core proteinssuch as Dvl [12] and Vangl2 [13] results in failure of thebasal body of the pit cells to shift posteriorly. Furthermore,some PCP core proteins are asymmetrically distributed inthe node cells: Dvl2 and Dvl3 are localized to the posteriorside of these cells [12], whereas Prickle2 [14] and Vangl1[13] are localized to the anterior side (Figure 2). Posteriorpositioning of the basal body is thus determined by planarpolarization of node cells along the anterior–posterior (A–P) axis. Dynamic rearrangement of the actin cytoskeletonis important for the polarized localization of PCP core

E7.5

E8.5

E9.5

E10.5 Stomach Blood vessels Lung

E11.5

Pa�erning of LPM

Organogenesis

Symmetry breaking

Signal transfer from node to LPM

NodalLe�y

Node

LR

R L

TRENDS in Genetics

Figure 1. Four steps in the generation of left–right (L-R) asymmetry. The black arrow on the left represents a time-course during development, from earlier embryonic

stages [embryonic (E) days E7.5–E8.5] to later ones (E10–E11.5). For the first step (symmetry breaking), an embryo exhibiting asymmetric gene expression at the node is

shown (top photo). For patterning of the lateral plate mesoderm (LPM) (third step), an embryo exhibiting left-sided expression of Nodal is shown (middle photo). For the last

step, three different mechanisms for the generation of morphological asymmetries are shown (bottom illustration): differential branching for the lung, directional looping

for the stomach, and one-sided regression for the blood vessels.

Review Trends in Genetics January 2014, Vol. 30, No. 1

11

Page 14: Trends in Genetics 2014 Jan

proteins, given that polarization of node cells is disruptedin the absence of cofilin1, a regulator of the actin cytoskel-eton [15]. Both the posterior tilt of node cilia and leftwardflow are also impaired in mutant mice lacking Bicaudal C[16]. This putative RNA-binding protein uncouples Dvl2signaling from the canonical Wnt signaling pathway [16]and regulates Pkd2 (polycystin 2) expression in the kidneyby antagonizing the activity of the microRNA miR-17 [17],but its exact role in positioning of the basal body remainsunclear. In ciliated cells of developing Xenopus skin [18],flow refines the orientation of the cilia (and possibly theposition of the basal body as well), giving rise to a flow-mediated self-organizing system. Such a mechanism doesnot seem to operate in the mouse embryo, however, giventhat the basal body of pit cells in the node is positionednormally in the iv/iv (inversus viscerum) mutant [4], whichcompletely lacks nodal flow [19].

The identity of the initial A–P cue (or cues) responsiblefor such polarization remains unknown. Noncanonical Wntligands such as Wnt5a and Wnt5b, which are expressedasymmetrically with respect to the position of the node, aregood candidates for the initial A–P cue, as are secretedantagonists of noncanonical Wnt signaling such as Sfrp,which is expressed in the region anterior to the node [20].

Genesis and motility of node cilia

Ciliated cells in the node possess a single primary cilium.Impairment of the formation or motility of the node ciliaresults in a loss of unidirectional flow and consequent L–Rpatterning defects. Since the first report of a mouse mutantwith impaired ciliogenesis [Kif3b (kinesin-like protein)mutant mouse] [21], numerous genes related to ciliogen-esis have been identified in the mouse and other verte-brates, some of which are essential for formation of nodecilia. Knowledge obtained from the study of these mutants

has contributed to our understanding of how the primarycilia of the pit cells in the node are formed under normalconditions.

Unlike other primary cilia present in various organs ofthe body, the primary cilia of pit cells in the node are motile(as we will describe below, there are also immotile cilia inthe node). Motile cilia in the node rotate in a clockwisedirection (when observed from the ventral side of theembryo) and at an average speed of �600 rpm. In general,motile cilia possess the 9+2 arrangement of microtubuleshaving a ring of nine peripheral doublets of microtubulesplus a pair of central microtubules. Motile cilia in the node,however, appear to possess the 9+0 arrangement lackingthe central pair of microtubules. The existence of motilecilia without the central pair of microtubules is not nec-essarily surprising, given that studies with Chlamydomo-nas [22,23] and human [24] suggest that the centralmicrotubules affect the pattern and speed of flagellarmovement but are not essential for motility per se. Al-though the molecular basis for the driving force of rota-tional movement is not known, ciliary motility appears todepend on the sliding of dynein arms, as has been sug-gested for the movement of Chlamydomonas flagella [25].Dynein arms, which are composed of the motor proteindynein and many associated proteins, are assembled inthe cytoplasm and transported to the cilium. The assemblyof these structures requires additional proteins, many ofwhich are cytoplasmic [26–30]. Mutation of any of theseproteins might thus be expected to result in a loss of ciliarymotility and in disorders known as primary ciliary dyski-nesia (PCD). Humans with PCD manifest lateralitydefects as well as infertility and respiratory disorders.Recessive mutations in DNAH5 (which encodes a heavychain of the outer dynein arm) or in DNAI1 or DNAI2(which encode intermediate chains of the outer dyneinarm) cause PCD associated with L–R defects in somepatients [31–33]. Mutations in CCDC39 and CCDC40(which encode coiled-coil proteins required for assemblyof the inner dynein arm) also result in PCD in humans[34,35].

The molecular phenotype (that is, the pattern of Nodalexpression in the LPM) associated with the loss of motilityof node cilia can be seen from the well-studied iv/iv mousemutant, which harbors a mutation in the Dnah11(Lrd)gene for an axonemal dynein protein that renders nodecilia immotile [36]. The loss of ciliary motility in thismutant results in a loss of leftward fluid flow in the node[19] and in consequent randomization of Nodal expressionin the LPM, with the expression pattern being either left-sided, right-sided, or bilateral [37,38]. Although it remainsunknown how nodal flow is sensed (as discussed in the nextsection), the L–R decision becomes randomized in theabsence of the flow.

Immotile cilia at the edge of the node sense the fluid

flow

There are two types of ciliated cells at the node [39,40](Figure 3). Cells located at the central region of the node(pit cells) possess motile cilia, which generate the fluidflow. By contrast, most cells located at the edge of the node(crown cells) possess immotile cilia [41].

A P&'( )*+,

V

Basal body

Prickle2Vangl1Dvl

D

A P?

Cilium

TRENDS in Genetics

Figure 2. Polarization of node cells. Polarized localization within node cells is

shown for the basal body of the cilium and for PCP core proteins. A, anterior; D,

dorsal; P, posterior; V, ventral. Note that the basal body is positioned at the

posterior side of node cells, which results in tilting of the motile cilium toward the

posterior side. Prickle2 and Vangl1 proteins are localized to the anterior side of the

cells, whereas the Dvl protein is on the posterior side. Putative anterior–posterior

information responsible for polarization of these proteins is indicated by the

yellow gradient on the top.

Review Trends in Genetics January 2014, Vol. 30, No. 1

12

Page 15: Trends in Genetics 2014 Jan

Recent evidence indicates that these immotile cilia ofthe crown cells act as sensors of the fluid flow [41]. Kif3a�/�

mouse embryos, which are deficient in the kinesin motorprotein KIF3A, lack all cilia including those at the node,fail to develop nodal flow, and manifest L–R defects [42].They are also unable to respond to flow. However, restora-tion of Kif3a expression specifically in crown cells resultedin the formation of cilia in these cells. More importantly,the modified embryos were also able to respond to flow,suggesting that cilia of the crown cells, most of which areimmotile, sense the fluid flow [41].

Can motile cilia of pit cells also sense nodal flow? Theflagella of Chlamydomonas not only move but also sensemechanical force [43]. In medaka, all of the cilia in KV aremotile [44], suggesting that motile cilia may also sensefluid flow. Motile cilia of the mouse node are unlikely tosense nodal flow, however, based on studies in Pkd2 mu-tant embryos, which possess motile cilia and develop nodalflow but are unable to sense the flow [45]. Restoration of theexpression of Pkd2 (a Ca2+ channel required for flowsensation) specifically in pit cells of Pkd2�/�mouse embry-os did not prevent the development of L–R defects [41],supporting the idea that in the mouse, motile cilia of pitcells are not involved in detecting nodal flow.

Mechanosensing or chemosensing?

There are two prevailing models explaining how the em-bryo senses nodal flow. The embryo may sense the mechan-ical force of the flow (two-cilia model or mechanosensormodel) or, alternatively, the flow may transport a determi-nant molecule toward the left side of the embryo (chemo-sensor model). Circumstantial evidence, including therecent observation that as few as two rotating cilia aresufficient for the breaking of L–R symmetry [46], favors thelatter model. Because the flow generated by two rotatingcilia is highly local and the flow velocity would be attenu-ated over a distance from the rotating cilia by a factor of 2–3, it would take a long time to for a molecule or a particle totravel from one side of the node to the other side. By

contrast, even a weak, local mechanical force can be trans-mitted instantly from the rotating cilia to the edge of thenode [47] because the node cavity is a semi-closed space.However, it is still not clear what exactly the cilia senseduring the symmetry-breaking process.

Ca2+ signaling in flow sensing

The requirement for a Ca2+ channel composed of Pkd2 [45]and Pkd1l1 [44,48] in L–R patterning as well as the directdetection of L–R asymmetric Ca2+ signaling at the node[39] have suggested that Ca2+ signaling plays a role in thesensing of nodal flow. Given that the L–R defects of Pkd2�/� mutant mice can be rescued by crown cell-specific ex-pression of a Pkd2 transgene, Pkd2-mediated Ca2+ signal-ing in crown cells appears to be sufficient for flow sensing.Indeed, several blockers of Ca2+ signaling have been shownto disrupt asymmetric gene expression in crown cells [41].In particular, the observed effects of GdCl3 [an inhibitor ofstretch-sensitive transient receptor potential (TRP) chan-nels], 2-ABP [an inhibitor of the inositol 1,4,5-trispho-sphate (IP3) receptor], and thapsigargin (an inhibitor ofCa2+-dependent ATPase activity in the endoplasmic retic-ulum) implicate Ca2+ signaling by a TRP-type channelsuch as Pkd2 as well as that by the IP3 receptor in thesensing of nodal flow. Pkd2, together with Pkd1l1, likelyfunctions in the ciliary compartment of crown cells, giventhat a mutation in Pkd2 that disrupts the ciliary localiza-tion of the encoded protein results in L–R defects similar tothose of Pkd2�/� embryos [41,48]. Whereas Pkd2 encodes aCa2+ channel with a short extracellular domain locatedbetween two transmembrane domains, Pkd1l1 possesses amuch larger extracellular domain at its amino terminus,suggesting that Pkd1l1 is responsible for sensing the flowsignal, be it mechanical or chemical.

However, Ca2+ signaling with obvious L–R asymmetryhas not been detected in crown cells. Examination oftransgenic mice that express the Ca2+ indicator GCaMP2(a calmodulin–GFP fusion) specifically in crown cellsrevealed that Ca2+ signaling was operative in these cells,

Ca2+

Ca2+

Ca2+

Flow

Pkd1l1Pkd2

Cerl2 mRNALR

Immo�le ciliaMo�le cilia

Cerl2 mRNA

TRENDS in Genetics

Figure 3. Model for the sensing of nodal flow by immotile cilia. Two types of ciliated cell are present in the node: those located centrally (green) have motile cilia that

generate nodal flow, whereas those located peripherally (pink) possess immotile cilia that sense the flow. Sensing of the flow requires ciliary localization of a Pkd2–Pkd1l1

complex with Ca2+ channel activity. The flow-mediated signal results in degradation of Cerl2 mRNA in crown cells on the left side. In this model, an immotile cilium located

on the left side of the node is bent in response to the flow. However, such bending has not been verified by in vivo observations.

Review Trends in Genetics January 2014, Vol. 30, No. 1

13

Page 16: Trends in Genetics 2014 Jan

but that it was present bilaterally and was retained inPkd2�/� embryos [41]. More recently [49], oscillations ofCa2+ signaling were detected in the node (oscillatory Ca2+

signals are found in both pit cells and crown cells, but thosein crown cells may be functionally more relevant). Thefrequency of the oscillations was higher on the left sidethan on the right side, and it was reduced on both sides inPkd2�/� embryos. Dynamic oscillatory (rather than static)Ca2+ signaling may thus be responsible for symmetrybreaking.

L–R asymmetry of Ca2+ signaling was first described atthe node [39], with similar observations being subsequent-ly reported by others. However, it is likely that this asym-metric Ca2+ signaling occurs in endoderm cells near thenode, rather than in crown cells. As described below, Ca2+

signaling in endoderm may have a different role: signaltransfer from the node to LPM.

Readout of the flow-mediated signal

What happens to crown cells after they have received theflow-mediated signal? Expression of Cerl2 appears to bethe most immediate readout of the flow signal because itexhibits the earliest L–R asymmetry at the node [46,50](Figure 3). Cerl2 encodes a Nodal antagonist, although itsprecise action is not clear. It is asymmetrically expressed incrown cells (i.e., the expression on the right side is higherthan that on the left side), and its absence results inrandomization of L–R decision making [51]. Whereas ex-pression of Nodal is bilateral in crown cells, the R > Lexpression of Cerl2 renders Nodal activity in crown cells

higher on the left side (Figure 4). The Cerl2-generatedasymmetry (R < L) of Nodal activity at the node closelycorrelates with the asymmetric pattern of Nodal expres-sion in the LPM [52]. Expression of Cerl2 is initiallysymmetric (R = L) at the early headfold stage, but itbecomes R > L as the velocity of nodal flow increases, withexpression on the left side being downregulated [46,52].Recent genetic evidence suggests that Cerl2 is the majortarget of the flow signal. Pkd2�/�Cerl2�/� double-mutantembryos thus manifest randomized Nodal expression inLPM, resembling the Cerl2 single mutant (but are unlikethe Pkd2 single mutant, which loses Nodal expression inLPM) [41].

L–R asymmetry of Cerl2 expression is determined not atthe transcriptional level but rather at the post-transcrip-tional level [53], specifically by the decay of Cerl2 mRNA ina manner dependent on its 30 untranslated region. Prefer-ential decay of Cerl2 mRNA on the left is initiated by theleftward fluid flow and further enhanced by the operationof Wnt–Cerl2 interlinked feedback loops, in which Wnt3upregulates Wnt3 expression and promotes Cerl2 mRNAdecay whereas Cerl2 promotes Wnt3 degradation. Mathe-matical modeling and experimental data [53] suggest thatthese feedback loops behave as a bistable switch that isable to amplify in a noise-resistant manner a small L–Rbias conferred by the leftward fluid flow, which is weak atthe stage when asymmetry of Cerl2 mRNA is establishedat the node [46].

Although Cerl2 mRNA exhibits R > L asymmetry at thenode, Cerl2 protein displays a dynamic behavior [54]. Cerl2

LPM Node

(R = L) (R > L)

Nodal

NodalCerl2

Cerl2 mRNA

L L

Nodal ac�vity

RR R

+ =

L

R

L

Gdf1

Gdf1

Endoderm

(A)

(B) (C)

TRENDS in Genetics

Figure 4. Generation of molecular asymmetries at the node. (A) Whereas Nodal mRNA and Gdf1 mRNA are present at similar levels on both sides of the mouse embryo,

Cerl2 mRNA shows a right (R) > left (L) distribution. (B) Nodal (yellow) and GDF1 (purple) proteins form a heterodimer that constitutes an active form of Nodal. Given that

Cerl2 (blue) is an inhibitor of Nodal, the effective level of active Nodal, which is reflected by the level of phosphorylated Smad2/3 (pSmad2), shows a R << L pattern [the

rightmost panel of (A)]. (C) The Nodal–GDF1 heterodimer produced by crown cells may be transported to the left lateral plate mesoderm (LPM) via an extra-embryonic

(black dotted arrow) or intra-embryonic route (red dotted arrow). According to the former route, the Nodal–GDF1 heterodimer would be secreted into the node cavity,

transported to the left side by the leftward flow, absorbed by the endoderm and transported to the LPM on the left side. Alternatively, the Nodal–GDF1 heterodimer may be

secreted within the embryo, and transported along extracellular matrix (ECM) to left LPM (intra-embryonic route). In either route, Nodal–GDF1 that has reached the LPM will

activate expression of Nodal, which is responsive to Nodal signaling. Note that the level of Nodal–Gdf1 heterodimer is much lower on the right side than that on the left

side. Nodal signaling in crown cells is not essential for L–R asymmetry because inhibition of Nodal signaling in crown cells does not interfere normal L–R patterning [52].

Review Trends in Genetics January 2014, Vol. 30, No. 1

14

Page 17: Trends in Genetics 2014 Jan

protein initially localizes to crown cells on the right side,but later it will be translocated to the left side. This right-to-left translocation is dependent on the flow, suggestingthat Cerl2 protein may be transported from the right sideto the left side of the node.

Signal transfer from the node to the LPMIdentity of the signal transferred from the node to the

LPM: is it Nodal protein?

The asymmetric signal (or signals) generated in or near thenode, whether it be a mechanical stress or a moleculardeterminant, must be transferred to the LPM, where itinduces the asymmetric expression of Nodal. Several linesof evidence suggest that an active form of Nodal protein,perhaps the Nodal–GDF1 heterodimer, is the signal that istransferred from the node to the lateral plate (Figure 4).

Among several signaling molecules expressed at thenode, Nodal–GDF1 (growth/differentiation factor 1) isthe best candidate for the determinant that is transferredfrom the node to the left LPM. First, Nodal is absolutelyessential for asymmetric gene expression in the lateralplate. Specific ablation of Nodal expression in the perino-dal crown cells thus prevents Nodal expression in the leftLPM [55,56]. Second, asymmetric Nodal expression in theLPM is conferred by two enhancers, both of which areNodal-responsive and dependent on the transcription fac-tor FoxH1 [57–61]. Third, introduction of a Nodal expres-sion vector into the right LPM can induce ectopicexpression of endogenous Nodal on the right side [62].Fourth, Gdf1 is coexpressed with Nodal bilaterally inthe perinodal crown cells, and Gdf1 mutant mice do notmanifest asymmetric Nodal expression in LPM [63]. GDF1alone does not activate signaling under physiological con-ditions, but it interacts with Nodal and thereby increasesNodal activity [64].

The route by which the Nodal signal is transferred

If Nodal protein is transferred from the node to the LPM,how is it transferred? Nodal is expressed in only tworegions, the node (crown cells) and left LPM, at this stageof mouse embryogenesis, with its expression in the nodeturning on slightly earlier than that in left LPM. It is easyto envisage that Nodal secreted from crown cells into thenode cavity would be transported by the leftward flow tothe left side of the embryo (an extra-embryonic route;Figure 4). Nodal protein that reaches the left side couldthen be incorporated into the embryo via the endoderm andbe transported to the left LPM, where it would induceendogenous Nodal expression. This is unlikely to be thecase, however, because culture of mouse embryos in thepresence of recombinant Nodal does not induce Nodalexpression in the right LPM, whereas injection of Nodalinto the right side of the embryo (near the paraxial meso-derm and LPM) does induce endogenous Nodal expressionon the right side [52,65]. Instead, several lines of evidencesuggest that Nodal-GDF1 secreted into the embryo istransported from the node to the LPM via an intra-embry-onic route (Figure 4). First, Cryptic, an essential compo-nent of the Nodal signaling pathway, is required only in theLPM for correct L–R patterning: it is not required in theregion between the node and the LPM [65]. Second, Nodal

interacts with sulfated glycosaminoglycans, which arespecifically localized to the basement membrane betweenthe node and the lateral plate. Moreover, inhibition ofsulfated glycosaminoglycan synthesis in Xenopus [66]and mouse [65] embryos prevented Nodal expression inthe LPM.

These results collectively suggest that Nodal proteinproduced at the node and secreted inside the embryo istransported from the node to the lateral plate via the intra-embryonic route (Figure 4), where it activates Nodal ex-pression. Secreted Nodal–GDF1 protein has not been di-rectly detected in the mouse embryo, but secreted Nodalprotein undergoing transport has been detected in the frogembryo [66].

The role of the endoderm in node–LPM signal transfer

Two recent studies of Sox17 mutant mice [67,68] suggestthat the endoderm plays a role in signal transfer from thenode to the LPM. These studies found that embryos defi-cient in Sox17 (SRY-box containing gene 17), a transcrip-tion factor required for definitive endoderm formation,manifest L–R patterning defects. Whereas asymmetricgene expression including that of Cerl2 was found to bemaintained in the node (specifically in the perinodal crowncells), Nodal expression in the LPM was lost. The LPMretained the ability to respond to Nodal signaling, howev-er. Sox17 is expressed in endoderm near the node, but it isnot expressed in the node itself, including the perinodalcrown cells. These observations strongly suggest that sig-nal transfer from the node to the LPM depends on Sox17 inthe endoderm.

The endoderm of the Sox17 mutant mice is defective inmultiple respects [67,68]. First, the gap-junction compo-nent Connexin43 is absent or mislocalized, and gap-junc-tional transport (examined by injection of the diffusible dyeLucifer yellow) is impaired in the mutant endoderm. Sec-ond, epithelial polarity and cellular adhesion are disorga-nized in the mutant endoderm. Third, the distribution ofextracellular matrix (ECM) components such as chondroi-tin sulfate and laminin is partially disorganized in themutant embryos.

It is not clear how these defects in the endoderm arelinked to the L–R defects of Sox17 mutant mice, but severalmechanisms are possible. For instance, gap junctionalcommunication in the endoderm is essential for transferof signal to the LPM in Xenopus [69]. Gap junction-medi-ated propagation of signaling (e.g., Ca2+ signaling) mayinfluence the underlying ECM and facilitate ECM-mediat-ed transport of Nodal protein from the node to the LPM(Figure 4), as proposed previously [70]. This scenario isconsistent with most, if not all, of the observations de-scribed above.

Concluding remarksL–R symmetry breaking takes place in a transient struc-ture termed the ‘node’ of the mouse embryos, or in anequivalent structure of other vertebrates. In most verte-brates including fish, frog and mouse, cilia play an essen-tial role in symmetry breaking. They generateunidirectional fluid flow by rotating, as well as sense theflow via cilia-localized Ca2+ channels. Sensing the flow at

Review Trends in Genetics January 2014, Vol. 30, No. 1

15

Page 18: Trends in Genetics 2014 Jan

the node will eventually lead to L–R asymmetric geneexpression and situs-specific formation of visceral organs.By contrast, other animals seem to make use of a complete-ly different mechanism for symmetry breaking. In Dro-sophila and snail, for example, L–R symmetry breakingdoes not appear to depend on cilia [71]. It will thus be ofinterest to determine the precise mechanisms of symmetrybreaking in diverse organisms. See Box 1 for other openquestions in this field.

Because cilia and fluid flow are involved in a wide rangeof physiological processes, studying the role of cilia andflow in the mouse node will not only deepen our under-standing of symmetry-breaking mechanisms but also pro-vide valuable insights into the general function of cilia inhuman physiology. Further advances will require develop-ment of various new approaches including genetic, cellular,biophysical, and mathematical methods.

AcknowledgmentsWe thank the anonymous reviewers for helpful comments and colleaguesin this field of research for discussion. The work performed in thelaboratory of the authors was supported by a grant from CREST (CoreResearch for Evolutional Science and Technology) of the Japan Scienceand Technology Corporation and by a Grant-in-Aid from the Ministry ofEducation, Culture, Sports, Science, and Technology of Japan. S.Y. wassupported by a fellowship from the Japan Society for the Promotion ofScience for Japanese Junior Scientists and a Grant-in-Aid for ScientificResearch on Innovative Areas.

References1 Tabin, C. (2005) Do we know anything about how left–right asymmetry

is first established in the vertebrate embryo? J. Mol. Histol. 36, 317–323

2 Nakamura, T. and Hamada, H. (2012) Left–right patterning: conservedand divergent mechanisms. Development 139, 3257–3262

3 Okada, Y. et al. (2005) Mechanism of nodal flow: a conserved symmetrybreaking event in left–right axis determination. Cell 121, 633–644

4 Nonaka, S. et al. (2005) De novo formation of left–right asymmetry byposterior tilt of nodal cilia. PLoS Biol. 3, e268

5 Takamatsu, A. et al. (2013) Asymmetric rotational stroke in mousenode cilia during left–right determination. Phys. Rev. E: Stat. Nonlin.Soft Matter Phys. 87, 050701

6 Blum, M. et al. (2009) Xenopus, an ideal model system to studyvertebrate left–right asymmetry. Dev. Dyn. 238, 1215–1225

7 Kreiling, J.A. et al. (2007) Analysis of Kupffer’s vesicle in zebrafishembryos using a cave automated virtual environment. Dev. Dyn. 236,1963–1969

8 Sullivan-Brown, J. et al. (2008) Zebrafish mutations affecting ciliamotility share similar cystic phenotypes and suggest a mechanismof cyst formation that differs from pkd2 morphants. Dev. Biol. 314, 261–275

9 Cui, C. et al. (2009) Rotation of organizer tissue contributes to left–rightasymmetry. Anat. Rec. (Hoboken) 292, 557–561

10 Gros, J. et al. (2009) Cell movements at Hensen’s node establish left/right asymmetric gene expression in the chick. Science 324, 941–944

11 Thompson, H. et al. (2012) The formation and positioning of cilia inCiona intestinalis embryos in relation to the generation and evolutionof chordate left–right asymmetry. Dev. Biol. 364, 214–223

12 Hashimoto, M. et al. (2010) Planar polarization of node cells determinesthe rotational axis of node cilia. Nat. Cell Biol. 12, 170–176

13 Song, H. et al. (2010) Planar cell polarity breaks bilateral symmetry bycontrolling ciliary positioning. Nature 466, 378–382

14 Antic, D. et al. (2010) Planar cell polarity enables posterior localizationof nodal cilia and left–right axis determination during mouse andXenopus embryogenesis. PLoS ONE 5, e8999

15 Mahaffey, J.P. et al. (2013) Cofilin and Vangl2 cooperate in theinitiation of planar cell polarity in the mouse embryo. Development140, 1262–1271

16 Maisonneuve, C. et al. (2009) Bicaudal C, a novel regulator of Dvlsignaling abutting RNA-processing bodies, controls cilia orientationand leftward flow. Development 136, 3019–3030

17 Tran, U. et al. (2010) The RNA-binding protein bicaudal C regulatespolycystin 2 in the kidney by antagonizing miR-17 activity.Development 137, 1107–1116

18 Mitchell, B. et al. (2007) A positive feedback mechanism governs thepolarity and motion of motile cilia. Nature 447, 97–101

19 Okada, Y. et al. (1999) Abnormal nodal flow precedes situs inversus iniv and inv mice. Mol. Cell 4, 459–468

20 Matsuyama, M. et al. (2009) Sfrp controls apicobasal polarity andoriented cell division in developing gut epithelium. PLoS Genet. 5,e1000427

21 Nonaka, S. et al. (1998) Randomization of left–right asymmetry due toloss of nodal cilia generating leftward flow of extraembryonic fluid inmice lacking KIF3B motor protein. Cell 95, 829–837

22 Mitchell, D.R. and Sale, W.S. (1999) Characterization of aChlamydomonas insertional mutant that disrupts flagellar centralpair microtubule-associated structures. J. Cell Biol. 144, 293–304

23 Yagi, T. and Kamiya, R. (2000) Vigorous beating of Chlamydomonasaxonemes lacking central pair/radial spoke structures in the presenceof salts and organic compounds. Cell Motil. Cytoskeleton 46, 190–199

24 Olbrich, H. et al. (2012) Recessive HYDIN mutations cause primaryciliary dyskinesia without randomization of left–right bodyasymmetry. Am. J. Hum. Genet. 91, 672–684

25 Gibbons, I.R. (1981) Cilia and flagella of eukaryotes. J. Cell Biol. 91,107s–124s

26 Omran, H. et al. (2008) Ktu/PF13 is required for cytoplasmic pre-assembly of axonemal dyneins. Nature 456, 611–616

27 Mitchison, H.M. et al. (2012) Mutations in axonemal dynein assemblyfactor DNAAF3 cause primary ciliary dyskinesia. Nat. Genet. 44, 381–389 S381–382

28 Duquesnoy, P. et al. (2009) Loss-of-function mutations in the humanortholog of Chlamydomonas reinhardtii ODA7 disrupt dynein armassembly and cause primary ciliary dyskinesia. Am. J. Hum. Genet.85, 890–896

29 Yamamoto, R. et al. (2010) Discrete PIH proteins function in thecytoplasmic preassembly of different subsets of axonemal dyneins.J. Cell Biol. 190, 65–71

30 Kobayashi, D. and Takeda, H. (2012) Ciliary motility: the componentsand cytoplasmic preassembly mechanisms of the axonemal dyneins.Differentiation 83, S23–S29

31 Olbrich, H. et al. (2002) Mutations in DNAH5 cause primary ciliarydyskinesia and randomization of left–right asymmetry. Nat. Genet. 30,143–144

32 Loges, N.T. et al. (2008) DNAI2 mutations cause primary ciliarydyskinesia with defects in the outer dynein arm. Am. J. Hum.Genet. 83, 547–558

33 Pennarun, G. et al. (1999) Loss-of-function mutations in a human generelated to Chlamydomonas reinhardtii dynein IC78 result in primaryciliary dyskinesia. Am. J. Hum. Genet. 65, 1508–1519

34 Merveille, A.C. et al. (2011) CCDC39 is required for assembly of innerdynein arms and the dynein regulatory complex and for normal ciliarymotility in humans and dogs. Nat. Genet. 43, 72–78

35 Becker-Heck, A. et al. (2011) The coiled-coil domain containing proteinCCDC40 is essential for motile cilia function and left–right axisformation. Nat. Genet. 43, 79–84

Box 1. Outstanding questions

Substantial progress has been achieved in our understanding of L–R

symmetry breaking, but many important questions remain unan-

swered. For example:

� How is A–P information translated into the posterior tilt of node

cilia?

� How is the rotational direction of nodes cilia determined?

� How is the fluid flow sensed by immotile cilia, and what is the

precise role of Ca2+ signaling?

� How is Cerl2 mRNA degraded in response to the flow?

� To what extent is the mechanism for breaking of L–R symmetry

conserved among species?

Review Trends in Genetics January 2014, Vol. 30, No. 1

16

Page 19: Trends in Genetics 2014 Jan

36 Supp, D.M. et al. (1997) Mutation of an axonemal dynein affects left–right asymmetry in inversus viscerum mice. Nature 389, 963–966

37 Collignon, J. et al. (1996) Relationship between asymmetric nodalexpression and the direction of embryonic turning. Nature 381,155–158

38 Lowe, L.A. et al. (1996) Conserved left–right asymmetry of nodalexpression and alterations in murine situs inversus. Nature 381,158–161

39 McGrath, J. et al. (2003) Two populations of node monocilia initiateleft–right asymmetry in the mouse. Cell 114, 61–73

40 Tabin, C.J. and Vogan, K.J. (2003) A two-cilia model for vertebrate left–right axis specification. Genes Dev. 17, 1–6

41 Yoshiba, S. et al. (2012) Cilia at the node of mouse embryos sense fluidflow for left–right determination via Pkd2. Science 338, 226–231

42 Takeda, S. et al. (1999) Left–right asymmetry and kinesin superfamilyprotein KIF3A: new insights in determination of laterality andmesoderm induction by kif3A�/� mice analysis. J. Cell Biol. 145,825–836

43 Fujiu, K. et al. (2011) Mechanoreception in motile flagella ofChlamydomonas. Nat. Cell Biol. 13, 630–632

44 Kamura, K. et al. (2011) Pkd1l1 complexes with Pkd2 on motile ciliaand functions to establish the left–right axis. Development 138,1121–1129

45 Pennekamp, P. et al. (2002) The ion channel polycystin-2 is required forleft–right axis determination in mice. Curr. Biol. 12, 938–943

46 Shinohara, K. et al. (2012) Two rotating cilia in the node cavity aresufficient to break left–right symmetry in the mouse embryo. Nat.Commun. 3, 622

47 Happel, K. and Brenner, H. (1983) Low Reynolds NumberHydrodynamics, Martinus Nijhoff

48 Field, S. et al. (2011) Pkd1l1 establishes left–right asymmetry andphysically interacts with Pkd2. Development 138, 1131–1142

49 Takao, D. et al. (2013) Asymmetric distribution of dynamic calciumsignals in the node of mouse embryo during left–right axis formation.Dev. Biol. 376, 23–30

50 Schweickert, A. et al. (2010) The nodal inhibitor coco is a critical targetof leftward flow in Xenopus. Curr. Biol. 20, 738–743

51 Marques, S. et al. (2004) The activity of the Nodal antagonist Cerl-2 inthe mouse node is required for correct L/R body axis. Genes Dev. 18,2342–2347

52 Kawasumi, A. et al. (2011) Left–right asymmetry in the level of activeNodal protein produced in the node is translated into left–rightasymmetry in the lateral plate of mouse embryos. Dev. Biol. 353,321–330

53 Nakamura, T. et al. (2012) Fluid flow and interlinked feedback loopsestablish left–right asymmetric decay of Cerl2 mRNA. Nat. Commun.3, 1322

54 Inacio, J.M. et al. (2013) The dynamic right-to-left translocation ofCerl2 is involved in the regulation and termination of Nodal activity inthe mouse node. PLoS ONE 8, e60406

55 Brennan, J. et al. (2002) Nodal activity in the node governs left–rightasymmetry. Genes Dev. 16, 2339–2344

56 Saijoh, Y. et al. (2003) Left–right patterning of the mouse lateral platerequires nodal produced in the node. Dev. Biol. 256, 160–172

57 Norris, D.P. and Robertson, E.J. (1999) Asymmetric and node-specificnodal expression patterns are controlled by two distinct cis-actingregulatory elements. Genes Dev. 13, 1575–1588

58 Saijoh, Y. et al. (2000) Left–right asymmetric expression of lefty2 andnodal is induced by a signaling pathway that includes the transcriptionfactor FAST2. Mol. Cell 5, 35–47

59 Adachi, H. et al. (1999) Determination of left/right asymmetricexpression of nodal by a left side-specific enhancer with sequencesimilarity to a lefty-2 enhancer. Genes Dev. 13, 1589–1600

60 Saijoh, Y. et al. (2005) Two nodal-responsive enhancers control left–right asymmetric expression of Nodal. Dev. Dyn. 232, 1031–1036

61 Vincent, S.D. et al. (2004) Asymmetric Nodal expression in the mouse isgoverned by the combinatorial activities of two distinct regulatoryelements. Mech. Dev. 121, 1403–1415

62 Yamamoto, M. et al. (2003) Nodal signaling induces the midline barrierby activating Nodal expression in the lateral plate. Development 130,1795–1804

63 Rankin, C.T. et al. (2000) Regulation of left–right patterning in mice bygrowth/differentiation factor-1. Nat. Genet. 24, 262–265

64 Tanaka, C. et al. (2007) Long-range action of Nodal requires interactionwith GDF1. Genes Dev. 21, 3272–3282

65 Oki, S. et al. (2007) Sulfated glycosaminoglycans are necessary forNodal signal transmission from the node to the left lateral plate in themouse embryo. Development 134, 3893–3904

66 Marjoram, L. and Wright, C. (2011) Rapid differential transport ofNodal and Lefty on sulfated proteoglycan-rich extracellular matrixregulates left–right asymmetry in Xenopus. Development 138, 475–485

67 Saund, R.S. et al. (2012) Gut endoderm is involved in the transfer ofleft–right asymmetry from the node to the lateral plate mesoderm inthe mouse embryo. Development 139, 2426–2435

68 Viotti, M. et al. (2012) Role of the gut endoderm in relaying left–rightpatterning in mice. PLoS Biol. 10, e1001276

69 Beyer, T. et al. (2012) Connexin26-mediated transfer of laterality cuesin Xenopus. Biol. Open 1, 473–481

70 Norris, D.P. (2012) Cilia, calcium and the basis of left–rightasymmetry. BMC Biol. 10, 102

71 Okumura, T. et al. (2008) The development and evolution of left–rightasymmetry in invertebrates: lessons from Drosophila and snails. Dev.Dyn. 237, 3497–3515

72 Blum, M. et al. (2007) Ciliation and gene expression distinguishbetween node and posterior notochord in the mammalian embryo.Differentiation 75, 133–146

73 Zariwala, M.A. et al. (2007) Genetic defects in ciliary structure andfunction. Annu. Rev. Physiol. 69, 423–450

74 Fliegauf, M. et al. (2007) When cilia go bad: cilia defects andciliopathies. Nat. Rev. Mol. Cell Biol. 8, 880–893

Review Trends in Genetics January 2014, Vol. 30, No. 1

17

Page 20: Trends in Genetics 2014 Jan

Hemophilia B Leyden and oncemysterious cis-regulatory mutationsAlister P.W. Funnell and Merlin Crossley

School of Biotechnology and Biomolecular Sciences, University of New South Wales, NSW 2052, Australia

Hemophilia B is a classic, monogenic blood clottingdisease caused by mutations in the coagulation factorIX (F9) locus. Although interpreting mutations within thegene itself has been relatively straightforward, ascribingmolecular mechanisms to the complete suite of muta-tions within the promoter region has proven somewhatdifficult and has only recently been achieved. Thesemutations, which are clustered at discrete transcriptionfactor binding sites, dynamically alter the developmen-tal expression of F9 in different ways. They illustrate howsingle-nucleotide mutations in cis-regulatory regionscan have drastic ramifications for the control of geneexpression and in some instances be causative of dis-ease. Here we present the human F9 promoter as amodel example for which saturation mutation mappinghas revealed the mechanisms of its regulation. More-over, we suggest that the growing number of genome-wide studies of transcription factor activity will acceler-ate both the discovery and understanding of regulatorypolymorphisms and mutations.

Single-nucleotide polymorphisms (SNPs) in regulatoryregions are a driving force in evolutionTechnological advances that facilitate data collection canhave profound effects. Genome sequencing first revealedthat, contrary to some expectations, the human genomedoes not contain vastly more genes than other organisms[1,2]. More recently, as large-scale sequencing technologieshave progressed, genome-wide association studies (GWAS)have revealed that a large proportion of potentially func-tional SNPs do not reside in coding regions, but lie up-stream or downstream of genes in what may be regulatoryregions. It is now generally thought that phenotypic differ-ences, not only between species but also between individu-als, are largely due to differential gene expression profilesthat arise from cis-regulatory mutations [3–5]. Indeed, inhumans, gene expression and transcription factor occupan-cy differ markedly between individuals, owing in partto SNPs and genetic variations in regulatory regions[6,7]. Understanding how regulatory SNPs and mutations

operate and affect the expression of their target genes hasthus become a major priority in biology.

Somewhat surprisingly, however, although many SNPshave been identified in putative regulatory regions, it hasoften proved difficult to determine their mechanism ofaction or even to be sure that they are functionally impor-tant changes, rather than merely markers that are linkedto yet to be discovered functional mutations [8,9]. Manyresearchers may be disappointed by the apparent slowrate of progress. In this review we cover 20 years ofanalysis of single-nucleotide variants in the coagulationfactor IX (F9) proximal promoter to provide a strikingexample of how difficult it is has been to define the mecha-nism by which such mutations operate. Most importantly,we suggest that current advances in our understanding oftranscription factors and their recognition sites, particu-larly derived from chromatin immunoprecipitation se-quencing (ChIP-Seq) studies, may rapidly accelerateprogress.

Hemophilia B Leyden: an unusual genetic disease thatresolves after pubertyHemophilia B is an X-linked, inherited bleeding disorderthat results from mutations in the F9 gene and was firstrecognized as a condition distinct from hemophilia A(another X-linked disease) in the 1950s [10]. In 1970, aparticularly unusual form of hemophilia B was describedin the Netherlands [11]. This disease subtype, termedhemophilia B Leyden, was remarkable in that affectedmales exhibited symptoms in childhood but graduallyimproved, and often recovered clinically, after puberty[12]. Throughout the 1980s and 1990s independent fami-lies from many countries were also identified with hemo-philia B that resolved after puberty [13–26]. Today morethan 100 cases of hemophilia B Leyden have been reported[27,28].

Sequencing of the F9 coding region, splice sites, andupstream regulatory regions from families with Leyden-like symptoms identified point mutations in the proximalpromoter [15,16,18,20,21,23,25,26,29,30]. Over 20 differ-ent point mutations have been identified, clustering intothree regions: one around 20 bp upstream of the majortranscriptional start site (TSS, +1); one at –5; and the thirdimmediately downstream of the TSS, centered around +10(Figure 1). These sites were immediately recognized aspotential regulatory elements. It was hypothesized thateach cluster of mutations disrupted the binding site for atranscriptional activator protein and thereby impaired theexpression of the F9 gene, at least until puberty.

Review

0168-9525/$ – see front matter

� 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.tig.2013.09.007

Corresponding author: Crossley, M. ([email protected]).Keywords: hemophilia; regulatory mutations; promoter mutations; SNPs; generegulation.

18 Trends in Genetics, January 2014, Vol. 30, No. 1

Page 21: Trends in Genetics 2014 Jan

These mutations were of particular interest in thatthey demonstrated how individual point mutations couldbe associated with profound alterations in the develop-mental timing of gene expression. Each of these single-nucleotide mutations is sufficient to switch a gene frombeing expressed throughout life to only after puberty(Figure 2).

Mapping the F9 promoter in vivo via naturallyoccurring, single-nucleotide mutationsThe hunt for the transcription factors that bind to the threefunctional cis-regulatory elements in the F9 promoterbegan in the late 1980s at the time when the first mam-malian transcription factors were being identified. F9 isexpressed primarily in hepatocytes, and because the liveris large and relatively homogeneous it was a suitable modelorgan for early studies in transcription factor purification.Several groups were successful in identifying, purifying,and ultimately cloning the cDNAs for liver-specificDNA-binding proteins [31,32]. It was possible to defineapproximate, albeit imperfect, consensus sequences forthese transcription factors and, accordingly, some DNA-binding proteins were identified as candidates that mightregulate the F9 gene. The first candidate assayed was theleucine zipper protein CCAAT/enhancer-binding protein a

(C/EBPa), which was known to bind the site TTGNNCAA,a sequence that is similar to the +10 region of the F9promoter (Figure 1).

It was found that C/EBPa bound to the +10 region andthat mutations associated with reduced F9 gene expres-sion before puberty disrupted binding [17]. Later studieson C/EBPa knockout mice, when they became available,confirmed that C/EBPa was functionally required fornormal levels of F9 expression [33].

The next transcription factor investigated was hepato-cyte nuclear factor 4 (HNF4). The Hnf4 gene was cloned inthe early 1990s [32] and the purified protein was shown tobind to the –20 region of the F9 promoter and to activatethe promoter in reporter assays [34,35]. Again, the knownmutations disrupted binding and inhibited transactivationby HNF4, confirming that it was functionally required forgene expression.

The mechanism of recovery from hemophilia B afterpubertyAround the same time, researchers began investigating thecause of the increased expression of F9 in Leyden patientsafter puberty. Systematic analysis of hemophilia Bpatients by sequencing the F9 gene identified anothernew subtype of the disease, termed hemophilia B Branden-burg [35]. Afflicted individuals presented with hemorrhagicsymptoms throughout life and exhibited very low levels ofF9. They were found to have normal F9 coding regions andsplice sites, but carried mutations at position –26 in theproximal promoter (Figure 1) [27,28,35–39]. These muta-tions were of particular interest in that, unlike all the othermutations in the F9 promoter, they were not associated withspontaneous recovery at puberty (Figure 2).

It was noted that this region of the promoter resembledan androgen response element (ARE), the binding site forthe androgen receptor (AR) transcription factor. The ARprotein is a zinc-finger steroid receptor family transcrip-tion factor that binds to DNA sequences conforming to theARE consensus and activates transcription in the presenceof testosterone. It was demonstrated that AR can bind tothe –26 region and that the hemophilia B Brandenburgmutations disrupt this binding [35]. Accordingly, it wasconcluded that the hemophilia B Brandenburg patients didnot receive a boost in F9 expression after puberty owing todisruption of the ARE at –26. By contrast, all otherpatients with promoter mutations that impaired activity(that is, around –20, –5, and +10) did experience an in-crease in F9 after puberty because their AREs were intactand responsive to testosterone. Lastly, it should be notedthat the ARE partially overlaps with the HNF4 elementat –20 (Figure 1). As such, the Brandenburg mutationsabolish both sites and the inhibition of binding ofHNF4 and AR together is thought to cause a form ofhemophilia that cannot be resolved by testosterone [38].In addition to the –26 Brandenburg mutations, two addi-tional mutations at –23 and –24 have been reported, andthese are also predicted to disrupt both the AR and HNF4sites (Figure 1). These patients would also be expected notto recover after puberty, but to our knowledge this has notbeen confirmed by longitudinal analysis [28,40].

AGCTCAGCTTGTACTTTGGTACAACTAATCGACCTTACCACTTTCACAATC–26 –24 –23 –21–20–19 –6 –5 +6 +7 +8 +9 +12+13+1

ACT G

GT G

ACC

ACGT ACCG G

CG

AT GATTC TT CNN AGT ACAGNACANNNTGTNCTTGNACTTTG

AR

HNF4α

ONECUT C/EBPα

TRENDS in Genetics

Figure 1. Hemophilia B Leyden mutations in the F9 promoter disrupt the binding of transcription factors. Single-nucleotide mutations associated with hemophilia B Leyden

are denoted by arrows and their positions relative to the transcriptional start site (+1) are indicated. Mutations at –26 (boxed) are associated with a variant of the disease,

termed hemophilia B Brandenburg, which does not resolve at puberty. Experimentally determined DNA-binding consensus sequences for the four transcription factors are

shown [45,67,68]. Transcription factor abbreviations: AR, androgen receptor; C/EBPa, CCAAT/enhancer-binding protein a; HNF4a, hepatocyte nuclear factor 4a; ONECUT,

one cut homeobox (ONECUT1/2). Detailed information on the mutations can be found in the Factor IX Mutation Database [28].

Review Trends in Genetics January 2014, Vol. 30, No. 1

19

Page 22: Trends in Genetics 2014 Jan

The importance of an intact ARE at –26 is also consis-tent with the observation that administration of testoster-one and anabolic steroids is sufficient to elevate F9expression in a pre-pubescent Leyden patient [13]. Morerecently, a case has been reported in which an individualwith hemophilia B Leyden was utilizing anabolic steroidsfor athletic enhancement [41]. When this individualdiscontinued the use of steroids his F9 levels declined.Together these studies provide strong evidence that andro-gens are involved in the increase of F9 observed in hemo-philia B Leyden patients after puberty.

The observation that androgens affect F9 levels inLeyden patients raises the question of whether sex steroidsordinarily play a role in the physiological regulation of F9expression. The alternative view is that the ARE is nor-mally largely non-functional and only becomes significantwhen the F9 promoter is compromised by mutation atthe –20, –5, or +10 sites. Although this issue is difficultto resolve we note that the ARE is not conserved in othermammals, for instance in mouse. Furthermore, in contrastto Leyden patients, the levels of F9 in normal males do notincrease dramatically after puberty (Figure 2) [42] and aresimilar to those found in females [43]. Thus it seems quitepossible that the ARE is normally a non-functional elementthat has become important only in the context of hemo-philia B Leyden. If this is the case, it is an excellentexample of how different patterns of developmental andhormone-responsive regulation can evolve in different spe-cies, or even different individuals in the same species, viavery subtle mutations that essentially unmask consensusbinding sites that may be present by chance.

The final piece of the puzzleGiven that the mechanism by which the –20 and +10 pointmutation clusters operated, and that the defect accountingfor hemophilia B Brandenburg were all explained in the

early 1990s, one might have expected that the transcrip-tion factor binding to the third cluster of mutations, thoselying around –5, would also be rapidly found. The impetusto find this protein was particularly strong because the –5cluster of mutations accounts for around half of all hemo-philia B Leyden patients [27,28]. The –5 region contains aCpG dinucleotide and such sites had been noted at the timeto be mutational hotspots in the F9 gene and other loci [44].Moreover, the various mutations at this site are associatedwith different severities of disease [18,19,21,28], and there-fore understanding their mechanisms of action was of someinterest. Nevertheless, despite significant interest byresearchers, no progress was made for more than 20 years,illustrating how difficult it can be to determine the mech-anism by which regulatory SNPs and mutations affect theexpression of nearby genes.

The –5 puzzle was finally solved by two groups simulta-neously who recently published their results together [45].Progress on several fronts prepared the ground for theadvance. The cloning of virtually all DNA-binding proteinsand the advent of ChIP-Seq were fundamental to thebreakthrough. Twenty years ago, not all DNA-bindingproteins had been identified, and it was therefore notpossible to test all candidate factors. Now, more than1500 putative human transcription factors are known[46] and systematic analysis, although being labor-inten-sive, is at least possible [47,48]. The ChIP-Seq data for thetranscription factor ONECUT1 (also known as hepatocytenuclear factor 6, HNF6) was instrumental in suggesting itas a candidate for binding to the –5 element in the F9promoter [49]. Additionally, ChIP-on-chip data providedevidence that (in the mouse at least) ONECUT1 occupiedthe proximal promoter of the F9 gene in hepatocytes [50].

Extensive ChIP-Seq studies have now shown thatONECUT1 specifically binds to the –5 region of the F9promoter in human hepatocytes [45] (Figure 1). ONECUT1functionally activates the promoter in reporter experi-ments, and knockout mouse embryos deficient inONECUT1 have reduced levels of F9 [45]. Interestingly,the related factor ONECUT2 also binds the –5 sitein vitro, and studies on Onecut1/Onecut2 double knockoutembryos confirmed that when both proteins are absentthe F9 locus is virtually silent [45].

Importantly, the various mutations lying within the –5cluster all disrupt the binding of the ONECUT factorsin vitro and inhibit transactivation in cellular assays[45]. Interestingly, the degree of inhibition varies, andthere is a broad agreement between the severity of hemo-philia and the degree to which the mutation disruptsbinding. It should be stressed, however, that additionalfactors, such as other genetic, epigenetic, or environmentaleffects, may also influence symptoms – and thus the clini-cal severity of hemophilia B cannot be reliably predictedpurely through knowledge of the mutation.

The F9 gene: saturation mapping of a human promoterThe F9 gene was one of the first human genes isolated andsince that time more than 1000 distinct mutations havebeen identified in thousands of patients worldwide [27,28].The number of mutations is such that it is possible thatthe gene has effectively been subjected to saturation

0

50

100

150

0 10 20 30 40 50 60 70

Age (years)

% N

orm

al F

9Normal

Brandenburg

Leyden

NorNormalmal

Brandenburg

Leyden

AR

TRENDS in Genetics

Figure 2. The developmental expression of F9 is dramatically altered in

hemophilia B Leyden and Brandenburg patients. F9 expression ordinarily

increases steadily throughout life (shown in red). In hemophilia B Leyden

sufferers (light blue), F9 levels rise rapidly at the onset of puberty, thought to be

due in part to increasing androgen receptor (AR) activity. Hemophilia B

Brandenburg patients (purple) contain a mutation in the androgen response

element (ARE) and do not show clinical improvement after puberty. Estimations

of F9 levels have been approximated from other studies [12,35,38,42,43,69].

Dotted lines represent extrapolations owing to a lack of available clinical

data.

Review Trends in Genetics January 2014, Vol. 30, No. 1

20

Page 23: Trends in Genetics 2014 Jan

mutagenesis in vivo and most functional mutations havenow been identified [40]. The F9 gene encodes a serineprotease and many mutations interfere with the activity orthe stability of this enzyme. Other mutations affect post-translational processing, including cleavage of the pre- andpropeptides, or modifications such as g-carboxylation orb-hydroxylation. There are also mutations that affect RNAprocessing at the level of splicing, as in the case of the ‘royaldisease’ that afflicted descendants of Queen Victoria [51].

The 21 distinct mutations in the proximal promoter verypossibly identify all essential elements in the regulatoryregion. This is not to say that no other transcription factorsbind and contribute to expression, but rather that if thereare other sites in the promoter they are not required tosustain clinically relevant levels of F9. There is also thepossibility that there are cases of hemophilia B for whichno mutation has yet been identified, perhaps most likely indistal regions that may not have been investigated byroutine sequencing of the F9 coding region and promoter.These mutations, if they exist, may one day be found tomap to an essential enhancer or to some other gene that isrequired for F9 expression or activity. Indeed, an upstreamregulatory region appears to be required for sustainedexpression of F9 in transgenic mice [52]; however, nohuman mutations associated with hemophilia have beenreported in this region.

The challenge of assigning functions to regulatory SNPsand mutationsHistorically, disease-causing mutations were frequentlydescribed in the coding regions of genes rather than incis-regulatory regions [8]. This was partly because thefunctional impact could much more easily be assigned tosuch mutations, but also because the sequencing of pro-moter and enhancer elements was rarely a part of routinediagnostic testing [53]. Moreover, regulatory regions ofthe genome were difficult to define and cannot always bepredicted by sequence conservation [54–56]. Today, how-ever, regulatory mutations represent an increasing frac-tion of all disease causing variants (currently 1.9% in TheHuman Gene Mutation Database, August 2013 [9,57]).GWAS and linkage analyses have revealed many SNPsthat are linked to a range of genetic disorders, and it isexpected that many of these will also influence gene ex-pression by disrupting transcription factor binding sites[8]. Will it also take a long time to identify the relevanttranscription factors?

In some cases it may. In the case of hemophilia B Leydenthe disorder was clearly defined, the mutations residedwithin the proximal promoter, they were encounteredrepeatedly in independent patients with similar pheno-types, and they typically altered the levels of F9 expressionby at least an order of magnitude, both in vivo and incellular transfection assays in vitro. By contrast, manySNPs that are linked to other genetic disorders will bemore challenging. Some will reside in distal enhancers orsilencer elements that are involved in long-range interac-tions and may not function properly in reporter assays –and will thus be more difficult to study. Some may be gain-of-function mutations creating new transcription factorbinding sites that would not ordinarily be detected by

ChIP-Seq analysis of wild type tissue, as in [58,59]. Otherswill have subtle but clinically relevant effects that aredifficult to assess against a background of other genetic,epigenetic, and environmental factors.

Nevertheless, recent advances in defining both the reg-ulatory components of the human genome as well as thegenomic occupancy of transcription factors are likely toaccelerate the elucidation of the molecular mechanisms bywhich regulatory SNPs and mutations operate. In therecent ENCODE studies, DNase-Seq was performed fora diverse array of cell types, highlighting regions of regu-latory potential [60]. In addition, ChIP-Seq experimentswere carried out for 119 DNA-binding proteins, furtherdefining the in vivo DNA-binding preference matrices forthese factors [48]. Other recent technical advances includemethods for precisely determining genome-wide transcrip-tion factor occupancy [61], and high-throughput techniquesfor functional validation of regulatory elements [62] anddetermination of transcription factor binding consensuses[47,63].

Cumulatively, these studies are informing our classifi-cation of the regulatory segments of the genome. In thefuture, SNPs linked to genetic conditions via GWASexperiments will increasingly be mapped directly to com-pilations of ChIP-Seq and DNase-Seq data to see if anyreside in known, in vivo binding sites for any studiedtranscription factors in the tissue of interest, as in [64].Improved knowledge of the matrices for transcription fac-tor binding in vivo will enable predictions about whether ornot the SNP will significantly influence the affinity oftranscription factor binding. The predicted impact canreadily be assessed by in vitro DNA-binding assays or,where possible, ChIP-Seq studies on patient samples.Mutation of the candidate motif in vivo can now moreeasily be achieved with emerging genomic engineeringtechnologies utilizing artificial zinc-finger nucleases, tran-scription activator-like effector nucleases (TALENs), andclustered regulatory interspaced short palindromic repeat(CRISPR)-associated nucleases [65]. Functional verifica-tion of the contribution that the transcription factor makesto gene expression can be tested either by RNAi knockdownexperiments in cellular assays or by examination ofknockout animal models, which are increasingly availablefrom large-scale repositories (such as EUCOMM, http://www.eucomm.org/; KOMP, https://www.komp.org/; andMMRRC, http://www.mmrrc.org/).

Thus a range of technical advances mean that it is noweasier to assign function to regulatory SNPs and mutationsthan it was in the past. There are undoubtedly multiplechallenges that will continue to hinder progress. For in-stance, ChIP-Seq studies have shown that transcriptionfactors typically bind to thousands of sites genome-wide,and many such binding events are thought to be func-tionally inconsequential [66]. Moreover, euchromaticcis-regulatory regions are frequently co-occupied by a mul-titude of transcription factors [66]. As such, discerningfunctionally relevant proteins from those that bind as a mereconsequence of chromatin accessibility may prove difficult.In addition, standard ChIP-Seq assays ordinarily definerelatively broad regions of occupancy that may containnumerous putative binding sites for a given transcription

Review Trends in Genetics January 2014, Vol. 30, No. 1

21

Page 24: Trends in Genetics 2014 Jan

factor. In such cases, specific bona fide binding sites mightonly be identified by subsequent functional validation or byhigh-resolution techniques such as ChIP-exo [61]. Lastly,although the number of ChIP-Seq datasets available isindeed on the rise, it remains an elaborate undertaking toinvestigate the complete cohort of human transcription fac-tors across an extensive range of cell types that may or maynot recapitulate typical gene regulation in vivo.

Gene regulation will remain a subtle and sophisticatedfield, and it is inevitable that the mechanism of action, andindeed the functional relevance, of many SNPs and muta-tions will remain mysterious for many years to come.Nevertheless, the major technical leaps forward in thepast two decades mean that significance progress will bemade in understanding how disease genes and other lociare regulated, and will proceed more rapidly in the future.

AcknowledgmentsThis work has been supported by funding from the Australian ResearchCouncil and the National Health and Medical Research Council.

References1 Lander, E.S. et al. (2001) Initial sequencing and analysis of the human

genome. Nature 409, 860–9212 Venter, J.C. et al. (2001) The sequence of the human genome. Science

291, 1304–13513 Wray, G.A. (2007) The evolutionary significance of cis-regulatory

mutations. Nat. Rev. Genet. 8, 206–2164 Wittkopp, P.J. and Kalay, G. (2012) Cis-regulatory elements: molecular

mechanisms and evolutionary processes underlying divergence. Nat.Rev. Genet. 13, 59–69

5 Zheng, W. et al. (2011) Regulatory variation within and betweenspecies. Annu. Rev. Genomics Hum. Genet. 12, 327–346

6 Kasowski, M. et al. (2010) Variation in transcription factor bindingamong humans. Science 328, 232–235

7 Cookson, W. et al. (2009) Mapping complex disease traits with globalgene expression. Nat. Rev. Genet. 10, 184–194

8 Epstein, D.J. (2009) Cis-regulatory mutations in human disease. Brief.Funct. Genomic Proteomic 8, 310–316

9 Worsley-Hunt, R. et al. (2011) Identification of cis-regulatory sequencevariations in individual genome sequences. Genome Med. 3, 65

10 Biggs, R. et al. (1952) Christmas disease: a condition previouslymistaken for haemophilia. Br. Med. J. 2, 1378–1382

11 Veltkamp, J.J. et al. (1970) Another genetic variant of haemophilia B:haemophilia B Leyden. Scand. J. Haematol. 7, 82–90

12 Briet, E. et al. (1982) Hemophilia B Leyden: a sex-linked hereditarydisorder that improves after puberty. N. Engl. J. Med. 306, 788–790

13 Briet, E. et al. (1985) The prophylactic treatment of hemophilia BLeyden with anabolic steroids. Ann. Intern. Med. 103, 225–226

14 Mandalaki, T. et al. (1986) Haemophilia B Leyden in Greece. Thromb.Haemost. 56, 340–342

15 Reitsma, P.H. et al. (1988) The putative factor IX gene promoter inhemophilia B Leyden. Blood 72, 1074–1076

16 Reitsma, P.H. et al. (1989) Two novel point mutations correlate with analtered developmental expression of blood coagulation factor IX(hemophilia B Leyden phenotype). Blood 73, 743–746

17 Crossley, M. and Brownlee, G.G. (1990) Disruption of a C/EBP bindingsite in the factor IX promoter is associated with haemophilia B. Nature345, 444–446

18 Hirosawa, S. et al. (1990) Structural and functional basis of thedevelopmental regulation of human coagulation factor IX gene:factor IX Leyden. Proc. Natl. Acad. Sci. U.S.A. 87, 4421–4425

19 Crossley, M. et al. (1990) A less severe form of Haemophilia B Leyden.Nucleic Acids Res. 18, 4633

20 Royle, G. et al. (1991) Haemophilia B Leyden arising de novo by pointmutation in the putative factor IX promoter region. Br. J. Haematol.77, 191–194

21 Vidaud, D. et al. (1993) Nucleotide substitutions at the –6 position inthe promoter region of the factor IX gene result in different severity of

hemophilia B Leyden: consequences for genetic counseling. Hum.Genet. 91, 241–244

22 Coyle, T.E. et al. (1994) Moderate hemophilia B Leyden: identificationby polymerase chain reaction, sequencing, and oligomer restriction.Am. J. Hematol. 46, 234–240

23 Hall, A.J. et al. (1994) A single base pair deletion in the promoter regionof the factor IX gene is associated with haemophilia B. Thromb.haemost. 72, 799–803

24 Morgan, G.E. et al. (1995) The high frequency of the –6G>A factor IXpromoter mutation is the result both of a founder effect and recurrentmutation at a CpG dinucleotide. Br. J. Haematol. 89, 672–674

25 Nielsen, L.R. et al. (1995) Detection of ten new mutations by screeningthe gene encoding factor IX of Danish hemophilia B patients. Thromb.Haemost. 73, 774–778

26 Crossley, P.M. et al. (1989) Unusual case of haemophilia B. Lancet 1,960

27 Giannelli, F. et al. (1998) Haemophilia B: database of point mutationsand short additions and deletions – eighth edition. Nucleic Acids Res.26, 265–268

28 Rallapalli, P.M. et al. (2013) An interactive mutation database forhuman coagulation factor IX provides novel insights into thephenotypes and genetics of hemophilia B. J. Thromb. Haemost. 11,1329–1340

29 Picketts, D.J. et al. (1992) An A to T transversion at position –5 of thefactor IX promoter results in hemophilia B. Genomics 12, 161–163

30 Reijnen, M.J. et al. (1993) Hemophilia B Leyden: substitution ofthymine for guanine at position –21 results in a disruption of ahepatocyte nuclear factor 4 binding site in the factor IX promoter.Blood 82, 151–158

31 Landschulz, W.H. et al. (1988) Isolation of a recombinant copy of thegene encoding C/EBP. Genes Dev. 2, 786–800

32 Sladek, F.M. et al. (1990) Liver-enriched transcription factor HNF-4 isa novel member of the steroid hormone receptor superfamily. GenesDev. 4, 2353–2365

33 Davies, N. et al. (1997) Clotting factor IX levels in C/EBP alphaknockout mice. Br. J. Haematol. 99, 578–579

34 Reijnen, M.J. et al. (1992) Disruption of a binding site for hepatocytenuclear factor 4 results in hemophilia B Leyden. Proc. Natl. Acad. Sci.U.S.A. 89, 6300–6303

35 Crossley, M. et al. (1992) Recovery from hemophilia B Leyden: anandrogen-responsive element in the factor IX promoter. Science 257,377–379

36 Belvini, D. et al. (2005) Molecular genotyping of the Italian cohort ofpatients with hemophilia B. Haematologica 90, 635–642

37 Wulff, K. et al. (1999) Molecular analysis of hemophilia B in Poland:12 novel mutations of the factor IX gene. Acta Biochim. Pol. 46,721–726

38 Morgan, G.E. et al. (1997) Further evidence for the importance of anandrogen response element in the factor IX promoter. Br. J. Haematol.98, 79–85

39 Heit, J.A. et al. (1999) Haemophilia B Brandenberg-type promotermutation. Haemophilia 5, 73–75

40 Ketterling, R.P. et al. (1995) Two novel factor IX promoter mutations:incremental progress towards ‘saturation in vivo mutagenesis’ of ahuman promoter region. Hum. Mol. Genet. 4, 769–770

41 Rimmer, E.K. et al. (2012) Unintended benefit of anabolic steroid use inhemophilia B Leiden. Am. J. Hematol. 87, 122–123

42 Andrew, M. et al. (1992) Maturation of the hemostatic system duringchildhood. Blood 80, 1998–2005

43 Lowe, G.D. et al. (1997) Epidemiology of coagulation factors, inhibitorsand activation markers: the Third Glasgow MONICA Survey. I.Illustrative reference ranges by age, sex and hormone use. Br. J.Haematol. 97, 775–784

44 Green, P.M. et al. (1990) The incidence and distribution of CpG–TpGtransitions in the coagulation factor IX gene. A fresh look at CpGmutational hotspots. Nucleic Acids Res. 18, 3227–3231

45 Funnell, A.P. et al. (2013) A CpG mutational hotspot in a ONECUTbinding site accounts for the prevalent variant of hemophilia B Leyden.Am. J. Hum. Genet. 92, 460–467

46 Vaquerizas, J.M. et al. (2009) A census of human transcription factors:function, expression and evolution. Nat. Rev. Genet. 10, 252–263

47 Jolma, A. et al. (2013) DNA-binding specificities of humantranscription factors. Cell 152, 327–339

Review Trends in Genetics January 2014, Vol. 30, No. 1

22

Page 25: Trends in Genetics 2014 Jan

48 ENCODE Project Consortium (2012) An integrated encyclopedia ofDNA elements in the human genome. Nature 489, 57–74

49 Laudadio, I. et al. (2012) A feedback loop between the liver-enrichedtranscription factor network and miR-122 controls hepatocytedifferentiation. Gastroenterology 142, 119–129

50 Odom, D.T. et al. (2004) Control of pancreas and liver gene expressionby HNF transcription factors. Science 303, 1378–1381

51 Rogaev, E.I. et al. (2009) Genotype analysis identifies the cause of the‘royal disease’. Science 326, 817

52 Kurachi, S. et al. (1999) Genetic mechanisms of age regulation ofhuman blood coagulation factor IX. Science 285, 739–743

53 de Vooght, K.M. et al. (2009) Management of gene promoter mutationsin molecular diagnostics. Clin. Chem. 55, 698–708

54 Parker, S.C. et al. (2009) Local DNA topography correlates withfunctional noncoding regions of the human genome. Science 324,389–392

55 Schmidt, D. et al. (2010) Five-vertebrate ChIP-seq reveals theevolutionary dynamics of transcription factor binding. Science 328,1036–1040

56 Odom, D.T. et al. (2007) Tissue-specific transcriptional regulationhas diverged significantly between human and mouse. Nat. Genet. 39,730–732

57 Stenson, P.D. et al. (2012) The Human Gene Mutation Database(HGMD) and its exploitation in the fields of personalized genomicsand molecular evolution. Curr. Protoc. Bioinform. 39, 1.13.1–1.13.20

58 De Gobbi, M. et al. (2006) A regulatory SNP causes a human geneticdisease by creating a new transcriptional promoter. Science 312,1215–1217

59 Lettice, L.A. et al. (2008) Point mutations in a distant sonic hedgehogcis-regulator generate a variable regulatory output responsible forpreaxial polydactyly. Hum. Mol. Genet. 17, 978–985

60 Thurman, R.E. et al. (2012) The accessible chromatin landscape of thehuman genome. Nature 489, 75–82

61 Rhee, H.S. and Pugh, B.F. (2011) Comprehensive genome-wideprotein–DNA interactions detected at single-nucleotide resolution.Cell 147, 1408–1419

62 Kheradpour, P. et al. (2013) Systematic dissection of regulatory motifsin 2000 predicted human enhancers using a massively parallel reporterassay. Genome Res. 23, 800–811

63 Berger, M.F. and Bulyk, M.L. (2009) Universal protein-bindingmicroarrays for the comprehensive characterization of the DNA-bindingspecificities of transcription factors. Nat. Protoc. 4, 393–411

64 Maurano, M.T. et al. (2012) Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195

65 Gaj, T. et al. (2013) ZFN, TALEN, and CRISPR/Cas-based methods forgenome engineering. Trends Biotechnol. 31, 397–405

66 Biggin, M.D. (2011) Animal transcription networks as highlyconnected, quantitative continua. Dev. Cell 21, 611–626

67 Tewari, A.K. et al. (2012) Chromatin accessibility reveals insights intoandrogen receptor activation and transcriptional specificity. GenomeBiol. 13, R88

68 Portales-Casamar, E. et al. (2010) JASPAR 2010: the greatly expandedopen-access database of transcription factor binding profiles. NucleicAcids Res. 38, D105–D110

69 Simpson, N.E. and Biggs, R. (1962) The inheritance of Christmasfactor. Br. J. Haematol. 8, 191–203

Review Trends in Genetics January 2014, Vol. 30, No. 1

23

Page 26: Trends in Genetics 2014 Jan

The ectodysplasin pathway: fromdiseases to adaptationsAlexa Sadier, Laurent Viriot, Sophie Pantalacci, and Vincent Laudet

Institut de Ge nomique Fonctionnelle de Lyon, Universite de Lyon, Universite Lyon 1, Centre National de la Recherche Scientifique

(CNRS), Ecole Normale Supe rieure de Lyon, 46 alle e d’Italie, 69364 Lyon CEDEX 07, France

The ectodysplasin (EDA) pathway, which is active duringthe development of ectodermal organs, including teeth,hairs, feathers, and mammary glands, and which iscrucial for fine-tuning the developmental network con-trolling the number, size, and density of these struc-tures, was discovered by studying human patientsaffected by anhidrotic/hypohidrotic ectodermal dyspla-sia. It comprises three main gene products: EDA, a ligandthat belongs to the tumor necrosis factor (TNF)-a family,EDAR, a receptor related to the TNFa receptors, andEDARADD, a specific adaptor. This core pathway relieson downstream NF-kB pathway activation to regulatetarget genes. The pathway has recently been found to beassociated with specific adaptations in natural popula-tions: the magnitude of armor plates in sticklebacks andthe hair structure in Asian human populations. Thus,despite its role in human disease, the EDA pathway is a‘hopeful pathway’ that could allow adaptive changes inectodermal appendages which, as specialized interfaceswith the environment, are considered hot-spots of mor-phological evolution.

Ectodermal dysplasiasIn a section of his 1875 book The Variation of Animals andPlants under Domestication, Charles Darwin discusses that‘skin and the appendages of hair, feathers, hoofs, horns andteeth, are homologous over the whole body’. In this verysame section entitled ‘Correlated variation of homologousparts’ he describes the case of a ‘Hindoo family in Scinde, inwhich ten men, in the course of four generations, werefurnished, in both jaws taken together, with only four smalland weak incisor teeth and with eight posterior molars. Themen thus affected have very little hair on the body, andbecome bald early in life. They also suffer much during hotweather from excessive dryness of the skin. It is remarkablethat no instance has occurred of a daughter being thusaffected’ [1]. In a few sentences he described what becameknown as anhidrotic/hypohidrotic ectodermal dysplasia(HED) and rightly pointed out two striking features of this

disease: it affects several ectodermal organs in a correlatedmanner, and predominantly affects men (one of the genesresponsible is carried by the X chromosome). HED is in factone of many (more than 150) relatively rare diseases, collec-tively termed ectodermal dysplasias, that affect the skin andother ectodermal organs.

In HED (OMIM 257980), patients lack hair (hypotri-chosis), teeth (oligodontia), and nails. Teeth are oftenmalformed in these HED patients. Sweat gland deficiency(anhydrosis/hypohydrosis) explains the inability to resisthigh temperatures mentioned by Darwin. Several otherglands are also affected, including the lacrimal glands, thesebaceous glands, the Meibomian glands, and possiblymucous glands in the respiratory tract. Many affectedindividuals present a characteristic facial appearance(prominent forehead, thick lips) and in some cases animmune deficiency has been observed [2,3]. It is fascinatingthat this example, reported by Darwin as an interestingcase of pathological variation in human, is now a seminalexample of how signaling pathways can affect humanhealth as well as be recruited in wild species for specificadaptations. Two cases have been particularly illustrativein that respect: in stickleback a variant of eda is associatedwith variation in the defensive armor [4], and in human apoint mutation in the receptor EDAR has been shown to beunder positively selection and has been linked to hairthickness and specific tooth morphology [5]. Here we re-view these two cases and discuss the implications of thisdual link between human disease and natural variation.

From human genetics to animal mutantsMore than 120 years after Darwin made his observations,the gene mutated in the X-linked forms of the disease wasfirst mapped on chromosome X of male patients and lateridentified using rare female patients that harbor an X:au-tosome translocation [6,7]. This gene was termed EDA forectodysplasin in reference to the disease. A very similarphenotype (hair defects, tooth abnormalities, and absenceof sweat glands [8]), had been described in the mousespontaneous X-linked mutant tabby. This facilitated thecloning of the Eda/tabby gene, the mouse ortholog of thehuman EDA gene [9,10], and eventually pointed to the factthe encoded protein was a member of the TNF superfamily.From then on mouse mutants which closely resemble thetabby mutant enabled the identification of other genes ofthe core ‘EDA pathway’. The mouse mutant downless led tothe Edar gene, a member of TNF receptor superfamily([11,12], reviewed in [13]). Lastly, the crinkled mouse

Review

0168-9525/$ – see front matter

� 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.tig.2013.08.006

Corresponding author: Laudet, V. ([email protected]).Keywords: ectodysplasin; anhidrotic/hypohidrotic ectodermal dysplasia; adaptation;ectodermal appendages; signaling pathways.

24 Trends in Genetics, January 2014, Vol. 30, No. 1

Page 27: Trends in Genetics 2014 Jan

mutant, almost identical to the tabby and downlessmutants, enabled identification of the Edaradd gene,encoding a death-domain adaptor [14] (Figure 1A). Muta-tions in the orthologous genes in human, EDAR andEDARADD, also cause HED [3,14,15]. To date there aremore than 100 different mutations, recessive or domi-nant, described in the X-linked EDA gene, whereas only20 and 3 causative mutations have been found in EDARand EDARADD, respectively [3,16,17] (Figure 1B). Hypo-morphic alleles of EDA have been discovered that lead toa reduction in tooth number (affecting the incisors only)although a clinically normal phenotype has also beenobserved [18–20]. Of note, similar cases of ectodermaldysplasia have been attributed to mutations in EDAR andEDARADD orthologs in other mammals including dogs[21], various cattle breeds [22], and rats [23]. The simi-larity of the ectodermal dysplasia phenotype in mutantsof these five species highlights the conserved and impor-tant role of the pathway in mammalian ectodermal ap-pendage development.

Recent data show that the EDA pathway is not restrict-ed to mammals and is largely conserved in most, if not all,vertebrates. The genes of the EDA pathway are expressedduring feather development in chicken (with the notabledifference of EDA being expressed in the mesenchymerather than in the epithelium) [24]. In addition, inhibition

of the pathway through dominant negative receptor ex-pression decreased the number of feather placodes (theembryonic structure that will later develop into a specificorgan, here feather) [25]. More recently the EDA pathwaywas also implicated in ectodermal development in fish: anedar mutant in medaka is almost completely devoid ofscales and exhibits a tooth phenotype [26,27]. Similarly,mutations in zebrafish eda and edar genes located in theprotein at positions homologous to those mutated in hu-man HED patients lead to defects in fins, scales, andpharyngeal teeth [28]. From these data, it is likely thatthe EDA pathway controls ectodermal appendages in allvertebrates and provides a common paradigm governingthe development of all these structures from hairs, scales,or feathers to various glands.

From EDA to the NF-kB pathwayGiven the similar clinical syndrome produced by mutationof these three genes in humans and their three counter-parts in mice, it is no surprise that their products form alinear pathway: the trimeric EDA ligand binds to thetrimeric EDAR receptor which, upon binding, recruitsthe EDARADD adaptor via death-domain/death-domaininteractions (Figure 1 [29]). These interactions weretested in vitro: first, EDAR was shown to interact withone protein product of the Eda gene, the secreted protein

EDA

EDARADD

Tak1Tab2

Cano

nica

l NF-

κB p

athw

ayIKK1 IKK2

Cano

nica

l EDA

pat

hway

IKIKIKIKKIKKIKKKKKKKKIKKKKKKKKKKKKKKKKKKKKKKKK1111111111 IKKKKKKIKKIKKIKKKKIKKIKKKKIKKIKKKKKKIKKKIKKKKIKKIKKIKIK 2222222222222222222

NEMO

IκBNF-κB

DD

DD

EDA

EDAR

EDARADD

LBD

Traf6 BS

TNFColFurin cleavage

TM

EDA X-linked mut a�ons

Key:

(A) (B)

Dominant au tosomal mut a�ons

Recessi ve au tosomal mut a�onsIκB

EDAR

Traf6

TRENDS in Genetics

Figure 1. The canonical EDA pathway. (A) Principal model for the EDA pathway downstream signaling: the trimeric ligand EDA binds to the trimeric receptor EDAR leading

to the recruitment of the adaptor EDARADD and the formation of a complex containing EDARADD, Traf6, Tab2, and Tak1. Tak1 activates the IKK complex (IKK1, IKK2, and

NEMO) which ubiquitinates IkB, leading to the release of NF-kB transcription factor. NF-kB is then translocated into the nucleus to activate target genes. Figure adapted

from [37]. (B) EDA, EDAR, and EDARADD proteins and their functional domains. Col, collagen domain. DD, death domain; LBD, ligand-binding domain; TM, transmembrane

domain; TNF, TNF domain; Traf6 BS, Traf6 binding site. Principal mutations are indicated. For EDAR and EDARADD autosomal mutations, it is interesting to note that

dominant mutations are mainly localized in the interaction domain, resulting in impaired oligomerization and a dominant negative effect.

Review Trends in Genetics January 2014, Vol. 30, No. 1

25

Page 28: Trends in Genetics 2014 Jan

EDA-A1 [30–33], through its TNF-binding domain. Then,EDAR was show to recruit the intracellular adaptorEDARADD through its death domain. Another proteinproduct of the Eda gene (EDA-A2), resulting from theskipping of two amino acids by alternative splicing, waslater shown to bind another TNFR receptor, EDA2R (pre-viously termed XEDAR) [33]. A recent report based on aunique case suggests that loss of function of this gene canbe associated with HED, although a mouse knockout doesnot induce an HED-like phenotype [34,35].

This cascade, from EDA and EDAR to EDARADD, isreminiscent of other TNF pathways (e.g., TNF/TNFR/TRADD), therefore it was hypothesized that this earlysignaling would lead to activation of the NF-kB pathway[30,36]. This was first shown in cellulo [14,30,36] andfurther confirmed by demonstrating the involvement ofcanonical partners for the NF-kB activation pathway suchas TAB2, IKK, and TRAF 6 ([37], for review see [29]).Although the activation of the JNK pathway and apoptosiswere also proposed, these results remain controversialbecause the response remains very weak as compared toother TNF [30,36,38]. The relevance of NF-kB activationwas further suggested when mice mutant for Traf6 or IKK/NEM were shown to exhibit tabby-like phenotypes, andHED patients were identified that had mutations in thesegenes [39–41]. However there is also some evidence that analternative, non-NF-kB-dependant pathway may exist[42]. Finally, the comparison of a NF-kB–LacZ reporteractivation in tabby versus wild type mice also indicatedthat decreased EDA signaling effectively induces a severedecrease in NF-kB activation [38,43]. Therefore it is clear,from the data discussed above, that the NF-kB pathway isa major transducer of the EDA pathway.

The implication of the NF-kB pathway in developmentcame as a surprise because it is primarily known for itsmajor role in inflammation and immunity, acting down-stream of the activation of TNF immune receptors [44]. Bycontrast, the activation of NF-kB by the EDA pathway doesnot seem to be obviously linked to inflammation. Thisraises the recurrent question of how cells can differentiatebetween the types of messages transduced by the NF-kBpathway and how specific target genes are activated fordevelopmental or inflammatory responses. Interestingly,mice mutant for core proteins of the NF-kB pathway suchas TRAF6 and NEMO [40,45] exhibit not only an HEDsyndrome and its developmental defects but also inflam-mation problems. Interestingly, recent data suggest thatthe EDA pathway induces the transcription of chemokines(key players of inflammation) that also have a developmen-tal role [46]. Moreover, it has been observed that theectodermal overexpression of the glucocorticoid receptor(GR), a well-known inhibitor of NF-kB in inflammation,induces an HED phenotype [47]. This suggests that thepathways in ectoderm development and inflammation maynot be fundamentally different, and that the specificity oftarget gene activation may largely be determined by thegeneral context of a given cell type.

Tuning ectodermal appendages with the EDA pathwayMany developmental pathways other than the EDA path-way, such as Wnt, Fgf, BMP, and hedgehog, also participate

in the development of ectodermal appendages. However,most of these pathways are highly pleiotropic, and disrup-tion of even one core component of these pathways generallyleads to very severe developmental defects not only inectodermal appendages but also in other organs. By con-trast, the absence of any one factor in the EDA pathwayspecifically affects only ectodermal appendages, and gener-ally (with a single exception to our knowledge) does notpreclude their development. Instead, these organs typicallypresent defects in terms of shape, number, size, or positionin the organism. For example, in both humans and mice, thedental formula and tooth shape is abnormal (reviewed in[13,29]). In mice, primary (guard) hairs do not develop, butothers hair types exhibit an abnormal morphology anddecreased density [48–50]. In addition, mammary glandsand salivary glands exhibit defects in branching and size[29,51–54]. The mutation of edar in medaka has a similareffect, leading to a reduction of scale number and a drasticdecrease in tooth number and organization in both the oraland pharyngeal cavities [27]. Conversely, increasing EDApathway activity tends to increase the number, size, anddensity of ectodermal appendages. For example, overex-pression of Eda in the ectoderm [under the control of thekeratin 14 (K14) promoter (K14–Eda mice)] results inectopic teeth and mammary glands, larger hair placodes,and higher dental complexity [50,55–58]. In chicken,hyperactivation of EDAR during feather developmentresults in increased feather placode density. Taken to-gether, these findings suggest that the EDA pathwayplays a conserved role in fine-tuning the size, spacing,and position (and probably thereby shape) of ectodermalorgans in vertebrates.

This tuning role of the EDA pathway is reflected by itsintegral position in the gene regulatory network control-ling ectodermal appendage development, as exemplified bystudies carried out in mouse. Studies of hair and teethdevelopment in mice have characterized the position of theEDA pathway respective to other more ‘classical’ develop-mental pathways (reviewed in [13,29]). The Wnt pathwayregulates both EDA [29,59] and EDAR [60], and the activinand BMP pathways also regulate EDAR [58,59]. Transcrip-tomic and developmental studies also identified down-stream targets of EDAR, including Shh, FGF20, and theWnt and/or BMP antagonists Dkk4, CCN2/CTGF, andfollistatin (reviewed in [29], see also [61]). Owing to itsposition both downstream and upstream of key develop-mental pathways (Figure 2), the EDA pathway can modu-late the strength and range of activation of these pathways.

In conclusion, the EDA pathway is specific for ectoder-mal appendage development where it serves to fine-tunegrowth, which can be seen in both phenotypes and molec-ular function. This may have allowed it to play a role inmorphological changes during evolution because, in con-trast to other major developmental pathways, it is notessential and therefore can be changed during evolutionwithout producing large pleiotropic lethal effects.

EDA pathway variation in natural populationsThe ectodermal appendages, including teeth, teleostscales, bird feathers, mammalian hair, and mammaryglands, are specialized interfaces with the environment,

Review Trends in Genetics January 2014, Vol. 30, No. 1

26

Page 29: Trends in Genetics 2014 Jan

and these structures are known to be hot-spots of morpho-logical evolution both at the macroevolutionary and micro-evolutionary levels. The study of their evolution is still amajor topic in evolutionary biology, and the EDA pathwayis of interest in this regard. Two main cases show that thegenes of the EDA pathway have indeed been recruited forspecific adaptations: one in a teleost fish, the stickleback,and another in our own species, Homo sapiens, both ofwhich we discuss below.

In the first case a gene of the EDA pathway, namelyeda itself, was linked to a specific adaptation in thedefensive armor of the threespine stickleback (Gasteros-teus aculeatus). In this species, parallel adaptive evolu-tion occurred when marine fishes repeatedly colonizedfreshwater streams and lakes 10 000 to 20 000 years ago.Extensive studies of natural populations showed that thecomplete armor of 30–36 plates is favored in marinehabitats, probably because it facilitates escape from pred-ator capture and renders ingestion by the predator moredifficult. By contrast, the low armor of up to nine platestypical of freshwater populations could be favored be-cause it allows more rapid growth and thus better oddsfor winter survival, but other explanations such as re-duced ion availability or improved swimming perfor-mance in freshwater have also been proposed ([62,63],reviewed in [64]) (Figure 3). Whatever the cause, thereduced fitness of the full-plate phenotype in freshwaterenvironments has been clearly demonstrated ([65] andreferences therein). QTL analysis and positional cloning

methods [4] revealed that the gene eda was associatedwith this evolutionary change. Interestingly, despitetheir independent origins, the vast majority of freshwaterpopulations present the same low-plate haplotype at thehomozygous state. The low-plate allele is also found atvery low frequency in marine fishes, and fishes heterozy-gous for this allele have mostly complete lateral plates.Freshwater low-armor populations thus evolved throughthe repeated selection of the same allele that seems to beretained as a cryptic variant within marine populations.The opposite scenario explains a case of reverse evolutionshown in an urban freshwater lake [66] (Figure 3). In thiscase, the standing highly plated allele increased in fre-quency in the freshwater population, probably as a con-sequence of increased predation by trout and, within 40years, half the population had reverted to a completearmor phenotype.

The precise mutations within the eda locus responsiblefor the low activity of the EDA pathway in freshwatermorphs have not yet been identified. It is clear that thisallele is a partial loss of function based on the reduced scalephenotype of medaka and zebrafish mutants in the EDApathway genes [26–28]. Moreover, it was predicted thatthese mutations should affect regulatory regions and de-crease EDA expression [4,67]. Interestingly, it has alsobeen shown that other genes of the pathway (in particularedar, edaradd, traf6 and nemo) are not associated with low-plated freshwater fishes, suggesting a bias of evolvabilitytowards the eda gene [67].

Low High

EDA pathway

EDA

EDAR

Number Size Density

NumberSize

Density

Wnt

Ac�vin

BMP

Shh

Fgf20Morphogenesis

Dkk

CTGFFollista�n

Wnt10b

+/–Feedback

Ectodermal placoda Ectodermal placoda

TRENDS in Genetics

Figure 2. The EDA pathway as a modulator of key developmental pathways. Proposed model for the role of the EDA pathway in the regulation of the establishment of placodes

that give rise to ectodermal organs. The EDA pathway is at a center position beside other key developmental pathways. Regulators of the EDA pathway are listed on the left and

targets are listed on the right. Two types of targets can be distinguished: those that are involved in morphogenesis, and those that participate in positive or negative feedback

loops that are responsible for local reinforcement and diffusion range limitation of factors that control placode fate. Thus, enhanced or reduced signaling induced by the EDA

pathway modulate the strength and range of activation of other key developmental pathways, ultimately fine-tuning the number, size, and position of placodes.

Review Trends in Genetics January 2014, Vol. 30, No. 1

27

Page 30: Trends in Genetics 2014 Jan

Finally, it is important to note that reduction of thearmor plate may only be the most obvious phenotypic effectlinked to a reduced activity of eda in sticklebacks. Recently,the eda locus has been associated with behavioral differ-ences in freshwater versus marine sticklebacks. Interest-ingly, the low-plate allele was associated with a propensityto move to new environments, suggesting that marinesticklebacks carrying the low-plate allele may be morelikely to colonize freshwater environments [68]. Thismay be one factor among others explaining the multipleselection of this allele in freshwater fishes: other subtle(but important in terms of fitness) morphological, physio-logical, or behavioral changes may be associated withfreshwater eda alleles [49]. Indeed, the pleiotropy of thepathway is well illustrated in the following example ofadaptation associated with the EDA pathway.

The second case involves the receptor EDAR inhumans, which is modified by a specific mutation inthe open reading frame in Asian populations. This mu-tation was identified in genome-wide scans for traces ofpast positive selection in human populations [69–74] andis a point mutation in which a conservative amino acidchange occurred: a conserved valine residue in the deathdomain of the protein is replaced by an alanine at posi-tion 370 (V370A) [5] (Figure 1). Although some debateexists, this mutation was found in East Asian and nativeAmerican populations, and it has been suggested that itwas selected in Asia or perhaps more precisely in centralChina �30 000 years ago [5,75]. It correlates with

increased hair thickness [76] and specific tooth morphol-ogy [77,78] in East Asian populations. Interestingly, thisvariant was shown to enhance NF-kB activation in cel-lulo, indicating that it has a gain of function effect[75,79]. In agreement, the V370A mutation has beenshown to attenuate the severity of EDA alleles responsi-ble for the HED phenotype [80]. Consistently, increasedEdar activity in transgenic mice with increased Edarcopy number (EdarTg951 line) converts the hair pheno-type to a typical East Asian morphology through thedevelopment of enlarged hair follicles [75,79]. However,the most convincing evidence that the V370A variant isresponsible for the specific East Asian hair morphologywas recently obtained by knocking-in this mutation intothe endogenous mouse Edar gene [5]. Not only did theEdar370A mice display increased hair thickness, but theyalso had increased mammary gland branch density aswell as increased eccrine sweat gland number [5]. Thislatter effect is also associated with the mutation in a HanChinese cohort, further showing that the variant haspleiotropic effects in human populations [5]. For thisreason it is still unclear which effects provided the selec-tive value of the variant and which were hitch-hikingeffects. To date there is no specific clue either way, but itis interesting to mention one hypothesis put forward [5]suggesting that positive selection was not for a specificsingle trait but instead for a combination of phenotypicchanges elicited by the variant, possibly through differ-ent, mutually reinforcing, selective pressures: sometraits could confer environmental advantages whereasothers (e.g., effects on the mammary gland) could havebeen subject to sexual selection.

In addition to providing interesting cases for which thegenomic basis of specific adaptations are identified (stillthe holy grail of the evo/devo field), these two examplesalso illustrate the dichotomy of the pleiotropy of devel-opmental genes which can nevertheless be recruited forspecific adaptations [81,82]. It has been suggested thatmutations in cis-regulatory sequences are more likelyto underlie phenotypic evolution than other types ofgenetic changes because they avoid the pleiotropic effectslinked to mutations in open reading frames of encodedproteins. The case of the V370A mutation in humansillustrates that a single point mutation can be pleiotropicbut that these effects, although fairly prominent, can beretained if the net adaptive value of the mutation ispositive.

Studying the EDA pathway highlights the connectionsbetween diseases and evolutionary adaptations in naturalpopulations. It can be anticipated that this will be evenmore prominent in coming years because recent develop-ments at the forefront of clinical research now provide aunique tool kit for studying the involvement of the pathwayin natural adaptations. Indeed, a series of reagents [arecombinant form of EDA-A1 able to cross the placentalbarrier (Fc-EDA-A1) and agonist anti-EDAR monoclonalantibodies], that aim to correct the HED phenotype ob-served in EDA pathway mutants, have been developed andwere shown to rescue effectively the phenotype in bothmouse and dogs [49,83,84]. This impressive proof of con-cept as a possible treatment of the disease also provides

Marine

Freshwater 1957

Freshwater 2006 TRENDS in Genetics

Figure 3. Sticklebacks with different activity levels of the EDA pathway. (Above)

The marine stickleback morph with the full armor plate phenotype with 30–36

plates. (Center) A freshwater stickleback caught in 1957 in Lake Washington,

Seattle, exhibits the typical freshwater low plate phenotype with up to nine plates

corresponding to reduced activity of the EDA pathway. (Below) A ‘reverted’

freshwater fish caught in 2006 in which a high amour plate phenotype is clearly

visible. The bones have been labeled using alizarin red staining as described in

[66]. Photographs courtesy of Jun Kitano.

Review Trends in Genetics January 2014, Vol. 30, No. 1

28

Page 31: Trends in Genetics 2014 Jan

tools to manipulate the level of the EDA pathway in modelorganisms or in natural variants, opening the way to directfunctional studies.

Concluding remarksIn light of these examples and our knowledge of thepathway, we are convinced that the two described casesare only the tip of the iceberg – and that other adaptationswill be linked to the genes of the EDA pathway. This couldbe the case of the V370A mutation, which could putativelyevolve repeatedly in other species as seen for other genes[85,86], and especially if the hypothesis that it drivescorrelated advantageous traits is correct. Of course, thiscould also be the case for other unknown mutations in any

of the genes in the pathway. The diverse teeth and hairmorphologies, some of which are known to be adaptive inmammals, the scales and teeth morphologies observed inteleosts, and the modified feathers or scales in sauropsids,provide a fascinating reservoir to explore for changes in theEDA pathway activity (Box 1). In terms of sequence evolu-tion at the vertebrate scale, the EDA pathway genes havebeen shown to combine both conserved features and evo-lutionary shifts [87]. These shifts are found at differenttaxonomic levels, suggesting that changes in the pathwaymay well have been linked to ectodermal organ diversifi-cation in vertebrates. Orthologs of EDA, as well as a uniquenon-duplicated orthologs of EDAR/Xedar/Troy, are foundin invertebrate chordate genomes (Ciona and amphioxus):

Box 1. How changing the dose of the EDA pathway affects dentition in human and mouse

In both human and mouse, tooth number and morphology are highly

sensitive to modifications in the dose of EDA signaling (Figure I).

Interestingly, in both species loss or gain of function results in a large

range of phenotypic modifications.

In human, one can distinguish severe loss of function, which results

in a significant reduction of primary and permanent dentition (on

average 22 missing teeth, sometimes extending to complete lack of

teeth) and severe shape modifications (such as typical peg-shaped

teeth, reduction in tooth size and cusp number) [90,91], from less

severe cases in which no clear HED symptoms are diagnosed and in

which only incisors are affected [18,20]. In the opposite case, the gain-

of-function mutation found in East Asian populations results in

shovel-shaped incisors [77,78].

In mouse, the general trends observed in tabby (Figure I: tabby

mouse molar as compared to wild type), downless, and crinkled

mouse dentitions are reductions of both molar size and cusp number,

as well as modifications in the segmentation of dental rows leading to

2–4 cheek teeth bearing various shape anomalies. Detailed analysis

by two groups pointed out that the tabby phenotype with only two

molars and simplified molar morphology is reminiscent to some

insectivorous rodent species, such as the water rat Hydromys

chrysogaster [56]. By contrast, overexpression of the Eda or Edar

genes (see Figure I: K14–Eda mouse molar) also leads to occasional

abnormal segmentation of the molar dental row and to various shape

anomalies [50,55,56]. The dominant dental trait of these mutants is

the occurrence of longitudinal crests in both lower and upper molars.

These crests resemble a rodent molar phenotype known as stepha-

nodonty, which occurs in many murine rodents but never in mice, and

is thought to be an adaptation to herbivory [94].

It is interesting to note that detailed morphological analysis

revealed that the Eda/tabby and Edar/J mutations differently affect

the molar phenotype [92], which is most obvious in homozygous

mutants for lower molars, the Edar/downless phenotype being more

severe in terms of cusp reduction. Moreover, the sensitivity of upper

and lower dental rows shows variability in regards to loss of

function of EDA pathway genes, the lower row tending to be more

drastically modified. All this suggests that tooth morphology is

sensitive to small variations in EDA pathway signaling level, and

therefore that evolutionary trends in mammalian dentition – such as

the reduction in tooth number or crown crestization in herbivores –

are phenotypes that may have been controlled by genes of the EDA

pathway.

Mouse

Human

EDA signalinglevel

+ –

HED(Reviewed in Clauss

et al . 2008)

No HEDIncisors affected

(Nikopensius et al . 2013) (Park et al. 2012)

Tabby WT K14-Eda

Severe hypodon�a Minor hypodon�a EDAR gain of func�onSinodon�a

TRENDS in Genetics

Figure I. Tooth phenotypes resulting from alterations of the EDA pathway. Mouse upper tooth raw phenotype and human tooth phenotype are compared against EDA

signaling levels ranging from loss of EDA function to increased activation. Mouse phenotypes in tabby (Eda null mutant), wild type (WT), and K14–Eda (Eda gain of

function) are shown by scans of the upper tooth raw. The equivalent tooth phenotypes in human are explained in the lower part of the figure. Human severe hypodontia

is reviewed in [91]; for minor hypodontia see [93]; for EDAR gain of function/sinodontia see [78]. HED, anhidrotic/hypohidrotic ectodermal dysplasia.

Review Trends in Genetics January 2014, Vol. 30, No. 1

29

Page 32: Trends in Genetics 2014 Jan

their developmental roles as well as their biochemicalfunction are totally unknown to date but could bring usefulinformation on the origin of the pathway [87,88].

These observations can be linked to the debate aroundRichard Goldschmidt’s suggestion that small geneticchanges to developmentally important genes may havelarge phenotypic effects, producing a ‘hopeful monster’that could be selected, thereby leading to rapid evolution-ary changes [89]. The finding that a pleiotropic pathwayinvolved in disease and adaptation is reminiscent of Gold-schmidt’s hopeful monster. It is interesting to note in thisrespect that, from the various cases discussed in thisreview, we can track null mutations causing HED, hypo-morphic EDA mutants causing selective tooth agenesis,and a gain of function EDAR allele in East Asian popula-tions causing specific tooth morphology, illustrating verywell the wide range of phenotypic variance associated withthis pathway. In line with our prediction that the EDApathway will be involved in other cases of adaptation, andthe fact that it is one of the rare pathways for which acorrection of the disease can be envisaged in humans, wewould like to see the EDA pathway as a ‘hopeful pathway’.

AcknowledgmentsWork in our laboratory is supported by the Ministry of Research andEducation (Agence Nationale de la Recherche, ANR) programs Quenottes(ANR-06-BLAN-0216), Bouillabaisse (ANR-09-BLAN-0127), BigTooth(ANR 2011 BSV7 00803), and Convergdent (ANR 2011 JSV6 00501).A.S. holds a fellowship from Fondation ARC pour la Recherche sur leCancer. We thank Joanne Burden for critical reading of the manuscriptand Jun Kitano for the stickleback pictures.

References1 Darwin, C. (1875) The Variations of Animals and Plants under

Domestication, John Murray2 Wisniewski, S.A. et al. (2002) Recent advances in understanding of the

molecular basis of anhidrotic ectodermal dysplasia: discovery of aligand, ectodysplasin A and its two receptors. J. Appl. Genet. 43, 97–107

3 Cluzeau, C. et al. (2011) Only four genes (EDA1, EDAR, EDARADD,and WNT10A) account for 90% of hypohidrotic/anhidrotic ectodermaldysplasia cases. Hum. Mutat. 32, 70–72

4 Colosimo, P.F. et al. (2005) Widespread parallel evolution insticklebacks by repeated fixation of Ectodysplasin alleles. Science307, 1928–1933

5 Kamberov, Y.G. et al. (2013) Modeling recent human evolution in miceby expression of a selected EDAR variant. Cell 152, 691–702

6 Kere, J. et al. (1996) X-linked anhidrotic (hypohidrotic) ectodermaldysplasia is caused by mutation in a novel transmembrane protein.Nat. Genet. 13, 409–416

7 Srivastava, A.K. et al. (1996) Fine mapping of the EDA gene: atranslocation breakpoint is associated with a CpG island that istranscribed. Am. J. Hum. Genet. 58, 126–132

8 Sofaer, J.A. (1969) Aspects of the Tabby–crinkled–downless syndrome.I. The development of Tabby teeth. J. Embryol. Exp. Morphol. 22,181–205

9 Ferguson, B.M. et al. (1997) Cloning of Tabby, the murine homolog ofthe human EDA gene: evidence for a membrane-associated proteinwith a short collagenous domain. Hum. Mol. Genet. 6, 1589–1594

10 Srivastava, A.K. et al. (1997) The Tabby phenotype is caused bymutation in a mouse homologue of the EDA gene that reveals novelmouse and human exons and encodes a protein (ectodysplasin-A) withcollagenous domains. Proc. Natl. Acad. Sci. U.S.A. 94, 13069–13074

11 Headon, D.J. and Overbeek, P.A. (1999) Involvement of a novel Tnfreceptor homologue in hair follicle induction. Nat. Genet. 22, 370–374

12 Monreal, A.W. et al. (1999) Mutations in the human homologue ofmouse dl cause autosomal recessive and dominant hypohidroticectodermal dysplasia. Nat. Genet. 22, 366–369

13 Mikkola, M.L. and Thesleff, I. (2003) Ectodysplasin signaling indevelopment. Cytokine Growth Factor Rev. 14, 211–224

14 Headon, D.J. et al. (2001) Gene defect in ectodermal dysplasiaimplicates a death domain adapter in development. Nature 414,913–916

15 Shimomura, Y. et al. (2004) A rare case of hypohidrotic ectodermaldysplasia caused by compound heterozygous mutations in the EDARgene. J. Invest. Dermatol. 123, 649–655

16 Mikkola, M.L. (2009) Molecular aspects of hypohidrotic ectodermaldysplasia. Am. J. Med. Genet. A 149A, 2031–2036

17 Chassaing, N. et al. (2010) Mutations in EDARADD account for a smallproportion of hypohidrotic ectodermal dysplasia cases. Br. J. Dermatol.162, 1044–1048

18 Tarpey, P. et al. (2007) A novel Gln358Glu mutation in ectodysplasin Aassociated with X-linked dominant incisor hypodontia. Am. J. Med.Genet. A 143, 390–394

19 Han, D. et al. (2008) Novel EDA mutation resulting in X-linked non-syndromic hypodontia and the pattern of EDA-associated isolatedtooth agenesis. Eur. J. Med. Genet. 51, 536–546

20 Tao, R. et al. (2006) A novel missense mutation of the EDA gene in aMongolian family with congenital hypodontia. J. Hum. Genet. 51,498–502

21 Casal, M.L. et al. (2005) Mutation identification in a canine model of X-linked ectodermal dysplasia. Mamm. Genome 16, 524–531

22 Drogemu ller, C. et al. (2001) Partial deletion of the bovine ED1 genecauses anhidrotic ectodermal dysplasia in cattle. Genome Res. 11,1699–1705

23 Kuramoto, T. et al. (2011) A rat model of hypohidrotic ectodermaldysplasia carries a missense mutation in the Edaradd gene. BMCGenet. 12, 91

24 Houghton, L. et al. (2005) The ectodysplasin pathway in feather tractdevelopment. Development 132, 863–872

25 Drew, C.F. et al. (2007) The Edar subfamily in feather placodeformation. Dev. Biol. 305, 232–245

26 Kondo, S. et al. (2001) The medaka rs-3 locus required for scaledevelopment encodes ectodysplasin-A receptor. Curr. Biol. 11,1202–1206

27 Atukorala, A.D. et al. (2010) Scale and tooth phenotypes in medakawith a mutated ectodysplasin-A receptor: implications for theevolutionary origin of oral and pharyngeal teeth. Arch. Histol. Cytol.73, 139–148

28 Harris, M.P. et al. (2008) Zebrafish eda and edar mutants revealconserved and ancestral roles of ectodysplasin signaling invertebrates. PLoS Genet. 4, e1000206

29 Mikkola, M.L. (2008) TNF superfamily in skin appendagedevelopment. Cytokine Growth Factor Rev. 19, 219–230

30 Kumar, A. et al. (2001) The ectodermal dysplasia receptor activates thenuclear factor-kappaB, JNK, and cell death pathways and binds toectodysplasin A. J. Biol. Chem. 276, 2668–2677

31 Elomaa, O. et al. (2001) Ectodysplasin is released by proteolyticshedding and binds to the EDAR protein. Hum. Mol. Genet. 10,953–962

32 Tucker, A.S. et al. (2000) Edar/Eda interactions regulate enamel knotformation in tooth morphogenesis. Development 127, 4691–4700

33 Yan, M. et al. (2000) Two-amino acid molecular switch in an epithelialmorphogen that regulates binding to two distinct receptors. Science290, 523–527

34 Wisniewski, S.A. and Trzeciak, W.H. (2012) A new mutation resultingin the truncation of the TRAF6-interacting domain of XEDAR: apossible novel cause of hypohidrotic ectodermal dysplasia. J. Med.Genet. 49, 499–501

35 Newton, K. et al. (2004) Myodegeneration in EDA-A2 transgenic mice isprevented by XEDAR deficiency. Mol. Cell. Biol. 24, 1608–1613

36 Koppinen, P. et al. (2001) Signaling and subcellular localization of theTNF receptor Edar. Exp. Cell Res. 269, 180–192

37 Morlon, A. et al. (2005) TAB2, TRAF6 and TAK1 are involved in NF-kappaB activation induced by the TNF-receptor, Edar and itsadaptator Edaradd. Hum. Mol. Genet. 14, 3751–3757

38 Schmidt-Ullrich, R. et al. (2006) NF-kappaB transmits Eda A1/EdaRsignalling to activate Shh and cyclin D1 expression, and controls post-initiation hair placode down growth. Development 133, 1045–1057

39 Ohazama, A. et al. (2004) Traf6 is essential for murine tooth cuspmorphogenesis. Dev. Dyn. 229, 131–135

Review Trends in Genetics January 2014, Vol. 30, No. 1

30

Page 33: Trends in Genetics 2014 Jan

40 Naito, A. et al. (2002) TRAF6-deficient mice display hypohidroticectodermal dysplasia. Proc. Natl. Acad. Sci. U.S.A. 99, 8766–8771

41 Zonana, J. et al. (2000) A novel X-linked disorder of immune deficiencyand hypohidrotic ectodermal dysplasia is allelic to incontinentiapigmenti and due to mutations in IKK-gamma (NEMO). Am. J.Hum. Genet. 67, 1555–1562

42 Pispa, J. et al. (2008) Edar and Troy signalling pathways actredundantly to regulate initiation of hair follicle development. Hum.Mol. Genet. 17, 3380–3391

43 Dickson, K.M. et al. (2004) TRAF6-dependent NF-kB transcriptionalactivity during mouse development. Dev. Dyn. 231, 122–127

44 Gilmore, T.D. and Wolenski, F.S. (2012) NF-kB: where did it come fromand why? Immunol. Rev. 246, 14–35

45 Shifera, A.S. (2010) The zinc finger domain of IKKg (NEMO) protein inhealth and disease. J. Cell. Mol. Med. 14, 2404–2414

46 Lefebvre, S. et al. (2012) Identification of ectodysplasin target genesreveals the involvement of chemokines in hair development. J. Invest.Dermatol. 132, 1094–1102

47 Cascallana, J.L. et al. (2005) Ectoderm-targeted overexpression of theglucocorticoid receptor induces hypohidrotic ectodermal dysplasia.Endocrinology 146, 2629–2638

48 Cui, C.Y. et al. (2003) Inducible mEDA-A1 transgene mediatessebaceous gland hyperplasia and differential formation of two typesof mouse hair follicles. Hum. Mol. Genet. 12, 2931–2940

49 Gaide, O. and Schneider, P. (2003) Permanent correction of aninherited ectodermal dysplasia with recombinant EDA. Nat. Med. 9,614–618

50 Mustonen, T. et al. (2004) Ectodysplasin A1 promotes placodal cell fateduring early morphogenesis of ectodermal appendages. Development131, 4907–4919

51 Jaskoll, T. et al. (2003) Ectodysplasin receptor-mediated signaling isessential for embryonic submandibular salivary gland development.Anat. Rec. A: Discov. Mol. Cell. Evol. Biol. 271, 322–331

52 Melnik, M. et al. (2009) Salivary gland branching morphogenesis: aquantitative systems analysis of the Eda/Edar/NFkappaB paradigm.BMC Dev. Biol. 9, 32

53 Mikkola, M.L. (2011) The Edar subfamily in hair and exocrine glanddevelopment. Adv. Exp. Med. Biol. 691, 23–33

54 Voutilainen, M. et al. (2012) Ectodysplasin regulates hormone-independent mammary ductal morphogenesis via NF-kB. Proc. Natl.Acad. Sci. U.S.A. 109, 5744–5749

55 Tucker, A.S. et al. (2004) The activation level of the TNF familyreceptor, Edar, determines cusp number and tooth number duringtooth development. Dev. Biol. 268, 185–194

56 Kangas, A.T. et al. (2004) Nonindependence of mammalian dentalcharacters. Nature 432, 211–214

57 Harjunmaa, E. et al. (2012) On the difficulty of increasing dentalcomplexity. Nature 483, 324–327

58 Mou, C. et al. (2006) Generation of the primary hair follicle pattern.Proc. Natl. Acad. Sci. U.S.A. 103, 9075–9080

59 Laurikkala, J. et al. (2002) Regulation of hair follicle development bythe TNF signal ectodysplasin and its receptor Edar. Development 129,2541–2553

60 Zhang, Y. et al. (2009) Reciprocal requirements for EDA/EDAR/NF-kappaB and Wnt/beta-catenin signaling pathways in hair follicleinduction. Dev. Cell 17, 49–61

61 Haara, O. et al. (2012) Ectodysplasin regulates activator-inhibitorbalance in murine tooth development through Fgf20 signaling.Development 139, 3189–3199

62 Barrett, R.D. et al. (2008) Natural selection on a major armor gene inthreespine stickleback. Science 322, 255–257

63 Le Rouzic, A. et al. (2011) Strong and consistent natural selectionassociated with armour reduction in sticklebacks. Mol. Ecol. 20,2483–2493

64 Barrett, R.D. (2010) Adaptive evolution of lateral plates in three-spinedstickleback Gasterosteus aculeatus: a case study in functional analysisof natural variation. J. Fish Biol. 77, 311–328

65 Leinonen, T. et al. (2012) Multiple evolutionary pathways to decreasedlateral plate coverage in freshwater threespine sticklebacks. Evolution66, 3866–3875

66 Kitano, J. et al. (2008) Reverse evolution of armor plates in thethreespine stickleback. Curr. Biol. 18, 769–774

67 Knecht, A.K. et al. (2007) Constraints on utilization of the EDA-signaling pathway in threespine stickleback evolution. Evol. Dev. 9,141–154

68 Barrett, R.D. et al. (2009) Should I stay or should I go? TheEctodysplasin locus is associated with behavioural differences inthreespine stickleback. Biol. Lett. 5, 788–791

69 Sabeti, P.C. et al. (2007) Genome-wide detection and characterizationof positive selection in human populations. Nature 449, 913–918

70 Carlson, C.S. et al. (2005) Genomic regions exhibiting positive selectionidentified from dense genotype data. Genome Res. 15, 1553–1565

71 Williamson, S.H. et al. (2005) Simultaneous inference of selection andpopulation growth from patterns of variation in the human genome.Proc. Natl. Acad. Sci. U.S.A. 102, 7882–7887

72 Voight, B.F. et al. (2006) A map of recent positive selection in thehuman genome. PLoS Biol. 4, e72

73 Tang, K. et al. (2007) A new approach for using genome scans to detectrecent positive selection in the human genome. PLoS Biol. 5, e171

74 Myles, S. et al. (2008) Identification and analysis of high Fst regionsfrom genome-wide SNP data from three human populations. Ann.Hum. Genet. 72, 99–110

75 Bryk, J. et al. (2008) Positive selection in East Asians for an EDARallele that enhances NF-kappaB activation. PLoS ONE 3, e2209

76 Fujimoto, A. et al. (2008) A replication study confirmed the EDAR geneto be a major contributor to population differentiation regarding headhair thickness in Asia. Hum. Genet. 124, 179–185

77 Kimura, R. et al. (2009) A common variation in EDAR is a geneticdeterminant of shovel-shaped incisors. Am. J. Hum. Genet. 85,528–535

78 Park, J.H. et al. (2012) Effects of an Asian-specific nonsynonymousEDAR variant on multiple dental traits. J. Hum. Genet. 57, 508–514

79 Mou, C. et al. (2008) Enhanced ectodysplasin-A receptor (EDAR)signaling alters multiple fiber characteristics to produce the EastAsian hair form. Hum. Mutat. 29, 1405–1411

80 Cluzeau, C. et al. (2012) The EDAR370A allele attenuates the severityof hypohidrotic ectodermal dysplasia caused by EDA gene mutation.Br. J. Dermatol. 166, 678–681

81 Hoekstra, H.E. and Coyne, J.A. (2007) The locus of evolution: evo devoand the genetics of adaptation. Evolution 61, 995–1016

82 Stern, D.L. and Orgogozo, V. (2008) The loci of evolution: howpredictable is genetic evolution? Evolution 62, 2155–2177

83 Casal, M.L. et al. (2007) Significant correction of disease after postnataladministration of recombinant ectodysplasin A in canine X-linkedectodermal dysplasia. Am. J. Hum. Genet. 81, 1050–1056

84 Kowalczyk, C. et al. (2011) Molecular and therapeutic characterizationof anti-ectodysplasin A receptor (EDAR) agonist monoclonalantibodies. J. Biol. Chem. 286, 30769–30779

85 Christin, P-A. et al. (2010) Causes and evolutionary significance ofgenetic convergence. Trends Genet. 26, 400–405

86 Martin, A. and Orgogozo, V. (2013) The loci of repeated evolution: acatalog of genetic hotspots of phenotypic variation. Evolution 67, 1235–1250

87 Pantalacci, S. et al. (2008) Conserved features and evolutionary shiftsof the EDA signaling pathway involved in vertebrate skin appendagedevelopment. Mol. Biol. Evol. 25, 912–928

88 Wiens, G.D. and Glenney, G.W. (2011) Origin and evolution of TNF andTNF receptor superfamilies. Dev. Comp. Immunol. 35, 1324–1335

89 Dietrich, M.R. (2003) Richard Goldschmidt: hopeful monsters andother ‘heresies’. Nat. Rev. Genet. 4, 68–74

90 Lexner, M.O. et al. (2008) X-linked hypohidrotic ectodermal dysplasia.Genetic and dental findings in 67 Danish patients from 19 families.Clin. Genet. 74, 252–259

91 Clauss, F. et al. (2008) Dento-craniofacial phenotypes and underlyingmolecular mechanisms in hypohidrotic ectodermal dysplasia (HED): areview. J. Dent. Res. 87, 1089–1099

92 Charles, C. et al. (2009) Distinct impacts of Eda and Edar loss offunction on the mouse dentition. PLoS ONE 4, e4985

93 Nikopensius, T. et al. (2013) Non-syndromic tooth agenesis associatedwith a nonsense mutation in ectodysplasin-A (EDA). J. Dent. Res. 92,507–511

94 Gomes Rodrigues, H. et al. (2013) Roles of dental development andadaptation in rodent evolution. Nat. Commun. 4, 2504 http://dx.doi.org/10.1038/ncomms3504

Review Trends in Genetics January 2014, Vol. 30, No. 1

31

Page 34: Trends in Genetics 2014 Jan

Genetics of recessive cognitivedisordersLuciana Musante and H. Hilger Ropers

Max Planck Institute of Molecular Genetics, Berlin, Germany

Most severe forms of intellectual disability (ID) have spe-cific genetic causes. Numerous X chromosome genedefects and disease-causing copy-number variants havebeen linked to ID and related disorders, and recent studieshave revealed that sporadic cases are often due to domi-nant de novo mutations with low recurrence risk. Forautosomal recessive ID (ARID) the recurrence risk is highand, in populations with frequent parental consanguinity,ARID is the most common form of ID. Even so, its eluci-dation has lagged behind. Here we review recent progressin this field, show that ARID is not rare even in outbredWestern populations, and discuss the prospects for im-proving its diagnosis and prevention.

ID: a major unsolved problem of healthcareEarly-onset cognitive impairment, commonly referred to asmental retardation or, more recently, ID [1], is defined as adisability ‘characterized by significant limitations both inintellectual functioning and in adaptive behavior’, andwhich ‘originates before the age of 18’ [2] with an IQ below70 (= IQ 100 – 2SD) which is generally considered to be thethreshold for ID. According to this definition, ID is esti-mated to affect 1–3% of Western populations [3] but issignificantly more common elsewhere, with malnutrition,cultural deprivation, poor healthcare, and parental con-sanguinity as predisposing factors. Worldwide, ID is amajor socioeconomic problem, the most costly of all diag-noses listed in the International Classification of Diseases(ICD10, http://www.cdc.gov/nchs/icd/icd10.htm), and themost frequent reason for referral to genetic services [4].ID may be the only clinical symptom or it may be part of aclinically recognizable syndrome, but specific clinical fea-tures will often only be apparent when comparing severalpatients [5], and a sharp distinction between syndromicand non-syndromic forms (NS-ID) is not possible.

Most autosomal recessive gene defects are stillunknownSince 1991, the year when common fragile X syndromewas elucidated, more than 100 X-linked gene defects have

been implicated in ID, as reported and reviewed else-where [6,7]. During the past decade numerous de novoand recurrent copy-number variants (CNVs) have beenidentified that cause or predispose to ID [8] and morerecently, sequencing of affected individuals and theirhealthy parents has indicated that in sporadic patients,de novo basepair changes are another important cause ofID ([9–11] and references therein). By contrast, researchinto autosomal recessive ID (ARID) has lagged behind,possibly because in Western societies where most of thegenetic research takes place, families are usually small,which has hampered mapping and identification of theunderlying gene defects. This problem has been partlyovercome by the introduction of high-throughput DNAsequencing techniques (Box 1). However, it has beenshown that ARID is extremely heterogeneous, that thetotal number of ARID genes may run into the thousands(reviewed in [4]), and that the vast majority of these arestill unknown.

Homozygosity mapping in consanguineous familiesHomozygosity (or autozygosity) mapping in consanguine-ous families is the strategy of choice for mapping genes forrecessive disorders in the human genome [12] (Box 1).Before 2002, virtually nothing was known about the mo-lecular causes of ARID and, until 2006, no more than threegenes for non-syndromic ARID had been identified, all bymicrosatellite-based homozygosity mapping in large con-sanguineous families and subsequent mutation screeningof functionally plausible positional candidate genes [4](Table 1).

The first large study employing single-nucleotide poly-morphism (SNP) arrays to map ID genes [13] identifiedsingle homozygous linkage intervals in 8 of 76 consanguin-eous Iranian families with two or more affected children.None of these intervals overlapped, indicating that ARID ishighly heterogeneous. This was confirmed by subsequentstudies [14,15]. Thus, in contrast to non-syndromic reces-sive deafness, where 50% of the patients have mutations ina single gene (reviewed in [16]), these studies did notidentify any frequent forms of ARID.

Many of the homozygous intervals in these familieswere large, which meant that hundreds of genes oftenhad to be screened to identify the causative mutation.Nevertheless, systematic Sanger sequencing has led tothe identification of numerous novel genes for non-syndromic ARID (Table 1) ([4] and references therein;[17,18]).

Review

0168-9525/$ – see front matter

� 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.tig.2013.09.008

Corresponding author: Ropers, H.H. ([email protected]).Keywords: autosomal recessive ID; homozygosity mapping; next-generation sequencing;healthcare.

32 Trends in Genetics, January 2014, Vol. 30, No. 1

Review CellPRESS

Genetics of recessive cognitivedisordersLuciana Musante and H. Hilger RopersMax Planck Institute of Molecular Genetics, Berlin, Germany

Most severe forms of intellectual disability (ID) have spe-cific genetic causes. Numerous X chromosome genedefects and disease-causing copy-number variants havebeen linked to ID and related disorders, and recent studieshave revealed that sporadic cases are often due to domi-nant de novo mutations with low recurrence risk. Forautosomal recessive ID (ARID) the recurrence risk is highand, in populations with frequent parental consanguinity,ARID is the most common form of ID. Even so, its eluci-dation has lagged behind. Here we review recent progressin this field, show that ARID is not rare even in outbredWestern populations, and discuss the prospects for im-proving its diagnosis and prevention.

ID: a major unsolved problem of healthcareEarly-onset cognitive impairment, commonly referred to asmental retardation or, more recently, ID [1], is defined as adisability ‘characterized by significant limitations both inintellectual functioning and in adaptive behavior’, andwhich ‘originates before the age of 18’ [2] with an IQ below70 (= IQ 100 — 2SD) which is generally considered to be thethreshold for ID. According to this definition, ID is esti-mated to affect 1—3% of Western populations [3] but issignificantly more common elsewhere, with malnutrition,cultural deprivation, poor healthcare, and parental con-sanguinity as predisposing factors. Worldwide, ID is amajor socioeconomic problem, the most costly of all diag-noses listed in the International Classifi cationof Diseases(ICD10, http2//www.cdc.gov/nchs/icd/icd10.htm), and themost frequent reason for referral to genetic services [4].ID may be the only clinical symptom or it may be part of aclinically recognizable syndrome, but specific clinical fea-tures will often only be apparent when comparing severalpatients [5], and a sharp distinction between syndromicand non-syndromic forms (NS-ID) is not possible.

Most autosomal recessive gene defects are stillunknownSince 1991, the year when common fragile X syndromewas elucidated, more than 100 X-linked gene defects have

Corresponding author: Ropers, H.H. ([email protected]).Keywords: autosomal recessive ID; homozygosity mapping; next—generation sequencing;healthcare.

0168-9525/$ — see front matter© 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.tig.2013.09.008

®CrossMarl<

32 Trends in Genetics, January 2014, Vol. 30, No. 1

been implicated in ID, as reported and reviewed else-where [6,7]. During the past decade numerous de novoand recurrent copy-number variants (CNVs) have beenidentified that cause or predispose to ID [8] and morerecently, sequencing of affected individuals and theirhealthy parents has indicated that in sporadic patients,de novo basepair changes are another important cause ofID ([9—11] and references therein). By contrast, researchinto autosomal recessive ID (ARID) has lagged behind,possibly because in Western societies where most of thegenetic research takes place, families are usually small,which has hampered mapping and identification of theunderlying gene defects. This problem has been partlyovercome by the introduction of high-throughput DNAsequencing techniques (Box 1). However, it has beenshown that ARID is extremely heterogeneous, that thetotal number of ARID genes may run into the thousands(reviewed in [4]), and that the vast majority of these arestill unknown.

Homozygosity mapping in consanguineous familiesHomozygosity (or autozygosity) mapping in consanguine-ous families is the strategy of choice for mapping genes forrecessive disorders in the human genome [12] (Box 1).Before 2002, virtually nothing was known about the mo-lecular causes ofARID and, until 2006, no more than threegenes for non-syndromic ARID had been identified, all bymicrosatellite—based homozygosity mapping in large con-sanguineous families and subsequent mutation screeningof functionally plausible positional candidate genes [4](Table 1).

The first large study employing single—nucleotide poly-morphism (SNP) arrays to map ID genes [13] identifiedsingle homozygous linkage intervals in 8 of 76 consanguin-eous Iranian families with two or more affected children.None ofthese intervals overlapped, indicating that ARID ishighly heterogeneous. This was confirmed by subsequentstudies [14,15]. Thus, in contrast to non-syndromic reces-sive deafness, where 50% ofthe patients have mutations ina single gene (reviewed in [16]), these studies did notidentify any frequent forms of ARID.

Many of the homozygous intervals in these familieswere large, which meant that hundreds of genes oftenhad to be screened to identify the causative mutation.Nevertheless, systematic Sanger sequencing has led tothe identification of numerous novel genes for non-syndromic ARID (Table 1) ([4] and references therein;[17 ,18]).

Page 35: Trends in Genetics 2014 Jan

Next-generation sequencing (NGS): a new dimension inthe elucidation of ARIDThe introduction of high-throughput NGS techniques(Box 2) has revolutionized the genetic dissection of IDand the identification of gene defects underlying ARID.TECR was the first gene for which a causative homozygousvariant was identified by whole-exome enrichment and

Box 1. Traditional strategies to map ARID genes

Screening for disease-associated CNVs by array-comparative

genomic hybridization (a-CGH)

CNVs are structural variations in the genome which consist in gains

and losses of large chunks of DNA sequence with a range in length

from 1000 bp to 5 Mb (cytogenetic level of resolution). Because

CNVs change the structure of the genome, their functional effect

could crucially depend on whether they change the sequence or

relative location of specific segments of genomic DNA.

Linkage mapping in multiple-affected families

Genetic linkage is the tendency whereby alleles at loci close to each

other on a chromosome will be inherited together during meiosis

because they are less likely to be separated by a crossover event.

Conversely, if loci are far apart or on different chromosomes then

recombination will occur by chance in 50% of meioses. The

recombination fraction ranges from 0 (tight linkage) to 0.5 (no

linkage) and is a measure of genetic distance. Linkage can be used

to map disease genes by typing DNA markers (i.e., SNPs) and seeing

if their alleles cosegregate with the disease phenotype.

Homozygosity mapping in consanguineous families

Consanguineous families are common in countries belonging to the

‘consanguinity belt’ that extends from Morocco to India, and in

migrant communities now permanently resident in Western Europe,

North America, and Australasia [96] (see also http://www.consang.

net/). It is estimated that about 20% of the human population live in

communities with a preference for consanguineous marriage and

that at least 8.5% of children have consanguineous parents ([97] and

references therein). Globally, the most common form of consangui-

neous union is between first cousins, who share 1/8 of their genes,

and their progeny therefore show autozygosity at 1/16 of all loci.

Conventionally, this is expressed as the coefficient of inbreeding (F),

and for first-cousin offspring F = 0.0625 [98]. The children of

consanguineous individuals will have more homozygous DNA than

the offspring of an outbred marriage. This leads to an increased

likelihood of rare, recessive disease-causing variants being inherited

from a common ancestor via both maternal and paternal lineages.

Homozygosity mapping is based on the fact that the affected

offspring of consanguineous matings will not only be homozygous

by descent for the causative gene defect, but also for flanking

genetic markers located on the same chromosomal segment.

Table 1. NS-ARID genes identified before the NGS era (2002–2011)

Genea HGNC ID Number of

families reported

Ethnicity Mutationb Disorderc OMIMd First description

PRSS12 9477 2 Algerian delACGT 1350-1353 MRT1, NS-ID #249500 [99]

CRBN 30185 1 Closed population

(North America)

R419X MRT2, NS-ID #607417 [100]

CC2D1A 30237 9 Israeli Arab G408fsX437 MRT3, NS-ID #608443 [101]

GRIK2 4580 1 Iranian del/inv, Ex7–11 MRT6, NS-ID #611092 [102]

TUSC3 30242 5 Iranian

French

Pakistani

Italian

del120 Kb; Q55X

N263TfsX300

del170 kb

del203 kb

MRT7, NS-ID #611093 [103,104]

TRAPPC9 30832 7 Israeli Arab

Tunisian

Pakistani

Iranian

Syrian

Italian

R475X

R570X

R475X; c.1024+1G>T

L772WfsX7

R475X

T951YfsX17

MRT13, NS-ID #613192 [105–107]

ZC3H14 20509 1 Iranian R154X NS-ID [17]

MED23 2372 1 Algerian R617Q NS-ID #614249 [18]

aGene symbol approved by Human Gene Nomenclature Committee, HGNC (http://www.genenames.org/).

bAbbreviations: c, coding region; del, deletion; fs, frameshift; inv, inversion; X, stop codon.

cMRT, mental retardation, autosomal recessive, phenotypic series, OMIM, Online Mendelian Inheritance in Man (http://www.ncbi.nlm.nih.gov/omim); NS-ID, non-

syndromic intellectual disability.

dOMIM number (#), phenotypic description, molecular basis known, version 9 October 2013.

Box 2. Disease gene identification by NGS

In the past a traditional way to identify Mendelian disease genes was

Sanger sequencing of candidate genes selected by positional

mapping (i.e., linkage analysis, homozygosity mapping), by their

relation to other genes responsible for similar phenotypes, or because

the encoded proteins were known to be physiologically or function-

ally relevant to the disease in question. The introduction of NGS has

revolutionized the genetic dissection of monogenic diseases, allow-

ing the identification of gene defects underlying ID in familial cases

even where linkage analysis would be impossible due to insufficient

family information (size of the family, number of affected per family,

etc.) as well as in sporadic cases and encompassing diverse models of

inheritance. Moreover, it can be applied to the detection of CNVs. Few

major NGS platforms exist (reviewed in [108]). Although they use

different enzymology, chemistry, high resolution optics, hardware

and software, nevertheless they share some commonalities – they

generally start with fragmented genomic DNA, ligated with platform

specific linker, then selectively amplified by PCR, ready for massively

parallel sequencing resulting in millions of short reads. NGS can be

applied to sequencing of the entire human genome (referred to as

whole-genome sequencing, WGS), to the entire protein-coding

sequences (known as whole-exome sequencing, WES), and also to

a subset of genomic regions (i.e., exons within the homozygous loci

or linkage intervals) or to a subset of target genes. Despite the

advantage of NGS technologies compared to previous methods,

including increased speed and reduced costs, the major challenge

resides now in the interpretation of the large number of variants

identified. It will be crucial to develop strategies for disease variant

prioritization, including robust bioinformatics procedures to filter the

relevant changes. This process could take advantage also of the

development of databases of genetic variants present in affected and

healthy individuals.

Review Trends in Genetics January 2014, Vol. 30, No. 1

33

Review

Box 1. Traditional strategies to map ARID genes

Screening for disease-associated CNVs by array-comparativegenomic hybridization (a-CGH)CNVs are structural variations in the genome which consist in gainsand losses of large chunks of DNA sequence with a range in lengthfrom 1000 bp to 5Mb (cytogenetic level of resolution). BecauseCNVs change the structure of the genome, their functional effectcould crucially depend on whether they change the sequence orrelative location of specific segments of genomic DNA.

Linkage mapping in multiple-affected familiesGenetic linkage is the tendency whereby alleles at loci close to eachother on a chromosome will be inherited together during meiosisbecause they are less likely to be separated by a crossover event.Conversely, if loci are far apart or on different chromosomes thenrecombination will occur by chance in 50% of meioses. Therecombination fraction ranges from 0 (tight linkage) to 0.5 (nolinkage) and is a measure of genetic distance. Linkage can be usedto map disease genes by typing DNA markers (i.e., SNPs) and seeingif their alleles cosegregate with the disease phenotype.

Homozygosity mapping in consanguineous familiesConsanguineous families are common in countries belonging to the‘consanguinity belt’ that extends from Morocco to India, and inmigrant communities now permanently resident in Western Europe,North America, and Australasia [96] (see also http://www.consang.net/). It is estimated that about 20% of the human population live incommunities with a preference for consanguineous marriage andthat at least 8.5% of children have consanguineous parents ([97] andreferences therein). Globally, the most common form of consangui-neous union is between first cousins, who share 1/8 of their genes,and their progeny therefore show autozygosity at 1/16 of all loci.Conventionally, this is expressed as the coefficient of inbreeding (F),and for first-cousin offspring F: 0.0625 [98]. The children of

Trends in Genetics January 2014, Vol. 30, No. 1

Box 2. Disease gene identification by NGS

In the past a traditional way to identify Mendelian disease genes wasSanger sequencing of candidate genes selected by positionalmapping (i.e., linkage analysis, homozygosity mapping), by theirrelation to other genes responsible for similar phenotypes, or becausethe encoded proteins were known to be physiologically or function-ally relevant to the disease in question. The introduction of NGS hasrevolutionized the genetic dissection of monogenic diseases, allow-ing the identification of gene defects underlying ID in familial caseseven where linkage analysis would be impossible due to insufficientfamily information (size of the family, number of affected per family,etc.) as well as in sporadic cases and encompassing diverse models ofinheritance. Moreover, it can be applied to the detection of CNVs. Fewmajor NGS platforms exist (reviewed in [108]). Although they usedifferent enzymology, chemistry, high resolution optics, hardwareand software, nevertheless they share some commonalities — theygenerally start with fragmented genomic DNA, ligated with platformspecific linker, then selectively amplified by PCR, ready for massivelyparallel sequencing resulting in millions of short reads. NGS can beapplied to sequencing of the entire human genome (referred to aswhole-genome sequencing, WGS), to the entire protein-codingsequences (known as whole-exome sequencing, WES), and also toa subset of genomic regions (i.e., exons within the homozygous locior linkage intervals) or to a subset of target genes. Despite theadvantage of NGS technologies compared to previous methods,including increased speed and reduced costs, the major challengeresides now in the interpretation of the large number of variantsidentified. It will be crucial to develop strategies for disease variantprioritization, including robust bioinformatics procedures to filter therelevant changes. This process could take advantage also of thedevelopment of databases of genetic variants present in affected andhealthy individuals.

consanguineous individuals will have more homozygous DNA thanthe offspring of an outbred marriage. This leads to an increasedlikelihood of rare, recessive disease—causing variants being inheritedfrom a common ancestor via both maternal and paternal lineages.Homozygosity mapping is based on the fact that the affectedoffspring of consanguineous matings will not only be homozygousby descent for the causative gene defect, but also for flankinggenetic markers located on the same chromosomal segment.

Next-generation sequencing (NGS): a new dimension inthe elucidation of ARIDThe introduction of high-throughput NGS techniques(Box 2) has revolutionized the genetic dissection of IDand the identification of gene defects underlying ARID.TECR was the first gene for which a causative homozygousVariant Was identified by Whole-exome enrichment and

Table 1. NS-ARID genes identified before the NGS era (2002-2011)HGNC ID Number of Ethnicity

families reported2

Disorder° First description

PRSS12 9477 Algerian de|ACGT 1350-1353 MRT1, NS-ID #249500 [99]CRBN 30185 1 Closed population R419X MRT2, NS-ID #607417 [100]

(North America)CC2D1A 30237 9 Israeli Arab G408fsX437 MRT3, NS—|D #608443 [101]GRIK2 4580 1 Iranian del/inv, Ex7—11 MRT6, NS-ID #611092 [102]TUSC3 30242 5 Iranian de|120 Kb; Q55X MRT7, NS-ID #611093 [103,104]

French N263TfsX300Pakistani de|170 kbItalian de|203 kb

TRAPPC9 30832 7 Israeli Arab R475X MRT13, NS-ID #613192 [105—107]Tunisian R570XPakistani R475X; c.1024+1G>TIranian L772WfsX7Syrian R475XItalian T951YfsX17

ZC3H14 20509 1 Iranian R154X NS-ID [17]MED23 2372 1 Algerian R6170 NS-ID #614249 [18]

aGene symbol approved by Human Gene Nomenclature Committee, HGNC (http://www.genenames.org/).

bAbbreviations: c, coding region; del, deletion; fs, frameshift; inv, inversion; X, stop codon.

°MRT, mental retardation, autosomal recessive, phenotypic series, OMIM, Online Mendelian Inheritance in Man (http://www.ncbi.n|m.nih.gov/omim); NS—|D, non-syndromic intellectual disability.

°'OM|M number (#), phenotypic description, molecular basis known, version 9 October 2013.

33

Page 36: Trends in Genetics 2014 Jan

sequencing (WES) of a large consanguineous family withNS-ID [19], and a missense mutation in this gene wasrecently found to be a common cause of NS-ID in Hutterites[20]. TECR codes for trans-2,3-enoyl-CoA reductase (alsoreferred to as synaptic glycoprotein 2), which reducestrans-2,3-stearoyl-CoA to stearoyl-CoA of long and verylong chain fatty acids (VLCFA). Perturbations of VLCFAmetabolism have also been observed in other neurologicaldisorders such as adrenoleukodystrophy and Zellwegersyndrome, and mutations affecting FACL4, which is in-volved in the degradation of VLCFA and the production ofkey intermediates in the synthesis of complex lipids, areknown to cause X-linked ID [21].

More recently, a large study highlighted the extraordi-nary potential of NGS for unraveling the molecular basis ofARID [22]. Instead of performing WES, these authorsopted for the enrichment and sequencing of exons from

homozygous linkage intervals in consanguineous Iranianfamilies. In 78 of 136 families investigated a single, appar-ently disease-causing sequence variant was identified. Ofthese families, 26 had homozygous mutations in 23 genespreviously implicated in ID or related neurological disor-ders and, in addition, single homozygous mutations werefound in 50 novel candidate genes for ARID, mostly inpatients with apparently ‘pure’ or NS-ID (Table 2). Follow-up studies have revealed additional clinical symptoms inpatients with mutations involving the same genes, therebyconfirming their postulated role in ID, but also illustratingthe clinical variability of these gene defects.

It is noteworthy that, in about 40% of the familiesstudied, potentially causative gene defects could not beidentified. In populations where parental consanguinity iscommon, not all recessive conditions are due to autozygouschanges, which were the only target of this study; other

Table 2. NS-ARID (candidate) genes identified by NGS (since 2011)a

Gene HGNC ID Number of

families reported

Ethnicity Mutation Disorder OMIM First

description

ADK 257 1 Iranian H324R NS-ID ASD [22]

ADRA2B 282 1 Iranian R440G NS-ID [22]

ASCC3 18697 1 Iranian S1564P NS-ID [22]

ASCL1 738 1 Iranian A41S NS-ID [22]

C11orf46 26798 1 Iranian R236H NS-ID [22]

TTI2 26262 1 Iranian P367L NS-ID [22]

RABL6 24703 1 Pakistani A562P NS-ID [22]

CASP2 1503 1 Iranian Q392X NS-ID [22]

CCNA2 1578 1 Iranian Splice site NS-ID [22]

COQ5 28722 1 Iranian G118S NS-ID [22]

CRADD 2340 5 Old Order Amish

and Mennonite

G128R MRT34, NS-ID #614499 [43]

EEF1B2 3208 1 Iranian Splice site NS-ID [22]

ELP2 18248 2 Lebanese

Iranian

T555P

R462L

NS-ID [22]

ENTPD1 3363 1 Iranian Y65C NS-ID [22]

FASN 3594 1 Iranian R1819W NS-ID [22]

HIST3H3 4778 1 Omani R130C NS-ID [22]

INPP4A 6074 1 Iranian D915fsX NS-ID [22]

KIAA1033 29174 1 Omani P1019R NS-ID [109]

MAN1B1 6823 3

1

1

Pakistani E397K MRT15, NS-ID #614202 [22,41]

W473X

Iranian R334C

NDST1 7680 1 Iranian R709Q NS-ID [22]

PECR 18281 1 Iranian L57V NS-ID ASD [22]

PRMT10 25099 1 Iranian G189R NS-ID [22]

PRRT2 30500 1 Iranian A214fsX NS-ID [22]

RALGDS 9842 1 Iranian A706V NS-ID [22]

RGS7 10003 1 Iranian N304fsX NS-ID ASD [22]

SCAPER 13081 1 Iranian Y118fsX NS-ID [22]

ST3GAL3 10866 2 Iranian A13D

D370Y

MRT12, NS-ID #611090 [39]

TECR 4551 15 Endogamic

population;

Hutterites

P182L MRT14, NS-ID #614020 [19]

TRMT1 25980 1 Iranian I230fsX NS-ID [22]

UBR7 20344 1 Iranian N124S NS-ID ASD [22]

ZCCHC8 25265 1 Iranian L90X NS-ID [22]

ZNF526 29415 2 Iranian R459Q

Q539H

NS-ID [22]

aAbbreviation: ASD, autism spectrum disorder; other abbreviations are given in Table 1 legend.

Review Trends in Genetics January 2014, Vol. 30, No. 1

34

Review Trends in Genetics January 2014, Vol. 30, No. 1

Table 2. NS-ARID (candidate) genes identified by NGS (since 2011)“HGNC ID Number of

families reportedADK 257

Ethnicity Mutation Disorder Firstdescription[22]1 Iranian H324R NS—ID ASD

ADRAZB 282 1 Iranian R44OG NS—ID [22]ASCC3 18697 1 Iranian S1564P NS—ID [22]ASCL1 738 1 Iranian A41S NS—ID [22]

C110rf46 26798 1 Iranian R236H NS—ID [22]

TTI2 26262 1 Iranian P367L NS—ID [22]RABL6 24703 1 Pakistani A562P NS—ID [22]CASP2 1503 1 Iranian O392X NS—ID [22]CCNA2 1578 1 Iranian Splice site NS—ID [22]C005 28722 1 Iranian G118S NS—ID [22]CRADD 2340 5 Old Order Amish G128R MRT34, NS—ID #614499 [43]

and MennoniteEEF1B2 3208 1 Iranian Splice site NS—ID [22]

ELP2 18248 2 Lebanese T555P NS—ID [22]Iranian R462L

ENTPD1 3363 1 Iranian Y65C NS—ID [22]FASN 3594 1 Iranian R1819W NS—ID [22]

HIST3H3 4778 1 Omani R130C NS—ID [22]INPP4A 6074 1 Iranian D915fsX NS—ID [22]

KIAA1033 29174 1 Omani P1019R NS—ID [109]MAN1B1 6823 3 Pakistani E397K MRT15, NS—ID #614202 [22,41]

1 W473X1 Iranian R334C

NDST1 7680 1 Iranian R7090 NS—ID [22]

PECR 18281 1 Iranian L57V NS—ID ASD [22]PRMT10 25099 1 Iranian G189R NS—ID [22]PRRT2 30500 1 Iranian A214fsX NS—ID [22]

RALGDS 9842 1 Iranian A706V NS—ID [22]RGS7 10003 1 Iranian N304fsX NS—ID ASD [22]

SCAPER 13081 1 Iranian Y118fsX NS—ID [22]

ST3GAL3 10866 2 Iranian A13D MRT12, NS—ID #611090 [39]D370Y

TECR 4551 15 Endogamic P182L MRT14, NS—ID #614020 [19]population;Hutterites

TRMT1 25980 1 Iranian |230fsX NS—ID [22]

UBR7 20344 1 Iranian N124S NS—ID ASD [22]

ZCCHC8 25265 1 Iranian L90X NS—ID [22]ZNF526 29415 2 Iranian R4590 NS—ID [22]

O539H

“Abbreviation: ASD, autism spectrum disorder; other abbreviations are given in Table 1 legend.

sequencing (WES) of a large consanguineous family withNS—ID [19], and a missense mutation in this gene wasrecently found to be a common cause ofNS—ID in Hutterites[20]. TECR codes for trans—2,3—enoyl-CoA reductase (alsoreferred to as synaptic glycoprotein 2), which reducestrans-2,3-stearoyl-CoA to stearoyl—CoA of long and verylong chain fatty acids (VLCFA). Perturbations of VLCFAmetabolism have also been observed in other neurologicaldisorders such as adrenoleukodystrophy and Zellwegersyndrome, and mutations affecting FACL4, which is in-volved in the degradation ofVLCFA and the production ofkey intermediates in the synthesis of complex lipids, areknown to cause X-linked ID [21].

More recently, a large study highlighted the extraordi-nary potential ofNGS for unraveling the molecular basis ofARID [22]. Instead of performing WES, these authorsopted for the enrichment and sequencing of exons from

34

homozygous linkage intervals in consanguineous Iranianfamilies. In 78 of 136 families investigated a single, appar-ently disease-causing sequence variant was identified. Ofthese families, 26 had homozygous mutations in 23 genespreviously implicated in ID or related neurological disor-ders and, in addition, single homozygous mutations werefound in 50 novel candidate genes for ARID, mostly inpatients with apparently ‘pure’ or NS—ID (Table 2). Follow-up studies have revealed additional clinical symptoms inpatients with mutations involving the same genes, therebyconfirming their postulated role in ID, but also illustratingthe clinical variability of these gene defects.

It is noteworthy that, in about 40% of the familiesstudied, potentially causative gene defects could not beidentified. In populations where parental consanguinity iscommon, not all recessive conditions are due to autozygouschanges, which were the only target of this study; other

Page 37: Trends in Genetics 2014 Jan

defects including compound heterozygosity or mutations inintronic, promoter, or other non-coding sequences could notbe detected by this approach. In other families, pathogenicchanges may have been overlooked due to overly-stringentfiltering of sequence variants, including all synonymouschanges.

In outbred populations, most patients with recessiveforms of ID or related disorders will be sporadic cases. IDfamilies are mostly non-consanguineous and only a smallproportion have multiple affected siblings [23]. Recently,the first systematic WES study including 19 such familiesrevealed compound heterozygous frameshift changes inthe DDHD2 gene, which encodes one of the three mamma-lian intracellular phospholipases A(1) [24], as well aspathogenic mutations in two known X-linked ID genes.Potentially pathogenic mutations, including three com-pound heterozygous and two homozygous changes, wereidentified in five candidate genes not previously implicatedin ID [23]. Thus, the diagnostic yield of this study (42%)was only slightly inferior to one performed in consanguin-eous families (57%) [22], although it remains to be seenhow many will be confirmed by validation studies.

Most novel candidates are bona fide ARID genesMany of the recently reported novel candidate genes arevery attractive candidates because of their synapse- orbrain-specific function; others involve basic cellular pro-cesses which have been repeatedly implicated in ID, suchas DNA transcription and translation, protein degrada-tion, mRNA splicing, energy metabolism, or fatty-acidsynthesis and turnover [22]. Conclusive proof for theirindispensable role in the brain has been obtained for agrowing number of these genes through the identificationof additional mutations in unrelated families, studies inmouse or fly models, or by other means.

For example, mutations in the LARP7 gene have nowbeen observed in two unrelated families. LARP7 encodes anegative transcriptional regulator of polymerase II genes,acting by means of the 7SK ribonucleoprotein (RNP) sys-tem [25]. After the first description [22], a second loss-of-function mutation in LARP7 was described in a familyfrom Saudi Arabia with primordial dwarfism, intellectualdisability, and dysmorphic facial features [26].

In a consanguineous family with ID, facial dysmorph-isms, and cataracts, a homozygous intragenic CACNA1Gdeletion was described that is predicted to remove at least 20amino acids of CACNA1G, abolishing its function. CAC-NA1G is a T-type calcium channel with a crucial role in thegeneration of GABAB receptor-mediated spike and wavedischarges in the thalamo-cortical pathway ([27] and refer-ences therein). A second homozygous CACNA1G mutationhas been found that removes a single but apparently essen-tial amino acid (F.S. Alkuraya, Riyadh, personal communi-cation) in several members of a previously described Arabfamily with a severe syndromic form of ARID [28]. Moreover,a de novo deletion removing one copy of the CACNA1G genehas been found in a male patient [29] whose clinical featuresclosely resembled that of Iranian patients with a homozy-gous intragenic CACNA1G deletion [22].

Mutations in the NSUN2 gene have been identified infive unrelated consanguineous families. Together, these

findings revealed the syndromic nature of this condition,which includes characteristic facial features and variableother clinical signs [30–32]. NSUN2 encodes an RNAmethyltransferase which methylates cytosine to 5-methyl-cytosine (m5C) at position 34 of intron-containing tRNA(-Leu)(CAA) precursors [33]. A Drosophila model of this wasgenerated by deleting the NSUN2 ortholog, which resultedin severe short-term memory (STM) deficits, pointing to animportant role of RNA methylation in cognition [30].NSUN2 is now the third RNA-methyltransferase genelinked to ID. Previously, FTSJ1 (MRX9, MIM #309549)had been implicated in X-linked NS-ID [34], and recentlyTRMT1, which encodes a tRNA (G26) dimethyltransfer-ase, was identified as a novel candidate gene for ARID [22].

ZC3H14, mutated in a consanguineous family with NS-ARID, is another gene whose indispensable role in thecentral nervous system has been supported by a Drosophi-la model. ZC3H14 is the human ortholog of the DrosophilaNab2 protein, which binds to polyadenylated mRNA andrestricts the length of the poly(A) tail, and this protein wasalso found to be indispensable for normal behavior in thefly [17] (see Table 1). ZC3H14 is a new member of thegrowing list of ID genes with a role in mRNA metabolism,including FMRP, FMR2P, PQBP1, UFP3B, DYRK1A, andCDKL5 ([35] for review).

ARID is extremely heterogeneous and clinically variableAt the time of writing, 40 genes have been implicated inNS-ID (Tables 1 and 2). In 11 of these, apparently patho-genic mutations have been detected in more than onefamily. A mutation in the neurotrypsin gene (PRSS12)has been found in two apparently unrelated Algerianfamilies with NS-ID (reviewed in [4] and references there-in). A mutation in the CC2D1A gene, the product of whichregulates expression of the serotonin receptor 1A gene inneuronal cells, had been identified in nine nuclear familiesand more recently in a Pakistani family ([36] and refer-ences therein). Other established ARID genes includeTUSC3 which is required for cellular Mg2+ uptake, traf-ficking protein particle complex 9 (TRAPPC9), and ST3 b-galactoside a-2,3-sialyltransferase 3 (ST3GAL3) ([4] andreferences therein; [37–40]), MAN1B1 encoding an enzymewhich functions in N-glycan biosynthesis [22,41], the tran-scriptional regulator ZNF526, and ELP2 [22] whichencodes a subunit of the RNA polymerase II elongatorcomplex [42]. Finally, CRADD has been identified asnew gene for NS-ARID in affected children from differentOld Order Amish and Mennonite sibships [43]. CRADDcodes for a caspase recruitment domain and death domain-containing adaptor protein that activates caspase 2, anovel candidate gene for NS-ARID [22], and is requiredfor neuronal apoptosis [44].

For many of the recently described gene defects thatgive rise to ARID the clinical picture has turned out to becomplex and variable. ADK deficiency may lead to NS-ARID or present with severe developmental delay, persis-tent hypermethioninemia, and mild liver dysfunction [45],and KIF7 mutations have been reported in two different,clinically distinguishable ID-malformation syndromes([46] and references therein). ARID genes have also beenimplicated in conditions that are apparently unrelated to

Review Trends in Genetics January 2014, Vol. 30, No. 1

35

Review

defects including compound heterozygosity or mutations inintronic, promoter, or other non—coding sequences could notbe detected by this approach. In other families, pathogenicchanges may have been overlooked due to overly-stringentfiltering of sequence variants, including all synonymouschanges.

In outbred populations, most patients with recessiveforms of ID or related disorders will be sporadic cases. IDfamilies are mostly non—consanguineous and only a smallproportion have multiple affected siblings [23]. Recently,the first systematic WES study including 19 such familiesrevealed compound heterozygous frameshift changes inthe DDHD2 gene, which encodes one of the three mamma-lian intracellular phospholipases A(1) [24], as well aspathogenic mutations in two known X-linked ID genes.Potentially pathogenic mutations, including three com-pound heterozygous and two homozygous changes, wereidentified in five candidate genes not previously implicatedin ID [23]. Thus, the diagnostic yield of this study (42%)was only slightly inferior to one performed in consanguin-eous families (57%) [22], although it remains to be seenhow many will be confirmed by validation studies.

Most novel candidates are bona fide ARID genesMany of the recently reported novel candidate genes arevery attractive candidates because of their synapse- orbrain-specific function; others involve basic cellular pro-cesses which have been repeatedly implicated in ID, suchas DNA transcription and translation, protein degrada-tion, mRNA splicing, energy metabolism, or fatty-acidsynthesis and turnover [22]. Conclusive proof for theirindispensable role in the brain has been obtained for agrowing number of these genes through the identificationof additional mutations in unrelated families, studies inmouse or fly models, or by other means.

For example, mutations in the LARP7 gene have nowbeen observed in two unrelated families. LARP7 encodes anegative transcriptional regulator of polymerase II genes,acting by means of the 7SK ribonucleoprotein (RNP) sys-tem [25]. After the first description [22], a second loss-of-function mutation in LARP7 was described in a familyfrom Saudi Arabia with primordial dwarfism, intellectualdisability, and dysmorphic facial features [26].

In a consanguineous family with ID, facial dysmorph-isms, and cataracts, a homozygous intragenic CACNA1Gdeletion was described that is predicted to remove at least 20amino acids of CACNA1G, abolishing its function. CAC-NA1G is a T-type calcium channel with a crucial role in thegeneration of GABAB receptor—mediated spike and wavedischarges in the thalamo-cortical pathway ([27] and refer-ences therein). A second homozygous CACNAIG mutationhas been found that removes a single but apparently essen-tial amino acid (F.S. Alkuraya, Riyadh, personal communi-cation) in several members of a previously described Arabfamilywith a severe syndromic form ofARID [28]. Moreover,a de novo deletion removing one copy ofthe CACNA1G genehas been found in a male patient [29] whose clinical featuresclosely resembled that of Iranian patients with a homozy-gous intragenic CACNAZG deletion [22].

Mutations in the NSUN2 gene have been identified infive unrelated consanguineous families. Together, these

Trends in Genetics January 2014, Vol. 30, No. 1

findings revealed the syndromic nature of this condition,which includes characteristic facial features and variableother clinical signs [30—32]. NSUN2 encodes an RNAmethyltransferase which methylates cytosine to 5—methyl-cytosine (m5C) at position 34 of intron-containing tRNA(-Leu)(CAA) precursors [33]. A Drosophila model of this wasgenerated by deleting the NSUN2 ortholog, which resultedin severe short-term memory (STM) deficits, pointing to animportant role of RNA methylation in cognition [30].NSUN2 is now the third RNA-methyltransferase genelinked to ID. Previously, FTSJ1 (MRX9, MIM #309549)had been implicated in X-linked NS-ID [34], and recentlyTRMT1, which encodes a tRNA (G26) dimethyltransfer-ase, was identified as a novel candidate gene for ARID [22].

ZC3H14, mutated in a consanguineous family with NS-ARID, is another gene whose indispensable role in thecentral nervous system has been supported by a Drosophi-la model. ZC3H14 is the human ortholog of the DrosophilaNab2 protein, which binds to polyadenylated mRNA andrestricts the length of the poly(A) tail, and this protein wasalso found to be indispensable for normal behavior in thefly [17] (see Table 1). ZC3I-I14 is a new member of thegrowing list of ID genes with a role in mRNA metabolism,including FMRP, FMRZP, PQBP1, UFP3B, DYRKIA, andCDKL5 ([35] for review).

ARID is extremely heterogeneous and clinically variableAt the time of writing, 40 genes have been implicated inNS-ID (Tables 1 and 2). In 11 of these, apparently patho-genic mutations have been detected in more than onefamily. A mutation in the neurotrypsin gene (PRSS12)has been found in two apparently unrelated Algerianfamilies with NS-ID (reviewed in [4] and references there-in). A mutation in the CC2D1A gene, the product of whichregulates expression of the serotonin receptor 1A gene inneuronal cells, had been identified in nine nuclear familiesand more recently in a Pakistani family ([36] and refer-ences therein). Other established ARID genes includeTUSC3 which is required for cellular Mg2+ uptake, traf-ficking protein particle complex 9 (TRAPPC9), and ST3 B-galactoside a—2,3—sialyltransferase 3 (ST3GAL3) ([4] andreferences therein; [37—40]), MAN1B1 encoding an enzymewhich functions in N-glycan biosynthesis [22,41], the tran-scriptional regulator ZNF526, and ELP2 [22] whichencodes a subunit of the RNA polymerase II elongatorcomplex [42]. Finally, CRADD has been identified asnew gene for NS-ARID in affected children from differentOld Order Amish and Mennonite sibships [43]. CRADDcodes for a caspase recruitment domain and death domain-containing adaptor protein that activates caspase 2, anovel candidate gene for NS-ARID [22], and is requiredfor neuronal apoptosis [44].

For many of the recently described gene defects thatgive rise to ARID the clinical picture has turned out to becomplex and variable. ADK deficiency may lead to NS-ARID or present with severe developmental delay, persis-tent hypermethioninemia, and mild liver dysfunction [45],and KIF7 mutations have been reported in two different,clinically distinguishable ID-malformation syndromes([46] and references therein). ARID genes have also beenimplicated in conditions that are apparently unrelated to

35

Page 38: Trends in Genetics 2014 Jan

ID, pointing to pleiotropic functions of these genes. Forexample, overexpression of the fatty acid synthase FASN,a strong positional and functional candidate gene for ARID[22], predisposes to leiomyomatosis [47], and homozygousinactivation of FTO, which encodes an RNA demethylaseand has been previously implicated in obesity [48], hasbeen shown to result in severe developmental delay withmalformations [49].

A role for recessive factors in epilepsy, autism, andother psychiatric disorders?ID is frequently associated with psychiatric and/or neuro-logical disorders (reviewed in [50]). Based on the Interna-tional Classification of Diseases (ICD, 10th revision) it hasbeen estimated that between 14% and 39% of individualswith ID present with comorbid psychiatric diagnoses.

Epilepsy is among the most frequently associated dis-orders [50], with a frequency ranging from 5.5% to 35%,which is similar to the 20–27% reported by population-based studies of children with epilepsy and some degree ofID ([51] and references therein). In patients with mild tomoderate ID its frequency is 15%, but it may exceed 30% ifthe ID is severe or profound [52]. Moreover, epilepsy is seenin about half of the X-linked ID syndromes (reviewed in[53]). A number of well-known genetic disorders share ID,epilepsy, and autism as prominent clinical features, in-cluding tuberous sclerosis, Rett syndrome, and fragile X[54].

In recent years, the contribution of structural genomevariation to epilepsy has become increasingly evident. Themost common CNVs associated with epilepsy, at 15q13.3,15q11.2, and 16p13.11, also confer susceptibility for learn-ing disabilities (reviewed in [8] and references therein)suggesting that common genetic factors could have a caus-ative role. NGS has also been instrumental in identifyinggenes for recessive syndromes encompassing epilepsy andID. One of the earliest applications of this technology wasthe identification of homozygous and compound heterozy-gous changes in the TBC1D24 gene, which encodes a RabGTPase activator, in an Arab family with seizures and ID[55], and in an Italian family with infantile myoclonicepilepsy (MIM #605021) [56], respectively. Since then,two additional families with recessive mutations inTBC1D24 have been described with early infantile epilep-tic encephalopathy 16 (MIM #615338) [57,58].

Recently, homozygous frameshift mutations in thePRRT2 (proline-rich transmembrane protein 2) gene havebeen identified in two families with ID and epilepsy ininfancy [22,59], whereas heterozygous truncating and mis-sense mutations were shown to cause dominant infantileepilepsy (MIM #605751) and episodic dyskinesia (MIM#128200) ([60] and references therein). These findingsagain highlight the stunning clinical variability of muta-tions involving the same gene.

In patients with ID autistic signs are also common, andmost patients with autism have some degree of cognitiveimpairment [61]. Numerous genetic defects have beenimplicated in ID and autism, including mutations in X-linked genes (e.g., NLG3, NLG4 [62,63], TMLHE [64], andCNVs (reviewed in [8]) or apparently dominant de novomutations ([65] and references therein), and it is likely that

the strong comorbidity between ID and other major psy-chiatric disorders [66] is also due, at least in part, to sharedgenetic factors (e.g., [67–69]).

In several genes for autosomal recessive ID, includingKDM6B (lysine demethylase 6B), MED13L, which encodesfor a subunit of the mediator complex, and nudE nucleardistribution E homolog 1 (NDE1) [22,70], dominant de novomutations or loss of one entire gene copy have been de-scribed in autistic patients [71,72]. Nevertheless, there islittle direct evidence for a causative role of recessive genedefects in autism and other psychiatric disorders.

In part this may be due to the focus of autism andschizophrenia research on common genetic risk factors(e.g., [73]) and, recently, on dominant de novo mutations(e.g., [71,74–76]) which may account for about 20% thesporadic cases [65]. Array CGH studies in consanguineousfamilies have identified several homozygous deletionsencompassing autism candidate genes, one of which wasalso found to be mutated in a non-consanguineous familywith autism spectrum disorder (ASD) [77]. In simplex ASDfamilies, affected individuals with IQ<70 have longerhomozygous segments in their genome than unaffectedsiblings, but probands with an IQ>70 do not show thisexcess. Thus, long stretches of homozygosity may confersusceptibility to autism with low IQ or to low IQ alone [78].

Homozygous, compound heterozygous, or homozygoushypomorphic mutations in disease genes are known toassociate with monogenic autosomal or X-linked recessiveneurodevelopmental disorders, and potentially causativemutations in candidate genes were found in consanguine-ous and outbred ASD families by WES [79,80]. This find-ing, together with the identification of a twofold increase inrare complete knockout mutations in ASD patients com-pared to controls [81], provide convincing evidence thatautosomal recessive gene defects play a role in autism, buttheir frequency is still unknown.

Finally, consanguinity has also been suggested as a riskfactor for bipolar disorder and schizophrenia [82,83], buteven less is known about the contribution of autosomalrecessive mutations to the pathogenesis of these diseases.

How frequent are recessive forms of ID?In the small families of outbred Western societies, mostpatients with recessive forms of ID or related disorders willbe sporadic cases. If couples with offspring have two chil-dren on average, which is close to the actual situation inEurope ([84]; M. Kreyenfeld, Rostock, personal communi-cation), only one of four patients will have an affectedsibling and will be identified as a familial case. In CentralEurope, between 3.3 and 6% of patients with ID referred togenetic services are familial cases [10,23]. Taken at facevalue, this suggests that recessive forms of ID account for13–24% of the cases in Europe. However, this may be anoverestimate because it is based on the assumption thatparents with a single affected child will be equally likely toseek genetic advice as parents with two or more affectedchildren, which is probably not true.

Given the low rate of parental consanguinity in devel-oped countries, most patients with ARID are expected to becompound heterozygotes carrying two different disease-causing alleles [85]. This is in keeping with a recent study

Review Trends in Genetics January 2014, Vol. 30, No. 1

36

Review

ID, pointing to pleiotropic functions of these genes. Forexample, overexpression of the fatty acid synthase FASN,a strong positional and functional candidate gene for ARID[22], predisposes to leiomyomatosis [47], and homozygousinactivation of FTO, which encodes an RNA demethylaseand has been previously implicated in obesity [48], hasbeen shown to result in severe developmental delay withmalformations [49].

A role for recessive factors in epilepsy, autism, andother psychiatric disorders?ID is frequently associated with psychiatric and/or neuro-logical disorders (reviewed in [50]). Based on the Interna-tional Classifi cationof Diseases (ICD, 10th revision) it hasbeen estimated that between 14% and 39% of individualswith ID present with comorbid psychiatric diagnoses.

Epilepsy is among the most frequently associated dis-orders [50], with a frequency ranging from 5.5% to 35%,which is similar to the 20-27% reported by population-based studies of children with epilepsy and some degree ofID ([51] and references therein). In patients with mild tomoderate ID its frequency is 15%, but it may exceed 30% ifthe ID is severe or profound [52]. Moreover, epilepsy is seenin about half of the X-linked ID syndromes (reviewed in[53]). A number of well-known genetic disorders share ID,epilepsy, and autism as prominent clinical features, in-cluding tuberous sclerosis, Rett syndrome, and fragile X[54].

In recent years, the contribution of structural genomevariation to epilepsy has become increasingly evident. Themost common CNVs associated with epilepsy, at 15q13.3,15q11.2, and 16p13.11, also confer susceptibility for learn-ing disabilities (reviewed in [8] and references therein)suggesting that common genetic factors could have a caus-ative role. NGS has also been instrumental in identifyinggenes for recessive syndromes encompassing epilepsy andID. One of the earliest applications of this technology wasthe identification of homozygous and compound heterozy-gous changes in the TBC1D24 gene, which encodes a RabGTPase activator, in an Arab family with seizures and ID[55], and in an Italian family with infantile myoclonicepilepsy (MIM #605021) [56], respectively. Since then,two additional families with recessive mutations inTBCZD24 have been described with early infantile epilep-tic encephalopathy 16 (MIM #615338) [57,58].

Recently, homozygous frameshift mutations in thePRRT2 (proline—rich transmembrane protein 2) gene havebeen identified in two families with ID and epilepsy ininfancy [22,59] , whereas heterozygous truncating and mis-sense mutations were shown to cause dominant infantileepilepsy (MIM #605751) and episodic dyskinesia (MIM#128200) ([60] and references therein). These findingsagain highlight the stunning clinical variability of muta-tions involving the same gene.

In patients with ID autistic signs are also common, andmost patients with autism have some degree of cognitiveimpairment [61]. Numerous genetic defects have beenimplicated in ID and autism, including mutations in X-linked genes (e.g., NLG3, NLG4 [62,63], TMLHE [64], andCNVs (reviewed in [8]) or apparently dominant de novomutations ([65] and references therein), and it is likely that

36

Trends in Genetics January 2014, Vol. 30, No. 1

the strong comorbidity between ID and other major psy-chiatric disorders [66] is also due, at least in part, to sharedgenetic factors (e.g., [67—69]).

In several genes for autosomal recessive ID, includingKDM6B (lysine demethylase 6B), MED13L, which encodesfor a subunit of the mediator complex, and nudE nucleardistribution E homolog 1 (NDE1) [22,70] , dominant de novomutations or loss of one entire gene copy have been de-scribed in autistic patients [71,72]. Nevertheless, there islittle direct evidence for a causative role of recessive genedefects in autism and other psychiatric disorders.

In part this may be due to the focus of autism andschizophrenia research on common genetic risk factors(e.g., [73]) and, recently, on dominant de novo mutations(e.g., [71,74—76]) which may account for about 20% thesporadic cases [65]. Array CGH studies in consanguineousfamilies have identified several homozygous deletionsencompassing autism candidate genes, one of which wasalso found to be mutated in a non—consanguineous familywith autism spectrum disorder (ASD) [77]. In simplex ASDfamilies, affected individuals with IQ<70 have longerhomozygous segments in their genome than unaffectedsiblings, but probands with an IQ>70 do not show thisexcess. Thus, long stretches of homozygosity may confersusceptibility to autism with low IQ or to low IQ alone [78].

Homozygous, compound heterozygous, or homozygoushypomorphic mutations in disease genes are known toassociate with monogenic autosomal or X-linked recessiveneurodevelopmental disorders, and potentially causativemutations in candidate genes were found in consanguine-ous and outbred ASD families by WES [79,80]. This find-ing, together with the identification ofa twofold increase inrare complete knockout mutations in ASD patients com-pared to controls [81], provide convincing evidence thatautosomal recessive gene defects play a role in autism, buttheir frequency is still unknown.

Finally, consanguinity has also been suggested as a riskfactor for bipolar disorder and schizophrenia [82,83], buteven less is known about the contribution of autosomalrecessive mutations to the pathogenesis of these diseases.

How frequent are recessive forms of ID?In the small families of outbred Western societies, mostpatients with recessive forms of ID or related disorders willbe sporadic cases. If couples with offspring have two chil-dren on average, which is close to the actual situation inEurope ([84]; M. Kreyenfeld, Rostock, personal communi-cation), only one of four patients will have an affectedsibling and will be identified as a familial case. In CentralEurope, between 3.3 and 6% ofpatients with ID referred togenetic services are familial cases [10,23]. Taken at facevalue, this suggests that recessive forms of ID account for13-24% of the cases in Europe. However, this may be anoverestimate because it is based on the assumption thatparents with a single affected child will be equally likely toseek genetic advice as parents with two or more affectedchildren, which is probably not true.

Given the low rate of parental consanguinity in devel-oped countries, most patients with ARID are expected to becompound heterozygotes carrying two different disease-causing alleles [85]. This is in keeping with a recent study

Page 39: Trends in Genetics 2014 Jan

focusing on dominant de novo mutations in sporadic ID[86]. No homozygous disease-causing mutation was foundin 51 sporadic patients, but some carried two allelic andprobably pathogenic mutations in functional candidategenes, suggesting that a minor proportion of the casesmay be due to ARID. However, the true proportion of ARIDmust be higher because familial cases and consanguineousfamilies were not included in this study, some compoundheterozygotes may have been overlooked because they aremore difficult to detect by NGS, and mutations in non-coding DNA have not been taken into consideration. De-tectable and submicroscopic chromosomal rearrangementsaccount for approximately 25% of all individuals withsevere ID, and X-linked factors are thought to be responsi-ble for 10–12%. De novo mutations have been found in 16and 31%, respectively, of sporadic patients [11,86], buttheir true frequency may be even higher. Taken together,in outbred populations ARID may account for about 10–20% of the cases, which leaves room for oligogenic/polygen-ic forms of ID, which have been the subject of a recentreview [87].

In populations where parental consanguinity is com-mon, autosomal recessive gene defects must be an evenmore important cause of ID. In families from the MiddleEast, autosomal recessive disorders were found to be al-most threefold more frequent among inbred as among non-inbred cases [88]. In Jordan, autosomal recessive inheri-tance was observed in 32% of the families counseled and, ofthe �27% sporadic cases without a definite diagnosis, 30%were also ascribed to autosomal recessive gene defects [89].Thus, in these countries, ARID should be the most commongenetic cause of ID – and a particularly promising targetfor diagnosis and prevention.

Implications for research and healthcareDespite the remarkable progress in the elucidation ofautosomal recessive forms of ID, it is likely that the severalhundred genes already implicated in syndromic or non-syndromic ARID (see [90] and references therein) andrelated disorders are vastly outnumbered by the manyARID genes still waiting to be found. Considering thaton the X chromosome alone, which carries 4% of all humangenes, already more than 100 ID genes have been identi-fied [6,7], there should be at least 2500 autosomal ID genes,and most of the novel forms of ID should be autosomalrecessive, which is supported by functional considerationsand evidence from model organisms.

NGS in families with two or more affected individualshas proven to be an extraordinarily effective approach foridentifying novel recessive causes of ID, and internationalcollaborations including the GENCODYS consortium(http://www.gencodys.eu/) have set out to identify themolecular causes of ARID in a systematic fashion. Al-though autozygosity mapping followed by targeted exonsequencing [22] is a successful and cost-effective strategyfor finding causative gene defects in consanguineous fam-ilies, it will only detect homozygous mutations. However,in families from Western industrialized countries, com-pound heterozygous mutations are common, and even incountries with frequent parental consanguinity, com-pound heterozygosity is not rare (H. Najmabadi, Tehran,

personal communication). This argues for using WES as amore comprehensive strategy to elucidate novel causes ofARID, even though, as with targeted exon sequencing, itwill miss most non-exonic mutations. Intronic changes[91] and mutations in non-coding regulatory sequencesare only detectable by whole-genome sequencing (WGS),and another advantage of WGS is its more even coverage.This is why WGS does not require very high sequencingdepths, and it may soon become an affordable alternativeto WES.

Increasingly, WES has been proposed as a comprehen-sive diagnostic tool for detecting mutations in patientswith ID and related disorders [92,93]. In a diagnosticsetting, targeted NGS-based tests encompassing all genesimplicated in ID or related disorders could be equallyuseful, but much cheaper and easier to read, and they willnot yield any unsolicited results. Targeted tests of this kindhave been developed for a variety of genetically heteroge-neous conditions such as deafness and blindness, and abroad test for severe recessive childhood diseases is al-ready routinely employed in healthcare [94,95].

Concluding remarksAfter having been disregarded for a long time, recessivegene defects are being discovered at a rapid pace as impor-tant causes of ID. Comprehensive and affordable tests torule out all known forms of ARID will have a major effect onthe diagnosis and prevention of ID, not only in developingcountries where parental consanguinity is common butalso elsewhere.

AcknowledgmentsWe thank Hossein Najmabadi and Kimia Kahrizi, Hao Hu, MasoudGarshasbi, Andreas Kuss, Wei Chen, and Thomas Wienker for theiressential contributions to our past and ongoing ARID research, andGabriele Eder for help with the preparation of the manuscript. This workwas supported by the Max Planck Society and by the EuropeanCommission Framework Program 7 (FP7) project GENCODYS, grantno 241995 (coordinator: Hans van Bokhoven, Nijmegen).

References1 Schalock, R.L. et al. (2007) The renaming of mental retardation:

understanding the change to the term intellectual disability.Intellect. Dev. Disabil. 45, 116–124

2 American Psychiatric Association (2000) Diagnostic and StatisticManual of Mental Disorders (4th Revision: DSM-IV-TR), AmericanPsychiatric Association

3 Leonard, H. and Wen, X. (2002) The epidemiology of mentalretardation: challenges and opportunities in the new millennium.Ment. Retard. Dev. Disabil. Res. Rev. 8, 117–134

4 Ropers, H.H. (2010) Genetics of early onset cognitive impairment.Annu. Rev. Genomics Hum. Genet. 11, 161–187

5 Kleefstra, T. and Hamel, B.C. (2005) X-linked mental retardation:further lumping, splitting and emerging phenotypes. Clin. Genet. 67,451–467

6 Lubs, H.A. et al. (2012) Fragile X and X-linked intellectual disability:four decades of discovery. Am. J. Hum. Genet. 90, 579–590

7 Piton, A. et al. (2013) XLID-causing mutations and associated geneschallenged in light of data form large-scale human exome sequencing.Am. J. Hum. Genet. 93, 368–383

8 Mefford, H.C. et al. (2012) Genomics, intellectual disability, andautism. N. Engl. J. Med. 366, 733–743

9 Veltman, J.A. and Brunner, H.G. (2012) De novo mutations in humangenetic disease. Nat. Rev. Genet. 13, 565–575

10 Rauch, A. et al. (2006) Diagnostic yield of various genetic approachesin patients with unexplained developmental delay or mentalretardation. Am. J. Med. Genet. A 140, 2063–2074

Review Trends in Genetics January 2014, Vol. 30, No. 1

37

Review

focusing on dominant de novo mutations in sporadic ID[86]. No homozygous disease-causing mutation was foundin 51 sporadic patients, but some carried two allelic andprobably pathogenic mutations in functional candidategenes, suggesting that a minor proportion of the casesmay be due to ARID. However, the true proportion ofARIDmust be higher because familial cases and consanguineousfamilies were not included in this study, some compoundheterozygotes may have been overlooked because they aremore difficult to detect by NGS, and mutations in non-coding DNA have not been taken into consideration. De-tectable and submicroscopic chromosomal rearrangementsaccount for approximately 25% of all individuals withsevere ID, and X-linked factors are thought to be responsi-ble for 10-12%. De novo mutations have been found in 16and 31%, respectively, of sporadic patients [11,86], buttheir true frequency may be even higher. Taken together,in outbred populations ARID may account for about 10-20% of the cases, which leaves room for oligogenic/polygen-ic forms of ID, which have been the subject of a recentreview [87].

In populations where parental consanguinity is com-mon, autosomal recessive gene defects must be an evenmore important cause of ID. In families from the MiddleEast, autosomal recessive disorders were found to be al-most threefold more frequent among inbred as among non-inbred cases [88]. In Jordan, autosomal recessive inheri-tance was observed in 32% ofthe families counseled and, ofthe ~27% sporadic cases without a definite diagnosis, 30%were also ascribed to autosomal recessive gene defects [89].Thus, in these countries, ARID should be the most commongenetic cause of ID - and a particularly promising targetfor diagnosis and prevention.

Implications for research and healthcareDespite the remarkable progress in the elucidation ofautosomal recessive forms of ID, it is likely that the severalhundred genes already implicated in syndromic or non-syndromic ARID (see [90] and references therein) andrelated disorders are vastly outnumbered by the manyARID genes still waiting to be found. Considering thaton the X chromosome alone, which carries 4% ofall humangenes, already more than 100 ID genes have been identi-fied [6,7] , there should be at least 2500 autosomal ID genes,and most of the novel forms of ID should be autosomalrecessive, which is supported by functional considerationsand evidence from model organisms.

NGS in families with two or more affected individualshas proven to be an extraordinarily effective approach foridentifying novel recessive causes of ID, and internationalcollaborations including the GENCODYS consortium(http://www.gencodys.eu/) have set out to identify themolecular causes of ARID in a systematic fashion. Al-though autozygosity mapping followed by targeted exonsequencing [22] is a successful and cost-effective strategyfor finding causative gene defects in consanguineous fam-ilies, it will only detect homozygous mutations. However,in families from Western industrialized countries, com-pound heterozygous mutations are common, and even incountries with frequent parental consanguinity, com-pound heterozygosity is not rare (H. Najmabadi, Tehran,

Trends in Genetics January 2014, Vol. 30, No. 1

personal communication). This argues for using WES as amore comprehensive strategy to elucidate novel causes ofARID, even though, as with targeted exon sequencing, itwill miss most non-exonic mutations. Intronic changes[91] and mutations in non-coding regulatory sequencesare only detectable by whole-genome sequencing (WGS),and another advantage ofWGS is its more even coverage.This is why WGS does not require very high sequencingdepths, and it may soon become an affordable alternativeto WES.

Increasingly, WES has been proposed as a comprehen-sive diagnostic tool for detecting mutations in patientswith ID and related disorders [92,93]. In a diagnosticsetting, targeted NGS-based tests encompassing all genesimplicated in ID or related disorders could be equallyuseful, but much cheaper and easier to read, and they willnot yield any unsolicited results. Targeted tests ofthis kindhave been developed for a variety of genetically heteroge-neous conditions such as deafness and blindness, and abroad test for severe recessive childhood diseases is al-ready routinely employed in healthcare [94,95].

Concluding remarksAfter having been disregarded for a long time, recessivegene defects are being discovered at a rapid pace as impor-tant causes of ID. Comprehensive and affordable tests torule out all known forms ofARID will have a major effect onthe diagnosis and prevention of ID, not only in developingcountries where parental consanguinity is common butalso elsewhere.

AcknowledgmentsWe thank Hossein Najmabadi and Kimia Kahrizi, Hao Hu, MasoudGarshasbi, Andreas Kuss, Wei Chen, and Thomas Wienker for theiressential contributions to our past and ongoing ARID research, andGabriele Eder for help with the preparation of the manuscript. This workwas supported by the Max Planck Society and by the EuropeanCommission Framework Program 7 (FP7) project GENCODYS, grantno 241995 (coordinator: Hans van Bokhoven, Nijmegen).

References1 Schalock, R.L. et al. (2007) The renaming of mental retardation:

understanding the change to the term intellectual disability.Intellect. Dev. Disabil. 45, 116-124

2 American Psychiatric Association (2000) Diagnostic and StatisticManual of Mental Disorders (4th Revision: DSM-IV-TR), AmericanPsychiatric Association

3 Leonard, H. and Wen, X. (2002) The epidemiology of mentalretardation: challenges and opportunities in the new millennium.Ment. Retard. Dev. Disabil. Res. Rev. 8, 117-134

4 Ropers, H.H. (2010) Genetics of early onset cognitive impairment.Annu. Rev. Genomics Hum. Genet. 11, 161-187

5 Kleefstra, T. and Hamel, B.C. (2005) X-linked mental retardation:further lumping, splitting and emerging phenotypes. Clin. Genet. 67,451-467

6 Lubs, H.A. et al. (2012) Fragile X and X-linked intellectual disability:four decades of discovery. Am. J. Hum. Genet. 90, 579-590

7 Piton, A. et al. (2013) XLID-causing mutations and associated geneschallenged in light ofdata form large-scale human exome sequencing.Am. J. Hum. Genet. 93, 368-383

8 Mefford, H.C. et al. (2012) Genomics, intellectual disability, andautism. N. Engl. J. Med. 366, 733-743

9 Veltman, J.A. and Brunner, H.G. (2012) De novo mutations in humangenetic disease. Nat. Rev. Genet. 13, 565-575

10 Rauch, A. et al. (2006) Diagnostic yield of various genetic approachesin patients with unexplained developmental delay or mentalretardation. Am. J. Med. Genet. A 140, 2063-2074

37

Page 40: Trends in Genetics 2014 Jan

11 de Ligt, J. et al. (2012) Diagnostic exome sequencing in persons withsevere intellectual disability. N. Engl. J. Med. 367, 1921–1929

12 Lander, E.S. and Botstein, D. (1987) Homozygosity mapping: a way tomap human recessive traits with the DNA of inbred children. Science236, 1567–1570

13 Najmabadi, H. et al. (2007) Homozygosity mapping in consanguineousfamilies reveals extreme heterogeneity of non-syndromic autosomalrecessive mental retardation and identifies 8 novel gene loci. Hum.Genet. 121, 43–48

14 Kuss, A.W. et al. (2011) Autosomal recessive mental retardation:homozygosity mapping identifies 27 single linkage intervals, atleast 14 novel loci and several mutation hotspots. Hum. Genet. 129,141–148

15 Abou Jamra, R. et al. (2011) Homozygosity mapping in 64 Syrianconsanguineous families with non-specific intellectual disabilityreveals 11 novel loci and high heterogeneity. Eur. J. Hum. Genet.19, 1161–1166

16 Duman, D. and Tekin, M. (2012) Autosomal recessive nonsyndromicdeafness genes: a review. Front. Biosci. 17, 2213–2236

17 Pak, C. et al. (2011) Mutation of the conserved polyadenosine RNAbinding protein, ZC3H14/dNab2, impairs neural function inDrosophila and humans. Proc. Natl. Acad. Sci. U.S.A. 108, 12390–12395

18 Hashimoto, S. et al. (2011) MED23 mutation links intellectualdisability to dysregulation of immediate early gene expression.Science 333, 1161–1163

19 Caliskan, M. et al. (2011) Exome sequencing reveals a novel mutationfor autosomal recessive non-syndromic mental retardation in theTECR gene on chromosome 19p13. Hum. Mol. Genet. 20, 1285–1289

20 Chong, J.X. et al. (2012) A population-based study of autosomal-recessive disease-causing mutations in a founder population. Am.J. Hum. Genet. 91, 608–620

21 Meloni, I. et al. (2002) FACL4, encoding fatty acid-CoA ligase 4, ismutated in nonspecific X-linked mental retardation. Nat. Genet. 30,436–440

22 Najmabadi, H. et al. (2011) Deep sequencing reveals 50 novel genes forrecessive cognitive disorders. Nature 478, 57–63

23 Schuurs-Hoeijmakers, J.H.M. (2012) Gene Identification inIntellectual Disability, Radboud University Nijmegen

24 Schuurs-Hoeijmakers, J.H. et al. (2012) Mutations in DDHD2,encoding an intracellular phospholipase A(1), cause a recessiveform of complex hereditary spastic paraplegia. Am. J. Hum. Genet.91, 1073–1081

25 Markert, A. et al. (2008) The La-related protein LARP7 is a componentof the 7SK ribonucleoprotein and affects transcription of cellular andviral polymerase II genes. EMBO Rep. 9, 569–575

26 Alazami, A.M. et al. (2012) Loss of function mutation in LARP7,chaperone of 7SK ncRNA, causes a syndrome of facialdysmorphism, intellectual disability, and primordial dwarfism.Hum. Mutat. 33, 1429–1434

27 Zamponi, G.W. et al. (2010) Role of voltage-gated calcium channels inepilepsy. Pflugers Arch. 460, 395–403

28 Al-Owain, M. et al. (2011) An autosomal recessive syndrome of severecognitive impairment, dysmorphic facies and skeletal abnormalitiesmaps to the long arm of chromosome 17. Clin. Genet. 80, 489–492

29 Preiksaitiene, E. et al. (2012) A novel de novo 1.8 Mb microdeletion of17q21.33 associated with intellectual disability and dysmorphicfeatures. Eur. J. Med. Genet. 55, 656–659

30 Abbasi-Moheb, L. et al. (2012) Mutations in NSUN2 cause autosomal-recessive intellectual disability. Am. J. Hum. Genet. 90, 847–855

31 Khan, M.A. et al. (2012) Mutation in NSUN2, which encodes an RNAmethyltransferase, causes autosomal-recessive intellectual disability.Am. J. Hum. Genet. 90, 856–863

32 Martinez, F.J. et al. (2012) Whole exome sequencing identifies asplicing mutation in NSUN2 as a cause of a Dubowitz-likesyndrome. J. Med. Genet. 49, 380–385

33 Brzezicha, B. et al. (2006) Identification of human tRNA:m5Cmethyltransferase catalysing intron-dependent m5C formation inthe first position of the anticodon of the pre-tRNA Leu (CAA).Nucleic Acids Res. 34, 6034–6043

34 Freude, K. et al. (2004) Mutations in the FTSJ1 gene coding for a novelS-adenosylmethionine-binding protein cause nonsyndromic X-linkedmental retardation. Am. J. Hum. Genet. 75, 305–309

35 Bardoni, B. et al. (2012) Intellectual disabilities, neuronalposttranscriptional RNA metabolism, and RNA-binding proteins:three actors for a complex scenario. Prog. Brain Res. 197, 29–51

36 Noor, A. et al. (2008) CC2D2A, encoding a coiled-coil and C2 domainprotein, causes autosomal-recessive mental retardation with retinitispigmentosa. Am. J. Hum. Genet. 82, 1011–1018

37 Khan, M.A. et al. (2011) A novel deletion mutation in the TUSC3 genein a consanguineous Pakistani family with autosomal recessivenonsyndromic intellectual disability. BMC Med. Genet. 12, 56

38 Loddo, S. et al. (2013) Homozygous deletion in TUSC3 causingsyndromic intellectual disability: a new patient. Am. J. Med. Genet.A 161, 2084–2087

39 Hu, H. et al. (2011) ST3GAL3 mutations impair the development ofhigher cognitive functions. Am. J. Hum. Genet. 89, 407–414

40 Marangi, G. et al. (2013) TRAPPC9-related autosomal recessiveintellectual disability: report of a new mutation and clinicalphenotype. Eur. J. Hum. Genet. 21, 229–232

41 Rafiq, M.A. et al. (2011) Mutations in the alpha 1,2-mannosidase gene,MAN1B1, cause autosomal-recessive intellectual disability. Am. J.Hum. Genet. 89, 176–182

42 Winkler, G.S. et al. (2001) RNA polymerase II elongator holoenzymeis composed of two discrete subcomplexes. J. Biol. Chem. 276, 32743–32749

43 Puffenberger, E.G. et al. (2012) Genetic mapping and exomesequencing identify variants associated with five novel diseases.PLoS ONE 7, e28936

44 Ribe, E.M. et al. (2012) Neuronal caspase 2 activity and functionrequires RAIDD, but not PIDD. Biochem. J. 444, 591–599

45 Bjursell, M.K. et al. (2011) Adenosine kinase deficiency disrupts themethionine cycle and causes hypermethioninemia, encephalopathy,and abnormal liver function. Am. J. Hum. Genet. 89, 507–515

46 Ali, B.R. et al. (2012) A mutation in KIF7 is responsible for theautosomal recessive syndrome of macrocephaly, multipleepiphyseal dysplasia and distinctive facial appearance. Orphanet J.Rare Dis. 7, 27

47 Eggert, S.L. et al. (2012) Genome-wide linkage and associationanalyses implicate FASN in predisposition to uterine leiomyomata.Am. J. Hum. Genet. 91, 621–628

48 Frayling, T.M. et al. (2007) A common variant in the FTO gene isassociated with body mass index and predisposes to childhood andadult obesity. Science 316, 889–894

49 Boissel, S. et al. (2009) Loss-of-function mutation in the dioxygenase-encoding FTO gene causes severe growth retardation and multiplemalformations. Am. J. Hum. Genet. 85, 106–111

50 Oeseburg, B. et al. (2011) Prevalence of chronic health conditions inchildren with intellectual disability: a systematic literature review.Intellect. Dev. Disabil. 49, 59–85

51 Berg, A.T. and Plioplys, S. (2012) Epilepsy and autism: is there aspecial relationship? Epilepsy Behav. 23, 193–198

52 Prince, E. and Ring, H. (2011) Causes of learning disability andepilepsy: a review. Curr. Opin. Neurol. 24, 154–158

53 Stevenson, R.E. et al. (2012) Seizures and X-linked intellectualdisability. Eur. J. Med. Genet. 55, 307–312

54 Leung, H.T. and Ring, H. (2013) Epilepsy in four geneticallydetermined syndromes of intellectual disability. J. Intellect.Disabil. Res. 57, 3–20

55 Corbett, M.A. et al. (2010) A focal epilepsy and intellectual disabilitysyndrome is due to a mutation in TBC1D24. Am. J. Hum. Genet. 87,371–375

56 Falace, A. et al. (2010) TBC1D24, an ARF6-interacting protein, ismutated in familial infantile myoclonic epilepsy. Am. J. Hum. Genet.87, 365–370

57 Guven, A. and Tolun, A. (2013) TBC1D24 truncating mutationresulting in severe neurodegeneration. J. Med. Genet. 50, 199–202

58 Milh, M. et al. (2013) Novel compound heterozygous mutations inTBC1D24 cause familial malignant migrating partial seizures ofinfancy. Hum. Mutat. 34, 869–872

59 Labate, A. et al. (2012) Homozygous c.649dupC mutation in PRRT2worsens the BFIS/PKD phenotype with mental retardation, episodicataxia, and absences. Epilepsia 53, e196–e199

60 Helbig, I. and Lowenstein, D.H. (2013) Genetics of the epilepsies:where are we and where are we going? Curr. Opin. Neurol. 26,179–185

Review Trends in Genetics January 2014, Vol. 30, No. 1

38

Review

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

38

de Ligt, J. et al. (2012) Diagnostic exome sequencing in persons withsevere intellectual disability. N. Engl. J. Med. 367, 1921-1929Lander, E.S. and Botstein, D. (1987) Homozygosity mapping: a way tomap human recessive traits with the DNA of inbred children. Science236, 1567-1570Najmabadi, H. et al. (2007) Homozygosity mapping in consanguineousfamilies reveals extreme heterogeneity of non-syndromic autosomalrecessive mental retardation and identifies 8 novel gene loci. Hum.Genet. 121, 43-48Kuss, A.W. et al. (2011) Autosomal recessive mental retardation:homozygosity mapping identifies 27 single linkage intervals, atleast 14 novel loci and several mutation hotspots. Hum. Genet. 129,141-148Abou Jamra, R. et al. (2011) Homozygosity mapping in 64 Syrianconsanguineous families with non-specific intellectual disabilityreveals 11 novel loci and high heterogeneity. Eur. J. Hum. Genet.19, 1161-1166Duman, D. and Tekin, M. (2012) Autosomal recessive nonsyndromicdeafness genes: a review. Front. Biosci. 17, 2213-2236Pak, C. et al. (2011) Mutation of the conserved polyadenosine RNAbinding protein, ZC3H14/dNab2, impairs neural function inDrosophila and humans. Proc. Natl. Acad. Sci. U.S.A. 108, 12390-12395Hashimoto, S. et al. (2011) MED23 mutation links intellectualdisability to dysregulation of immediate early gene expression.Science 333, 1161-1163Caliskan, M. et al. (2011) Exome sequencing reveals a novel mutationfor autosomal recessive non-syndromic mental retardation in theTECR gene on chromosome 19p13. Hum. Mol. Genet. 20, 1285-1289Chong, J.X. et al. (2012) A population-based study of autosomal-recessive disease-causing mutations in a founder population. Am.J. Hum. Genet. 91, 608-620Meloni, I. et al. (2002) FACL4, encoding fatty acid-CoA ligase 4, ismutated in nonspecific X-linked mental retardation. Nat. Genet. 30,436-440Najmabadi, H. et al. (2011) Deep sequencing reveals 50 novel genes forrecessive cognitive disorders. Nature 478, 57-63Schuurs-Hoeijmakers, J (2012) Gene Identification inIntellectual Disability, Radboud University NijmegenSchuurs-Hoeijmakers, J.H. et al. (2012) Mutations in DDHD2,encoding an intracellular phospholipase A(1), cause a recessiveform of complex hereditary spastic paraplegia. Am. J. Hum. Genet.91, 1073-1081Markert, A. et al. (2008) The La-related protein LARP7 is a componentof the 7SK ribonucleoprotein and affects transcription of cellular andviral polymerase II genes. EMBO Rep. 9, 569-575Alazami, A.M. et al. (2012) Loss of function mutation in LARP7,chaperone of 7SK ncRNA, causes a syndrome of facialdysmorphism, intellectual disability, and primordial dwarfism.I-Ium. Mutat. 33, 1429-1434Zamponi, G.W. et al. (2010) Role ofvoltage-gated calcium channels inepilepsy. Pfl ugersArch. 460, 395-403Al-Owain, M. et al. (2011) An autosomal recessive syndrome of severecognitive impairment, dysmorphic facies and skeletal abnormalitiesmaps to the long arm of chromosome 17. Clin. Genet. 80, 489-492Preiksaitiene, E. et al. (2012) A novel de novo 1.8 Mb microdeletion of17q21.33 associated with intellectual disability and dysmorphicfeatures. Eur. J. Med. Genet. 55, 656-659Abbasi-Moheb, L. et al. (2012) Mutations in NSUN2 cause autosomal-recessive intellectual disability. Am. J. Hum. Genet. 90, 847-855Khan, M.A. et al. (2012) Mutation in NSUN2, which encodes an RNAmethyltransferase, causes autosomal-recessive intellectual disability.Am. J. Hum. Genet. 90, 856-863Martinez, F.J. et al. (2012) Whole exome sequencing identifies asplicing mutation in NSUN2 as a cause of a Dubowitz-likesyndrome. J. Med. Genet. 49, 380-385Brzezicha, B. et al. (2006) Identification of human tRNA:m5Cmethyltransferase catalysing intron-dependent m5C formation inthe first position of the anticodon of the pre-tRNA Leu (CAA).Nucleic Acids Res. 34, 6034-6043Freude, K. et al. (2004) Mutations in the FTSJ1 gene coding for a novelS-adenosylmethionine-binding protein cause nonsyndromic X-linkedmental retardation. Am. J. Hum. Genet. 75, 305-309

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

Trends in Genetics January 2014, Vol. 30, No. 1

Bardoni, B. et al. (2012) Intellectual disabilities, neuronalposttranscriptional RNA metabolism, and RNA-binding proteins:three actors for a complex scenario. Prog. Brain Res. 197, 29-51Noor, A. et al. (2008) CC2D2A, encoding a coiled-coil and C2 domainprotein, causes autosomal-recessive mental retardation with retinitispigmentosa. Am. J. Hum. Genet. 82, 1011-1018Khan, M.A. et al. (2011) A novel deletion mutation in the TUSC3 genein a consanguineous Pakistani family with autosomal recessivenonsyndromic intellectual disability. BMC Med. Genet. 12, 56Loddo, S. et al. (2013) Homozygous deletion in TUSC3 causingsyndromic intellectual disability: a new patient. Am. J. Med. Genet.A 161, 2084-2087Hu, H. et al. (2011) ST3GAL3 mutations impair the development ofhigher cognitive functions. Am. J. Hum. Genet. 89, 407-414Marangi, G. et al. (2013) TRAPPC9-related autosomal recessiveintellectual disability: report of a new mutation and clinicalphenotype. Eur. J. Hum. Genet. 21, 229-232Rafi q,M.A. et al. (2011) Mutations in the alpha 1,2-mannosidase gene,MAN1B1, cause autosomal-recessive intellectual disability. Am. J.Hum. Genet. 89, 176-182Winkler, G.S. et al. (2001) RNA polymerase II elongator holoenzymeis composed of two discrete subcomplexes. J. Biol. Chem. 276, 32743-32749Puffenberger, E.G. et al. (2012) Genetic mapping and exomesequencing identify variants associated with five novel diseases.PLoS ONE 7, e28936Ribe, E.M. et al. (2012) Neuronal caspase 2 activity and functionrequires RAIDD, but not PIDD. Biochem. J. 444, 591-599Bjursell, M.K. et al. (2011) Adenosine kinase deficiency disrupts themethionine cycle and causes hypermethioninemia, encephalopathy,and abnormal liver function. Am. J. Hum. Genet. 89, 507-515Ali, B.R. et al. (2012) A mutation in KIF7 is responsible for theautosomal recessive syndrome of macrocephaly, multipleepiphyseal dysplasia and distinctive facial appearance. Orphanet J.Rare Dis. 7, 27Eggert, S.L. et al. (2012) Genome-wide linkage and associationanalyses implicate FASN in predisposition to uterine leiomyomata.Am. J. Hum. Genet. 91, 621-628Frayling, T.M. et al. (2007) A common variant in the FTO gene isassociated with body mass index and predisposes to childhood andadult obesity. Science 316, 889-894Boissel, S. et al. (2009) Loss-of-function mutation in the dioxygenase-encoding FTO gene causes severe growth retardation and multiplemalformations. Am. J. Hum. Genet. 85, 106-111Oeseburg, B. et al. (2011) Prevalence of chronic health conditions inchildren with intellectual disability: a systematic literature review.Intellect. Dev. Disabil. 49, 59-85Berg, A.T. and Plioplys, S. (2012) Epilepsy and autism: is there aspecial relationship? Epilepsy Behav. 23, 193-198Prince, E. and Ring, H. (2011) Causes of learning disability andepilepsy: a review. Curr. Opin. Neurol. 24, 154-158Stevenson, R.E. et al. (2012) Seizures and X-linked intellectualdisability. Eur. J. Med. Genet. 55, 307-312Leung, H.T. and Ring, H. (2013) Epilepsy in four geneticallydetermined syndromes of intellectual disability. J. Intellect.Disabil. Res. 57, 3-20Corbett, M.A. et al. (2010) A focal epilepsy and intellectual disabilitysyndrome is due to a mutation in TBC1D24. Am. J. Hum. Genet. 87,371-375Falace, A. et al. (2010) TBC1D24, an ARF6-interacting protein, ismutated in familial infantile myoclonic epilepsy. Am. J. Hum. Genet.87 , 365-370Guven, A. and Tolun, A. (2013) TBC1D24 truncating mutationresulting in severe neurodegeneration. J. Med. Genet. 50, 199-202Milh, M. et al. (2013) Novel compound heterozygous mutations inTBC1D24 cause familial malignant migrating partial seizures ofinfancy. I-Ium. Mutat. 34, 869-872Labate, A. et al. (2012) Homozygous c.649dupC mutation in PRRT2worsens the BFIS/PKD phenotype with mental retardation, episodicataxia, and absences. Epilepsia 53, e196-e199Helbig, I. and Lowenstein, D.H. (2013) Genetics of the epilepsies:where are we and where are we going? Curr. Opin. Neurol. 26,179-185

Page 41: Trends in Genetics 2014 Jan

61 Fombonne, E. (2003) Epidemiological surveys of autism and otherpervasive developmental disorders: an update. J. Autism Dev. Disord.33, 365–382

62 Jamain, S. et al. (2003) Mutations of the X-linked genes encodingneuroligins NLGN3 and NLGN4 are associated with autism. Nat.Genet. 34, 27–29

63 Laumonnier, F. et al. (2004) X-linked mental retardation and autismare associated with a mutation in the NLGN4 gene, a member of theneuroligin family. Am. J. Hum. Genet. 74, 552–557

64 Celestino-Soper, P.B. et al. (2012) A common X-linked inborn error ofcarnitine biosynthesis may be a risk factor for nondysmorphic autism.Proc. Natl. Acad. Sci. U.S.A. 109, 7974–7981

65 Jiang, Y.H. et al. (2013) Detection of clinically relevant geneticvariants in autism spectrum disorder by whole-genome sequencing.Am. J. Hum. Genet. 93, 249–263

66 Morgan, V.A. et al. (2008) Intellectual disability co-occurring withschizophrenia and other psychiatric illness: population-based study.Br. J. Psychiatry 193, 364–372

67 Kirov, G. et al. (2009) Support for the involvement of large copynumber variants in the pathogenesis of schizophrenia. Hum. Mol.Genet. 18, 1497–1503

68 de Kovel, C.G. et al. (2010) Recurrent microdeletions at 15q11.2 and16p13.11 predispose to idiopathic generalized epilepsies. Brain 133,23–32

69 Betancur, C. (2011) Etiological heterogeneity in autism spectrumdisorders: more than 100 genetic and genomic disorders and stillcounting. Brain Res. 1380, 42–77

70 Guven, A. et al. (2012) Novel NDE1 homozygous mutation resulting inmicrohydranencephaly and not microlyssencephaly. Neurogenetics13, 189–194

71 Iossifov, I. et al. (2012) De novo gene disruptions in children on theautistic spectrum. Neuron 74, 285–299

72 Ullmann, R. et al. (2007) Array CGH identifies reciprocal 16p13.1duplications and deletions that predispose to autism and/or mentalretardation. Hum. Mutat. 28, 674–682

73 Nicolson, R. and Szatmari, P. (2003) Genetic and neurodevelopmentalinfluences in autistic disorder. Can. J. Psychiatry 48, 526–537

74 Sanders, S.J. et al. (2012) De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 485,237–241

75 O’Roak, B.J. et al. (2012) Sporadic autism exomes reveal a highlyinterconnected protein network of de novo mutations. Nature 485,246–250

76 Neale, B.M. et al. (2012) Patterns and rates of exonic de novomutations in autism spectrum disorders. Nature 485, 242–245

77 Morrow, E.M. et al. (2008) Identifying autism loci and genes by tracingrecent shared ancestry. Science 321, 218–223

78 Gamsiz, E.D. et al. (2013) Intellectual disability is associated withincreased runs of homozygosity in simplex autism. Am. J. Hum.Genet. 93, 103–109

79 Yu, T.W. et al. (2013) Using whole-exome sequencing to identifyinherited causes of autism. Neuron 77, 259–273

80 Chahrour, M.H. et al. (2012) Whole-exome sequencing andhomozygosity analysis implicate depolarization-regulated neuronalgenes in autism. PLoS Genet. 8, e1002635

81 Lim, E.T. et al. (2013) Rare complete knockouts in humans:population distribution and significant role in autism spectrumdisorders. Neuron 77, 235–242

82 Mansour, H.A. et al. (2009) Association study of 21 circadian geneswith bipolar I disorder, schizoaffective disorder, and schizophrenia.Bipolar Disord. 11, 701–710

83 Mansour, H. et al. (2010) Consanguinity and increased risk forschizophrenia in Egypt. Schizophr. Res. 120, 108–112

84 Dratva, J. et al. (2007) Variability of reproductive history across theSwiss SAPALDIA cohort – patterns and main determinants. Ann.Hum. Biol. 34, 437–453

85 Ten Kate, L.P. et al. (2010) Autosomal recessive disease in children ofconsanguineous parents: inferences from the proportion of compoundheterozygotes. J. Community Genet. 1, 37–40

86 Rauch, A. et al. (2012) Range of genetic mutations associated withsevere non-syndromic sporadic intellectual disability: an exomesequencing study. Lancet 380, 1674–1682

87 Ellison, J.W. et al. (2013) Genetic basis of intellectual disability. Annu.Rev. Med. 64, 441–450

88 Hoodfar, E. and Teebi, A.S. (1996) Genetic referrals of Middle Easternorigin in a western city: inbreeding and disease profile. J. Med. Genet.33, 212–215

89 Hamamy, H.A. et al. (2007) Consanguinity and genetic disorders.Profile from Jordan. Saudi Med. J. 28, 1015–1017

90 Zoghbi, H.Y. and Warren, S.T. (2010) Neurogenetics: advancing the‘next-generation’ of brain research. Neuron 68, 165–173

91 Lynch, M. (2010) Rate, molecular spectrum, and consequences ofhuman mutation. Proc. Natl. Acad. Sci. U.S.A. 107, 961–968

92 Vissers, L.E. et al. (2010) A de novo paradigm for mental retardation.Nat. Genet. 42, 1109–1112

93 Dixon-Salazar, T.J. et al. (2012) Exome sequencing can improvediagnosis and alter patient management. Sci. Transl. Med. 4, 138ra178

94 Bell, C.J. et al. (2011) Carrier testing for severe childhood recessivediseases by next-generation sequencing. Sci. Transl. Med. 3, 65ra64

95 Kingsmore, S. (2012) Comprehensive carrier screening and moleculardiagnostic testing for recessive childhood diseases. PLoS Curr. 1, 1–23http://dx.doi.org/10.1371/e4f9877ab8ffa9 [Edition1]

96 Bittles, A.H. (2008) A community genetics perspective onconsanguineous marriage. Community Genet. 11, 324–330

97 Modell, B. and Darr, A. (2002) Science and society: genetic counsellingand customary consanguineous marriage. Nat. Rev. Genet. 3, 225–229

98 Wright, S. (1922) Coefficients of inbreeding and relationship. Am. Nat.56, 330–338

99 Molinari, F. et al. (2002) Truncating neurotrypsin mutation inautosomal recessive nonsyndromic mental retardation. Science 298,1779–1781

100 Higgins, J.J. et al. (2004) A mutation in a novel ATP-dependent Lonprotease gene in a kindred with mild mental retardation. Neurology63, 1927–1931

101 Basel-Vanagaite, L. et al. (2006) The CC2D1A, a member of a newgene family with C2 domains, is involved in autosomal recessive non-syndromic mental retardation. J. Med. Genet. 43, 203–210

102 Motazacker, M.M. et al. (2007) A defect in the ionotropic glutamatereceptor 6 gene (GRIK2) is associated with autosomal recessivemental retardation. Am. J. Hum. Genet. 81, 792–798

103 Garshasbi, M. et al. (2008) A defect in the TUSC3 gene is associatedwith autosomal recessive mental retardation. Am. J. Hum. Genet. 82,1158–1164

104 Molinari, F. et al. (2008) Oligosaccharyltransferase-subunitmutations in nonsyndromic mental retardation. Am. J. Hum.Genet. 82, 1150–1157

105 Mir, A. et al. (2009) Identification of mutations in TRAPPC9, whichencodes the NIK- and IKK-beta-binding protein, in nonsyndromicautosomal-recessive mental retardation. Am. J. Hum. Genet. 85,909–915

106 Philippe, O. et al. (2009) Combination of linkage mapping andmicroarray-expression analysis identifies NF-kappaB signalingdefect as a cause of autosomal-recessive mental retardation. Am. J.Hum. Genet. 85, 903–908

107 Mochida, G.H. et al. (2009) A truncating mutation of TRAPPC9 isassociated with autosomal-recessive intellectual disability andpostnatal microcephaly. Am. J. Hum. Genet. 85, 897–902

108 Ratan, A. et al. (2013) Comparison of sequencing platforms for singlenucleotide variant calls in a human sample. PLoS ONE 8, e55089

109 Ropers, F. et al. (2011) Identification of a novel candidate gene for non-syndromic autosomal recessive intellectual disability: the WASHcomplex member SWIP. Hum. Mol. Genet. 20, 2585–2590

Review Trends in Genetics January 2014, Vol. 30, No. 1

39

Review

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

Fombonne, E. (2003) Epidemiological surveys of autism and otherpervasive developmental disorders: an update. J. Autism Dev. Disord.33, 365-382Jamain, S. et al. (2003) Mutations of the X-linked genes encodingneuroligins NLGN3 and NLGN4 are associated with autism. Nat.Genet. 34, 27-29Laumonnier, F. et al. (2004) X-linked mental retardation and autismare associated with a mutation in the NLGN4 gene, a member of theneuroligin family. Am. J. Hum. Genet. 74, 552-557Celestino-Soper, P.B. et al. (2012) A common X-linked inborn error ofcarnitine biosynthesis may be a risk factor for nondysmorphic autism.Proc. Natl. Acad. Sci. U.S.A. 109, 7974-7981Jiang, Y.H. et al. (2013) Detection of clinically relevant geneticvariants in autism spectrum disorder by whole-genome sequencing.Am. J. Hum. Genet. 93, 249-263Morgan, V.A. et al. (2008) Intellectual disability co-occurring withschizophrenia and other psychiatric illness: population-based study.Br. J. Psychiatry 193, 364-372Kirov, G. et al. (2009) Support for the involvement of large copynumber variants in the pathogenesis of schizophrenia. Hum. Mol.Genet. 18, 1497-1503de Kovel, C.G. et al. (2010) Recurrent microdeletions at 15q11.2 and16p13.11 predispose to idiopathic generalized epilepsies. Brain 133,23-32Betancur, C. (2011) Etiological heterogeneity in autism spectrumdisorders: more than 100 genetic and genomic disorders and stillcounting. Brain Res. 1380, 42-77Guven, A. et al. (2012) Novel NDE1 homozygous mutation resulting inmicrohydranencephaly and not microlyssencephaly. Neurogenetics13, 189-194Iossifov, I. et al. (2012) De novo gene disruptions in children on theautistic spectrum. Neuron 74, 285-299Ullmann, R. et al. (2007) Array CGH identifies reciprocal 16p13.1duplications and deletions that predispose to autism and/or mentalretardation. I-Ium. Mutat. 28, 674-682Nicolson, R. and Szatmari, P. (2003) Genetic and neurodevelopmentalinfluences in autistic disorder. Can. J. Psychiatry 48, 526-537Sanders, S.J. et al. (2012) De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 485,237-241O’Roak, B.J. et al. (2012) Sporadic autism exomes reveal a highlyinterconnected protein network of de novo mutations. Nature 485,246-250Neale, B.M. et al. (2012) Patterns and rates of exonic de novomutations in autism spectrum disorders. Nature 485, 242-245Morrow, E.M. et al. (2008) Identifying autism loci and genes by tracingrecent shared ancestry. Science 321, 218-223Gamsiz, E.D. et al. (2013) Intellectual disability is associated withincreased runs of homozygosity in simplex autism. Am. J. Hum.Genet. 93, 103-109Yu, T.W. et al. (2013) Using whole-exome sequencing to identifyinherited causes of autism. Neuron 77, 259-273Chahrour, M.H. et al. (2012) Whole-exome sequencing andhomozygosity analysis implicate depolarization-regulated neuronalgenes in autism. PLoS Genet. 8, e1002635Lim, E.T. et al. (2013) Rare complete knockouts in humans:population distribution and significant role in autism spectrumdisorders. Neuron 77, 235-242Mansour, H.A. et al. (2009) Association study of 21 circadian geneswith bipolar I disorder, schizoaffective disorder, and schizophrenia.Bipolar Disord. 11, 701-710Mansour, H. et al. (2010) Consanguinity and increased risk forschizophrenia in Egypt. Schizophr. Res. 120, 108-112Dratva, J. et al. (2007) Variability of reproductive history across theSwiss SAPALDIA cohort - patterns and main determinants. Ann.Hum. Biol. 34, 437-453

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

Trends in Genetics January 2014, Vol. 30, No. 1

Ten Kate, L.P. et al. (2010) Autosomal recessive disease in children ofconsanguineous parents: inferences from the proportion of compoundheterozygotes. J. Community Genet. 1, 37-40Rauch, A. et al. (2012) Range of genetic mutations associated withsevere non-syndromic sporadic intellectual disability: an exomesequencing study. Lancet 380, 1674-1682Ellison, J.W. et al. (2013) Genetic basis ofintellectual disability. Annu.Rev. Med. 64, 441-450Hoodfar, E. and Teebi, A.S. (1996) Genetic referrals ofMiddle Easternorigin in a western city: inbreeding and disease profile. J. Med. Genet.33,212-215Hamamy, H.A. et al. (2007) Consanguinity and genetic disorders.Profi le from Jordan. Saudi Med. J. 28, 1015-1017Zoghbi, H.Y. and Warren, S.T. (2010) Neurogenetics: advancing the‘next-generation’ of brain research. Neuron 68, 165-173Lynch, M. (2010) Rate, molecular spectrum, and consequences ofhuman mutation. Proc. Natl. Acad. Sci. U.S.A. 107, 961-968Vissers, L.E. et al. (2010) A de novo paradigm for mental retardation.Nat. Genet. 42, 1109-1112Dixon-Salazar, T.J. et al. (2012) Exome sequencing can improvediagnosis and alter patient management. Sci. Transl. Med. 4, 138ra178Bell, C.J. et al. (2011) Carrier testing for severe childhood recessivediseases by next-generation sequencing. Sci. Transl. Med. 3, 65ra64Kingsmore, S. (2012) Comprehensive carrier screening and moleculardiagnostic testing for recessive childhood diseases. PLoS Curr. 1, 1-23http://dX.doi.org/10.1371/e4f9877ab8ffa9 [Editionl]Bittles, A.H. (2008) A community genetics perspective onconsanguineous marriage. Community Genet. 11, 324-330Model], B. and Darr, A. (2002) Science and society: genetic counsellingand customary consanguineous marriage. Nat. Rev. Genet. 3, 225-229Wright, S. (1922) Coeffi cientsofinbreeding and relationship. Am. Nat.56, 330-338Molinari, F. et al. (2002) Truncating neurotrypsin mutation inautosomal recessive nonsyndromic mental retardation. Science 298,1779-1781Higgins, J.J. et al. (2004) A mutation in a novel ATP-dependent Lonprotease gene in a kindred with mild mental retardation. Neurology63, 1927-1931Basel-Vanagaite, L. et al. (2006) The CC2D1A, a member of a newgene family with C2 domains, is involved in autosomal recessive non-syndromic mental retardation. J. Med. Genet. 43, 203-210Motazacker, M.M. et al. (2007) A defect in the ionotropic glutamatereceptor 6 gene (GRIK2) is associated with autosomal recessivemental retardation. Am. J. Hum. Genet. 81, 792-798Garshasbi, M. et al. (2008) A defect in the TUSC3 gene is associatedwith autosomal recessive mental retardation. Am. J. Hum. Genet. 82,1158-1164Molinari, F. et al. (2008) Oligosaccharyltransferase-subunitmutations in nonsyndromic mental retardation. Am. J. Hum.Genet. 82, 1150-1157Mir, A. et al. (2009) Identification of mutations in TRAPPC9, whichencodes the NIK- and IKK-beta-binding protein, in nonsyndromicautosomal-recessive mental retardation. Am. J. Hum. Genet. 85,909-915Philippe, O. et al. (2009) Combination of linkage mapping andmicroarray-expression analysis identifies NF-kappaB signalingdefect as a cause of autosomal-recessive mental retardation. Am. J.Hum. Genet. 85, 903-908Mochida, G.H. et al. (2009) A truncating mutation of TRAPPC9 isassociated with autosomal-recessive intellectual disability andpostnatal microcephaly. Am. J. Hum. Genet. 85, 897-902Ratan, A. et al. (2013) Comparison of sequencing platforms for singlenucleotide variant calls in a human sample. PLoS ONE 8, e55089Ropers, F. et al. (201 1) Identification ofa novel candidate gene for non-syndromic autosomal recessive intellectual disability: the WASHcomplex member SWIP. Hum. Mol. Genet. 20, 2585-2590

39