1 © The Author 2013. Published by Oxford University Press. All ...

12
Copy-number variation of functional galectin genes: Studying animal galectin-7 ( p53-induced gene 1 in man) and tandem-repeat-type galectins-4 and -9 Herbert Kaltner 1 , Anne-Sarah Raschta , Joachim C Manning, and Hans-Joachim Gabius Institute of Physiological Chemistry, Faculty of Veterinary Medicine, Ludwig- Maximilians-University Munich, Veterinärstr. 13, 80539 München, Germany Received on May 12, 2013; revised on June 23, 2013; accepted on July 5, 2013 Galectins are potent adhesion/growth-regulatory effectors with characteristic expression proles. Understanding the molecular basis of gene regulation in each case requires detailed information on copy number of genes and sequence (s) of their promoter(s). Our report reveals plasticity in this respect between galectins and species. We here describe oc- currence of a two-gene constellation for human galectin (Gal)-7 and dene current extent of promoter-sequence divergence. Interestingly, cross-species genome analyses also detected single-copy display. Because the regulatory potential will then be different, extrapolations of expression proles are precluded between respective species pairs. Gal-4 coding in chromosomal vicinity was found to be conned to one gene, whereas copy-number variation also applied to Gal-9. The example of rat Gal-9 teaches the lesson that the presence of multiple bands in Southern blotting despite a single-copy gene constellation is attributable to two pseudogenes. The documented copy-number variability should thus be taken into consideration when studying regulation of galectin genes, in a species and in comparison between species. Keywords: apoptosis / homology / lectin / phylogenesis / promoter Introduction The concept of the sugar code ascribes a broad range of func- tionalities to the glycan chains of cellular glycoconjugates (for overviews, see Gabius 2009). One route of translating carbohydrate-based information into effects is via specic recep- tors (lectins). This recognition process will, directly or after bio- signaling, spawn diverse aspects of cell behavior, e.g., establishing cell adhesion or inducing apoptosis. The apparent physiological signicance explains why lectins are widely distributed and a large number, currently 14, of protein folds have developed capacity to bind carbohydrates in animals (Gabius et al. 2011). Once an ancestral gene for such a lectin domain has arisen, genetic diversication can then become operative to start estab- lishing a group of homologous proteins. To what extent the group size can develop has been delineated in detail, for example, for C-type lectins and galectins (Cooper 2002; Houzelstein et al. 2004; Gready and Zelensky 2009). Sequence alterations found between family members account, for in- stance, for differences in binding carbohydrates, as detectable by different approaches such as chemical mapping, calorimetry or cell/microarray (carbohydrates/glycoproteins) testing for galectins (Solís et al. 1996; Ahmad et al. 2002; Stowell et al. 2008; Kopitz et al. 2010; Krzeminski et al. 2011; Vokhmyanina et al. 2012), so that a set of related butin certain aspectsdistinct effectors can be built (Kasai and Hirabayashi 1996; Boscher et al. 2011; Kaltner and Gabius 2012; Liu et al. 2012). Even a loss of lectin activity can occur, as attested by the plasti- city of the C-type domain and the two galectin-related proteins galectin-related inter-ber protein and galectin-related protein (Cooper 2002; Zhou et al. 2008; Gready and Zelensky 2009). Along with changes in the coding regions, the promoters of the lectin genes, too, can be subject to introduction of devia- tions. They then underlie tissue-specic expression proles and dynamic down- or upregulation. Focusing on galectins, the increasing evidence (i) for cell-type-selective expression and (ii) pronounced context-dependent changes in transcriptional activity of certain genes, e.g., in response to activation within effector/regulatory T cell communication or hormones, to indu- cers of differentiation or to master regulators of cell growth (Goldstone and Lavin 1991; Lu and Lotan 1999; Saal et al. 2005; André et al. 2007; Amano et al. 2012; Ledeen et al. 2012), as well as (iii) for disease-associated manifestation of single-nucleotide polymorphisms (Pál et al. 2010, 2012; Chen et al. 2012) prompts a detailed promoter analysis and compari- son, in the rst step by computational means. Access to genome data makes comprehensive gene detection and sequence comparison possible, also with respect to phylo- genesis. Such data will answer questions on the extent of diver- gence in noncoding regions. The cases of the two homodimeric (proto-type) galectins-1 and -2 (Gal-1/-2), together with their avian orthologs, i.e., chicken (Gallus gallus) galectins (CG)-1/-2, and the paralog pair CG-1A/B, exemplify how markedly different These authors contributed equally to this work. 1 To whom correspondence should be addressed: Tel: +49-89-2180-3984; Fax: +49-89-2180-992290; e-mail: [email protected] Glycobiology vol. 23 no. 10 pp. 11521163, 2013 doi:10.1093/glycob/cwt052 Advance Access publication on July 9, 2013 © The Author 2013. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: [email protected] 1152 Downloaded from https://academic.oup.com/glycob/article-abstract/23/10/1152/1988214 by guest on 30 January 2018

Transcript of 1 © The Author 2013. Published by Oxford University Press. All ...

Page 1: 1 © The Author 2013. Published by Oxford University Press. All ...

Copy-number variation of functional galectin genes:Studying animal galectin-7 (p53-induced gene 1 in man)and tandem-repeat-type galectins-4 and -9

Herbert Kaltner†1, Anne-Sarah Raschta†,Joachim CManning, and Hans-Joachim Gabius

Institute of Physiological Chemistry, Faculty of Veterinary Medicine, Ludwig-Maximilians-University Munich, Veterinärstr. 13, 80539 München, Germany

Received on May 12, 2013; revised on June 23, 2013; accepted on July 5, 2013

Galectins are potent adhesion/growth-regulatory effectorswith characteristic expression profiles. Understanding themolecular basis of gene regulation in each case requiresdetailed information on copy number of genes and sequence(s) of their promoter(s). Our report reveals plasticity in thisrespect between galectins and species. We here describe oc-currence of a two-gene constellation for human galectin(Gal)-7 and define current extent of promoter-sequencedivergence. Interestingly, cross-species genome analyses alsodetected single-copy display. Because the regulatory potentialwill then be different, extrapolations of expression profiles areprecluded between respective species pairs. Gal-4 coding inchromosomal vicinity was found to be confined to one gene,whereas copy-number variation also applied to Gal-9. Theexample of rat Gal-9 teaches the lesson that the presence ofmultiple bands in Southern blotting despite a single-copygene constellation is attributable to two pseudogenes. Thedocumented copy-number variability should thus be takeninto consideration when studying regulation of galectin genes,in a species and in comparison between species.

Keywords:apoptosis / homology / lectin / phylogenesis / promoter

Introduction

The concept of the sugar code ascribes a broad range of func-tionalities to the glycan chains of cellular glycoconjugates(for overviews, see Gabius 2009). One route of translatingcarbohydrate-based information into effects is via specific recep-tors (lectins). This recognition process will, directly or after bio-signaling, spawn diverse aspects of cell behavior, e.g., establishingcell adhesion or inducing apoptosis. The apparent physiological

significance explains why lectins are widely distributed and alarge number, currently 14, of protein folds have developedcapacity to bind carbohydrates in animals (Gabius et al. 2011).Once an ancestral gene for such a lectin domain has arisen,genetic diversification can then become operative to start estab-lishing a group of homologous proteins. To what extent thegroup size can develop has been delineated in detail, forexample, for C-type lectins and galectins (Cooper 2002;Houzelstein et al. 2004; Gready and Zelensky 2009). Sequencealterations found between family members account, for in-stance, for differences in binding carbohydrates, as detectableby different approaches such as chemical mapping, calorimetryor cell/microarray (carbohydrates/glycoproteins) testing forgalectins (Solís et al. 1996; Ahmad et al. 2002; Stowell et al.2008; Kopitz et al. 2010; Krzeminski et al. 2011; Vokhmyaninaet al. 2012), so that a set of related but—in certain aspects—distinct effectors can be built (Kasai and Hirabayashi 1996;Boscher et al. 2011; Kaltner and Gabius 2012; Liu et al. 2012).Even a loss of lectin activity can occur, as attested by the plasti-city of the C-type domain and the two galectin-related proteinsgalectin-related inter-fiber protein and galectin-related protein(Cooper 2002; Zhou et al. 2008; Gready and Zelensky 2009).Along with changes in the coding regions, the promoters of

the lectin genes, too, can be subject to introduction of devia-tions. They then underlie tissue-specific expression profilesand dynamic down- or upregulation. Focusing on galectins, theincreasing evidence (i) for cell-type-selective expression and(ii) pronounced context-dependent changes in transcriptionalactivity of certain genes, e.g., in response to activation withineffector/regulatory T cell communication or hormones, to indu-cers of differentiation or to master regulators of cell growth(Goldstone and Lavin 1991; Lu and Lotan 1999; Saal et al.2005; André et al. 2007; Amano et al. 2012; Ledeen et al.2012), as well as (iii) for disease-associated manifestation ofsingle-nucleotide polymorphisms (Pál et al. 2010, 2012; Chenet al. 2012) prompts a detailed promoter analysis and compari-son, in the first step by computational means.Access to genome data makes comprehensive gene detection

and sequence comparison possible, also with respect to phylo-genesis. Such data will answer questions on the extent of diver-gence in noncoding regions. The cases of the two homodimeric(proto-type) galectins-1 and -2 (Gal-1/-2), together with theiravian orthologs, i.e., chicken (Gallus gallus) galectins (CG)-1/-2,and the paralog pair CG-1A/B, exemplify how markedly different†These authors contributed equally to this work.

1To whom correspondence should be addressed: Tel: +49-89-2180-3984;Fax: +49-89-2180-992290; e-mail: [email protected]

Glycobiology vol. 23 no. 10 pp. 1152–1163, 2013doi:10.1093/glycob/cwt052Advance Access publication on July 9, 2013

© The Author 2013. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: [email protected] 1152

Downloaded from https://academic.oup.com/glycob/article-abstract/23/10/1152/1988214by gueston 30 January 2018

Page 2: 1 © The Author 2013. Published by Oxford University Press. All ...

the promoter regions of these single-copy genes have become(Abbott and Feizi 1989; Sakakura et al. 1990; Gitt andBarondes 1991; Chiariotti et al. 2004; Sturm et al. 2004; Lohret al. 2007; Kaltner et al. 2008). An extension of regulatory po-tential can be achieved by gene duplication. A second sequencestretch will hereby become available as platform for tailoringnew promoter characteristics. This process is a means to fine-tune gene expression. Such a case is realized for humantandem-repeat-type Gal-9 (Matsumoto et al. 1998; Lipkowitzet al. 2001; Cooper 2002), in contrast to the single-copy genesof Gal-1 and -2. In this respect, an indication for the possibilityof occurrence of a second gene for homodimeric Gal-7 onhuman chromosome 19 in the vicinity of the known Gal-7 geneand the gene for tandem-repeat-type Gal-4 (Cooper 2002)attracted our attention to refine the understanding of gene diver-sity for Gal-7 in man and mammals.Gal-7, also referred to as p53-induced gene 1 (Polyak et al.

1997), is typically present in stratified epithelia, effective ingrowth regulation via extra- and intracellular mechanisms,known as antigen in autoimmune diseases as well as relevant todifferential diagnosis and prognosis assessment in head andneck tumors (Madsen et al. 1995; Magnaldo et al. 1995;Bernerd et al. 1999; Kuwabara et al. 2002; Kopitz et al. 2003;Sturm et al. 2004; Saussez et al. 2006; Remmelink et al. 2011;Villeneuve et al. 2011; Sarter et al. 2013). By processinggenome-sequence data, we not only verified the actual presenceof a second gene but also detected sequence identity in thecoding region except for a single (silent) deviation. Moreover,we let two search algorithms scrutinize the proximal promotersequences of both genes for potential transcription-factor-binding sites, after defining the transcription start point(s) (tsp’s)experimentally. Because the extent of copy-number variation(CNV) for Gal-7 in mammalian phylogenesis was unknown, wenext assessed the genomic representation of its orthologs in 44species. To broaden comparison among galectins, this workincludes mapping the copy number of the Gal-4 gene, locatedin chromosomal vicinity to the Gal-7 site. Phylogenetic analysesof these two genes were supplemented by examining Gal-9,encoded by three functional genes in the human genome. The

disclosed interspecies plasticity can preclude extrapolations ofexpression profiles of galectins between species with CNV.

Results and discussionDetection of a second gene for human Gal-7Following the assumption that the region on human chromo-some 19q13.1-13.2, which harbors the genes for Gal-4 and -7as well as a cluster starting with Gal-10 (Charcot-Leydencrystal protein) followed by a series of paralogs, “may include asecond copy for Gal-7” (Cooper 2002), we put this specialsection under scrutiny. Indeed, we detected a second, fullysequenced copy in close vicinity to the known gene (and noevidence for Gal-4 gene duplication; see below for details)(Figure 1A). The divergent orientation in the gene pair, marked-ly less frequent than the head-to-tail tandem (Graham 1995;Shoja and Zhang 2006), may have arisen from differentmechanisms and survived the threat of duplication-deletionevents (Hastings et al. 2009; Weischenfeldt et al. 2013). Toexclude its nature as pseudogene, the two sequences were com-pared in detail, carefully checking the presence of start/stopcodons and the gene architecture. The fine-structure mapping ofexon/intron display of both genes, shown in Figure 1B, revealsthe characteristic design with four exons, with disparity in orien-tation. These two coding sequences differed exclusively in pos-ition 144 (guanine (G) in lectin, galactoside-binding, soluble 7(LGALS7), cytosine (C) in LGALS7B; no change in amino acidsequence) so that the complete set of amino acids makingcontact to carbohydrate ligands, as defined by X-ray crystallog-raphy and modeling (Leonidas et al. 1998; André et al. 2005), isconserved. No premature termination site was present in theLGALS7B gene. In contrast, they are common in the clustermentioned above on the same chromosome, to which the Gal-13gene belongs, here especially at the site encoding for amino acid55 in a total of 18 pseudogenes (Than et al. 2009).We applied the 5′-rapid amplification of cDNA ends method-

ology to identify the tsp’s. Two respective sites were detectedby analysis of 10 different clones (Figure 2). In detail, fiveclones each from this total of 10 presented a sequence for tsp1a

Fig. 1. (A) Schematic map for the localization of the second gene for Gal-7 and its environment on human chromosome 19. Orientation of the two genes are definedby arrows; box lengths are not drawn to scale. ACTN4, actinin α4; CAPN12, calpain 12; ECH1, enoyl CoA hydratase 1, peroxisomal. (B) Structural organization ofthe two human genes for Gal-7 (for relative orientation on the chromosome, see A). Lengths of exons (together with information on untranslated sections inparentheses), the complete gene and the space between the two genes are given. The carbohydrate-binding site is encoded in exon 3.

Galectin genes in man and animals

1153

Downloaded from https://academic.oup.com/glycob/article-abstract/23/10/1152/1988214by gueston 30 January 2018

Page 3: 1 © The Author 2013. Published by Oxford University Press. All ...

or tsp1b, respectively, no other sequence seen. The length of the5′-untranslated stretch is thus comparable, for example, with thatof human Gal-2 and -3 and other mammalian galectins (Gitt et al.1992; Kadrofske et al. 1998; Chiariotti et al. 2004), allowing toselect similarly sized sequence stretches for promoter analysis insilico. The nearly complete identity in the coding regions and inthe intron sequences (according to Ensembl release 72 andcurrent National Center for Biotechnology Information (NCBI)GENE entry, a difference of 8/3 bp in the untranslated part ofexon 1/4, 3 T/C changes at positions 614, 691, 974 in intron 2and 1 T/G change at position 520 in intron 3) yet precluded toassess with certainty that both genes are transcribed. Havingdetected no sequence change on the level of the protein, whichfittingly runs as single spot in two-dimensional gel electrophoret-ic analysis (Madsen et al. 1995), the issue to be addressed nextwas thus whether and to what extent the proximal promotersequences of the two genes have undergone divergence.

Computational analysis of proximal promoter sequencesSetting the experimentally defined tsp1a as zero-position(Figure 2B), the sequence stretch for analysis was defined to rangefrom −2000 bp upstream to +260 bp downstream. Whereas thetwo downstream parts were identical, deviations started to occurfrom position −109 onwards to result in an overall sequence-stretchdifference of 34.2%. Both sequences were systematically alignedto known motifs for binding transcription factors. This compu-tational (MatInspector-based) screening led to two sets ofsequences, which fit criteria for use in gene regulation andwhich are qualitatively differently represented in the two indi-vidual promoter regions (Table I; for full survey of hits, scoresand sequence details, see Supplementary data, Table S1). Thelisted cases support the hypothesis for an extension of regula-tory capacity of Gal-7 expression by gene duplication, com-pared with the single-copy genes for Gal-1 and -2. In principle,the essential nature of experimental support prior to claimingfunctional significance yet needs to be emphasized.This reasoning is underscored by identifying nonoverlapping

panels of sequence differences when using the P-Match/Match

algorithms (Supplementary data, Table S2; for complete ana-lysis, see Supplementary data, Table S3). In this sense, the com-putational processing gives design of further experiments aframework, as epitomized by tumor-suppressor-driven Gal-7 ex-pression. With respect to the p53-induced transcriptional upregu-lation seen in human DLD-1 colon carcinoma cells (Polyak et al.1997), the searches came up with putative sites depending on thenature of the program used. While the sequence around −1622/3(+) was given a high score by two algorithms, other regions, too,scored well in the MatInspector-based process (Supplementarydata, Table S4). Because the human proto-type galectins have sofar been viewed as being encoded by single-copy genes, we nextexplored the degree of phylogenetic plasticity of this feature, i.e.,to define the genomic representation of Gal-7 orthologs in apanel of mammals with increasing evolutionary distance to man.

Phylogenetic conservation/variation of Gal-7 gene displayA set of 44 animal species was put under scrutiny accordingly,for gene assignment to chromosome and assessment of the copynumber. The origin of processed information is compiled in aphylogenetic-tree-style representation (Figure 3), with furtherdetails on species, sources and genes given in Supplementarydata, Table S5. To prevent transcriptionally inactive sequencestretches from entering the compilation, only candidates with de-finitive presence of start/stop codons at appropriate positions, thecharacteristic exon/intron display and absence of nondefinedregions in between were processed.As a result, the mode Gal-7 is encoded was disclosed to fall

into two categories. As illustrated in Figures 3 and 4 exemplarily,22 species share the two-genes-in-opposite-direction constellationwith the human genome. Evidently, the copy number is alsofound to be confined to 1 in 20 species, based on the current statusof sequencing and validity of annotation (Figures 3 and 4). In thiscategory, both types of orientation are present (Figure 4). Topreclude unjustified generalization from this result in that thehuman genome has increased number of genes for lectins itshould at this point be noted that CNV for lectin genes betweenman and rodents can also be in the opposite direction: The

Fig. 2. Identification of the transcriptional start points. Illustration of the gel electrophoretic product analysis using a DNA ladder for calibration (A) and the sequenceof the amplified touchdown PCR product, ligated RNA oligonucleotide and gene-specific primer sequences are given, the translation start (ATG) highlighted (B).Processing of 10 clones reveals the presence of two start points termed tsp1a/b.

HKaltner et al.

1154

Downloaded from https://academic.oup.com/glycob/article-abstract/23/10/1152/1988214by gueston 30 January 2018

Page 4: 1 © The Author 2013. Published by Oxford University Press. All ...

Table I. Qualitative differences in the profile of putative transcription-factor-binding sites

LGALS7 LGALS7B

AARE; BNC1; BTB/POZ; CDE; ChRE; DMRT; FAST; GABF; Gcnr; HDGF;HIFF; HNF6; HOX C; INRE; LHXF; NKXH; Olf; Oct-1; PIT-1; SIX3

AP-4; ATBF; BCL6; BTBF; Cdx; Csen; DICE; GRE; HMTB; HNF1; NKX6;PARF; PBXC; PDX1; PRDF; RP58; SATB; SRF; STEM; TALE

The proximal promoter region in each case was chosen from −2000 bp upstream to +260 bp downstream relative to the gene’s tsp, which was experimentallyidentified and defined as +1a (see Figure 2B), and its sequence was computationally processed using the MatInspector algorithm. Only the putative binding motifsare listed which are uniquely present in the proximal promoter region of the LGALS7 gene or uniquely present in the proximal promoter region of the LGALS7Bgene. Names of matrix families: AARE, amino acid-response element; AP-4, Activating enhancer-binding protein 4; ATBF, AT-binding transcription factor; BCL6,POZ domain zinc finger expressed in B-Cells; BNC1, Zinc finger protein basonuclin-1; BTBF, bromodomain and PHD domain transcription factors; BTB/POZ,broad complex, TRAMTrack, bric-a-brac/pox viruses and zinc fingers transcription factors; CDE, cell cycle regulators, cell cycle dependent element; Cdx, vertebratecaudal related homeodomain protein; ChRE, carbohydrate-response elements, consist of two E box motifs separated by 5 bp; Csen, Calsenilin, presenilin-bindingprotein, EF hand transcription factor; DICE, downstream immunoglobulin control element, critical to B cell activity and specificity; DMRT, DM domain-containingtranscription factors; FAST, FAST-1 SMAD-interacting proteins; GABF, GA-boxes; Gcnr, germ cell nuclear receptors; GRE, glucocorticoid-responsive and relatedelements; HDGF, hepatoma-derived growth factor; HIFF, hypoxia-inducible factor; HMTB, human muscle-specific Mt-binding site; HNF1, Hepatic nuclear factor 1;HNF6, Onecut homeodomain factor HNF6; HOX C, HOX—PBX complexes; INRE, Core promoter initiator elements; LHXF, Lim homeodomain factors; NKX6,NK6 homeobox transcription factors; NKXH, NKX homeodomain factors; Olf, neuron-specific olfactory factor; Oct-1, Octamer-binding protein; PARF, PAR/bZIPfamily; PBXC, PBX1—MEIS1 complexes; PDX1, Pancreatic and intestinal homeodomain transcription factor; PIT-1, GHF-1 pituitary specific POU domaintranscription factor; PRDF, positive regulatory domain I-binding factor; RP58, RP58 (ZFP238) zinc finger protein; SATB, special AT-rich sequence-binding protein;SIX3, Sine oculis homeobox homolog 3; SRF, serum-response element-binding factor; STEM, motif composed of binding sites for pluripotency or stem cell factors;TALE, TALE homeodomain class recognizing TG motifs.

Fig. 3. Comprehensive compilation of data-bank-stored information on human LGALS7/LGALS7B genes and orthologous genes or gene fragments within chromosomes,scaffolds or contigs in man and 44 animal species from different branches of the phylogenetic tree. The retrieved information on genes or gene fragments either in forward(+) or in reverse (−) direction is listed as chromosome, contig or scaffold number together with the number of nucleotides, characterizing the precise position (forchromosomes in the range of millions) and length of the gene.

Galectin genes in man and animals

1155

Downloaded from https://academic.oup.com/glycob/article-abstract/23/10/1152/1988214by gueston 30 January 2018

Page 5: 1 © The Author 2013. Published by Oxford University Press. All ...

C-type macrophage asialoglycoprotein receptor (CD301) isrepresented by two genes in mice but only one gene in manand, notably, rats (Onami et al. 2002; Tsuiji et al. 2002). Thesame holds true for the collectin mannan (mannose)-bindinglectin, for which the human genome harbors an active pseudo-gene taking the place of the second functional murine gene(Garred 2008). The apparent exceptions of dolphin and hedge-hog genomes with three or six hits (Figure 3), respectively,should be viewed with caution, awaiting further verification bycompletely sequencing the regions in question.Because lectin activity is a hallmark of Gal-7 functionality,

any deviations from the signature of carbohydrate-contactingamino acids (for detailed structural analysis of Gal-7 in crystals,see Protein Data Bank entry 4GAL, and in solution, see

Nesmelova et al. 2012; Ermakova et al. 2013) can be spotted inSupplementary data, Table S5, as indicator for impairment ofthis capacity. When extending the comparison of protein-sequence data from the signature for lectin activity to the com-plete sequence, two different ways how to graphically depictscoring similarity emphasize the high level of conservation ofGal-7 between species (Supplementary data, Figures S1A andS1B). In contrast to human genes, amino acid sequencechanges can evidently occur in other species, from 1 (in cat/cow) up to 25 (horse), none affecting a signature site for glycanbinding. However, impact on association to nonglycan-bindingpartners is possible, as galectins, similar to other lectin types(Gabius et al. 1985; Gready and Zelensky 2009), are being deli-neated to interact also with ligands other than carbohydrates,

Fig. 4. Representative overview of the presence of Gal-7 gene(s) in animals. For internal calibration, the information on human LGALS7/LGALS7B (see Figure 1B)is inserted on top; exon 3 is highlighted due to its coding of the signature sequence for ligand contact.

HKaltner et al.

1156

Downloaded from https://academic.oup.com/glycob/article-abstract/23/10/1152/1988214by gueston 30 January 2018

Page 6: 1 © The Author 2013. Published by Oxford University Press. All ...

the antiapoptotic bcl-2 being an example for Gal-7 (Villeneuveet al. 2011). Deviation in the lectin signature sequence appearsto happen in partially sequenced stretches, especially in thecases of anole lizard, bushbaby, chimpanzee, chinese hamster,dolphin, hedgehog, gibbon, killer whale, pacific walrus, panda,pika, tarsier, tree shrew and wallaby (Supplementary data,Table S5) and may indicate neofunctionalization. These compi-lations on gene and protein sequences thus teach the lessonsthat (i) gene information for a Gal-7 protein is widely presentamong mammals (the only indication for an ortholog in a non-mammalian species comes from sequencing the genome ofanole lizard), (ii) it is encoded as single-copy gene or by twogenes, (iii) the relative orientation of the single Gal-7 gene inits chromosomal context can differ between species, (iv) thestatus of protein sequences of duplicated genes differ, from nochange to alterations affecting the lectin signature and (v) pres-ence in tandem is not a general phenomenon in evolution ofhuman lectin genes, when gene duplication has occurred inmammals.The detected variability, in turn, can have notable conse-

quences for Gal-7 expression in a species and for interspeciescomparison of expression profiles. It signifies that the combina-torial capacity for gene regulation of the same galectin candiffer among species. In other words, this finding sets limits toextrapolations of patterns of gene regulation from a certainspecies to others, unless the gene’s copy-number status has atleast been confirmed to be identical. This said, it becomes anissue to further explore the representation of galectin genes inorder to answer the question whether Gal-7 is an exceptionalcase. To do so, we first focused on the tandem-repeat-typeGal-4.

Phylogenetic conservation/variation of Gal-4 gene displayCoding for Gal-4, in the vicinity of the two Gal-7 genes onhuman chromosome 19 (Figure 1A), rests on a single-copygene. A relevant hypothesis posits that Gal-7 “might have arisenby duplication” of Gal-4´s N-terminal domain (Houzelsteinet al., 2004). This two-domain lectin has special functions inapical membrane trafficking, loading distinct cargo for transportby its dual specificity to sulfatides, especially when presentedby a sphingolipid with long-chain (up to C26) fatty acids, andglycoproteins rich in N-acetyllactosamine termini on complex-type N-glycans (Delacour et al. 2005; Morelle et al. 2009;Stechly et al. 2009; Velasco et al. 2013). In contrast to Gal-7,ganglioside GM1 is not a major counterreceptor for Gal-4 onneuroblastoma cell surface (Kopitz et al. 2003, 2012). Its in-volvement in myelination explains the tight expression controlin postnatal brain (Stancic et al. 2012). Using the functionalitycriteria on the basis of the gene given above and the set ofsearch tools listed in Methods to identify any member of thegalectin family regardless of sugar-binding capacity, assumedlyfunctional LGALS4 constellations are present in the same panelof species as for LGALS7 (for details, see Supplementary data,Table S6). Although orientation and intron lengths can differ,LGALS4 maintained its status as single-copy gene in interspe-cies comparison in this test cohort, examples shown inFigure 5A. Including data for Xenopus, with two genomes ana-lyzed which are devoid of an ortholog for Gal-7, disclosedevidence for a two-gene constellation (Supplementary data,

Table S6). Looking at protein-sequence comparisons, hot spotsfor differences had been unraveled in the connecting peptide.The fine-structural comparison of the sequences of the linkerconnecting the two carbohydrate recognition domains (respect-ive exons marked by asterisks in Figure 5A) found conservationlimited to the peripheral sections and the central CPG motif(Cooper 2002).A unique case of intraspecies divergence is the mouse-

specific occurrence of a second Gal-4-like gene, whose con-spicuous disparities especially in this part led to its designationas Gal-6 (Gal-6), the result of tandem duplication on murinechromosome 7 syntenic to 19q13.1-13.3 in man (Gitt, Colnotet al. 1998, Gitt, Xia et al. 1998). Having appeared after themouse–rat divergence, this gene is an attractive model in ourcontext as well. In line with the acquired alterations betweenpromoters of the two Gal-7 genes, 13% disparities are presentin the proximal promoter regions of murine Gal-4/-6 genes.Specific antibodies showed Gal-6 presence in enteroendocrinecolonic cells and its absence in colonic lamina propria after in-flammatory challenge, interpretable as neofunctionalization(Houzelstein et al. 2013). The approach taken herein may behelpful to relate aspects of this assumed process to sequence di-vergence. Intriguingly, strains and even wild individualsshowed presence/absence polymorphism for the LGALS6 gene(Houzelstein et al. 2008).So far, we have thus examined single- or double-gene(s) con-

stellations of the human genome. Despite their spatial vicinity,the constellations of human genes for Gal-4 and -7 are different.To further test the hypothesis on promoter-sequence differencesin functional genes for a human galectin, Gal-9 affords a suit-able test model. Physiologically, this protein has a characteristicexpression profile in relation to other galectins, mapped, forexample, in skin and squamous cell carcinomas, and capacityto induce T cell apoptosis (Čada et al. 2009; Sakuishi et al.2011; Fík et al. 2013).

Phylogenetic conservation/variation of Gal-9 gene displayThe human genome encodes two genes along with a Gal-9-likesequence (referred to as Gal-9C) adjacent to the Gal-9 gene at17p11.1 (LOC 162568) (Lipkowitz et al. 2001; Cooper 2002).The product of the LGALS9B gene (hUAT2) has an identity of94.7% to Gal-9 (Lipkowitz et al. 2001). Supporting the conceptfor promoter divergence after copy-number increase of a galectingene, the respective sequences are nearly 20% different forLGALS9 vs. LGALS9B/C, this disparity reflected in thenumber of unique features in MatInspector-based processing(Supplementary data, Table S7; for complete analysis, seeSupplementary data, Tables S8A–C; for P-Match/Match-baseddata, see Solís et al. 2010). Expression profiles measured byreverse transcription-polymerase chain reaction (PCR) revealedmRNA for hUAT1 in various organs, whereas hUAT2 expressionwas confined to colon, detectable also in prostate and peripheralblood lymphocytes, in these instances at considerably weaker in-tensity than hUAT1-specific mRNA (Lipkowitz et al. 2001). Inorder to answer the question on gene-display modes in animals,as for Gal-4 and -7, we processed the respective sequence infor-mation (see for details Supplementary data, Table S9).The patterns for LGALS9 gene(s) in the tested animal

genomes were not constant. Either the status with three genes

Galectin genes in man and animals

1157

Downloaded from https://academic.oup.com/glycob/article-abstract/23/10/1152/1988214by gueston 30 January 2018

Page 7: 1 © The Author 2013. Published by Oxford University Press. All ...

Fig. 5. Representative overview of genome entries for Gal-4 (A) confined to single-copy presence and Gal-9 (B) in man and animals. In addition to marking codingfor the carbohydrate-binding site (by asterisks), the exon subject to alternative splicing in (B) to generate the two forms with linker-length difference is labeled by ablack circle. Gal-9 genes in horses indicated by consecutive Arabic numbers refer to the following Ensembl Gene IDs: ENSECAG00000019932 (1),ENSECAG00000022377 (2), ENSECAG00000024755 (3). The presence of the Gal-5 gene in rats is unique.

HKaltner et al.

1158

Downloaded from https://academic.oup.com/glycob/article-abstract/23/10/1152/1988214by gueston 30 January 2018

Page 8: 1 © The Author 2013. Published by Oxford University Press. All ...

was maintained or a one-gene-only arrangement was observedin the studied set of fully sequenced Gal-9 genes, in accordwith the situation for LGALS7 gene(s) that no third scenariowas realized (Figure 5B; for comparison, see Figure 4). Takingstock of a total of 60 species with evidence for a Gal-9 ortho-log, 10 or 38 species fall into one of these two categories(Supplementary data, Table S9). However, that the status of notyet finished sequencing of the genomes of 12 species currentlyleads to two entries (Supplementary data, Table S9) givesreason to postpone reaching a definitive conclusion. Unique forthe rat genome is the emergence of the single-domain Gal-5(Gal-5), which appears involved in the exosomal sortingpathway during rat reticulocyte maturation (Gitt et al. 1995;Wada and Kanwar 1997; Lensch et al. 2006; Barrès et al.2010). The C-terminal lectin domain of rat Gal-9 and -5 sharean 86% identity in protein sequence so that Gal-5 is the productof a short N-terminal sequence linked to this domain (Gitt et al.1995; Wada and Kanwar 1997). On the level of the promotersequences, signs attributable to divergence were disclosedsuch as cytidine-cytidine-adenosine-adenosine-thymidine boxor Octamer-binding protein-1/-2 binding motifs (Lensch et al.2006), the extent of identity initially being perfect, then drop-ping to 85.7%, when examining the same stretch lengths as forthe Gal-7 promoters.The mode of coding of Gal-9 in the rat genome also afforded

to teach another salient lesson. The reported detection ofseveral bands in Southern blots had implied gene duplication(Wada and Kanwar 1997), whereas the data-bank-derived infor-mation shown in Figure 5B reaches the conclusion for the pres-ence of a single functional gene. In fact, further scouring thisgenome information for genes with Gal-9(-like) informationdelineated the cause of letting several bands arise in Southernblots. They can be attributed to the presence of two (intronless)pseudogenes on chromosome 2 (LOC361937; contains fivestop codons) and on chromosome 14 (LOC364170; containsseven stop codons). Of general relevance, the presence of pseu-dogenes for a galectin can complicate the interpretation ofSouthern-blot data. As already mentioned above, this situationcan especially be daunting in the Gal-13-containing cluster onchromosome 19 (Than et al. 2009). The vivid duplication/copyloss events (birth-and-death evolution) can be considered alikely source for new variants, to optimize adaptation to envir-onment (Nei and Rooney 2005). As the case of C-type/C-type-like lectins teaches, nonlectin proteins, which targetpeptide motifs and hereby broaden the range of counterrecep-tors, can originate from this genomic-level dynamics.Finally, the question on plasticity of galectin-gene represen-

tation beyond mammals, of which Gal-7 is characteristic, isaddressed. In Aves, to give a further example, no respectiveorthologs for all the three galectins studied here were found (butthe ones for Gal-1, -2, -3 and -8; Beyer et al. 1980; Oda andKasai 1983; Kaltner et al. 2008, 2009, 2011). In addition, twoparalogs for Gal-1 are present in chicken (see “Introduction”section). A gene duplication event likely not far from the separ-ation of birds and mammals (�380 million years ago) appears tounderlie their existence, and both proteins acquired characteristicfeatures in terms of disulfide-bond formation and glycan binding(Sakakura et al. 1990; Varela et al. 1999; Wu et al. 2007;López-Lucendo et al. 2009). In terms of gene regulation, the ex-pression profiles are markedly different (Kaltner et al. 2008),

most notably with CG-1A as a potentially functional marker ofchicken limb precartilage condensations at a very early stage(Bhat et al. 2011), a case of neofunctionalization.

Conclusions

Defining the copy number for each galectin gene is essentialfor functional analysis of gene regulation. Our study revealsCNV in the group of homodimeric proto-type galectins bydetecting the second gene for human Gal-7 and promoter-sequence divergence. The search was not only guided by thesignature sequence for glycan binding but included homologycriteria to pick up any family member, regardless of this prop-erty, reflecting the known multifunctionality of galectins(Smetana et al. 2013). The presented data from computationalanalysis give respective experimental work directions, and thedocumented difference in gene display between species clearlyprecludes immediate extrapolation of expression profiles.Among galectins, plasticity in gene representation can occur,too. Whereas the copy number was constant for the Gal-4 gene(except for unique presence of a murine Gal-6 gene), which isin the vicinity of the Gal-7 gene(s), the situation for Gal-9backed the take-home message of occurrence of CNV betweenman and popular animal models. The presented data, thus,support the concept for species-dependent refinements in regu-lation of genes for certain galectins.

Materials and methodsProcessing of sequence informationGene sequences were downloaded from the Ensembl GenomeBrowser (http://www.ensembl.org/index.html, Ensembl release72, June 2013), the NCBI Genbank (www.ncbi.nlm.nih.gov/genbank/index.html) with its divisions CoreNucleotide (www.ncbi.nlm.nih.gov/nuccore), Expressed Sequence Tags database(www.ncbi.nlm.nih.gov/nucest) and Genome Survey Sequencedatabase (www.ncbi.nlm.nih.gov/nucgss), as well as theZebrafish Model organism database ZFIN (www.zfin.org) andthe Xenopus laevis/tropicalis model database Xenbase (www.xenbase.org). Attempting to reach accurate assessment of thecopy number, whole-shotgun sequencing data of each speciesprovided by the University of California Santa Cruz (UCSC)genome browser (www.genome.ucsc.edu) and NCBI Genome(www.ncbi.nlm.nih.gov/genome) as contigs, unplaced scaf-folds, chromosomal genomic scaffolds and assemblies werescrutinized first for the presence of distinct exon sequences,later routinely for full-length coding sequence. In addition, theBasic Local Alignment Search Tool (BLAST) (www.blast.ncbi.nlm.nih.gov/) search algorithms were applied to cover the fullrange of sequences fitting stringent criteria of homology.Chromosomal locations of the hits were defined using theNCBI Map Viewer function (http://www.ncbi.nlm.gov/projects/mapview), hereby also enabling to figure out the positionsof proximal promoter sequences. Amino acid sequences ofthe predicted gene products were deduced unless available inthe NCBI package retrieved by BLASTP/Position-SpecificIterative Basic Local Alignment Search Tool search algorithms,in the NCBI Protein (www.ncbi.nlm.nih.gov/protein) and theUniProt Knowledgebase (UniProtKB, Expasy Proteomics Server,

Galectin genes in man and animals

1159

Downloaded from https://academic.oup.com/glycob/article-abstract/23/10/1152/1988214by gueston 30 January 2018

Page 9: 1 © The Author 2013. Published by Oxford University Press. All ...

www.expasy.ch). Information on entries for orthologs of genesfor human galectins in species from different branches of thephylogenetic tree was displayed applying the NCBI TaxonomyBrowser (www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi) and the phylogenetic tree visualization softwareTreeView (www.taxonomy.zoology.gla.ac.uk/rod/treeview.html).Genomic sequences were examined manually in each case

for annotation errors, the presence of nonsequenced stretches,exon/intron boundaries and completeness of the coding se-quence including the presence of start/stop codons, apply-ing the sequence text view tool implemented in NCBI Gene(http://www.ncbi.nlm.nih.gov/gene; Maglott et al. 2011) andthe export contig sequence/features tool in the Location viewof Ensembl. Sequences were edited using the EditSeq soft-ware version 7.1 (DNAstar Inc., Madison, WI). Exclusivelygene sequences satisfying the given criteria qualified forconsideration, and this principle rigorously followed in eachcase. Gene architecture is graphically displayed in the figuresas follows: Exons are drawn as boxes (untranslated exon partsin white, coding exons in gray, a half black/half white patterndepicts the exon encoding the carbohydrate-binding site) andintrons and 5′/3′-untranslated parts are shown as lines (size pro-portional to exon/intron lengths). Arabic numbers below exonboxes define the spatial order of exon occurrence. The lengths ofexons (number in parentheses refers to untranslated exon part fol-lowing stop codon), of introns and the distance (dashed line)between the first exons in the two-gene constellation are given inbasepairs (bp).Pairwise alignments of nucleotide or amino acid sequences

were performed using EMBOSS Needle (www.ebi.ac.uk/Tools/psa/; Rice et al. 2000), multiple sequence alignments applyingthe ClustalW software (www.ebi.ac.uk/Tools/msa/clustalw2)(Thompson et al. 1994; Larkin et al. 2007), which introducescoloring residues according to their physicochemical proper-ties. Clustal-format alignments were edited in JalView (vs. 2.7;Waterhouse et al. 2009), and residues are assigned a coloraccording to the percentage of the residues (>80% dark blue,>60% medium blue, >40% light blue and <40% white) in eachcolumn that are in accord with the consensus sequence.Conservation is scored according to a numerical index (0–10)reflecting level of maintaining physicochemical properties, ahigh-quality score (Blocks Substitution Matrix 62 based) in acolumn signals an impact of unfavorable changes. Consensus isa measure for classifying amino acid residues to frequency ofoccurrence, given as percentage.

Identification of tsp’sThe RNA ligase-mediated 5′-RACE technique using the GeneRacer™ kit (Invitrogen, Darmstadt, Germany) was applied todefine the 5′-start sequence of Gal-7-specific mRNA usingcells of the immortalized keratinocyte (HaCaT) line (kind giftof Drs. M. Quintanilla and A. Villalobo, Madrid, Spain) assource for the transcript. Briefly, a series of enzyme reactionswas required to remove 5′-capping and replace it with a RNAoligonucleotide (5′-CGACUGGAGCACGAGGACACUGACAUGGACUGAAGGAGUAGAAA-3′). After reverse transcrip-tion with random primers, the 5′-start sequence was amplifiedby touchdown PCR using the Gene Racer™ 5′ primer and aGal-7-gene-specific primer (5′-CTTGAAGCCGTCGTCTGA

CGCGATG-3′). The corresponding positions of the primerwithin the gene sequence are depicted in the respective figure.

Computational promoter analysisThe manually edited proximal promoter regions (from −2000 bpupstream of the experimentally identified tsp1a to +260 bp down-stream) of the genes for human Gal-4, -7 and -9 (Gal-7: Ensembl,Gene ID: ENSG00000205076, Transcript ID: ENST00000378626,NCBI, Gene ID: 3963, Reference Sequence (RefSeq) mRNA:NM_002307.3; Gal-7B: Ensembl, Gene ID: ENSG00000178934,Transcript ID: ENST00000314980, NCBI Gene ID: 653499,RefSeq mRNA: NM_001042507.3; Gal-4: Ensembl, Gene ID:ENSG00000171747,Transcript ID: ENST00000307751, NCBI,Gene ID: 3960, RefSeq mRNA: NM_006149.3; Gal-9: Ensembl,Gene ID: ENSG00000168961, Transcript ID: ENST00000395473,NCBI, Gene ID: 3965, RefSeq mRNA: NM_009587.2; Gal-9B:Ensembl, Gene ID: ENSG00000170298, Transcript ID:ENST00000324290, NCBI, Gene ID: 284194, RefSeq mRNA:NM_001042685.1; Gal-9C: Ensembl, Gene ID: ENSG00000171916,Transcript ID: ENST00000328114, NCBI, Gene ID: 654346,RefSeq mRNA: NM_001040078.2) were processed to trace pu-tative sites for binding transcription factors. Analytical work inthis context comprised two different search and evaluationmethodologies. In one set of data processing, online versionsof the programs Match™ (Kel et al. 2003) and P-Match™(Chekmenev et al. 2005) were used, both available on thewebsite of the BIOBASE Biologische Datenbanken GmbH(Wolfenbüttel, Germany; http://www.gene-regulation.com/pub//programs.html), with data input from the latest update of theTRANSFAC® transcription factor database (http://www.biobase.de) as source for weight matrices and consensus sequences.Settings to avoid missing factors with low-quality matrices werecustomized, as recently applied to promoters of the five CGgenes (Kaltner et al. 2008, 2009, 2011). In parallel, the programMatInspector, which minimizes redundant matches by arrangingsimilar and/or functionally related transcription-factor-bindingsites into matrix families (Cartharius et al. 2005), also led toidentification of sequence motifs that satisfy the criteria of abinding site for transcription factors. The MatInspector software(http://www.genomatix.de/index.html) identifies matches bycomparing the proximal promoter sequences with weightedmatrix descriptions of functional binding sites based also on theTRANSFAC® database, as above, and on the MatInspector-inherent 634 matrices grouped in 279 families. The optimizedthreshold of a weight matrix was defined as the matrix similaritythreshold that allows a maximum of three matches in 10 kbof nonregulatory test sequences. By scanning the imported se-quence against the relative frequency of each nucleotide at a par-ticular position in the program-based distribution profile, thematrix similarity was calculated on-the-run. Only sites with amatrix similarity of >0.8 and a core similarity of 1.0, which isreached only if the sequence analyzed is completely identicalto matrix-inherent sequence sections with highest degree ofconservation, were considered to be of putative significance.

Supplementary Data

Supplementary data for this article is available online at http://glycob.oxfordjournals.org/.

HKaltner et al.

1160

Downloaded from https://academic.oup.com/glycob/article-abstract/23/10/1152/1988214by gueston 30 January 2018

Page 10: 1 © The Author 2013. Published by Oxford University Press. All ...

Funding

This work was generously supported by the EC SeventhFramework Program (FP7/2007-2013) under grant agreementno. 2602600 (“GlycoHIT”) and grant agreement no. 317297for the RTN network GLYCOPHARM.

Conflict of interest

None declared.

Abbreviations

A, adenine; BLAST, Basic Local Alignment Search Tool; C,cytosine; CG, chicken galectin; CNV, copy-number variation;dbGSS, Genome Survey Sequence database; G, guanine; Gal,galectin; LGALS, lectin, galactoside-binding, soluble; NCBI,National Center for Biotechnology Information; PCR, polymer-ase chain reaction; RefSeq, Reference Sequence; T, thymine; tsp,transcription start point; UCSC, University of California SantaCruz.

References

Abbott WM, Feizi T. 1989. Evidence that the 14 kDa soluble β-galactoside-bindinglectin in man is encoded by a single gene. Biochem J. 259:291–294.

Ahmad N, Gabius H-J, Kaltner H, André S, Kuwabara I, Liu F-T, Oscarson S,Norberg T, Brewer CF. 2002. Thermodynamic binding studies of cell surfacecarbohydrate epitopes to galectins-1, -3, and -7: Evidence for differentialbinding specificities. Can J Chem. 80:1096–1104.

Amano M, Eriksson H, Manning JC, Detjen KM, André S, Nishimura S-I,Lehtiö J, Gabius H-J. 2012. Tumour suppressor p16INK4a: Anoikis-favouringdecrease in N/O-glycan/cell surface sialylation by down-regulation ofenzymes in sialic acid biosynthesis in tandem in a pancreatic carcinomamodel. FEBS J. 279:4062–4080.

André S, Kaltner H, Lensch M, Russwurm R, Siebert H-C, Fallsehr C,Tajkhorshid E, Heck AJR, von Knebel Doeberitz M, Gabius H-J, et al. 2005.Determination of structural and functional overlap/divergence of five proto-type galectins by analysis of the growth-regulatory interaction with ganglio-side GM1 in silico and in vitro on human neuroblastoma cells. Int J Cancer.114:46–57.

André S, Sanchez-Ruderisch H, Nakagaw H, Buchholz M, Kopitz J, ForberichP, Kemmner W, Böck C, Deguchi K, Detjen KM, et al. 2007. Tumor suppres-sor p16INK4a: Modulator of glycomic profile and galectin-1 expression toincrease susceptibility to carbohydrate-dependent induction of anoikis inpancreatic carcinoma cells. FEBS J. 274:3233–3256.

Barrès C, Blanc L, Bette-Bobillo P, André S, Mamoun R, Gabius H-J, Vidal M.2010. Galectin-5 is bound onto the surface of rat reticulocyte exosomes andmodulates vesicle uptake by macrophages. Blood. 115:696–705.

Bernerd F, Sarasin A, Magnaldo T. 1999. Galectin-7 overexpression is asso-ciated with the apoptotic process in UVB-induced sunburn keratinocytes.Proc Natl Acad Sci USA. 96:11329–11334.

Beyer EC, Zweig SE, Barondes SH. 1980. Two lactose-binding lectins fromchicken tissues. Purified lectin from intestine is different from those in liverand muscle. J Biol Chem. 255:4236–4239.

Bhat R, Lerea KM, Peng H, Kaltner H, Gabius H-J, Newman SA. 2011. A regu-latory network of two galectins mediates the earliest steps of avian limb skel-etal morphogenesis. BMC Dev Biol. 11:6.

Boscher C, Dennis JW, Nabi IR. 2011. Glycosylation, galectins and cellular sig-naling. Curr Opin Cell Biol. 23:383–392.

Čada Z, Smetana K, Jr, Lacina L, Plzáková Z, Stork J, Kaltner H, Russwurm R,Lensch M, André S, Gabius H-J. 2009. Immunohistochemical fingerprintingof the network of seven adhesion/growth-regulatory lectins in human skinand detection of distinct tumour-associated alterations. Folia Biol (Praha).55:145–152.

Cartharius K, Frech K, Grote K, Klocke B, Haltmeier M, Klingenhoff A, FrischM, Bayerlein M, Werner T. 2005. MatInspector and beyond: Promoter analysisbased on transcription factor binding sites. Bioinformatics. 21:2933–2942.

Chekmenev DS, Haid C, Kel AE. 2005. P-Match: Transcription factor bindingsite search by combining patterns and weight matrices. Nucleic Acids Res.33:W432–W437.

Chen H-j, Zheng Z-c, Yuan B-q, Liu Z, Jing J, Wan S-S. 2012. The effect ofgalectin-3 genetic variants on the susceptibility and prognosis of gliomas ina Chinese population. Neurosci Lett. 518:1–4.

Chiariotti L, Salvatore P, Frunzio R, Bruni CB. 2004. Galectin genes:Regulation of expression. Glycoconj J. 19:441–449.

Cooper DNW. 2002. Galectinomics: Finding themes in complexity. BiochimBiophys Acta. 1572:209–231.

Delacour D, Gouyer V, Zanetta J-P, Drobecq H, Leteurtre E, Grard G,Moreau-Hannedouche O, Maes E, Pons A, André S, et al. 2005. Galectin-4and sulfatides in apical membrane trafficking in enterocyte-like cells. J CellBiol. 169:491–501.

Ermakova E, Miller MC, Nesmelova IV, López-Merino L, Berbís MA, NesmelovY, Tkachev YV, Lagartera L, Daragan VA, André S, et al. 2013. Lactosebinding to human galectin-7 (p53-induced gene 1) induces long-range effectsthrough the protein resulting in increased dimer stability and evidence for posi-tive cooperativity. Glycobiology. 23:508–523.

Fík Z, Valach J, Chovanec M, Mazánek J, Kodet R, Kodet O, Tachezy R,Foltynová E, André S, Kaltner H, et al. 2013. Loss of adhesion/growth-regulatory galectin-9 from squamous cell epithelium in head and neck car-cinomas. J Oral Pathol Med. 42:166–173.

Gabius H-J, editor. 2009. The Sugar Code. Fundamentals of glycosciences.Weinheim, Germany: Wiley-VCH.

Gabius H-J, André S, Jiménez-Barbero J, Romero A, Solís D. 2011. Fromlectin structure to functional glycomics: Principles of the sugar code. TrendsBiochem Sci. 36:298–313.

Gabius H-J, Springer WR, Barondes SH. 1985. Receptor for the cell bindingsite of discoidin I. Cell. 42:449–456.

Garred P. 2008. Mannose-binding lectin genetics: From A to Z. Biochem SocTrans. 36:1461–1466.

Gitt MA, Barondes SH. 1991. Genomic sequence and organization of twomembers of a human lectin gene family. Biochemistry. 30:82–89.

Gitt MA, Colnot C, Poirier F, Nani KJ, Barondes SH, Leffler H. 1998.Galectin-4 and galectin-6 are two closely related lectins expressed in mousegastrointestinal tract. J Biol Chem. 273:2954–2960.

Gitt MA, Massa SM, Leffler H, Barondes SH. 1992. Isolation and expression ofa gene encoding L-14-II, a new human soluble lactose-binding lectin. J BiolChem. 267:10601–10606.

Gitt MA, Wiser MF, Leffler H, Herrmann J, Xia Y-R, Massa SM, Cooper DNW,Lusis AJ, Barondes SH. 1995. Sequence and mapping of galectin-5, aβ-galactoside-binding lectin, found in rat erythrocytes. J Biol Chem.270:5032–5038.

Gitt MA, Xia Y-R, Atchison RE, Lusis AJ, Barondes SH, Leffler H. 1998.Sequence, structure, and chromosomal mapping of the mouse Lgals6 gene,encoding galectin-6. J Biol Chem. 273:2961–2970.

Goldstone SD, Lavin MF. 1991. Isolation of a cDNA clone, encoding a humanβ-galactoside binding protein, overexpressed during glucocorticoid-inducedcell death. Biochem Biophys Res Commun. 178:746–750.

Graham GJ. 1995. Tandem repeat genes and clustered genes. J Theor Biol.175:71–87.

Gready JE, Zelensky AN. 2009. Routes in lectin evolution: Case study on theC-type lectin-like domains. In: Gabius H-J, editor. The Sugar Code.Fundamentals of Glycosciences. Weinheim, Germany: Wiley-VCH. p. 329–346.

Hastings PJ, Lupski JR, Rosenberg SM, Ira G. 2009. Mechanisms of change ingene copy number. Nat Rev Genet. 10:551–564.

Houzelstein D, Gonçalves IR, Fadden AJ, Sidhu SS, Cooper DNW, DrickamerK, Leffler H, Poirier F. 2004. Phylogenetic analysis of the vertebrate galectinfamily.Mol Biol Evol. 21:1177–1187.

Houzelstein D, Gonçalves IR, Orth A, Bonhomme F, Netter P. 2008. Lgals6, a2-million-year-old gene in mice: a case of positive Darwinian selection andpresence/absence polymorphism. Genetics. 178:1533–1545.

Houzelstein D, Reyes-Gomez E, Maurer M, Netter P, Higuet D. 2013.Expression patterns suggest that despite considerable functional redundancy,galectin-4 and -6 play distinct roles in normal and damaged mouse digestivetract. J Histochem Cytochem. 61:348–361.

Kadrofske MM, Openo KP, Wang JL. 1998. The human LGALS3 (galectin-3)gene: Determination of the gene structure and functional characterization ofthe promoter. Arch Biochem Biophys. 349:7–20.

Kaltner H, Gabius H-J. 2012. A toolbox of lectins for translating the sugarcode: The galectin network in phylogenesis and tumors. Histol Histopathol.27:397–416.

Galectin genes in man and animals

1161

Downloaded from https://academic.oup.com/glycob/article-abstract/23/10/1152/1988214by gueston 30 January 2018

Page 11: 1 © The Author 2013. Published by Oxford University Press. All ...

Kaltner H, Kübler D, López-Merino L, Lohr M, Manning JC, Lensch M,Seidler J, Lehmann WD, André S, Solís D, et al. 2011. Toward comprehen-sive analysis of the galectin network in chicken: Unique diversity of galectin-3and comparison of its localization profile in organs of adult animals to theother four members of this lectin family. Anat Rec. 294:427–444.

Kaltner H, Solís D, André S, Lensch M, Manning JC, Mürnseer M, Sáiz JL,Gabius H-J. 2009. Unique chicken tandem-repeat-type galectin: Implicationsof alternative splicing and a distinct expression profile compared to those ofthe three proto-type proteins. Biochemistry. 48:4403–4416.

Kaltner H, Solís D, Kopitz J, Lensch M, Lohr M, Manning JC, Mürnseer M,Schnölzer M, André S, Sáiz JL, et al. 2008. Prototype chicken galectinsrevisited: Characterization of a third protein with distinctive hydrodynamicbehaviour and expression pattern in organs of adult animals. Biochem J.409:591–599.

Kasai K-I, Hirabayashi J. 1996. Galectins: A family of animal lectins that de-cipher glycocodes. J Biochem. 119:1–8.

Kel AE, Gossling E, Reuter I, Cheremushkin E, Kel-Margoulis OV, WingenderE. 2003. MATCH: A tool for searching transcription factor binding sites inDNA sequences. Nucleic Acids Res. 31:3576–3579.

Kopitz J, André S, von Reitzenstein C, Versluis K, Kaltner H, Pieters RJ,Wasano K, Kuwabara I, Liu F-T, Cantz M, et al. 2003. Homodimericgalectin-7 (p53-induced gene 1) is a negative growth regulator for humanneuroblastoma cells. Oncogene. 22:6277–6288.

Kopitz J, Ballikaya S, André S, Gabius H-J. 2012. Ganglioside GM1/galectin-dependent growth regulation in human neuroblastoma cells: Special proper-ties of bivalent galectin-4 and significance of linker length for ligandselection. Neurochem Res. 37:1267–1276.

Kopitz J, Bergmann M, Gabius H-J. 2010. How adhesion/growth-regulatorygalectins-1 and -3 attain cell specificity: Case study defining their target onneuroblastoma cells (SK-N-MC) and marked affinity regulation by affectingmicrodomain organization of the membrane. IUBMB Life. 62:624–628.

Krzeminski M, Singh T, André S, Lensch M, Wu AM, Bonvin AMJJ, GabiusH-J. 2011. Human galectin-3 (Mac-2 antigen): Defining molecular switchesof affinity to natural glycoproteins, structural and dynamic aspects of glycanbinding by flexible ligand docking and putative regulatory sequences in theproximal promoter region. Biochim Biophys Acta. 1810:150–161.

Kuwabara I, Kuwabara Y, Yang R-Y, Schuler M, Green DR, Zuraw BL, HsuDK, Liu F-T. 2002. Galectin-7 (PIG1) exhibits pro-apoptotic functionthrough JNK activation and mitochondrial cytochrome c release. J BiolChem. 277:3487–3497.

Larkin MA, Blackshields G, Brown NP, Chenna R, Mc Gettigen PA,McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, et al. 2007.ClustalWand ClustalX version 2.0. Bioinformatics. 23:2947–2948.

Ledeen RW, Wu G, André S, Bleich D, Huet G, Kaltner H, Kopitz J, Gabius H-J.2012. Beyond glycoproteins as galectin counterreceptors: Tumor/effector Tcell growth control via ganglioside GM1. Ann NYAcad Sci. 1253:206–221.

Lensch M, Lohr M, Russwurm R, Vidal M, Kaltner H, André S, Gabius H-J. 2006.Unique sequence and expression profiles of rat galectins-5 and -9 as a result ofspecies-specific gene divergence. Int J Biochem Cell Biol. 38:1741–1758.

Leonidas DD, Vatzaki EH, Vorum H, Celis JE, Madsen P, Acharya KR. 1998.Structural basis for the recognition of carbohydrates by human galectin-7.Biochemistry. 37:13930–13940.

Lipkowitz MS, Leal-Pinto E, Rappoport JZ, Najfeld V, Abramson RG. 2001.Functional reconstitution, membrane targeting, genomic structure, andchromosomal localization of a human urate transporter. J Clin Invest.107:1103–1115.

Liu F-T, Yang R-Y, Hsu DK. 2012. Galectins in acute and chronic inflammation.Ann NYAcad Sci. 1253:80–91.

Lohr M, Lensch M, André S, Kaltner H, Siebert H-C, Smetana K, Jr, SinowatzF, Gabius H-J. 2007. Murine homodimeric adhesion/growth-regulatorygalectins-1, -2 and -7: Comparative profiling of gene/promoter sequences bydatabase mining, of expression by RT-PCR/immunohistochemistry and ofcontact sites for carbohydrate ligands by computational chemistry. Folia Biol(Praha). 53:109–128.

López-Lucendo MF, Solís D, Sáiz JL, Kaltner H, Russwurm R, André S,Gabius H-J, Romero A. 2009. Homodimeric chicken galectin CG-1B(C-14): Crystal structure and detection of unique redox-dependent shapechanges involving inter- and intrasubunit disulfide bridges by gel filtration,ultracentrifugation, site-directed mutagenesis, and peptide mass fingerprint-ing. J Mol Biol. 386:366–378.

Lu Y, Lotan R. 1999. Transcriptional regulation by butyrate of mouse galectin-1gene in embryonal carcinoma cells. Biochim Biophys Acta. 1444:85–91.

Madsen P, Rasmussen HH, Flint T, Gromov P, Kruse TA, Honore B, Vorum H,Celis JE. 1995. Cloning, expression, and chromosome mapping of humangalectin-7. J Biol Chem. 270:5823–5829.

Maglott D, Ostell J, Pruitt KD, Tatusova T. 2011. Entrez gene: Gene-centeredinformation at NCBI. Nucleic Acids Res. 39:D52–D57.

Magnaldo T, Bernerd F, Darmon M. 1995. Galectin-7, a human 14-kDaS-lectin, specifically expressed in keratinocytes and sensitive to retinoic acid.Dev Biol. 168:259–271.

Matsumoto R, Matsumoto H, Seki M, Hata M, Asano Y, Kanegasaki S, StevensRL, Hirashima M. 1998. Human ecalectin, a variant of human galectin-9, isa novel eosinophil chemoattractant produced by T lymphocytes. J BiolChem. 273:16976–16984.

Morelle W, Stechly L, André S, Van Seuningen I, Porchet N, Gabius H-J,Michalski JC, Huet G. 2009. Glycosylation pattern of brush border-associated glycoproteins in enterocyte-like cells: Involvement of complex-type N-glycans in apical trafficking. Biol Chem. 390:529–544.

Nei M, Rooney AP. 2005. Concerted and birth-and-death evolution of multi-gene families. Annu Rev Genet. 39:121–152.

Nesmelova IV, Berbís MÁ, Miller MC, Cañada FJ, André S, Jiménez-BarberoJ, Gabius H-J, Mayo KH. 2012. 1H, 13C, and 15N backbone and side-chainchemical shift assignments for the 31 kDa human galectin-7 (p53-inducedgene 1) homodimer, a pro-apoptotic lectin. Biomol NMR Assign. 6:127–129.

Oda Y, Kasai K-I. 1983. Purification and characterization of β-galactoside-binding lectin from chick embryonic skin. Biochim Biophys Acta.761:237–245.

Onami TM, Lin MY, Page DM, Reynolds SA, Katayama CD, Marth JD,Irimura T, Varki A, Varki N, Hedrick SM. 2002. Generation of mice deficientfor macrophage galactose- and N-acetylgalactosamine-specific lectin:Limited role in lymphoid and erythroid homeostasis and evidence for mul-tiple lectins.Mol Cell Biol. 22:5173–5182.

Pál Z, Antal P, Millinghoffer A, Hullám G, Pálóczi K, Tóth S, Gabius H-J,Molnár MJ, Falus A, Buzás EI. 2010. A novel galectin-1 and interleukin-2receptor β haplotype is associated with autoimmune myasthenia gravis. JNeuroimmunol. 229:107–111.

Pál Z, Antal P, Srivastava SK, Hullám G, Semsei AF, Gál J, Svébis M, Soós G,Szalai C, André S, et al. 2012. Non-synonymous single nucleotide poly-morphisms in genes for immunoregulatory galectins: Association ofgalectin-8 (F19Y) occurrence with autoimmune diseases in a Caucasianpopulation. Biochim Biophys Acta. 1820:1512–1518.

Polyak K, Xia Y, Zweier JL, Kinzler KW, Vogelstein B. 1997. A model forp53-induced apoptosis. Nature. 389:300–305.

Remmelink M, de Leval L, Decaestecker C, Duray A, Crompot E, Sirtaine N,André S, Kaltner H, Leroy X, Gabius H-J, et al. 2011. Quantitative immuno-histochemical fingerprinting of adhesion/growth-regulatory galectins insalivary gland tumours: Divergent profiles with diagnostic potential.Histopathology. 58:543–556.

Rice P, Longden I, Bleasby A. 2000. EMBOSS: The European MolecularBiology Open Software Suite. Trends Genet. 16:276–277.

Saal I, Nagy N, Lensch M, Lohr M, Manning JC, Decaestecker C, André S,Kiss R, Salmon I, Gabius H-J. 2005. Human galectin-2: Expression profilingby RT-PCR/immunohistochemistry and its introduction as a histochemicaltool for ligand localization. Histol Histopathol. 20:1191–1208.

Sakakura Y, Hirabayashi J, Oda Y, Ohyama Y, Kasai K-I. 1990. Structure ofchicken 16-kDa β-galactoside-binding lectin. Complete amino acid se-quence, cloning of cDNA, and production of recombinant lectin. J BiolChem. 265:21573–21579.

Sakuishi K, Jayaraman P, Behar SM, Anderson AC, Kuchroo VK. 2011.Emerging Tim-3 functions in antimicrobial and tumor immunity. TrendsImmunol. 32:345–349.

Sarter K, Janko C, André S, Muñoz LE, Schorn C, Winkler S, Rech J, KaltnerH, Lorenz HM, Schiller M, et al. 2013. Autoantibodies against galectins areassociated with antiphospholipid syndrome in patients with systemic lupuserythematosus. Glycobiology. 23:12–22.

Saussez S, Cucu DR, Decaestecker C, Chevalier D, Kaltner H, André S,Wacreniez A, Toubeau G, Camby I, Gabius H-J, et al. 2006. Galectin-7(p53-induced gene 1): A new prognostic predictor of recurrence and survivalin stage IV hypopharyngeal cancer. Ann Surg Oncol. 13:999–1009.

Shoja V, Zhang L. 2006. A roadmap of tandemly arrayed genes in the genomesof human, mouse, and rat.Mol Biol Evol. 23:2134–2141.

Smetana K, Jr, André S, Kaltner H, Kopitz J, Gabius H-J. 2013. Context-dependent multifunctionality of galectin-1: A challenge for defining thelectin as therapeutic target. Expert Opin Ther Targets. 17:379–392.

HKaltner et al.

1162

Downloaded from https://academic.oup.com/glycob/article-abstract/23/10/1152/1988214by gueston 30 January 2018

Page 12: 1 © The Author 2013. Published by Oxford University Press. All ...

Solís D, Maté MJ, Lohr M, Ribeiro JP, López-Merino L, André S, Buzamet E,Cañada FJ, Kaltner H, Lensch M, et al. 2010. N-domain of human adhesion/growth-regulatory galectin-9: Preference for distinct conformers and non-sialylated N-glycans and detection of ligand-induced structural changes incrystal and solution. Int J Biochem Cell Biol. 42:1019–1029.

Solís D, Romero A, Kaltner H, Gabius H-J, Díaz-Mauriño T. 1996. Differentarchitecture of the combining site of the two chicken galectins revealed bychemical mapping studies with synthetic ligand derivatives. J Biol Chem.271:12744–12748.

Stancic M, Slijepcevic D, Nomden A, Vos MJ, de Jonge JC, Sikkema AH,Gabius H-J, Hoekstra D, Baron W. 2012. Galectin-4, a novel neuronal regu-lator of myelination. GLIA. 60:918–935.

Stechly L, Morelle W, Dessein AF, André S, Grard G, Trinel D, Dejonghe MJ,Leteurtre E, Drobecq H, Trugnan G, et al. 2009. Galectin-4-regulated deliv-ery of glycoproteins to the brush border membrane of enterocyte-like cells.Traffic. 10:438–450.

Stowell SR, Arthur CM, Mehta P, Slanina KA, Blixt O, Leffler H, Smith DF,Cummings RD. 2008. Galectins-1, -2, and -3 exhibit differential recognition ofsialylated glycans and blood group antigens. J Biol Chem. 283:10109–10123.

Sturm A, Lensch M, André S, Kaltner H, Wiedenmann B, Rosewicz S,Dignass AU, Gabius H-J. 2004. Human galectin-2: Novel inducer of T cellapoptosis with distinct profile of Caspase activation. J Immunol. 173:3825–3837.

Than NG, Romero R, Goodman M, Weckle A, Xing J, Dong Z, Xu Y, TarquiniF, Szilagyi A, Gal P, et al. 2009. A primate subfamily of galectins expressedat the maternal-fetal interface that promote immune cell death. Proc NatlAcad Sci USA. 106:9731–9736.

Thompson JD, Higgins DG, Gibson TJ. 1994. ClustalW: Improving the sensi-tivity of progressive multiple sequence alignment through sequence weight-ing, position-specific gap penalties and weight matrix choice. Nucleic AcidsRes. 22:4673–4680.

Tsuiji M, Fujimori M, Ohashi Y, Higashi N, Onami TM, Hedrick SM, IrimuraT. 2002. Molecular cloning and characterization of a novel mouse

macrophage C-type lectin, mMGL2, which has a distinct carbohydrate speci-ficity from mMGL1. J Biol Chem. 277:28892–28901.

Varela PF, Solís D, Díaz-Mauriño T, Kaltner H, Gabius H-J, Romero A. 1999.The 2.15 Å crystal structure of CG-16, the developmentally regulated homo-dimeric chicken galectin. J Mol Biol. 294:537–549.

Velasco S, Díez-Revuelta N, Hernández-Iglesias T, Kaltner H, André S, GabiusH-J, Abad-Rodríguez J. 2013. Neuronal Galectin-4 is required for axongrowth and for the organization of axonal membrane L1 delivery and cluster-ing. J Neurochem. 125:49–62.

Villeneuve C, Baricault L, Canelle L, Barboule N, Racca C, Monsarrat B,Magnaldo T, Larminat F. 2011. Mitochondrial proteomic approach revealsgalectin-7 as a novel bcl-2 binding protein in human cells. Mol Biol Cell.22:999–1013.

Vokhmyanina OA, Rapoport EM, André S, Severov VV, Ryzhov I, PazyninaGV, Korchagina E, Gabius H-J, Bovin NV. 2012. Comparative study of theglycan specificities of cell-bound human tandem-repeat-type galectin-4, -8and -9. Glycobiology. 22:1207–1217.

Wada J, Kanwar YS. 1997. Identification and characterization of galectin-9, anovel β-galactoside-binding mammalian lectin. J Biol Chem. 272:6078–6086.

Waterhouse AM, Procter JB, Martin DMA, Clamp M, Barten GJ. 2009.JalView Version2—A multiple sequence alignment editor and analysis work-bench. Bioinformatics. 25:1189–1191.

Weischenfeldt J, Symmons O, Spitz F, Korbel JO. 2013. Phenotypic impact ofgenomic structural variation: Insights from and for human disease. Nat RevGenet. 14:125–138.

Wu AM, Singh T, Liu J-H, Krzeminski M, Russwurm R, Siebert H-C, BonvinAMJJ, André S, Gabius H-J. 2007. Activity-structure correlations in diver-gent lectin evolution: Fine specificity of chicken galectin CG-14 and compu-tational analysis of flexible ligand docking for CG-14 and the closely relatedCG-16. Glycobiology. 17:165–184.

Zhou D, Ge H, Sun J, Gao Y, Teng M, Niu L. 2008. Crystal structure of theC-terminal conserved domain of human GRP, a galectin-related protein, revealsa function mode different from those of galectins. Proteins. 71:1582–1588.

Galectin genes in man and animals

1163

Downloaded from https://academic.oup.com/glycob/article-abstract/23/10/1152/1988214by gueston 30 January 2018