Duplication and evolution of the P-glycoprotein genes in pig

8
¢?,- . + , ; * ELSEVIER Biochimica et Biophysica Acta 1307 (1996) 205-212 BB Biochi ~mic~a et Biophysica /~ta Duplication and evolution of the P-glycoprotein genes in pig Sarah Childs, Victor Ling + British Columbia Cancer Research Centre, 601 West lOth Ace., Vancoul,er, BC V5Z I L3, Canada Received 16 November 1995; accepted 15 February 1996 Abstract The P-glycoproteins (Pgp's) are a small family of proteins frequently associated with the multidrug resistance phenotype in drug-selected cell lines. The number of Pgp isoforms in different mammalian species is variable although the reason for having a larger or smaller number of isoforms is not known. Two isoform classes from human, and three from rodents have been extensively characterised and have been shown to have independent expression patterns and substrate preferences. We have cloned 3' terminal genomic fragments for five members of the Pgp multigene family from the pig, which is the largest number of Pgp genes found in any mammalian species to date. Sequential duplications of one class of Pgp gene have given rise to this large gene family since four genes show similarity to the drug resistance-causing Class I isoform of Pgp. The fifth pig Pgp gene shows similarity to the phosphatidylcholine-translocating Class !II isoform. The history of the duplications creating this large gene family can be traced by atypical features which have been inherited in common. These include a mutation in the stop codon at the 3' end of four Class I Pgp genes, increasing the coding region by six amino acids, and a SINE element of the PRE1 family inserted into the 3' untranslated region of three Class I Pgp's. We demonstrate expression of Class I pgp in pig brain cultured capillary endothelial cells, and Class III pgp in the liver, two important sites of expression of Pgp in rodents and humans. Thus there appears to be strong phylogenetic conservation in mammals of both sequence and expression of these two Pgp isoforms. Keywords: P-glycoprotein; Short interspersed repetitive element; Gene conversion; Multigene family; Evolution I. Introduction P-glycoproteins are a family of membrane transport proteins expressed in a large number of cell and tissue types through phylogeny, from protozoans to animals and plants [1]. P-glycoprotein was first identified in cultured animal cells where its overexpression conferred resistance to a wide variety of structurally unrelated cytotoxic drugs. Homology between Pgp and the bacterial toxin transporter Haemolysin B suggested that the mechanism of Pgp in the multidrug resistance (MDR) phenotype is as an ATP-de- pendent efflux pump, actively transporting drugs from the cell [2]. Cloning of Pgp revealed that there are three closely related isoforms in rodents and two in humans [3-5]. In animals, the Pgp isoforms have unique tissue-ex- pression patterns, [6] furthermore transfection studies have shown that the genes have distinct (although somewhat overlapping) substrate specificities [7]. These studies sug- gest that each Pgp isoform has a unique in vivo function. Abbreviations: SINE, short interspersed repetitive element. + Corresponding author. Fax: + 1 (604) 8776150. 0167-4781/96/$15.00 © 1996 Elsevier Science B.V. All rights reserved P11 S0167-4781 (96)00048-6 However, these natural functions are not yet completely understood. Homozygous disruption of the Class I Pgp (mdr3 or mdrla) in mice has demonstrated the importance of this isoform in preventing the entry of cytotoxic compounds into the brain, and thus in forming the blood-brain barrier [8]. In addition to the brain, Class I Pgp has been observed highly expressed on the apical surface of the jejunum and colon, the brush border of the proximal tubules of kidney, the apical surface of small ductules of the pancreas, on the surface of endothelial cells in the small blood vessels of ovary and testis, and in lesser amounts in other tissues of both rodents and humans [9,10]. The function of Class I Pgp in tissues other than brain is not understood as there is little phenotypic effect in the knockout mice. This may be because rodents have a second Class I-like Pgp isoform (Class II) which may be able to functionally compensate for the lack of Class I Pgp. This second isoform, the mdrl (or mdrlb) gene in mice, is constitutively highly expressed in the adrenal cortex and markedly induced in the pregnant uterus where it is hypothesised that it plays a role in steroid secretion [11]. Class II Pgp is also strongly upregu-

Transcript of Duplication and evolution of the P-glycoprotein genes in pig

¢ ? , - . + , ; *

ELSEVIER Biochimica et Biophysica Acta 1307 (1996) 205-212

BB Biochi ~mic~a et Biophysica /~ta

Duplication and evolution of the P-glycoprotein genes in pig

Sarah Childs, Victor Ling +

British Columbia Cancer Research Centre, 601 West lOth Ace., Vancoul,er, BC V5Z I L3, Canada

Received 16 November 1995; accepted 15 February 1996

Abstract

The P-glycoproteins (Pgp's) are a small family of proteins frequently associated with the multidrug resistance phenotype in drug-selected cell lines. The number of Pgp isoforms in different mammalian species is variable although the reason for having a larger or smaller number of isoforms is not known. Two isoform classes from human, and three from rodents have been extensively characterised and have been shown to have independent expression patterns and substrate preferences. We have cloned 3' terminal genomic fragments for five members of the Pgp multigene family from the pig, which is the largest number of Pgp genes found in any mammalian species to date. Sequential duplications of one class of Pgp gene have given rise to this large gene family since four genes show similarity to the drug resistance-causing Class I isoform of Pgp. The fifth pig Pgp gene shows similarity to the phosphatidylcholine-translocating Class !II isoform. The history of the duplications creating this large gene family can be traced by atypical features which have been inherited in common. These include a mutation in the stop codon at the 3' end of four Class I Pgp genes, increasing the coding region by six amino acids, and a SINE element of the PRE1 family inserted into the 3' untranslated region of three Class I Pgp's. We demonstrate expression of Class I pgp in pig brain cultured capillary endothelial cells, and Class III pgp in the liver, two important sites of expression of Pgp in rodents and humans. Thus there appears to be strong phylogenetic conservation in mammals of both sequence and expression of these two Pgp isoforms.

Keywords: P-glycoprotein; Short interspersed repetitive element; Gene conversion; Multigene family; Evolution

I. Introduct ion

P-glycoproteins are a family of membrane transport proteins expressed in a large number of cell and tissue types through phylogeny, from protozoans to animals and plants [1]. P-glycoprotein was first identified in cultured animal cells where its overexpression conferred resistance to a wide variety of structurally unrelated cytotoxic drugs. Homology between Pgp and the bacterial toxin transporter Haemolysin B suggested that the mechanism of Pgp in the multidrug resistance (MDR) phenotype is as an ATP-de- pendent efflux pump, actively transporting drugs from the cell [2]. Cloning of Pgp revealed that there are three closely related isoforms in rodents and two in humans [3-5]. In animals, the Pgp isoforms have unique tissue-ex- pression patterns, [6] furthermore transfection studies have shown that the genes have distinct (although somewhat overlapping) substrate specificities [7]. These studies sug- gest that each Pgp isoform has a unique in vivo function.

Abbreviations: SINE, short interspersed repetitive element. + Corresponding author. Fax: + 1 (604) 8776150.

0167-4781/96/$15.00 © 1996 Elsevier Science B.V. All rights reserved P11 S0167-4781 ( 9 6 ) 0 0 0 4 8 - 6

However, these natural functions are not yet completely understood.

Homozygous disruption of the Class I Pgp (mdr3 or mdrla) in mice has demonstrated the importance of this isoform in preventing the entry of cytotoxic compounds into the brain, and thus in forming the blood-brain barrier [8]. In addition to the brain, Class I Pgp has been observed highly expressed on the apical surface of the jejunum and colon, the brush border of the proximal tubules of kidney, the apical surface of small ductules of the pancreas, on the surface of endothelial cells in the small blood vessels of ovary and testis, and in lesser amounts in other tissues of both rodents and humans [9,10]. The function of Class I Pgp in tissues other than brain is not understood as there is little phenotypic effect in the knockout mice. This may be because rodents have a second Class I-like Pgp isoform (Class II) which may be able to functionally compensate for the lack of Class I Pgp. This second isoform, the mdrl (or mdrlb) gene in mice, is constitutively highly expressed in the adrenal cortex and markedly induced in the pregnant uterus where it is hypothesised that it plays a role in steroid secretion [11]. Class II Pgp is also strongly upregu-

206 S. Childs, V. Ling / Biochimica et Biophysica Acta 1307 (1996) 205-212

lated in experimental liver tumours and in hepatocyte culture [12].

In humans and rodents, the Class III isoform of Pgp is highly expressed in hepatocytes where it is localised to the bile canalicular membrane [9,10]. Homozygous disruption of the mdr2 gene in mice results in a deficiency of phosphatidylcholine (PC) in bile and liver pathology, but no visible phenotype in other tissues [13]. Subsequent experiments utilising heterologous expression of mdr2 in a yeast vesicle system have shown that it is a phospholipid translocase specific for PC [14].

Extensive sequence homology among the Pgp genes suggests that they have arisen through a series of gene duplication events. The number of Pgp genes in mammals varies between two and five as predicted by Southern blotting of genomic DNA from different species. Like humans, monkeys (order Primates) are predicted to have two Pgp genes, as are rabbits (Lagomorpha). Cows (Artiodactyla) are predicted to have three Pgp genes while dogs (Carnivora) are predicted to have four, and pigs (Artiodactyla) are predicted to have five [15]. This survey shows that several mammals only have two Pgp genes. On the other hand, the gene family has increased greatly in size in other lines of evolution, even among animals in the same mammalian order like pig and cow. The reason for this is not known. Nor is it understood what the conse- quences in terms of function and expression are when there are many Pgp isoforms in a species. In this study we isolate genomic fragments for the five-membered Pgp multigene family from the pig in order to derive further insight into the continuing evolution of the family at the gene level in mammals.

2. Materials and methods

2.1. Genomic DNA and library construction

Genomic DNA was prepared from freshly excised pig liver or AuxB1 Chinese hamster ovary cells using Pro- teinase K digestion and ethanol precipitation [16]. For Southern analysis, 10 /zg of DNA was digested overnight with EcoRI, BamHI or HindlII, electrophoresed on a 0.7% gel and transferred to Hybond N membrane (Amersham, Arlington Heights IL). Blocking of the blot occurred in 1% glycine, 5 X SSPE (I X SSPE is 0.18 M NaCI, 0.01 M Sodium phosphate and 1 mM EDTA pH 7.7), 50% formamide, 10XDenhardt 's solution (1 X Denhardt's solution is 0.01% Ficoll, 0.01% BSA, 0.01% polyvinylpyrrolidone), 15 /zg salmon sperm DNA and 5% dextran sulphate for 8 h at 42°C before labelled probe was added and allowed to hybridize overnight. The probe PEX-172 recognises all known mammalian pgp genes and is derived from base pairs 3714-3895 of the hamster pgp2 cDNA [5]. To synthesise the probe, primers (5'GTCCAG- GAAGCCCTGGAC3') and (5 'CCATGGAGAAATA-

GATGC3'), l ng of plasmid template, 100 /zM dATP, dGTP, dTTP, 100 nM dCTP and 50 /zCi [32p]dCTP were used in 25 cycles of PCR reaction with 1 U Taq poly- merase and 1 X PCR buffer (10 x PCR buffer is 100 mM Tris-C1, pH 8.3, 500 mM KCI, 10 mM MgCI2, 0.5% Triton X-100). The labelled probe was separated from free radiolabel by a G-50 Sephadex spin column [16]. After hybridization the blot was washed in 2 X SSC (1 X SSC is: 0.18 M NaCI, 0.015 M Sodium citrate, pH 7.0), and 0.1% SDS for 20 rain at room temperature and 40 rain at 50°C.

To isolate genomic fragments at the C-terminus of the pig pgp genes, 100 /zg of EcoRI-digested genomic DNA was size-fractionated according to the band sizes observed on the genomic Southern blot (0.5-5 kb or 8-23 kb). The DNA was electroeluted overnight in dialysis membrane (Spectrum Medical Industries, Los Angeles, CA), precipi- tated, and cloned into the EcoRI site of Agtl0 (pgplA, 1B, 1C, 1D) or EMBL4 (pgp3). The libraries were screened with the probe PEX-172. Positive clones were subcloned to pUC9 for sequencing. The sequences have been deposited in GenBank with the accession numbers U27703, U27704, U27705, U27706 and U27707.

2.2. Sequence and erolutionary analysis

The University of Wisconsin Genetics Computer Group programs, Version 8.0 (Madison WI) were used for se- quence alignment and analysis. Predictions of which Pgp Class the pig genes belonged were made with the GCG GAP program by comparing the 3' UTR sequences of hamster and human pgp's with the predicted 3' UTRs of the pig pgp's. In the case of pgplB, pgplC and pgplD comparisons were made after removing the SINE element from the sequence since this is not present in other species and confounds the analysis. Randomizations of the 3' UTR sequences were used to assess the level of background identity in this AT-rich region, which is approx. 50% (data not shown).

2.3. Expression of the pig pgp genes

RNA was obtained from flesh pig liver or from pig brain cultured capillary endothelial cells (CCECs) [17] by the cesium chloride density gradient centrifugation method [16]. 5 /zg of total RNA was reversed transcribed at 42°C by murine Moloney reverse transcriptase (Gibco BRL, Gaithersburg MD) according to the manufacturer's proto- col using 100 ng oligo dT24 as a primer. PCR Primers were synthesised for: Class I pig pgp's (5'AAGCGCT- CATCAACTGTG3') and (5'GGCACTTTATGCAAA- CATTC3'); Class III pig pgp (5'GAAGCCCTG- GACAAA3') and (5'AACGATTGGAATTTATTTTAA- A3'); or for /32-microglobulin (5'CAGCAAGGACTGGT- CTTTCTAC3'), and (5'AAGCATATCAATATTAAAAA- GCAA3'). It was not possible to obtain sequence of pgplC in the region of the most 3' Class I primer since the clone

S. Childs, V. Ling / Biochimica et Biophysica Acta 1307 (1996) 205-212 207

ends before the 3' primer site. It is therefore not known whether the Class I primer pair will recognise the pgplC cDNA, however, the primers will recognise pgplA, pgplB and pgplD. 100 ng of each primer and 1/40 of the reverse transcriptase reaction or 0.5 ng plasmid was used in 40 cycles of synthesis (94°C 30 s, 50°C 60 s, 72°C 60 s) in a Perkin Elmer 480 Thermal Cycler in 1 X PCR buffer, 100 /xM each dNTP, and 1 U Taq. Products were visu- alised after separation on an agarose gel. The pgplA product is 389 bp, the p g p l B / p g p l C / p g p l D product is around 660 bp, the pgp3 product is 240 bp and the fl2-microglobulin product is 218 bp.

3. Results

8454 b p ~

4 8 2 2 - -

3 6 7 5 ~

E

i

E E B H

3.1. Prediction of the number of pgp genes in the pig

The PEX-172 probe is a short probe derived from the C-terminal exon of hamster pgp2, a region highly con- served cross-species in mammals and capable of recognis- ing all pgp gene classes. It has previously been used to identify members the pgp multigene family in hamster, rat and winter flounder [5,18,19]. The number of bands visu- alised on a Southern genomic blot using this probe will correspond to the number of pgp genes in a species providing that the genomic sequence to which it hybridises has not been cut by a restriction enzyme. To eliminate this possibility we routinely hybridize several different restric- tion digests of genomic DNA, since the probability of a 172 bp fragment being cut with multiple different hexa- nucleotide-recognizing restriction enzymes is quite low. Fig. 1 shows a Southern genomic blot of genomic DNA from pig digested with three restriction enzymes and probed with PEX-172. Five bands are seen in each lane. Equally intense bands are seen after digestion with several restric- tion enzymes, suggesting that the five bands do not repre- sent alleles of a smaller number of pgp genes. This pattern was also reproducible with the DNA of several pigs, suggesting that it is a general phenomenon of Sus scrofa and not of one individual pig (data not shown).

3.2. Isolation and analysis qf.five pig pgp genomic frag- ments

Utilising the PEX-172 probe, five unique genomic frag- ments corresponding to the C-terminus of Pgp were iso- lated and sequenced from an EcoRI digest of pig genomic DNA (Fig. 2A). The gene fragments were named pgplA (2.1 kb), pgplB (2.3 kb), pgplC (0.9 kb), pgplD (4.0 kb) and pgp3 (15 kb) according to the similarity they showed to either Class I or Class III pgp genes in the region of their 3' UTR. The 3'UTRs of many genes have been found to be conserved cross-species, and indicative of individual isoforms. This is true of Pgp genes in human and rodent

2323 n

1929 D

1371

702

Fig. I. Prediction of the number of pgp genes in the genome of pig. Southern blot utilising 10 /xg of AuxBl Chinese hamster ovary cell or pig genomic DNA digested with EcoRl (E), BamH1 (B) or HindIII (H), electrophoresed, transferred to nylon membrane and probed with the PEX-172 probe. PEX-172 recognises all classes o/pgp and is in a region of the gene highly conserved cross species.

species [5]. The relationships of the pig pgp genes to those of hamster and human were determined by comparing the 3' UTR sequences of the genes using the Genetics Com- puter Group program GAP. The genes pgplA, pgplB and pgplD show high identity to the Class l isoform of Pgp and to each other while the pig pgp3 shows high identity to the Class III isoform, pgplC does not show high identity to any Pgp isoform in the 3' UTR, however, it shares a number of structural features with pgplA, pgplB and pgplD (Fig. 3). For this reason it has been classified as a Class I gene (Table 1).

The four pig Class I isoforms show several anomalies when compared with Class I Pgp's from other species. Firstly, the four Class I isoforms share an intron sequence upstream of the final exon which is > 99% conserved

208 S. Childs, V. Ling/Biochimica et Biophysica Acta 1307 (1996) 205-212

among all four genes (Fig. 2A). While a database search using BLAST [20] did not result in any similar sequences being identified in GenBank, GAP analysis shows this sequence to be most like the sequence of the intron of the hamster Class I gene [5]. This similarity suggests that it might be an intron which has recently been subjected to gene conversion. Gene conversion is a genomic process in which sequences from closely related genes can be used to 'correct' each other. The mechanism of this process is not understood but gene conversion is thought to play a major role in the maintenance of the integrity of large multigene families [21]. In the case of the pig pgp's, this intron is

more highly conserved than the adjacent coding sequence. Thus the conversion of these genes may be an extremely recent event. In contrast, the intron of the Class III gene is completely different from these Class I introns, and bears similarity to the Class III intron of hamster. It has therefore been exempt from this process.

Secondly, the four Class I isoforms share identical TGATGA to TCATCA mutations at the position equiva- lent to the stop codon of Class I Pgp's from other species, or Class III of pig. This results in a predicted protein which is 6 amino acids longer than Class I Pgp from other species (Fig. 2B and 2C). As this mutation is common to

A i

Consensus pgplA pgplB pgplC pgplD

90 GAATTCGGTA TTCTGAAATG ACAGAACCCT TGCTTTGTAG AATCTATCCA TATCCACTGA CACTACTGTT CAAGAAGCAC TTACTGTTGC

pgp3 a gaaca actatgcat tta tatg aaaaacaca tgcaaagc c aag_t_e_ a_gggaag a tg_aagg a t t ctt

91 178 Consensus AGAAATGAAT TGTGAGGATT GTGATGGAGG AGGAATTAAA GGGAGAGTAT TATTATGATT GGAGTCAATT TATGTGATAA -TGAACAG pgplA a c pgpln t ~ pgplC c C pgplD t - - pgp3 ttgtg g aaata t a~atttcc tttcgtt_ aaacatat ccac art ta at gg ct c t tt a C t

B Consensus

pgpIA

1 80

GTTGTCCAAG AAGCCCTGGA CAAAGCCAGA GAAGGCCGCA CCTGCATCGT GATCGCTCAC CGCCTGTCCA CCATCCAGAA

pgplB pgplC pgplD pgp3 a

Consensus

pgplA

81 160 CGCAGACTTG ATCGTGGTGA TTCAGAACGG CAAAGTCCGG GAGCATGGCA CACACCAGCA GCTACTGGCA CAGAAAGGCA

a t

pgplB aa t

pgplC a pgplD a

pgp3

Consensus pgplA pgplB pgplC pgplD pgp3

161 222 TCTATTTCTC CATGGTCAGC GTCCAGGCTG GAGCAAAGCG CTCATCAACT GTGAT-ATAT GA

e c g_ __

a a t t cc

_ga cc aa __g__g.__ cctg_t gtg

C 1 73

pgplA VVQEALDKAR EGRTCIVIAH RLSTIQNADL IWIQNGKVQ EYGTHQQLLA QKGIYFSMVS VQAGAKRSST VTM* pgplB WQEALDKAR EGRTCIMIAH RLSTIQNTDL IVVIQKSKVR EHGTHQQLLA QKGIYFSMVS VQAGAKRSST VMI* pgplC WQEALDKAR EGRTCIVIAH RLSTIQNADS IVVIQKGKVR EHGTHQQLLA QKGIYFSMVS VQAGTKHLST VMI* pgplD VVQEALDKAR EGRTCIVIAH RLSTIQNTDL IVVIQNGKVR EHGTHQQLLA QKGIYFSMVS VQAGAKRSST VTI* pgp3 IVQEALDKAR EGRTCIVIAH RLSTIQNADL IVVIQNGKVQ EHGTHQQLLA QKGIYFSMVS VQAGTPN**

Fig. 2. Sequence of five pig pgp genomic fragments. (A) DNA sequence of the intron upstream of the final exon of Pgp. A consensus sequence of the genes is presented on the top line while individual differences from the consensus are indicated. The fragments are reported from a conserved EcoRI site which is present in all Class I pig pgp genes. (B) DNA sequence of the C-terminal exon of the pig pgp genes. (C) Predicted amino acid sequence of the pig Pgp C-terminal exons. (D) YUTR sequences of the pig pgp's reported from the first nucleotide of the 3'UTR (pgp3), or the equivalent position in the Class 1 pgp's. A SINE element interrupts to YUTR in three pig pgp's and is boxed. Potential polyA addition signals corresponding to those utilised in Class I ( zx ) and Class III ( • ) pgp genes of other species are underlined, and sites of polyA tail addition in hamster are marked (Class I ( II ) and Class III

(O)).

D pgplA pgplB pgplC pgplD pgp3

p.gplC pgplD

0 -- -- "' 99- actgtgacca tgtgaattgt tatatgttgt ctaatatttg t~ttaaaatc tg..~. ............................. ................. actgtgatga tatgagctgt tatatactgt ctaatatttg tg~taaaata tg~=atttag gagttcccat tg~ggctaac tggtaatgaa caccagtagt actgtgatga tatgagctgt tatatattgt ttaatatttg tg~taaaata tgg~atttag gagtttcctt tgtggetcag cagtaacgaa ccagactagt actgtgacca tatgagttgt ta.aaattgt ctaacatttg tglctaaaata tg~atttag gagttccctt tgtggctcag cagtaacgaa ccagactact actcctgtta tagtgtatct tcaaaataaa ttccaatcat tctacaactt ttg~tgactt atgtaagaag tttctaagtc accataaaat atagg

"I "I00 . . . . . . 199

pgplA .................................................................................................... pgplB atctatgagg acttgggtct gatccttagc ccggttcagt gggt.aagga tct .... ggc attgcttcaa gctgcagtgt aggtcacaga cacggctagg pgplC atccatgagg atgggggttc aatccttggc ctcactcagt gggttaagga tee .... agt gttgtggtga gctgacgtgt aggttgcaga ctcatttcac pgplD atccatgaag ataggggttc aatccttqac ctcattcagt gggttaagga tccagtcagt gttgtggtga gctgccatgt aggttgcaga ctcatcttga

2[30 299 pgplA .................................................................................................... pgpIB ttccagcatt acEgtggctg tggtgtaggc ctgcagatgc agctctgatt cgaccccctg tctgggaact tccatatgcc ctgggtgtgg acctaaaaaa

arc ..... tt ggcattgctg tggtgtaggc ctgcagctgc agttctggtt atctggtgtt gctgtgccta tggcacaggc ctgcagctga agttctgatt

S. Childs, V. Ling / Biochimica et Biophysica Acta 1307 (1996) 205-212 209

ggatccctag cctgggaact tccatatgcc acaggtgtgg ccctaaaaa. ggatccctag cctgggaact tccatatgcc acaggtgtgg ccctgaaaa.

300 ~. 399 pgplA .................... :.: ..... qq|catttaatca aaattttaaa agtgaacaet tactggaaaa actttctaga gttacttgtt taacatttcc pgplB taataaataa aa~aattt~t taaaatatgglaatttagtca aaatttaaaa agtgaacact tactggaaaa ac~atgtaga gttacaagtt taac.gttcc

.aataataat caaattet~t aaaaatatgg|catttaatca aaatttgaaa agtcaacact tacgggaaaa tccatgtaca ataacttgtt taacatttct pgplC .... aaaaat aaagattt~t tcaaatatggJcatttaataa taacattttc aaattttaaa agtgaacac. .actggaaaa actaggtaca gttacttgtt pgplD

400 499 pgplA tgctaccatt gaagatcatt ccaccaagtt cagattcttc agagacttta taattgaagg aaaagaaata aatatcatca agtggagtaa cataatggct pgplB tgctaccact gaagatcatt ccatcaagct gagaattctc agagacttta caatcgaagg aacggaaaga aatatcatca aatggaataa aataatggct pgplC tgctaccact gaag.tcatt ccaccaagtt cagaatgaag agagagttga aattaaactg agtttcattt actaaagatc aaattaaact gaatttcatt pgplD tgctaccact gaagatcatt ccaccaaatt tagaatcttt atagaattt ...... gaagg aacaaaaaga agtatcacta aatggaataa gataacggct

500 599 pgplA ttaattgcat tataaaatta atagaataat tcaaagtaga tttgttaata aagtttataa tttttattta tattttctta ttcacactgt aactgactac pgplB ttaattgcat ta~aaaattt acagaataaa gaagat .... tttgttaata aagtgtataa ttttaattta tattttcttt ttcagactgt aaaggattac pgplC cactgaagat cattccacca agttttctga agagac .... agtttgaatt aaagaaacaa acaaaaaaaa ttaaatggaa taaaggcttt aattgcacta pgplD ttaattgcat tataaaattt atagaattaa aa.gtg .... atttttttaa aggtgtaaat ttttatttat attttctttt ttcagactgt aactgattac

600 699 pgplA ttgctaaaag attaaaagta .... gcaaaa agtactgaat gtttgcataa agtacccata ataaaactaa acttttatat gactcgagtc atcttgtcta pgplB cttgttaaaa tattgtagaa gtaggaaaaa acgactgaat gcttgcataa agtgcccata ataaaactaa actttcatat gaattgagtc atcttgtata pgp]C taaaatttat agaattc ~ • pgplD cttc.taaaa gattatagaa gta.gcaaaa agtactgaat gtttgaataa agtgcccg~ata#aactaa acttttatat gactcaagtc atcttgtcta

Fig. 2 (continued).

all four genes, it is likely that this mutation occurred before the duplication of the Class I genes.

Thirdly, the 3 'UTRs from pgplB, pgplC and pgplD all show an retroposed insertion sequence 38 bp down- stream of the stop codon (Fig. 2D). The gene pgplA does not have this insertion. A BLAST search of GenBank showed that the insertion is a SINE element of the pig PRE-1 family. This arginine tRNA-derived element is very common in the swine genome where it is estimated there are 105-106 copies, or about one element every 12 kb of genomic sequence [22]. In general the element is flanked by direct repeats derived from the insertion site 7-21 bp in length. In the case of the pig SINEs, the insertion site

p g p l A . . . . . . . . . . . . Intron exon 3'UTR

SINE

. . . . . . . . . . . . • ._-~>~-----.... i pgpl C . . . . . . . . . . . . . . . . : : . . -_ - :_ - . . . . . . . [

Class I

pgpl D ........... :!::!::::::::::::::: ]

pgp3 ~ " I/ml I Class t .

Fig. 3. Cartoon of the organisation of the pig pgp genomic fragments. A conserved intron, an exon, 3' untranslated region and SINE element are demonstrated. Regions of strong sequence similarity among groups of genes are indicated by common shading.

duplication is 11 bp in length and derived from the 3' UTR of the pgp genes. The three pig SINE elements occur at identical positions in their respective genes. SINE elements are not known to be excised from genes, therefore the insertion of this element probably occurred at an interme- diate evolutionary stage when there were two Class I pgp genes in the pig genome (pgplA and an ancestor of pgplB/pgplC/pgplD). The insertion of the element into one gene was likely followed by two gene duplication events.

In contrast to the other pig Pgp's, the Class Ili gene, pgp3, appears not to have any great differences from the Class 1II genes from other species. It does not share the conserved intron sequence, stop codon mutation or SINE element insertion with the Class I pgp genes. Sequence similarity (Table 1) and these physical characteristics of the pig pgp's have been used to propose an evolutionary scenario for the formation of the family (Fig. 4).

3.3. Expression of the pig pgp genes

PCR primers surrounding the SINE element were de- signed which would recognise the pig Class I isoforms, and distinguish between expression of the typical Class I gene pgplA and the atypical pgplB/pgplC/pgplD by a size polymorphism. PCR primers specifically recognizing the Class III pgp3 were also designed. RNA from pig

210 S. Childs, V. Ling / Biochimica et Biophysica Acta 1307 (1996) 205-212

.oo,..,,., \ ::::::::::::::::::::::: mammalian ancestor (Class I and Class III) T 1

! Fig. 4. A scenario for the evolution of the pig Pgp multigene family. The topology of this tree is determined from similarity among the gene fragments at the nucleotide level, in addition to gross structural features. From the root of the tree, the numbered nodes represent the following events: 1. The duplication of an ancestral Pgp gene to form the Class I and Class Ill Pgp isoforms observed today in pigs, rodents and humans. This event occurred earlier than the separation of these mammalian lineages which is marked by a horizontal line near the base of the tree. Four pig pgp genes bear similarity to Class I genes from other species in addition to sharing a common mutation at the end of their coding region. 2. A SINE element was inserted into the 3' untranslated region of the ancestral gene of pgplB/pgplC/pgplD but not into pgplA. 3. Nu- cleotide identity in the 3' untranslated region suggests that pgplB and pgpD are more closely related to each other than to pgplC.

pgplB[ pgplC/pgplD'~ pgplA-.*

l~2M"~

1 2 3 4 5

pgp3.~ 132M.,,~

1 2 3 4 5 6 7

Fig. 5. Expression of the Class I pig pgp genes in brain endothelial cells and in liver. RT-PCR analysis using primers specific for pig Class I and Class 1II pgp's in pig cultured capillary endothelial cells and liver. The Class I primers were chosen to differentiate between the 389 bp product of pgplA and the 658 bp product of pgplB or pgplD. Both sets of reactions were compared with the constitutive expression of /32-micro- globulin. Top panel: Class 1 pgp expression: 1, Pig liver; 2. pig brain capillary endothelial cells: 3, no template control; 4, pgplA plasmid control; lane 5 pgplB plasmid control. Bottom panel: Class HI pgp expression: Lanes 1,2, 5 and 7 use primers for Class III pgp. Lanes 3,4,6 use primers for /32-microglobulin. 1 and 3, Pig liver; 2 and 4, pig brain capillary endothelial cells: 5 and 6, no template controls; 7. pgp3 plasmid control.

l iver and pig cul tured capi l lary endothel ial cells f rom brain

was chosen because o f the avai labi l i ty o f samples, and the

known impor tance o f Pgp to the funct ional integrity o f

these tissues [13,8]. R T - P C R demonst ra tes express ion o f

the pgplA but not the other Class I i soforms (Fig. 5, top).

pgp lA appears not to be expressed in pig l iver a l though it

has been observed in low amounts in human and rat l iver

[9,23]. S imi la r to other species, pgp3 is expressed in the

pig liver, but not in brain endothelial cel ls (Fig. 5, bottom).

4. D i s c u s s i o n

Genomic Southern blot t ing using the short, h ighly con-

served PEX-172 probe has prev ious ly shown remarkable

variat ion in the size o f the Pgp mul t igene family in mam-

mals [15]. The largest predicted Pgp mul t igene family

occurs in the pig where it has been sugges ted that there are

f ive genes. By c loning genomic D N A for each o f these p ig

Pgp genes, we demonst ra te the uniqueness o f each frag-

ment as an independent Pgp gene. Four pig Pgp genes are

der ived f rom repeated duplicat ions of Class I Pgp while

the fifth is Class I lI Pgp.

Genomic Southern blott ing suggests that the min imum

number o f Pgp genes in any mammal i an species is two.

This pattern can be seen in different mammal i an orders,

suggest ing that it is the ancestral condition. Fur thermore

by cloning, Class I and Class III genes have been identi-

f ied in rodents, pr imates and art iodactyls (pig). The mini-

m u m of two Pgp genes in several mammal s and the

presence of Class I and Class II l in different mammal ian

orders is sugges t ive that these two classes were present in

the mammal i an ancestor and will probably be found today

in all mammals .

In both pig and rodent l ineages, there are additional

genes der ived by duplicat ion o f a Class I gene. Are these

addit ional genes related? Sequence compar isons o f 3' U T R

regions do not f ind any pig Pgp i soform signif icantly more

related to ei ther hamster Class I or Class II Pgp. In

Table 1 Similarity of the 3'UTR sequences of pig pgp family members with MDR and pgp genes from human and hamster

Humdr I Humdr3 chpgp I chpgp2 chpgp3 pgp l B pgp 1C pgp 1D pgp3

pgplA 76.7 50.0 65.2 65.7 50.0 84.5 61.1 83.7 46.0 pgp 1B 74.3 48.1 66.1 65.0 54.0 - 63.4 87.0 52.9 pgp 1C 52.9 50.0 52.8 51.1 54.0 - - 63.3 51.0 pgp 1D 71. I 50.0 61.2 60.2 54.0 - - - 56.9 pgp3 49.0 76.5 54.1 47.1 84.0 . . . .

3' UTR sequences from human (HuMDR1, HuMDR3) and Chinese hamster (chpgpl, chpgp2, chpgp3) were compared against the predicted 3'UTRs of the pig Pgp genes (pgplA-D, pgp3). In the case of pgplB-D comparisons were made after the removal of the SINE element from the sequence. Identities at the nucleic acid level are expressed as percents.

S. Childs, V. Ling / Biochimica et Biophysica A eta 1307 (1996) 205-212 211

addition, we demonstrate a stop codon mutation present in all four pig Class I genes which probably occurred prior to their duplication. Since this mutation is not present in either Class I or Class II genes of rodents, it suggests that the pig and rodent Class I /Class II genes arose from duplication events which occurred independently. The du- plications are related only in that they both started from an ancestral Class I Pgp gene. The pig pgp genes have been designated as Class I genes with reference to humans, a species where there is a single Class I gene. Each of the pig pgp's is equally related to the human Class I gene as the others, hence the 'A,B,C,D' designation. There is no 'Class II ' gene in pigs because the duplication which formed Class II Pgp was restricted (as far as we can tell) to the rodent lineage. These findings imply that the Pgp multigene family has been expanding independently in pigs, rodents, and perhaps also many other lines of mam- malian evolution where there are more than two Pgp genes in a species. Although frequent and independent duplica- tion of a gene is a non-parsimonious evolutionary scenario, it appears that this has occurred in the Pgp multigene family.

The Pgp multigene family also varies widely in size in non-mammals. The winter flounder has two Pgp genes, Drosophila three, Caenorhabditis elegans, four, Enta- moeba histolytica, six and Arabidopsis thaliana, two [19,24-27]. To date none of the genes from any one of these species, in either expression or sequence, has been found to closely resemble any Pgp gene from other species more closely than they do Pgp genes within their own species. One explanation is that Pgp family size and isoform function within a species may be malleable. changing to suit the needs of the organism. On a large evolutionary scale, the function of Pgp genes may not be conserved. It also suggests that the Pgp gene family is undergoing frequent and independent gene duplication in many lines of evolution since the number of Pgp's is not constant in different lineages. Depending on the environ- ment of an organism, or its metabolism, possessing a greater or lesser number of Pgp genes may be selectively advantageous. On the other hand variation in Pgp gene family size might reflect genomic structure, if for instance these highly related genes are physically linked, promoting the frequency of gene duplication by unequal crossing-over. However, while there is close linkage of the Pgp genes in humans and rodents, Pgp genes in C. elegans and Drosophila are scattered in different chromosomal loca- tions [28,29,24,25].

Complete cDNA's of the Pgp multigene family have been cloned from humans (2 isoforms) and rodents (three isoforms). Within mammals, Pgp isoforms are conserved across species, while each multigene family member is distinct from the others in its sequence, function and expression within a species. Expression of Class I Pgp has been previously demonstrated in the capillary endothelial cells of human and rodent brain [30,10,8]. Using a set of

primers surrounding the SINE element in the 3'UTRs of pgplA, pgplB, pgplC and pgplD, we show that the one of the pig Class l Pgp's, pgplA, is expressed in pig brain cultured capillary endothelial cells. The brain endothelial cells used here have been previously shown to express an 170 kDa protein reactive with the Pgp monoclonal anti- body C219 and verapamil-reversible rhodamine- 123 effiux activity, suggestive of functional Pgp [31]. This is consis- tent with our finding of pig pgp in these cells. Expression of the Class III isoform of Pgp has been demonstrated in the liver of rodents and humans [6,32]. We show that Class III Pgp is also expressed in the pig liver. Thus there appears to be strong phylogenetic conservation of these two sites of expression in three independent mammalian orders.

Although we did not find any evidence for expression of pgplB, pgplC, or pgplD, it is possible that they are expressed in tissues not examined in this study. For in- stance in rodents the Class I isoform but not the Class II isoform is expressed in brain endothelial cells. The Class II isoform is functional in other tissues. It is also possible that the genes are not expressed at all. Since the genomic fragments we have cloned for pgpIB, pgplC and pgplD contain an intron, it is not likely that these "new' genes are processed pseudogenes. However. we have isolated one exon from the 3' end of the gene and have not been able to examine sequences further upstrearn. We cannot rule out the possibility that the genomic fragments isolated here are partial duplications of the typical Class I gene (pgplA), or mutationally inactivated full-length genes.

The functional consequences of having five recently duplicated, closely related Pgp genes in a genome could be several. Duplication of a gene will cause a immediate increase in the expression levels of the gene message, and perhaps protein product. If high expression levels are detrimental to the organism, the +extra" genes need to be rapidly down-regulated. Perhaps not coincidentally, the insertion of a SINE element into the 3'UTRs in three of the genes may affect their expression levels, since the occurrence of SINEs in transcripts has been shown to affect the expression of genes in which they occur [33-35]. The most interesting fate of the pig Pgp genes would be if the new gene family members have not been silenced, but have been free to assume new functions and expression patterns. We do not yet know if the pig Pgp isoforms have acquired unique roles, but determination of the functions and expression of the 'new" pig Pgp's will help in our understanding of how duplicated genes ew~lve new func- tions.

Acknowledgements

S.C. is the recipient of a Steve Fonyo Studentship from the National Cancer Institute of Canada. This research was supported by a grant from the National Cancer Institute of

212 S. Childs, V. Ling/Biochimica et Biophysica Acta 1307 (1996) 205-212

Canada . W e t h a n k Fang Z h a n g for cr i t ical r ead ing o f the

manusc r ip t .

References

[1] Childs, S. and Ling, V. (1994) in Important Advances in Oncology, 1994 (DeVita, V.T., Hellman, S. and Rosenberg, S.A., eds), pp. 21-36, J.B. Lippincott, Philadelphia, PA,

[2] Gerlach, J.H., Endicott, J.A., Juranka, P.F., Henderson, G., Sarangi, F., Deuchars, K.L. and Ling, V. (1986) Nature 324, 485-489.

[3] Chen, C., Chin, J.E., Ueda, K., Clark, D.P., Pastan, I., Gottesman, M.M. and Roninson, I.B. (1986) Cell 47, 381-389.

[4] Van der Bliek, A.M., Kooiman, P.M., Schneider, C., Borst,P. (1988) Gene 71,401-411.

[5] Ng, W.F., Sarangi, F., Zastawny, R.L., Veinot-Drebot, L., Ling, V. (1989) Mol. Cell. Biol. 9, 1224-1232.

[6] Croop, J,M., Raymond, M., Haber, D., Devault~ A., Arceci, R.J., Gros, P. and Houseman, D.E. (1989)Mol, Cell. Biol. 9, 1346-1350.

[7] Devault, A., Gros, P. (1990) Mol. Cell. Biol. 10, 1652-1663. [8] Schinkel, A.H., Smit, J.J.M., Van Tellingen, O., Beijnen, J.H.,

Wagenaar, E., Van Deemter, L., Mol, C.A.A.M., Van der Valk, M.A., Robanus-Maandag, E.C., Te Riele, H.P.J., Berns, A.J.M. and Borst, P. (1994) Cell 77, 491-502.

[9] Thiebaut, F., Tsuro, T., Hamada, H., Gottesman, M.M., Pastan, I. and Willingham, M.C. (1987) Proc. Natl. Acad. Sci. USA 84, 7735-7738.

[10] Bradley, G., Georges, E. and Ling, V (1990) J. Cell. Physiol. 145, 398-408.

[11] Arceci, R.J., Croop, J.M., Horowitz, S.B. and Houseman, D. (1988) Proc. Natl. Acad. Sci. USA 85, 4350-4354.

[12] Lee, C.H., Bradley, G. and Ling, V. (1994) Cold Spring Harbour Symposia on Quantitative Biology, LIX, 607-615.

[13] Smit, J.J.M., Schinkel, A.H., Oude Elferink, R.P.J., Groen, A.K., Wagenaar, E., Van Deemter, L., Mol, C.A.A.M., Ottenhoff, R., Van der Lugt, N.M.T., Van Roon, M.A., Van der Valk, M.A., Offerhaus, G.J.A., Berns, A.J.M. and Borst, P. (1993) Cell 75, 451-462.

[14] Ruetz, S. and Gros, P. (1994) Cell 71, 1071-1081. [15] Ling, V., Bradley, G., Veinot, L.M., Hiruki, T. and Georges, E.

(1992) in Bristol-Myers Squibb Cancer Symposia (Tsuro, T. and Ogawa, M., eds.) 1992, pp. 117-128, Academic Press, New York.

[16] Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York.

[17] Tonsch, U. and Bauer, H.-C. (1989) Microvasc. Res. 37, 148-161. [18] Deuchars. K.L., Duthie, M. and Ling, V. (1992) Biochim. Biophys.

Acta 1130, 157-165. [19] Chan, K.M., Davies,P.L., Childs, S., Veinot, L. and Ling, V. (1992)

Biochim. Biophys. Acta 1171, 65-72. [20] Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J.

(1990) J. Mol. Biol. 215,403-410. [21] Liskay, R.M., Letsou, A. and Stachelek, J.L. (1987) Genetics 115,

161-167. [22] Ellegren, H. (1993) Mammalian Genome 4, 429-434. [23] Lee, C.H., Bradley, B., Zhang, J.-T. and Ling, V. (1993) J. Cell.

Physiol. 157, 392-402. [24] Gerrard, B., Stewart, C. and Dean, M. (1993) Genomics 17, 83-88. [25] Lincke, C.R., The, I., Van Groenigen, M. and Borst, P. (1992) J.

Mol. Biol. 228, 701-711. [26] Descoteaux, S., Ayala, P., Orozco, E. and Samuelson, J. (1992) Mol.

Biochem. Parasitol. 54, 201-212. [27] Dudler, R. and Hertig, C. (1992) J. Biol. Chem. 267, 5882-5888. [28] Chin, J.E., Soffir, R., Noonan, K.E., Choi, K. and Roninson, I.B

(1989) Mol. Cell. Biol. 9, 3808-3820. [29] Raymond. M., Rose, E., Housman, D.E. and Gros, P. (1990) Mol.

Cell. Biol. 10, 1642-1651. [30] Cordon-Cardo, C., O'Brian, J.P., Casals, D., Rittman-Grauer, L,

Biedler, J.L., Melamed, M.R. and Bertino, J.R. (1989) Proc. Natl. Acad. Sci. USA 86, 695-698.

[31] Hegmann, E.J., Bauer, H.C. and Kerbel, R.S. (1992) Cancer Res. 52, 6969-6975.

[32] Smit, J.J.M., Schinkel, A.H., Mol, C.A.A.M, Majoor, D., Mooi, W.J., Jongsma, P.M., Lincke, C.R. and Borst, P. (1994) Lab. Invest. 71,638-649.

[33] Adeniyi-Jones, S. and Zasloff, M. (1985) Nature 317, 81-84. [34] McKinnon, R.B., Shinnick, T.M. and Sutcliffe, J.G. (1986) Proc.

Natl. Acad. Sci. USA 83, 3751-3755. [35] Vidal, F., Mougneau, E., Glaichenhaus, N., Baigot, P., Darmon, M,

and Cuzin, F. (1993) Proc. Natl. Acad. Sci. USA 90, 208-212.