OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, of 25, 7017 …THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol....

8
THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, Issue of April 25, pp. 7017-7024,1989 0 1989 by The American Society for Biochemistry and Molecular Biology, Inc. Printed in U.S.A. Partial Primary Structure of the 48- and 90-Kilodalton Core Proteins of Cell Surface-associated Heparan Sulfate Proteoglycans of Lung Fibroblasts PREDICTION OF AN INTEGRAL MEMBRANE DOMAIN AND EVIDENCE FOR MULTIPLE DISTINCT CORE PROTEINS AT THE CELL SURFACE OF HUMAN LUNG FIBROBLASTS* (Received for publication, August 10, 1988) Peter MarynenS, J i Zhang, Jean-Jacques Cassiman, Herman Van den Berghe, and Guido David5 From the Center for Human Genetics, University of Leuven, Campus Gasthuisberg 0 & N, Herestraat B-3000 Leuuen, Belgium Heparitinase treatment of cell surface-associated heparan sulfate proteoglycans (HSPG) of human lung fibroblasts reveals core proteins with apparent M, val- ues of 125,000, 90,000, 64,000, 48,000 and 35,000 (Lories, V., De Boeck, H., David, G., Cassiman, J.-J., and Van den Berghe, H. (1987) J. Biol. Chern. 262, 854-859). The 90- and 48-kDa core proteins share the epitope of the monoclonal antibody 6G12 which was used to screen a human lung fibroblast expression cDNA library. Rescreening of the libraries yielded clone 48K5 with an insert of 3439 base pairs. Polyclonal antibodies were raised in rabbits against a fragment of the protein encoded by the 48K5 cDNA different from the part carrying the 6G12 epitope. These antibodies specifi- cally recognize the 90- and 48-kDa core proteins on Western blots of total cellular extracts of human lung fibroblast HSPG. The specific reactivity of the poly- clonal antiserum confirms the identity of the 4835 clone and further distinguishes the 48- and the 90-kDa core proteins, which do share the 6G12-defined epitope and at least one additional antigenic determinant with the 48K5 cDNA-encoded protein, from the 125-, 64-, and 35-kDa core proteins of cell surface HSPG of human lung fibroblasts which do not react with either antibody preparation. The protein encoded by the 48K5 clone contains a stop-transfer sequence indica- tive of an integral membrane protein and three poten- tial glycosaminoglycanattachment sites. The 48K5 clone detects two major poly(A)* RNA species in human lung fibroblasts presumably gener- ated by the use of alternative polyadenylation signals. The 48K5 gene was mapped to chromosome 8q23 by in situ hybridization and hybridization to DNA of so- matic cell hybrids. * This investigation was supported in part by Grants 3.0066.87 and 3.0088.88 of the Fonds voor Geneeskundig Wetenschappeliijk Onder- zoek, Belgium, by United States Public Health Service Research Grant HI-31750 (to G. D.), and by the Inter-university Network for Fundamental Research sponsored by the Belgian Government (1987- 1991). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequence(s) reported in thispaper has been submitted to the GenBankTM/EMBL Data Bank with accession numbeds) 504621. $ Bevoegdverklaard Navorser of the Nationaal Fondsvoor Weten- schappeliijk Onderzoek, Belgium. Onderzoeksleider of the Nationaal Fondsvoor Wetenschappelijk Onderzoek, Belgium. To whom correspondence should be addressed. Cell surface-associated proteoglycans appear involved in a variety of biological processes. These pertain to the mecha- nisms of cell-matrix (Koda et al., 1985) and cell-cell (Cole et al. 1985) adhesion, to the control of cell growth (Fritze et al., 1985; Ratner et al., 1985), the activation of proteinase inhib- itors (Low et al., 1981), the regulation of receptor function (Fransson et al., 1984), antigen presentation (Sant et al., 1985), and even the generation of biological rhythms (Jackson et al., 1986). In the execution of these processes, different modes of association of the proteoglycans with the cell surface seem to exist. Some forms are periferally membrane-bound to recep- tors for the glycosaminoglycan chains (Kjell6n et al., 1980) or for structures on the core protein (Glosll et al., 1983). In the latter instance, the membrane association may only be very transient, representing proteoglycan destined for internali- zation and further processing (Bienkowski and Conrad, 1984; Ishihara et al., 1986). Other cell surface proteoglycans, how- ever, seem endowed with properties that allow for a more direct insertion into the membrane; some of these may be linked to the plasma membrane through an inositol phospho- lipid anchor (Ishihara et al., 1987), but others are presumed to possess hydrophobic core protein segments that span the membrane (Rapraeger and Bernfield, 1985). Indirect evidence for the existence of integral membraneforms also stems from the tentative identification of some core proteins as alterna- tively processed membrane glycoprotein: the class I1 (Ia) histocompatibility antigen-associated invariant chain in lym- phocytes (Giacoletto et al., 1986) and thrombomodulin in endothelial cells (Jackman et al., 1986). For a large part, the functional properties of these proteo- glycans seem to depend on the charge and the structure of the glycosaminoglycan moieties. The latter comprise mostly heparan sulfatebut also chondroitin sulfate/dermatan sulfate chains (see Fransson, 1987). These seem to mediate or, to the contrary, interfere with the binding of a number of different ligands: linking cells to collagen (Koda and Bernfield, 1984) and fibronectin (Laterra et al., 1983) in the matrix, catalyti- cally inactivating proteinases of the coagulation cascade (Marcum et al., 1986), or shielding receptor structure in confluent cells (Coster et al., 1986). The transformation- associated changes of these complex carbohydrates (Hook et al., 1984; David and Van den Berghe, 1983) may therefore have profound implications for proteoglycan function and cellular behavior. Besides providing a means to concentrate and anchor glycosaminoglycan chains at the cell surface, the core proteins may also have a functional role of their own right, e.g. as cofactor for the activation of protein C (Bourin et al., 1986) or by providing a transmembrane link between 7017

Transcript of OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, of 25, 7017 …THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol....

Page 1: OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, of 25, 7017 …THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, Issue of April 25, pp. 7017-7024,1989 0 1989 by The American Society

THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, Issue of April 25, pp. 7017-7024,1989 0 1989 by The American Society for Biochemistry and Molecular Biology, Inc. Printed in U.S.A.

Partial Primary Structure of the 48- and 90-Kilodalton Core Proteins of Cell Surface-associated Heparan Sulfate Proteoglycans of Lung Fibroblasts PREDICTION OF AN INTEGRAL MEMBRANE DOMAIN AND EVIDENCE FOR MULTIPLE DISTINCT CORE PROTEINS AT THE CELL SURFACE OF HUMAN LUNG FIBROBLASTS*

(Received for publication, August 10, 1988)

Peter MarynenS, J i Zhang, Jean-Jacques Cassiman, Herman Van den Berghe, and Guido David5 From the Center for Human Genetics, University of Leuven, Campus Gasthuisberg 0 & N, Herestraat B-3000 Leuuen, Belgium

Heparitinase treatment of cell surface-associated heparan sulfate proteoglycans (HSPG) of human lung fibroblasts reveals core proteins with apparent M, val- ues of 125,000, 90,000, 64,000, 48,000 and 35,000 (Lories, V., De Boeck, H., David, G., Cassiman, J.-J., and Van den Berghe, H. (1987) J. Biol. Chern. 262, 854-859). The 90- and 48-kDa core proteins share the epitope of the monoclonal antibody 6G12 which was used to screen a human lung fibroblast expression cDNA library.

Rescreening of the libraries yielded clone 48K5 with an insert of 3439 base pairs. Polyclonal antibodies were raised in rabbits against a fragment of the protein encoded by the 48K5 cDNA different from the part carrying the 6G12 epitope. These antibodies specifi- cally recognize the 90- and 48-kDa core proteins on Western blots of total cellular extracts of human lung fibroblast HSPG. The specific reactivity of the poly- clonal antiserum confirms the identity of the 4835 clone and further distinguishes the 48- and the 90-kDa core proteins, which do share the 6G12-defined epitope and at least one additional antigenic determinant with the 48K5 cDNA-encoded protein, from the 125-, 64-, and 35-kDa core proteins of cell surface HSPG of human lung fibroblasts which do not react with either antibody preparation. The protein encoded by the 48K5 clone contains a stop-transfer sequence indica- tive of an integral membrane protein and three poten- tial glycosaminoglycan attachment sites.

The 48K5 clone detects two major poly(A)* RNA species in human lung fibroblasts presumably gener- ated by the use of alternative polyadenylation signals. The 48K5 gene was mapped to chromosome 8q23 by in situ hybridization and hybridization to DNA of so- matic cell hybrids.

* This investigation was supported in part by Grants 3.0066.87 and 3.0088.88 of the Fonds voor Geneeskundig Wetenschappeliijk Onder- zoek, Belgium, by United States Public Health Service Research Grant HI-31750 (to G. D.), and by the Inter-university Network for Fundamental Research sponsored by the Belgian Government (1987- 1991). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in thispaper has been submitted to the GenBankTM/EMBL Data Bank with accession numbeds) 504621.

$ Bevoegdverklaard Navorser of the Nationaal Fonds voor Weten- schappeliijk Onderzoek, Belgium.

Onderzoeksleider of the Nationaal Fonds voor Wetenschappelijk Onderzoek, Belgium. To whom correspondence should be addressed.

Cell surface-associated proteoglycans appear involved in a variety of biological processes. These pertain to the mecha- nisms of cell-matrix (Koda et al., 1985) and cell-cell (Cole et al. 1985) adhesion, to the control of cell growth (Fritze et al., 1985; Ratner et al., 1985), the activation of proteinase inhib- itors (Low et al., 1981), the regulation of receptor function (Fransson et al., 1984), antigen presentation (Sant et al., 1985), and even the generation of biological rhythms (Jackson et al., 1986).

In the execution of these processes, different modes of association of the proteoglycans with the cell surface seem to exist. Some forms are periferally membrane-bound to recep- tors for the glycosaminoglycan chains (Kjell6n et al., 1980) or for structures on the core protein (Glosll et al., 1983). In the latter instance, the membrane association may only be very transient, representing proteoglycan destined for internali- zation and further processing (Bienkowski and Conrad, 1984; Ishihara et al., 1986). Other cell surface proteoglycans, how- ever, seem endowed with properties that allow for a more direct insertion into the membrane; some of these may be linked to the plasma membrane through an inositol phospho- lipid anchor (Ishihara et al., 1987), but others are presumed to possess hydrophobic core protein segments that span the membrane (Rapraeger and Bernfield, 1985). Indirect evidence for the existence of integral membrane forms also stems from the tentative identification of some core proteins as alterna- tively processed membrane glycoprotein: the class I1 (Ia) histocompatibility antigen-associated invariant chain in lym- phocytes (Giacoletto et al., 1986) and thrombomodulin in endothelial cells (Jackman et al., 1986).

For a large part, the functional properties of these proteo- glycans seem to depend on the charge and the structure of the glycosaminoglycan moieties. The latter comprise mostly heparan sulfate but also chondroitin sulfate/dermatan sulfate chains (see Fransson, 1987). These seem to mediate or, to the contrary, interfere with the binding of a number of different ligands: linking cells to collagen (Koda and Bernfield, 1984) and fibronectin (Laterra et al., 1983) in the matrix, catalyti- cally inactivating proteinases of the coagulation cascade (Marcum et al., 1986), or shielding receptor structure in confluent cells (Coster et al., 1986). The transformation- associated changes of these complex carbohydrates (Hook et al., 1984; David and Van den Berghe, 1983) may therefore have profound implications for proteoglycan function and cellular behavior. Besides providing a means to concentrate and anchor glycosaminoglycan chains at the cell surface, the core proteins may also have a functional role of their own right, e.g. as cofactor for the activation of protein C (Bourin et al., 1986) or by providing a transmembrane link between

7017

Page 2: OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, of 25, 7017 …THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, Issue of April 25, pp. 7017-7024,1989 0 1989 by The American Society

7018 Structure of an Integral Membrane Proteoglycan Core Protein

the cytoskeleton and the extracellular matrix (Rapraeger and Bernfield, 1982; Woods et al., 1985). The evidence, however, is indirect, and the information on core protein properties is too limited for an evaluation of these possibilities.

Accumulating data suggest, perhaps not surprisingly, that this functional versatility of the cell surface proteoglycans is accompanied by an outspoken structural diversity (see Frans- son, 1987). Moreover, we have obtained evidence that struc- tural heterogeneity occurs among membrane proteoglycans of a single class within a single cell type. Indeed, unreduced preparations of hydrophobic heparan sulfate proteoglycan of confluent human lung fibroblasts harbor multiple core protein forms with apparent M, values of 125,000, 90,000, 68,000, 48,000, and 35,000 after heparitinase digestion (Lories et al., 1987). The origin of this heterogeneity is not clear, but it cannot simply be accounted for by the heparitinase treatment itself (Lories et al., 1987, 1989). Comparative peptide maps and the reactivity of a panel of monoclonal antibodies imply that more than one single proteoglycan species must exist (Lories et al., 1989). However, the two distinct epitopes which are recognized by the monoclonal antibodies F58-6G12 and F58-10H4 are common to both the 90- and the 48-kDa core proteins, suggesting that the latter may be related. The present report describes the molecular cloning of a partial cDNA sequence, "derived" from human lung fibroblasts, which seems to encode a peptide that is common to both these 48/90-kDa core proteins. The sequence suggests the existence of a carboxyl-terminal intracytoplasmic domain, a stop-trans- fer sequence with a hydrophobic transmembrane segment, and an extracellular domain with several potential glycosa- minoglycan chain acceptor sequences. The recombinant core protein cross-reacts immunologically with the 48/90-kDa core proteins but not with the other forms, which is further support for the existence of multiple cell surface heparan sulfate proteoglycan species in these lung fibroblasts.

MATERIALS AND METHODS

Isolation and Analysis of Poly(A)+ RNA-The isolation of total RNA from human lung fibroblasts was modified from Maniatis et al. (1982). Typically, 2 X 10' fibroblasts were solubilized with 25 ml of 5 M guanidine isothiocyanate, 5 mM sodium citrate, 0.2 M 2-mercapto- ethanol, 0.5% N-lauroylsarcosine. DNA was sheared by repeated passage through a 20-gauge needle, 10 g of CsCl was added, and 7-ml fractions of the extracts were layered on 5.5 ml of a cesium trifluo- roacetate solution (Pharmacia LKB Biotechnology Inc.) with a den- sity of 1.51 and spun at 30,000 rpm for 24 h in a SW 41 rotor (Beckman Instruments). The RNA pellet was solubilized in 10 mM Tris, 1 mM EDTA, 5% phenol, 5% N-lauroylsarcosine, phenol-ex- tracted once, and ethanol-precipitated twice. Poly(A)+ RNA was isolated by two rounds of chromatography on oligo(dT)-cellulose (Bethesda Research Laboratories) according to Aviv and Leder (1972) and stored as an ethanol precipitate until further use. The absence of degradation of the RNA in the preparations was checked by Northern blotting. For Northern analysis, poly(A)+ RNA was dena- tured and separated on 1.2% agarose gels containing 0.2 M 3-(N- morpho1ino)propanesulfonic acid, 0.05 M sodium acetate, pH 7.0, 10 mM EDTA, and 6% formaldehyde as described in Maniatis et al. (1982). RNA was transferred to Nytran membranes (Schleicher and Schuell) following the specifications of the manufacturers and cross- linked to its support by exposure to UV light (Church and Gilbert, 1984).

Preparation and Screening of Libraries-The initial (48K1) cDNA clone was isolated from a human lung fibroblast cDNA library cloned into the expression vector XGTll (Young and Davis, 1983) obtained from commercial sources (Clontech, Palo Alto, CAI. The phages were grown in Y1090 r- host cells, induced with isopropyl 0-D-thiogalac- topyranoside, and selected by blotting and immunostaining of the nitrocellulose filters with conditioned culture medium of the F58- 6G12 hybridoma and alkaline phosphatase-conjugated affinity-puri- fied goat anti-mouse antibodies (Promega Biotec). Additional clones were obtained by screening the fibroblast cDNA library and a com-

mercially available placenta XGTll library (Clontech) using random primer-labeled (Feinberg and Vogelstein, 1983) cDNA inserts and DNA hybridization. The cDNA for the XZAP library was prepared from 2 pg of lung fibroblast poly(A)+ RNA with a cDNA synthesis kit (Bethesda Research Laboratories). After methylation of internal EcoRI sites, ligation of EcoRI linkers, and digestion with EcoRI, the cDNA was separated by gel permeation chromatography on a Bio- Gel A-50m (Bio-Rad) column cast in a siliconized glass 1-ml pipette (Huynh et al., 1985). The fractions containing the largest cDNAs were pooled (25% of the total cDNA), ethanol-precipitated, and ligated into dephosphorylated XZAP arms (Promega Biotec). The ligated DNA was packaged, plated onto BB4 host cells, and screened without further amplification,

Analysis of the cDNA Clones-The inserts of XGTll clones were subcloned into pGEM-3Z (Promega Biotec). Bluescript SK- plasmids were obtained from the XZAP clones after superinfection with R408 helper phage as described by the manufacturer.

Sequences of the inserts were obtained using the dideoxy chain termination method (Sanger et al., 1977) by direct sequencing of supercoiled plasmid (Chen and Seeburg, 1985) with SP6 and T7 primers for pGEM-3Z and SK and KS primers (Genofit, Geneva, Switzerland) for Bluescript SK- plasmids. To obtain the full sequence of both strands, exonuclease III/mung bean nuclease deletion sub- clones were prepared as described by Henikoff (1984). To alleviate the severe compression problems resulting from the GC-rich regions at the 5' end of clones 48K5 and 48K3, sequencing of these fragments was done using 7"deaza-GTP (Misusawa et al., 1986) and Klenow enzyme and, after subcloning into M13, with dITP and Sequenase (United States Biochemical Co.).

Gene Localization-The selection and karyotyping of human mouse somatic hybrids has been described elsewhere (Zhang et al., 1988). Somatic hybrids were rekaryotyped at the moment of harvest for DNA purification. The isolation, digestion, and blotting of DNA were according to established procedures (Maniatis et al., 1982). Human DNA, digested with EcoRI, yields two fragments of 23 and 3.2 kilobase pairs when hybridized with the 48K3 probe. Under similar hybridi- zation circumstances, murine DNA digested with EcoRI reveals an easily distinguishable pattern of DNA fragments of, respectively, 12.5, 5.1, 4.4, and 3.1 kilobase pairs.

In situ hybridization of metaphase chromosome spreads of human white blood cells with [3H]dCTP-labeled 48K3 was as described by Harper and Saunders (1981).

Construction of Expression Plasmids and Isolation of the Hybrid Proteins-The coding EcoRI-PstI fragment of cDNA clone 48K1 (the equivalent of base 954-1385 of clone 48K5) and the BamHI-PstI fragment of clone 48K3 (base 602-1385 of 48K5) were ligated into the plasmid expression vector pEX2 (Genofit) digested with the corresponding restriction enzymes. Transformed POP 2136 cells were grown overnight at 34 "C on ampicillin plates. Colonies containing recombinant plasmids were detected by screening temperature-in- duced replica cultures for the production of 0-galactosidase-core protein fusion products by immunostaining with Mab' F58-6G12 and by DNA hybridization with oligolabeled cDNA insert.

Fusion proteins for immunization and for use in affinity chroma- tography were isolated from exponentially growing cultures in LB medium, started from single colonies of transformed POP 2136 cells and induced by shifting the culture temperature for 2 h from 34 to 42 "C. Purification of the fusion proteins encoded by both recombi- nant plasmids was facilitated by the poor solubility of these compo- nents. After pelleting (10,000 X gav, 15 min at 4 "C) and rinsing in 0.15 M NaCl, 50 mM Tris-HCI, pH 8.0, by resuspension and centrif- ugation, the cells were sonicated for 2 min in ice-cold rinse buffer containing 1 mg/ml lysozyme. The sonicated pellet was diluted in 10 volumes of ice-cold 0.5% Triton X-100, 6 M urea, 50 mM Tris-HC1, pH 8.0, supplemented with 1 pg/ml pepstatin A, 25 pg/ml leupeptin, 5 mM EDTA, 25 mM 6-aminohexanoic acid, 1 mM phenylmethylsul- fonyl fluoride and extracted for 10 min at 4 "C. After centrifugation (25,000 X gav, 30 min, 4 "C), the residual pellet was further extracted for 48 h at 4 "C in 4 M guanidine hydrochloride, 10 mM Tris-HC1, pH 8.0. This 4 M guanidine hydrochloride extract was cleared by centrif- ugation (40,000 X gav, 30 min, 4 "C) and was shown by Western blotting and immunostaining with Mah F58-6G12 to contain large amounts of fusion proteins and, in comparison with the ureum extract, low amounts of bacterial proteins. The fusion proteins were further purified from these 4 M guanidine hydrochloride extracts by

The abbreviations used are: Mab(s), monoclonal antibody(ies); bp, base pair(s); HSPG, heparan sulfate proteoglycan.

Page 3: OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, of 25, 7017 …THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, Issue of April 25, pp. 7017-7024,1989 0 1989 by The American Society

Structure of an Integral Memb

gel filtration chromatography on Sepharose CL-4B or -6B in 4 M guanidine hydrochloride and by ion exchange chromatography on Mono Q (Pharmacia) in 0.5% Triton X-100, 6 M urea, 50 mM Tris- HCI, pH 8.0. Elution was monitored by immunodot spotting using Hybond-N membranes (Amersham Corp.), Mab F58-6G12 (20 pg/ ml), and peroxidase-conjugated rabbit anti-mouse immunoglobulins (diluted 1:50, Dakopatts).

Preparation of Polyclonal Antirecombinant Core Protein Antibod- ies-Purified pEX2-48K3-encoded fusion protein was suspended in phosphate-buffered saline, mixed with an equal volume of Freund's complete adjuvant, and injected subcutaneously into rabbits. After three additional injections with fusion protein-Freund's incomplete adjuvant mixtures, immune sera were obtained from ear bleedings. Pooled sera were incubated overnight with purified pEX2-48K3 fu- sion protein coupled to CNBr-activated Sepharose. After rinsing the beads, bound immunoglobulins were eluted with 0.15 M NaC1, 0.1 M glycine HCl, pH 2.0, and collected in tubes containing 0.1 ml of 2 M Tris-HC1, pH 8.0. To remove antibacterial specificities, the affinity- purified immunoglobulins were mixed with 0.2 volume of 0.5% casein in phosphate-buffered saline and incubated overnight with CNBr- activated Sepharose beads substituted with extracts from POP cells transformed with nonrecombinant pEX2 plasmids. Nonbound im- munoglobulins were further absorbed on pEX2-48K1 fusion protein coupled to CNBr-activated Sepharose to obtain an antiserum specific for (core) protein epitopes other than the F58-6G12 epitope used for the initial selection.

RESULTS

Molecular Cloning of a Presumptive 48/90-kDa Core Protein cDNA-A human lung fibroblast cDNA library, cloned into the expression vector XGT11, was screened with the mono- clonal antibody 6G12. One positive clone, 48K1, was plaque- purified, and the 1317-bp insert was subcloned into pGEM- 32 and sequenced (Figs. 1 and 2). The orientation of the insert in the original XGTll-48K1 clone was determined from the restriction map, and an open reading frame of 265 base pairs, continuous with the open reading frame of the Xgtll lac Z

lrane Proteoglycan Core Protein 7019

gene and therefore coding for the 6G12 epitope, was found. A probe generated from the 5' end of 48K1 was used to screen both the original fibroblast cDNA library and a placental cDNA library. One clone, 48K3, with a 3373-bp insert, did contain the complete sequence of clone 48K1 and other 48K clones selected as determined by restriction mapping and Southern hybridizations. Sequencing did not reveal an initi- ator AUG codon at the 5' end of clone 48K3, and the open reading frame did code for a protein with a molecular mass of 41 kDa. Therefore, a size-selected cDNA library was con- structed from poly(A)+ RNA isolated from human lung fibro- blasts. The cDNA was size-selected by gel permeation chro- matography and cloned into XZAP vector using EcoRI linkers. About lo6 independent plaques of the unamplified library were screened with a 500-bp probe isolated from the 5' end of clone 48K3. Of the 13 clones isolated, one clone 48K5, with a 3439-bp insert, contained additional information at the 5' end. The sequence of this clone, obtained by sequencing both strands, is shown in Fig. 2. 48K5 has a poly(A) tail preceded by the AATAAA polyadenylation signal 20 bp upstream. Two GC-rich regions were found extending from base 50 to 120 and base 260 to 560. No particular homologies were detected by searching GenBank release 54.

Characteristics of the 48K5 Gene Product-Clone 48K5 also lacks an initiator AUG codon. Translation of the 1191-bp open reading frame at the 5' end (Fig. 2) predicts a protein of 43 kDa with properties compatible with its tentative identi- fication as part of the 48/90-kDa core proteins.

Indeed, near the carboxyl terminus there is one long stretch of 24 hydrophobic amino acids followed by a short stretch rich in basic residues (4 out of 6) (see Fig. 2). This domain has the structure of a stop-transfer signal (Sabatini et al., 1982) and could thus constitute the membrane-spanning do-

l I

0 1000 2000 3000 bp

I 48 K 1

FIG. 1. Analysis of the 48K clones. The position of 48K clones is shown relative to the 48K5 sequence. The position of the TAA stop codon is indicated by A. AATAAA sequences, possibly used as alternative polyadenyl- ation signals, are indicated by A. The recognition sites for selected restriction enzymes are shown. (A)n indicates the presence of a poly(A) sequence. The thick lines delimit the GC-rich regions in the 48K sequence. The cross-hatched boxes show the 48K1 and 48K3 frag- ments used for the production of fusion proteins. The open boxes delimit the probes used for Northern analysis.

I 4 48-7A4

I 4 48-2E4

"- - ( A ) n 4 8 - 5 6 4

1 , 4 8 - l A 4

I I 4 8 - i A 4

-- - - 48-764

Page 4: OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, of 25, 7017 …THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, Issue of April 25, pp. 7017-7024,1989 0 1989 by The American Society

7020 Structure of an Integral Membrane Proteoglycan Core Protein GGC AGG AGG GAG GGA GCC AGA GGA AAA GAA GAG GAG GAG AAG GAG GAG GAC CCG GGG AGG GAG GCG CGG CGC GGG AGG AGG AGG GGC GCA G R R E G A R G K E E E E K E E D P G R E A R R G R R R G A

GCC GCG GAG CCA GTG GCC CCG CTT GGA CGC GCT GCT CTC CAG ATA CCC CCG GAG CTC CAG CCG CGC GGA TCG CGC GCT CCC GCC GCT CTG A A E P V A P L G R A A L Q I P P E L Q P R G S R A P A A L

CCC CTA AAC TTC TGC CGT AGC TCC CTT TCA AGC CAG CGA ATT TAT TCC TTA AAA CCA GAA ACT GAA CCT CGG CAC GGG AAA GGA GTC CGC P L N F C R S S L S S Q R I Y S L K P E T E P R H G K G V R

GGA GGA GCA AAA CCA CAG CAG AGC AAG AAG AGC TTC AGA GAG CAG CCT TCC CGG AGC ACC AAC TCC GTG TCG GGA GTG CAG AAA CCA ACA G G A K P Q Q S K K S F R E Q P S R S T N S V S G V Q K P T

AGT GAG AGG GCG CCG CGT TCC CGG GGC GCA GCT GCG GGC GGC GGG AGC AGG CGC AGG AGG AGG AAG CGA GCG CCC CCG AGC CCC GAG CCC S E R A P R S R G A A A G G G S R R R R R K R A P P S P E P

GAG TCC CCG AGC CTG AGC CGC AAT CGC TGC GGT ACT CTG CTC CGG ATT CGT GTG CGC GGG CTC GCC GAG CGC TGG GCA GGA GGC TTC GTT E S P S L S R N R C G T L L R I R V R G L A E R W A G G F V

TTG CCC TGG TTG CAA GCA GCG GCT GGG AGC AGC CGG TCC CTG GGG AAT ATG CGG CGC GCG TGG ATC CTG CTC ACC TTG GGC TTG GTG GCC L P W L Q A A A G S S R S L G N M R R A W I L L T L G L V A

TGC GTG TCG GCG GAG TCG AGA GCA GAG CTG ACA TCT GAT AAA GAC ATG TAC CTT GAC AAC AGC TCC ATT GAA GAA GCT TCA GGA GTG TAT C V S A E S R A E L T S D K D M Y L D N S S I E E A S G V Y

“”“”_ - CCT ATT GAT GAC GAT GAC TAC GCT TCT GCG TCT GGC TCG GGA GCT GAT GAG GAT GTA GAG AGT CCA GAG CTG ACA ACA ACT CGA CCA CTT P I D D D D Y A S A S G S G A D E D V E S P E L T T T R P L

CCA AAG ATA CTG TTG ACT AGT GCT GCT CCA AAA GTG GAA ACC ACG ACG CTG AAT ATA CAG AAC AAG ATA CCT GCT CAG ACA AAG TCA CCT P K I L L T S A A P K V E T T T L N I Q N K I P A Q T K S P

GAA GAA ACT GAT AAA GAG AAA GTT CAC CTC TCT GAC TCA GAA AGG AAA ATG GAC CCA GCC GAA GAG GAT ACA AAT GTG TAT ACT GAG AAA E E T D K E . K V H L S D S E R K M D P A E E D T N V Y T E K

CAC TCA GAC AGT CTG TTT AAA CGG ACA GAA GTC CTA GCA GCT GTC ATT GCT GGT GGA GTT ATT GGC TTT CTC TTT GCA ATT TTT CTT ATC H S D S L F K R T E V L A A V I A G G V I G F L F A I F L I

CTG CTG TTG GTG TAT CGC ATG AGA AAG AAG GAT GAA GGA AGC TAT GAC CTT GGA GAA CGC AAA CCA TCC AGT GCT GCT TAT CAG AAG GCA L L L V Y R M R K K D E G S Y D L G E R K P S S A A Y Q K A

CCT ACT AAG GAG TTT TAT GCG TAA AAC TCC AAC TTA GTG TCT CTA TTT ATG AGA TCA CTG AAC TTT TCA P@A TAA *C TTT TGC ATA GAA P T K E F Y A *

TAA TGA AGA TCT TTG TTT TTT GTT TTC ATT AAA GGA CCA TTC TGG CAC TTT AAT GAT AAA ATC CCA TTG TAT TTA AAA CAT TTC ATG TAT TTC TTT AGA ACA ACA TAA AAT TAA AAT TTA ACA TCT GCA GTG TTC TGT GAA TAG CAG TGG CAA AAT ATT ATG TTA TGA AAA CCC TCG ATG TTC ATG GAA TTG GTT TAA ACT TTT ATG CGC AAA TAC AAA ATG ATT GTC TTT TTC CTA TGA CTC AAA GAT GAA AGC TGT TTC ATT TGT GTC AGC ATG TCT CAG ATT GAC CTT ACC AAG TTG GTC TTA CTT TGT TAA TTT ATC TGT TGT CCC CTT CCT CTC CTC TGC CCT CCC TTC TTG TGC CCT TAA AAC CAA ACC CTA TGC CTT TTG TAG CTG TCA TGG TGC AAT TTG TCT TTG GAA AAT TCA GAT AAT GGT AAT TTA GTG TAT ATG TGA TTT TCA AAT ATG TAA ACT TTA ACT TCC ACT TTG TAT AAA TTT TTA AGT GTC AGA CTA TCC ATT TTA CAC TTG CTT TAT TTT TCA TTA CCT GTA GCT TTG GGC AGA TTT GCA ACA GCA AAT TAA TGT GTA AAA TTG GAT TAT TAC TAC AAA ACC GTT TAG TCA TAT CTA TCT AAT CAG ATC TTC TTT TGG GAG GAT TTG ATG TAA GTT ACT GAC AAG CCT CAG CAA ACC CAA AGA TGT TAA CAG TAT TTT AAG AAG TTG CTG CAG ATT CCT TTG GCC ACT GTA TTT GTT AAT TTC TTG CAA TTT GAA GGT ACG AGT AGA GGT TTA AAG AAA AAT CAG TTT TTG TTC TTA AAA ATG CAT TTA

AAA AAA AAG TTG GTA TTT TAT AAG CAC AGA CAA TTC TAA TGG TAA CTT TTG TAG TCT TAT GAA TAG ACA TAA ATT GTA ATT TGG GAA CAT AGT TGT AAA CGT CTT TTT AAG CCT TTG AAG TGC CTC TGA TTC TAT GTA ACT TGT TGC AGA CTG GTG TTA ATG AGT ATA TGT AAC AGT TTA

AAA AAC TAC Tq-m CAT GTG GCC TAA TAT TGA AAA TGT CAC TGT TAT AAA TTT TGT ACA TTT TTG ATC AAA TGT ACA TCT CCC CTT TGC TAA CGG CCG TCT GCT CTC AGG TTG ACG TGG GTT TGA TTT CTA AGT GTT TCA CAG TGT CTG TAA ATC AAG ACC AAA GAG CCT GTC GAT GAG ACT GTT TAT TAC CAG ATT CAC TTC TGA ATT GGC CAG AGG AAA TCT GAA TGT ATT ATC CTG TGT GTG TCT AGG TAG AGA TAT TGG AAG GCT GCC AGG GGA TTT CGA AGT TTG CAA CCT TTA TAG GAT AAC TGA TGG CAA TAT TAA GAC AGA CGC CTG CTT TTG CAA ATA ACT TAC AAG ACT GTA AAT TCC AAA GAT CTG AAT GGG GCT TTC CTG ATG TTG GTA TCT AAG GCT TAG GCC TAT AGA TTG ATT TAC CTT TGG AAT TGT GCT CCA AAT GTC TAC TGA AGC TTA ACC GAA GAA CTA ATA AAT GGA CTA CAG TAG CTC ACG TTA CAG GGA ACC ACC CTA GGC AGG GAG GCT CTG TGT GTT AAA ATG AGG GTC TCA CTG CTT TAG GAT TGA AGT GGC TGG AAA GAG TGA TGC CTG GGG AAG GAG ATG GAG TTA TGA GGG TAC TGT GGC TGG TAC TTT CTG TAC TAA ACA TTT CCT TTT TCT ATT TTA CCA CTA ATT TTC TTT TAA ACT GTG AGC CGT CCA AGT CAG AAG AAG ACA GCA AAA AAA GCA ACT TTT CCA ACA TAC AAT TTA CTT TTA ATA AAG TAT GAA TAT TTC ATT TTG AGA ACA TTC CCT GGA ATT GCC ACA TAA TTC ATT AAA AAC ATT TTT TTA AGC AAC ACT TGG AAC AGT GTT TAC TTT AAA TCC TTA ATG GCC TTA ATT AAT TCT CAG ATT CCT GCC CCA TCA CTT ACA GAA CCA ATT CAC TTT AGA GTG ACT AAA AGG AAA CGA TAG CCT AGC TTT CTA AAG CCA CGC TGT GTC CCT CAA TTA CAG AGG GTA GGA ATG GNT ATA CCT CTA ACT GTG CAA AGC AGA GTG AAA TTC AAT TCA TAG AAT AAC AAC TGC TGG GAA TAT CCG TGC CAG GAA AAG AAA AAT TTC TGG CAA ATA TTT TGT CAC TGC TGT AAA GCA AAA TAT TTG TGA AAG TGC C A A l n ] G T C TGT CAT GCC AAA AGT AAA AAA A A A A A A A A A A A A A A A A A A

FIG. 2. Sequence and derived primary structure of clone 48K5. Clone 48K5 was sequenced as indicated under “Materials and Methods.” Canonical polyadenylation signals are boxed. The presumptive transmembrane domain is underlined. Thick lines underscore the potential glycosaminoglycan attachment sites. An N-glycosylation site is indicated with a broken line.

90 30

180 60

270 90

360 120

450 150

540 180

630 210

720 240

810 270

900 300

990 330

1080 360

1170 390

1260 397

1350 1440 1530 1620 1710 1800 1890 1980 2070 2160 2250 2340 2430 2520 2610 2700 2790 2880 2970 3060 3150 3240 3330 3420

main of an integral membrane protein. The stop-transfer domain is followed by 33 amino acids which then would form the cytoplasmic domain up to the carboxyl terminus. Assum- ing similar requirements as for the synthesis of chondroitin sulfate, two Ser-Gly dipeptides and one Ser-Gly-Ser-Gly tetrapeptide in the extracellular NHn-terminal domain form three potential attachment sites for the heparan sulfate side chains of the HSPG. This attachment site has been defined as consisting of a few acidic residues closely followed by a Ser-Gly-X-Gly motif with X denoting any amino acid (Bour- don et al., 1987). This is indeed the case for the Ser-Gly-Ser- Gly at position 251 which is closely preceded by 4 aspartic

acids. According to the same model, the Ser-Gly at position 114 is unlikely to be an efficient glycosaminoglycan attach- ment site. The sequence around the Ser-Gly dipeptide at position 237, however, displays a striking similarity to the sequence found around the glycosaminoglycan attachment site of PG40, the core protein of a chondroitin/dermatan sulfate proteoglycan of human fibroblasts (Krusius and Ruos- lahti, 1986) (Fig. 3). Although the second glycine of the Ser- Gly-X-Gly motif is lacking here, it should be noted that peptides lacking this second glycine still act as a substrate, albeit less efficiently, of xylosyl transferases in uitro (Bourdon et al., 1987). On the other hand, studies using model peptides

Page 5: OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, of 25, 7017 …THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, Issue of April 25, pp. 7017-7024,1989 0 1989 by The American Society

Structure of a n Integral Membrane Proteoglycan Core Protein

1 - D E A S C I C P-8 PC40

I I1 I1 I1 II I II 234 - E E A S C V Y P-241 48K5

FIG. 3. Sequence similarity between the glycosaminoglycan attachment site of PG40 and a potential attachment site of 48K5. Identical amino acids and conservative substitutions are indicated. The sequence numbering of PG40 is from Krusius and Ruoslahti (1986). The numbering of 48K5 is as in Fig. 2.

1 2 3 4 5

0 .

0

f4

4 8 K I fusion protein

9 8 K 3 fusion protein

POP 2136 pEX2

FIG. 4. Specificity of the rabbit anti-48K3 antibodies. Total cell extract of E. coli (POP 2136) carrying the pEX2 plasmid and purified 48K1 and 48K3 fusion proteins (see "Results") were spotted as indicated. The dot blot strips were then challenged with (1 ) the monoclonal F58-10H4 (negative control): ( 2 ) F58-6G12 Mab's; (3) rabbit anti-48K3 fusion protein; ( 4 ) absorbed rabbit anti-48K3 spec- ificities minus 48K1 specificities, and (5) preimmune antisera. The blots were then developed with the appropriate horseradish peroxi- dase-conjugated second antibodies and substrate.

A B C D

Mr x

97 .

68

42

2 s . 18 . 15 '

FIG. 5. Specific binding of the absorbed anti-48K3 antibod- ies to the 48- and 90-kDa core proteins. Cell surface-associated heparan sulfate proteoglycans, heparitinase-treated (lanes A, B, and C ) or undigested (lane D), were separated on a 6-26% polyacrylamide gel under denaturing circumstances and transferred onto nylon mem- branes. Different slices of the blot were developed with the mono- clonal antibody F69-3G10 binding all core proteins (A), the mono- clonal F58-6G12 ( B ) , and the absorbed rabbit anti-48K3 minus 48K1 antibodies (C and D).

285 -

185

48K5 I I I Ill I V

FIG. 6. Northern analysis with 48K5-derived probes. Hu- man lung fibroblast poly(A)' RNA was separated on a denaturing formaldehyde agarose gel and blotted onto nylon membranes. The membrane strips were then hybridized with probes derived from 48K5 as indicated (see also Fig. 1).

."I" .. "I.

..". .".... 100

80 .""".........."".."I"" .... I""". ................................................ ..I"

4 8

40

40 1

7021

I

20

CHROMOSOMES

FIG. 7. In situ hybridization with the 48K3 probe. Meta- phase chromosome spreads from human white blood cell cultures were hybridized with tritium-labeled 48K3 insert and subjected to autoradiography. 14% of all grains were present on chromosome 8q, which is 4.2 times the amount expected on the basis of a random distribution of the grains.

do not take into account the potential effects of the secondary structure of the protein.

A potential N-glycosylation site (Asn-X-Ser) is found at position 230. Resistance toward N-glycanase, however, sug- gests that no N-glycosylation of 48/90-kDa core proteins would occur.'

48K5 Codes for Part of the 48190-kDa Core Protein-To confirm the identity of clone 48K5 as coding for part of the 48/90-kDa core protein, the coding sequence of 48K1 (corre- sponding to bases 954-1385 from 48K5, see Fig. 1) and part of the coding sequence of clone 48K3 (corresponding to bases 602-1385 from 48K5) were subcloned into the expression plasmid pEX2 (Stanley and Luzio, 1984). Synthesis of the p- galactosidase fusion proteins was induced in the Escherichia coli strain POP 2136, and the appropriate fusion proteins were purified. A polyclonal antiserum was obtained by immunizing rabbits with the larger 48K3 fusion protein. Anti-48K3 anti- bodies were affinity-purified on a 48K3 fusion protein column and absorbed with E. coli extract coupled to CNBr-activated Sepharose.

The anti-48K3 antibodies were then absorbed with the shorter 48K1 fusion protein which carries the 6G12 epitope. Complete out adsorption of 48K1 specificities was monitored with dot blots with both 48K1 and 48K3 fusion proteins (Fig. 4). These absorbed polyclonal antibodies do react with 48K3 fusion proteins but not with 48K1 fusion protein and therefore recognize at least one epitope different from the 6G12 epitope which is located on the 48K1 fusion protein. The absorbed antibodies (48K3 specificities minus 48K1 specificities) do react with diffuse high molecular weight species on a Western blot of detergent-extracted lung fibroblast heparan sulfate (Fig. 5, lune D). After heparitinase treatment of the samples, Western blots developed with the absorbed polyclonal anti- serum reveal the same 90- and 48-kDa species as detected by 6G12 (Fig. 5, lunes B and C). This demonstrates that the 90-

* G. David, unpublished results.

Page 6: OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, of 25, 7017 …THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, Issue of April 25, pp. 7017-7024,1989 0 1989 by The American Society

7022 Structure of an Integral Membrane Proteoglycan Core Protein

TABLE I Chromosome mapping of 48K5

DNA isolated from human/mouse hybrids was digested with EcoRI. The fragments were separated by size on an agarose gel, blotted onto a nylon membrane, and hybridized with a 32P-labeled 48K5 cDNA probe. A c-myc probe, mapping to 8q24 (HGM9) was used as a control. + denotes the presence of a chromosome in at least 2 out of 10 metaphase spreads or the Dresence of human specific fragments revealed with the c-mvc and the 48K5 probe.

Hybrids

F49D5S1 LN5S2F43 LN3S3F31 M37 LN5F31C4 LN3BF49S1 F49D3S1 LN3F31D2 V27

Chromsome

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X 48K5 c-myc

+ + + - + + + + - + + + + + + - - + + + + + - + + + + - - - + + + + + + + - + + + + + + + + + - + + - + - + + + + - + + - - + + - - + + - + + " + - + + - " + + "" + + - + - " + + " + + - + + - + - - - - - + + + + + - - - - + + + - - -

+ - + - + - " " " " - -

- -

- - - - - - - - - - " + - - + " " _ + + - " " " + - - - - + " + + + - + - - - + + - - + + - + + - - - - " + - " + " " _ + """"" - -

F49D3S2 F49C4

- - - - - - - - - + - - - - - - - - - - - - - - + + " " + + - - + + - + + - - + + + + + - + +

FLSN9 "" + " " _ - - - - - - - - - - - LN5G5F31

+ - - + + - + + + + - - + + + + + + - + - + + + + + + FL5N35 - + + + + + - + + - + - + + + + + + + + + + + + +

-

- -

and 48-kDa core proteins and the 48K5 protein do share, in addition to the 6G12 epitopes, other distinct antigenic deter- minants detected by the polyclonal antiserum and that in human lung fibroblasts these epitopes are unique for the 90/ 48-kDa core protein of membrane-associated HSPG. We therefore conclude that 48K5 is a partial cDNA clone for both the 90- and 48-kDa core proteins.

Northern Analysis-Upon hybridization of Northern blots of human lung fibroblast poly(A)+ RNA with 48135, two main RNA species of 4200 and 2900 bases, respectively, were high- lighted (Fig. 6). Longer exposures sometimes revealed other minor species. When smaller probes were generated by restric- tion enzyme digestion of 48K5, all probes derived from up- stream of the PstI site at base 1969 did light up both RNA species, while a probe reaching from the XmaIII site at base 2343 to the 3' end of 48K5 hybridized exclusively to the 4200- base mRNA. This indicates that the different mRNA species are generated by the presence of at least different noncoding 3' ends.

The 48K mRNAs of Different Size Are Generated by Alter- native Use of Polyadenylation Signals-To investigate possi- ble heterogeneity of the 48K clones, the 5' and 3' ends of all clones isolated from the XZAP human lung fibroblast cDNA library in the last selection round were sequenced. No se- quence divergences were found at the 3' end. The 3' ends were, however, clearly clustered (Fig. 1). 48K3,48K5, and two more 48K clones possess the AATAAA signal at base 3390, followed 20 bases downstream by a poly(A) sequence. Clone 48K1 and four other 48K clones stop around base 2250 of the 48K5 sequence, and another three 48 clones all end around base 1350 of the 48K5 sequence. Inspection of the 48K5 sequence reveals an AATAAA polyadenylation signal starting at base 1240 and one more starting at base 2262 of clone 48K5 (Fig. 2). In addition to this, one clone of each cluster carries a short poly(A) stretch not accounted for in the 48135 se- quence. From this evidence and from the results of the North- ern analysis reported above, it appears that alternative poly- adenylation signals present in the 48K5 sequence are being recognized in fibroblasts to produce different sizes of mRNA. It should be noted that the size difference between mRNAs generating using the 3390 and the 2262 polyadenylation sig- nals is large enough to account for the size difference of the two major mRNA species detected.

The 5' ends of the 48K clones clearly cluster in the GC-

rich regions at the 5' end of clone 48K5. It appears that the problems encountered in obtaining full length clones, even from unamplified libraries, might stem from the apparent difficulty of the murine Moloney leukemia virus reverse tran- scriptase to copy these sequences. 48K clones containing the 5' end of 48K5 were even totally absent in an avian myelo- blastosis virus reverse transcriptase-generated cDNA library (result not shown).

Genomic Mapping of the 48K5 Gene-Zn situ hybridization with 3H-labeled 48K5 yielded a strong signal on chromosome 8 (Fig. 7). 896 grains were scored on 429 metaphases; 15% of these were present on chromosome 8. The majority of the grains (98 out of 136 grains on chromosome 8) mapped to 8q23. Southern hybridization with a panel of human/mouse somatic cell hybrids is concordant only with the unique local- ization of the 48K5 gene on chromosome 8 (Table I). Southern hybridization of DNA obtained from random individuals di- gested with several restriction enzymes with the 48K5 probe showed simple patterns with a limited number of fragments (not shown) consistent with the presence of a unique 48K5 gene in the human genome.

DISCUSSION

The molecular cloning of a partial cDNA for the 48/90-kDa core protein of HSPG from human lung fibroblasts is pre- sented here. The evidence is indirect but conclusive: we have shown that the proteins coded for by 48K5 and the 48/90- kDa core proteins do share at least two independent antigenic sites, i.e. the epitope of 6G12, which is unique for the 48/90- kDa core proteins in human lung fibroblasts, and the deter- minant(s) of the polyclonal antiserum raised against a differ- ent fragment of the 48K5 protein and which selectively stains the 48/90-kDa core protein on Western blots of total cellular extracts of HSPG of human lung fibroblasts.

Present work further establishes the occurrence of multiple cell surface HSPG on human lung fibroblasts. It has been documented that the occurrence of 125-, 90-, 64-, 48-, and 35- kDa core proteins is not an artifact generated during purifi- cation of these core proteins (De Boeck et al., 1987; Lories et al., 1986, 1987). Monoclonal antibodies do detect different epitopes on each of these core proteins, and at least some of these epitopes seem to be uniquely defined by the protein moiety of the HSPG (De Boeck et al., 1987; Lories et al., 1989). The polyclonal antiserum generated against the 48133

Page 7: OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, of 25, 7017 …THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, Issue of April 25, pp. 7017-7024,1989 0 1989 by The American Society

Structure of an Integral Membrane Proteoglycan Core Protein 7023

protein does recognize the 48- and 90-kDa core proteins but not the 125-, 6 4 , and 34-kDa proteins. It is therefore unlikely that one gene product is processed to yield the 125,90,68,48, and 35-kDa core proteins.

The 90- and 48-kDa core proteins, however, are clearly related although the exact nature of this relation remains to be elucidated. Several hypotheses are still possible. It is un- likely but not excluded that two different genes code for both proteins. The data presented here indeed indicate that several epitopes detected on the 48-kDa protein are present on the 90-kDa protein, thereby suggesting that the 48-kDa protein sequences are contained in the 90-kDa protein. Sequencing of the 5’ end of all clones isolated in the last selection round and restriction mapping of all clones isolated did not detect any heterogeneity in the coding part of the 48K5 sequence. Furthermore, the in situ hybridization and the Southern hybridization to human/mouse somatic cell hybrid do indicate that the 48K5 gene maps exclusively to chromosome 8q23, while the simple patterns obtained upon Southern hybridi- zation suggest the presence of a single 48K5 gene.

On the other hand, one gene could yield 48- and 90-kDa mRNAs by differential processing of RNAs, or one mRNA could yield both proteins by posttranslational processing. Cloning of the 5’ end of the mRNA should allow us to elucidate this. To this aim, primer-extended cDNA libraries with primers designed to hybridize upstream of the GC-rich regions of 48K5 will be constructed.

The 48K5 probe detects two main and possibly some minor mRNA species upon Northern blotting of human lung fibro- blast poly(A)+ RNA. Present experiments show that at least for the two main species of 4200 and 2900 bases, respectively, the difference could be generated by differential use of poly- adenylation signals in the 48K5 sequence. Indeed, sequences 3‘ of the XrnaIII site at bp 2343 are exclusively present in the 4200 nucleotides, and one clone (48-6D4, see Fig. 1) was shown to possess a poly(A) tail downstream of the potential polyad- enylation signal at bp 2262. The alternative use of different polyadenylation signals is a known mechanism whereby one gene can generate multiple mRNAs (for review, see Leff and Rosenfeld, 1986; Breitbart et al., 1987). In some cases, the alternative use of polyadenylation signals seems to be linked to differential splicing and thus to the synthesis of different gene products such as the generation of secreted or mem- brane-bound forms of immunoglobulins (Perry and Kelley, 1979) or the synthesis of calcitonin hormone in thyroid cells and of calcitonin-related peptide in the brain (Amara et al., 1982). It should be noted, however, that all potential polyad- enylation signals occur downstream from the stop codon of 48K5 and that no variability was detected in the coding region. The alternative use of the different polyadenylation signals seems therefore not to be linked to protein heterogeneity, at least not for the carboxyl-terminal 43 kDa of the molecule. An intriguing example of differential polyadenylation was recently reported by Powell et al. (1987). According to these authors, a single gene generates the message for apoB-100 in the liver and for apoB-48 in the intestine by some novel mechanism of RNA editing. In addition to this, in the intes- tine but not in the liver, the alternative use of polyadenylation signals leads to size variability of the mRNAs, and a possible relation between this differential use of polyadenylation sig- nals and the RNA-editing mechanism was hypothesized (Powell et aL, 1987).

The most distinctive feature of the 48K5 protein is the presence of a stop-transfer sequence, suggesting that 48K5 is an integral membrane protein with the extracellular NH, terminus carrying the glycosaminoglycan side chain(s) and a

small cytoplasmic domain with the carboxyl terminus. This feature clearly distinguishes 48K5 from the lung fibroblast chondroitin/dermatana sulfate core protein PG40 (Krusius and Ruoslahti, 1986) with which it seems to share only a glycosaminoglycan attachment site and the core protein of the large molecular weight lung fibroblast chondroitin sulfate proteoglycan (Krusius et al., 1987). Also, no sequence similar- ity was detected with the sequences of thrombomodulin (Jack- man et al., 1986) and the class I1 (Ia) histocompatibility antigen-associated invariant chain of HLA (Giacoletto et al., 1986), integral membrane proteins which have been defined as “part time” proteoglycans by Fransson (1987).

Binding of integral membrane forms of proteoglycans to actin (Rapraeger and Bernfield, 1982) and the cytoskeleton (Woods et al., 1985) has been reported. Cell surface proteogly- cans of rat fibroblasts codistribute with stress fibers in fully spread cells and with actin bundles during cell spreading and rounding (Woods et al., 1984). Proteoglycans with integral membrane core proteins would therefore act as a link between the extracellular matrix components and the cytoskeleton, thereby contributing to processes such as cell attachment, cell spreading, and the maintenance of cell shape.

The availability of data on the primary structure of the extracellular, the transmembrane, and the cytoplasmic do- main of a cell surface-associated proteoglycan will allow us to define further the structure-function relationships within this class of integral membrane proteins.

Acknowledgments-We thank Hilde Braeken, An RayC, Magda Dehaen, and Marleen Willems for their expert technical assistance.

REFERENCES Amara, S. G., Jonas, V., Rosenfeld, M. G., Ong, E. S., and Evans, R.

Aviv, H., and Leder, P. (1972) Proc. Natl. Acad. Sci. U. S. A. 6 9 ,

Bienkowski, M. J., and Conrad, H. E. (1984) J. Biol. Chem. 269,

Bourdon, M. A., Krusius, T., Campbell, S., Schwartz, N. B., and Ruoslahti, E. (1987) Proc. Natl. Acad. Sci. U. S. A. 8 4 , 3194-3198

Bourin, M. C., Boffa, M. C., Bjork, I., and Lindhal, U. (1986) Proc. Natl. Acad. Sci. U. S. A. 8 3 , 5924-5928

Breitbart, R. E., Andreadis, A., and Nadal-Ginard, B. (1987) Annu. Rev. Biochem. 56,467-495

Chen, E. J., and Seeburg, P. H. (1985) DNA ( N Y ) 4, 165-170 Church, G. M., and Gilbert, W . (1984) Proc. Natl. Acad. Sci. U. S. A.

Cole, G. J., Schubert, D., and Glaser, L. (1985) J. Cell Biol. 100,

Coster, L., Carlstedt, I., Kendall, S., Malmstrom, A., Schmidtchen, A., and Fransson, L. A. (1986) J. Biol. Chern. 2 6 1 , 12079-12088

David, G., and Van den Berghe, H. (1983) J. Biol. Chem. 2 5 8 , 7338- 7344

De Boeck, H., Lories, V., David, G., Cassiman, J.-J., and Van den Berghe, H. (1987) Biochem. J. 247, 765-771

Feinberg, A. c., and Vogelstein, P. (1983) Anal. Biochem. 132 , 6-13 Fransson, L. 4. (1987) Trends Biochem. Sci. 12,406-411 Fransson, L. A., Carlstedt, I., Coster, L., and Malmstrom, A. (1984)

Proc. Natl. Acad. Sci. U. S. A. 8 1 , 5657-5661 Fritze, L. M. S., %illy, C. F., and Rosenberg, R. D. (1985) J. Cell

Biol. 100 , 1041-1049 Giacoletto, K. S., Sant, A. J., Bono, C., Gorka, J., O’Sullivan, D. M.,

Quaranta, V., and Schwartz, B. D. (1986) J. Exp. Med. 164, 1422- 1439

Glosll, J., Schubert-Prinz, R., Gregory, J. P., Damle, S., von Figura, K., and Kresse, M. (1983) Biochem. J. 2 1 5 , 295-301

Harper, M. E., and Saunders, G. G. (1981) Chromosome (Berl.) 8 3 ,

Henikoff, S. (1984) Gene (Amst.) 28, 351-359 Hook, M., KjellBn, L., Johansson, S., and Robinson, J. (1984) Annu.

Huynh, T. V., Young, R. A., and Davis, R. W . (1985) in DNA Cloning

M. (1982) Nature 298 , 240-244

1408-1411

12989-12996

69,1408-1412

1192-1199

431-439

Reu. Biochem. 53,847-869

Page 8: OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, of 25, 7017 …THE JOURNAL OF BIOLOGICAL CHEMISTRY Vol. 264, No. 12, Issue of April 25, pp. 7017-7024,1989 0 1989 by The American Society

7024 Structure of an Integral Membrane Proteoglycan Core Protein

(Glover, D. M., ed) Vol. I, pp. 49-78, IRL Press Ltd., Oxford, Great Britain

Ishihara, M., Fedarko, N. S., and Conrad, H. E. (1986) J. Eiol. Chem.

Ishihara, M., Fedarko, N. S., and Conrad, H. E. (1987) J. Biol. Chem.

Jackman, R. W., Becher, D. L., Van De Water, J., and Rosenberg, R.

Jackson, F. R., Bargiello, T. A., Yun, S.-H., and Young, M. W. (1986)

Kjellkn, L., Oldberg, A., and Hook, M. (1980) J. Biol. Chem. 2 5 5 ,

Koda, J. E., and Bernfield, M. (1984) J. Biol. Chem. 259, 11763-

Koda, J. E., Rapraeger, A., and Bernfield, M. (1985) J. Biol. Chern.

Krusius, T., and Ruoslahti, E. (1986) Proc. Natl. Acad. Sci. U. S. A.

Krusius, T., Gehlsen, K. R., and Ruoslahti, E. (1987) J. Biol. Chem.

Laterra, I., Silbert, J. E., and Culp, L. (1983) J. Cell Biol. 106 , 112-

Leff, S. E., Rosenfeld, M. G., and Evans, R. M. (1986) Annu. Reo.

Lories, V., David, G., Cassiman, J.-J., and Van den Berghe, H. (1986)

Lories, V., De Boeck, H., David, G., Cassiman, J.-J., and Van den

Lories, V., Cassiman, J.-J., Van den Berghe, H., and David, G. (1989)

Low, D. A., Baker, J. B., Koone, W. C., and Cunningham, D. D.

261,13575-13580

262,4708-4716

D. (1986) Proc. Natl. Acad. Sci. U. S. A. 83,8834-8838

Nature 320, 185-188

10407-10413

11770

260,8157-8162

83,7683-7687

262,13120-13125

123

Biochem. 65,1091-1117

Eur. J. Biochem. 158,351-360

Berghe, H. (1987) J. Biol. Chem. 262,854-859

J. Bwl. Chem. 264,7009-7016

(1981) Proc. Natl. Acad. Sci. U. S. A. 78, 2340-2344

Maniatis, T., Fritsch, E. F., and Sambrook, J. (1982) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY

Marcum, J. A., Atha, D. H., Fritze, L. M. S., Nawroth, P., Stern, D., and Rosenberg, R. D. (1986) J. Bwl. Chem. 2 6 1 , 7507-7517

Misusawa, S., Nishimura, S., and Sula, F. (1986) Nucleic Acids Res.

Perry, R. P., and Kelley, D. E. (1979) Cell 18 , 1333-1339 Powell, L. M., Wallis, S. C., Pease, R. J., Edwards, Y. H., Knott, T.

J., and Scott, J. (1987) Cell 50,831-840 Rapraeger, A. C., and Bernfield, M. (1982) in Extracellular Matrix

(Hawkes, S., and Way, J. L., eds) pp. 265-267, Academic Press, New York

Rapraeger, A. C., and Bernfield, M. R. (1985) J. €501. Chem. 260,

Ratner, N., Bunge, R. P., and Glazer, L. (1985) J. Cell Biol. 101 ,

Sabatini, D. D., Kreibich, G., Morimoto, T., and Adesnik, M. (1982)

Sanger, F., Nicklem, S., and Coulson, A. R. (1977) Proc. Natl. Acad.

Sant, A. J., Cullen, S. E., and Schwartz, B. D. (1985) J. Zmmunol.

Stanley, K., and Luzio, K. (1984) EMEO J. 3 , 1429-1434 Woods, A., Hook, M., KjellBn, L., Smith, C. G., and Rees, D. A. (1984)

Woods. A.. Couchman. J. P.. and Hook, M. (1985) J. Biol. Chem.

14,1319-1324

4103-4109

744-754

J. Cell Biol. 9 2 , 1-22

Sci. U. S. A. 74, 5463-5467

135,416-422

J. Cell Bwl. 99,1743-1753

260,10872-10879 '

Young, R. A., and Davis, R. W. (1983) Proc. Natl. Acad. Sci. U. S. A.

Zhang, Y., Saison, M., Spaepen, M., De Strooper, B., Van Leuven, F., David, G., Van den Berghe, H., and Cassiman, J.-J. (1988)

80,1194-1198

Somatic dell Mol. Genet. 14,99-104