Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

15
Isolation and Molecular Evolutionary Analysis of a Cytochrome c Gene from Ovyza sativa (Rice)’ Elizabeth C. Kemmerer, * ,2 Ming Lei, * and Ray Wu* pt *Field of Botany and fSection of Biochemistry, Cornell University A cytochrome c gene, OsCc- 1, from rice (Oryza saliva) has been isolated and analyzed. The OsCc- 1 gene encodes a cytochrome c protein that is typical of higher- plant cytochrome c proteins. OsCc- 1 consists of three exons separated by two introns that are 8 17 and 747 bp in length, respectively. From genomic DNA hybridization analysis, OsCc-1 appears to be one of possibly two cytochrome c genes in several Asian, American, and Indian rice species and varieties surveyed. A single, unique cytochrome c gene appears to be present in one African cultivated rice species. We performed comparative molecular evolutionary analyses of OsCc- 1 and other cy- tochrome c genes. We calculated a unit evolutionary period of 19.4 Myr for cy- tochrome c DNA sequences, which agrees closely with previous estimates based on protein sequence comparisons. Introduction Cytochrome c is a small, heme protein located on the outer surface of the inner mitochondrial membrane. It is a protein in the respiratory chain that transfers electrons from cytochrome c reductase to cytochrome c oxidase. This process is responsible for generating most of the proton gradient across the inner mitochondrial membrane that drives adenosine triphosphate synthesis in all eukaryotic cells (Hinkle and Mc- Carty 1978). The cytochrome c protein sequence has been used extensively in molecular evo- lutionary comparisons because it is present in all eukaryotes and easily purified (Day- hoff et al. 1972; Schwartz and Dayhoff 1978; Baba et al. 1981; Syvanen et al. 1989). There are advantages to using DNA sequences instead of amino acid sequences. The degeneracy of the genetic code and lower constraints on noncoding regions allow for analysis of closely related species that have identical amino acid sequences. Such may be the case for the identical cytochrome c amino acid sequences in Brussicu napus L. and B. oleruceu L. (Scogin 198 1). In addition, the phylogenetic tree based on the molecular evolutionary study of cy-tochrome c DNA sequences in vertebrates is in striking agreement with that based on the rich vertebrate fossil record (Scogin 198 1). Hence, cytochrome c DNA sequences are excellent candidates for a comparative mo- lecular evolutionary study of organisms that have inadequate fossil records, such as angiosperms. As a first step toward the analysis of plant cytochrome c genes, we have previously isolated and analyzed an Arubidopsis thuliunu cytochrome c gene (Kemmerer et al., 1. Key words: cladograms, cytochrome c, introns, phenograms, unit evolutionary period. 2. Current address:USDA-ARS-Bare West, Plant MolecularBiology Laboratory,Building 006, Beltsville, Maryland 20705. Address for correspondenceand reprints: Ray Wu, Section of Biochemistry, Molecular and CelI Biology, Biotechnology Building, Cornell University, Ithaca, New York 14853. Mol. Bid. Evol. 8(2):2 12-226. 1991. 0 1991 by The University of Chicago. All rights reserved. 0737-4030/91/08024004$02.00 212 Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106 by guest on 05 February 2018

Transcript of Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

Page 1: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

Isolation and Molecular Evolutionary Analysis of a Cytochrome c Gene from Ovyza sativa (Rice)’

Elizabeth C. Kemmerer, * ,2 Ming Lei, * and Ray Wu* pt *Field of Botany and fSection of Biochemistry, Cornell University

A cytochrome c gene, OsCc- 1, from rice (Oryza saliva) has been isolated and analyzed. The OsCc- 1 gene encodes a cytochrome c protein that is typical of higher- plant cytochrome c proteins. OsCc- 1 consists of three exons separated by two introns that are 8 17 and 747 bp in length, respectively. From genomic DNA hybridization analysis, OsCc-1 appears to be one of possibly two cytochrome c genes in several Asian, American, and Indian rice species and varieties surveyed. A single, unique cytochrome c gene appears to be present in one African cultivated rice species. We performed comparative molecular evolutionary analyses of OsCc- 1 and other cy- tochrome c genes. We calculated a unit evolutionary period of 19.4 Myr for cy- tochrome c DNA sequences, which agrees closely with previous estimates based on protein sequence comparisons.

Introduction

Cytochrome c is a small, heme protein located on the outer surface of the inner mitochondrial membrane. It is a protein in the respiratory chain that transfers electrons from cytochrome c reductase to cytochrome c oxidase. This process is responsible for generating most of the proton gradient across the inner mitochondrial membrane that drives adenosine triphosphate synthesis in all eukaryotic cells (Hinkle and Mc- Carty 1978).

The cytochrome c protein sequence has been used extensively in molecular evo- lutionary comparisons because it is present in all eukaryotes and easily purified (Day- hoff et al. 1972; Schwartz and Dayhoff 1978; Baba et al. 1981; Syvanen et al. 1989). There are advantages to using DNA sequences instead of amino acid sequences. The degeneracy of the genetic code and lower constraints on noncoding regions allow for analysis of closely related species that have identical amino acid sequences. Such may be the case for the identical cytochrome c amino acid sequences in Brussicu napus L. and B. oleruceu L. (Scogin 198 1). In addition, the phylogenetic tree based on the molecular evolutionary study of cy-tochrome c DNA sequences in vertebrates is in striking agreement with that based on the rich vertebrate fossil record (Scogin 198 1). Hence, cytochrome c DNA sequences are excellent candidates for a comparative mo- lecular evolutionary study of organisms that have inadequate fossil records, such as angiosperms.

As a first step toward the analysis of plant cytochrome c genes, we have previously isolated and analyzed an Arubidopsis thuliunu cytochrome c gene (Kemmerer et al.,

1. Key words: cladograms, cytochrome c, introns, phenograms, unit evolutionary period. 2. Current address: USDA-ARS-Bare West, Plant Molecular Biology Laboratory, Building 006, Beltsville,

Maryland 20705. Address for correspondence and reprints: Ray Wu, Section of Biochemistry, Molecular and CelI Biology,

Biotechnology Building, Cornell University, Ithaca, New York 14853.

Mol. Bid. Evol. 8(2):2 12-226. 1991. 0 1991 by The University of Chicago. All rights reserved. 0737-4030/91/08024004$02.00

212 Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106by gueston 05 February 2018

Page 2: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

Rice Cytochrome c Gene 2 13

accepted). This gene, although structurally characteristic of dicotyledonous plant nu- clear genes, encodes a cytochrome c protein atypical of other higher-plant cytochrome c proteins that were determined by protein sequencing. Here we report the isolation and analysis of a cytochrome c gene from rice (Oryza diva), the first cytochrome c gene cloned from a monocotyledonous plant. The rice cytochrome c gene, designated “OsCc- 1,” encodes a protein characteristic of a plant cytochrome c.

In a molecular evolutionary analysis, we present both phenograms and cladograms constructed from the coding-region DNA sequence of this and other cytochrome c genes. We have also calculated a unit evolutionary period (UEP) for cytochrome c DNA sequences.

Material and Methods Isolation of Oryza sativa Genomic DNA

Rice seeds ( Oryza sutiva, cv. Indica IR36) were sterilized in 1% sodium hypo- chlorite for 12-16 h. The seeds were rinsed thoroughly with sterile, deionized water and spread on sterile 3-mm Whatman paper. A sterile fungizone (GIBCO Laboratories, Grand Island, N.Y.) solution was added such that the filter paper was soaked. The seeds were grown for 2 wk in the dark at 37°C. The fungizone solution was replaced approximately every 4 d. Genomic DNA was purified using the procedure of Moon et al. ( 1987), followed by cesium chloride purification and dialysis against TE ( 10 mM Tris-HCl, 1 mM ethylenediaminetetraacetate, pH 8.0).

Polymerase Chain Reaction (PCR)

Two primers were synthesized that annealed to the DNA sequence encoding highly conserved regions of the cytochrome c protein. The 5’ primer, named 2-5’, was 5’-GAGAATTCTTCAAGACYAAGTGCGCYCARTGCCACA-3 ‘, where Y = C or T and R = A or G. The first two 5’ bases in this primer have no function except to extend the primer two more bases to ensure the integrity of the ends of the PCR products. The next six bases constitute an EcoRI site to facilitate cloning of PCR products. The remaining 28 bases represent the possible DNA sequence encoding amino acids Phe at position 10 to part of Thr at position 19 of cytochrome c. The 3’ primer, named 3-3’, was S’CTAAGCTTASACCATCTTRGTRCCRGGDATGT- ACTTC-3’, where S = C or G, R = A or G and where D = A or G or T. The first two 5’ bases have the same role.as described above. The next six bases constitute a Hind111 site to facilitate cloning of PCR products. The next 29 bases represent the complement of the possible DNA sequence encoding amino acids Lys at position 72 to part of Phe at position 82 of cytochrome c. For both primers, the codon preferences employed by plant nuclear genes (Murray et al. 1989) were considered, in order to keep the number of unique primers in each mixture low.

For the PCR reaction, 1 ug of rice genomic DNA was added to 0.1 ug of each unique primer in each mixture in a lOO-ul reaction buffer containing 50 mM KCl, 10 mM Tris-HCl, pH 8.3, 1 mM MgClp, 100 ug gelatin/ml, 200 uM of each dNTP (dATP, dGTP, dCTP, and dTTP), and 2.5 units Thermus aquaticus DNA polymerase (United States Biochemicals, Cleveland). This was topped with three drops of mineral oil in an Eppendorf tube. The reaction was run in an Ericomp programmable cycle reactor instrument (San Diego). The initiation cycle was 92°C for 2 min, 60°C for 1 min, and 72°C for 3 min. The reaction cycle (92°C for 0.5 min, 60°C for 1 mitt, and 72°C for 2 min) was repeated 35 times. This was followed by a termination cycle that was 92°C for 0.5 min, 60°C for 1 min, and 72°C for 10 min. After cooling to room

Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106by gueston 05 February 2018

Page 3: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

2 14 Kemmerer et al.

temperature, the aqueous phase was drawn out with a capillary pipette and removed to a different tube. This was extracted with an equal volume of phenol and then with an equal volume of chloroform. The double-stranded DNA was preferentially precip itated by the addition of 0.5 vol of 7.5 M ammonium acetate and of 2 vol. ethanol, followed by incubation at room temperature for 5 min. The DNA was collected by centrifugation at 12,000 g for 15 min and resuspended in 40 ul TE. The PCR products were resolved on a 1% low-melting-point agarose gel. After being photographed, the desired bands were cut out of the gel, melted at 65°C for 5 min, phenol extracted twice, and precipitated by the addition of 0.1 vol 3 M sodium acetate and of 2 vol. ethanol, followed by incubation at -20°C for at least 2 h. The DNA was collected by centrifugation at 12,000 g for 15 min, dissolved in water, digested with EcoRI and HindIII, ligated to EcoRI-and-HindIII-cut pBluescript vectors (Stratagene, Palo Alto, CA), and transformed into JM 10 1 competent cells.

Genomic Library Screening, Hybridization Analysis, and DNA Sequencing

A genomic library containing partially digested Suu3A fragments of 0. sutivu (cv. IR36) genomic DNA in lambda-DASH (Stratagene, La Jolla, CA) was a gift from Drs. Steve Kay and Nam-Hai Chua. Approximately 1 X lo6 plaques were screened with a random-hexamer ‘*P-labeled PCR-1 probe. Prehybridization and then hybrid- ization were carried out at 42°C for 24 h in 1% (w/v) SDS (sodium dodecyl sulfate), 1 M NaCl, and 10% (w/v) dextran sulfate. The filters were washed successively at 42°C for 20 min each in the following solutions: 2 X SSC (0.15 sodium chloride, 0.015 sodium citrate), 0.1% (w/v) SDS; 1 X SSC, 0.1% (w/v) SDS; and 0.5 X SSC, 0.1% (w/v) SDS.

For the genomic DNA hybridization analysis, the rice genomic DNA was pre- hybridized for 12 h and then hybridized for 24 h at 60°C in 1% (w/v) SDS, 1 M NaCl, and 10% (w/v) dextran sulfate. The filter was washed successively at 60°C for 20 min in each of the following solutions: 1 X SSC, 0.1% (w/v) SDS; 0.5 X SSC, 0.1% (w/v) SDS; 0.1 X SSC, 0.1% (w/v) SDS; and 0.01 X SSC, 0.1% (w/v) SDS. Three 32P-labeled probes were used separately: the 4-kb EcoRI fment containing the entire protein coding region, introns, and some noncoding regions of the OsCc-1 gene; the 2.4-kb S&I-EcoRI fragment containing most of the coding region, both introns, and the 3’ flanking region; and the 1.6-kb EcoRI-Sal1 fragment containing some of exon I, no intron sequences, and the 5’ noncoding region. Each probe was stripped from the filter before reprobing by treating the filter with 0.4 N NaOH for 15 min, followed by neutralization in 0.1 X SSC, 0.1% (w/v) SDS, 0.2 M Tris-HCl, pH 7.5 for 10 min.

Various restriction fragments of OsCc- 1 were subcloned into M 13 or pBluescript and sequenced by the dideoxynucleotide chain-termination method of Sanger et al. ( 1977). Both strands were sequenced in coding regions.

Computer Analysis and Molecular Evolution

Cytochrome c DNA sequences were obtained from Smith et al. ( 1979), Mont- gomery et al. ( 1980), Scarpulla et al. ( 198 1 ), Russell and Hall ( 1982), Limbach and Wu (1983, 1985u, 1985b), Swanson et al. (1985), Evans and Scarpulla (1988), Vir- basius and Scarpulla ( 1988 ) , and Kemmerer et al. (accepted). The input matrices for cytochrome c were built by aligning the sequences from the codon for the universally conserved Gly at position 1 to the codon for residue 103.

Determination of the number of nucleotide substitutions per 100 sites, corrected Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106by gueston 05 February 2018

Page 4: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

Rice Cytochrome c Gene 2 15

for multiple hits, for the first base in each codon (K, ) was calculated using the program of Li et al. ( 1985 ) . Phenograms were constructed using the FITCH program in PHYLIP ( PHYLogeny Inference Package, version 3.3) by Felsenstein ( 198 1). Cladograms were constructed using the DNAPARS program in PHYLIP version 3.2.

The UEP was calculated by plotting K1 against estimated evolutionary divergence times, followed by finding the slope of the best-fitting straight line by using the Cricket Graph 1.3’” program (Great Valley Corporate Center, Malvern, Penn.).

Results Isolation of a Rice Cytochrome c Gene

PCR was carried out on Oryza sativa genomic DNA as described. Two bands of -210 and -250 bp were reproducibly amplified. DNA from these two bands was cloned. Sequence analysis showed that one clone with a 2 16-bp insert encodes part of a cytochrome c-like protein. This clone was designated “PCR- 1.” The sequence of PCR- 1 was far from identical to the amino acid sequence of rice cytochrome c deter- mined previously by protein sequencing (Ochi et al. 1983) and is somewhat more similar to cytochrome c sequences from molds and Arabidopsis (Kemmerer et al., accepted) than to those from most plants (fig. 1). It appears that a second rice cyto- chrome c gene or pseudogene may have been selectively amplified during PCR. Al- ternatively, a fungal cytochrome c sequence may have been selectively amplified, presumably because of the fungal contamination of the rice seedlings.

The insert of PCR- 1 was labeled and used as a probe to screen an 0. sativa (cv. Indica, IR36) genomic library. Two different groups of lambda clones were isolated, determined to be overlapping by restriction mapping (data not shown). A 4-kb EcoRI fragment was subcloned and designated “pOsCc- 1.” The restriction map of this clone is shown in figure 2.

Appropriate restriction fragments of pOsCc-1 were subcloned and sequenced (fig. 3). The coding region consists of three exons interrupted by two introns. The exons encode a cytochrome c protein identical to the amino acid sequence of rice cytochrome c determined previously by protein sequencing (Ochi et al. 1983). The codon bias and GC content of the coding region are typical for a monocotyledonous plant nuclear gene (data not shown; Murray et al. 1989). Several sequences resembling regulatory motifs can be found in the 5’ noncoding regions. A putative TATA box in the 5’ noncoding region is underlined in figure 3. A purine-rich region is also present in the 5’ noncoding region that may be a ribosome binding sequence (marked by asterisks in fig. 3). In the 3’ flanking region there is a putative polyadenylation signal (underlined in fig. 3 ) .

Analysis of Intron Sequences

Intron I is 8 17 bp in length, and intron II is 747 bp in length. Both introns contain conserved sequence elements required for proper splicing. The 5’ splice junction is AAG/gtgagc for intron I and AAG/gtatga for intron II, which are 89% and 78% identical to the consensus sequence MAG/gtragt, respectively (M = A or C; r = a or g; Black et al. 1985; Jacob and Gallinaro 1989). Likewise, both introns have a sequence resembling the -55ytray-20 consensus sequence for the branch point site: -33ttgat-29 for intron I and -26ctgag- 22 for intron II (y = c or t; in these cases, the numbers refer to nucleotides 5’ of the 3’ splice junction; Langford et al. 1984; Keller and Noon 1985 ) . A pyrimidine-rich region located between 5-20 bases 5 ’ of the 3 ’ splice junction are also present: -‘ctctttc -3 for intron I and -“ttttctc- lo for intron II. Finally, the 3’

Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106by gueston 05 February 2018

Page 5: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

PCR-1 FKTK~IGPALHGLWGRKTGSVEGYA YTDANKQKGIQWNHSTLFHYLKNPKKYIPGTKMV

RICE .D.GA.H.Q..N.N..F..QS.TTP..S.ST.DNMAVI.EEN..YD..L.. 46.2%

IDoPSI$ 75.0% . . . . . . E F.......A..S..........E.WD.......... . . . .

PLANT (10) . ..GA.H.Q..N....F..QS.TTA..S.SA...N.AV....Y...... 46.2%

D N P TD M IGK D L D

WLD (1) EGGNLTQ.........F.......D.......T.......... 75.0%

YEAST! (2) . . .G.PH.V. .N.. .IF..HS.Q~..S......K..vL.DEN.Ms...... 59.6% R EAP

- (9) . ..G.KH.T..N....F.....QAP..S......N...T.~..M...... 69.2% V T IKD

INSECT (4) _.~.~.V..N....F.....QAP.F...N...A...T.QDD......--- 67.3% Y A G

/ I

FtC . 1 .-Alignment of PCR- 1 amino acid sequence, deduced from DNA sequence, with other cytochrome c protein sequences. The cytochrome c sequences include rice, Arabidopsis, 10 additional plants, one mold, two yeasts, nine mammals, and four insects (sequences were obtained from Dickerson 197 1; O&i et al. 1983; Kemmerer et al., accepted). Numbers in parenthesis are the number of species compared in generating the consensus sequence. The numbers on the right refer to the percentage of similarity between the protein sequences and the PCR-1 sequence for the 52 central amino acid residues. The first 10 and the last 10 amino acids corresponding to the two PCR primers are not included because they may not represent the actual amino acids encoded by the PCR-1 gene or pseudogene. Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106

by gueston 05 February 2018

Page 6: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

Rice Cytochrome c Gene 2 17

Sal I Cla I Xbal

u 100 bp

FIG. 2.-Restriction map of OsCc- 1. Solid boxes represent exons, empty boxes represent introns, and thin lines represent flanking regions.

splice junctions for both introns are tag/, which agrees with the yag/consensus sequence (Keller and Noon 1985).

The positions of the introns within the OsCc- 1 coding region, located within the codon for Gly at position 29 and after the codon for Lys at position 73, are unique.

-263

-223

-112

t1

85

185

2%

407

SlB

629

740

851

954

1038

1140

l251

1362

1473

1584

1695

1806

1991

x)97

are in uppercase letters. A putative TATA box is underlined in the 5’ flanking region. A purinarich region that may be important for ribosome binding is indicated by asterisks. A polyadenylation signal is underlined in the 3’ flanking region.

\

Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106by gueston 05 February 2018

Page 7: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106by gueston 05 February 2018

Page 8: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

Rice Cytochrome c Gene 2 19

strongly to a 5.8-kb band and weakly to a 6.7-kb band, for all of the 0. sativa rice cultivars from Asia, India, and America that we surveyed (fig. 4, lanes l-5). Oryza glaberrima, an African cultivated species, yielded only one band, at 7.1 kb (fig. 4, lane 6). All of the probes hybridized with the 5.8- and 6.7-kb bands for a wild species of rice found in Asia and America (fig. 4, lane 7). Since there are no Hind111 sites in the pOsCc-1 clone (see the restriction map of OsOc-1), only one band was expected to hybridize with each of the probes.

Molecular Evolutionary Analysis

A dissimilarity matrix for cytochrome c genes was constructed (table 1). This matrix was used as input data to construct the phenogram shown in figure 5. Ki values (Kimura 198 1) were chosen for several reasons: ( 1) KA values were not used, because previous analyses have been performed using cytochrome c amino acid sequence sim- ilarities and thus KA values would provide no new information. Because not every base change in the first base of the codon results in an amino acid change and because not all codons with identical bases in the first position encode the same amino acid, our calculated K, values based on DNA sequences are different from previous cal- culations performed using protein sequences. (2) K2 values were not used because almost all substitutions at the second position in a codon are nonsynonymous. ( 3) Ks and Kj values were not used, because the species we included in our analysis are very divergent and because the computer analysis indicated that the number of sub- stitutions at synonymous sites was saturated. When more plant cytochrome c genes are isolated, Ks and K3 values will provide more useful information.

The resulting phenogram (fig. 5 ) shows that OS& 1 does not group ‘gether with the Arabidopsis cytochrome c gene, designated “At c- ,/ t’ However, #Jz;niJ the AtCc-I sequence is an abnormal representati e for igher plants al., accepted).

Wagner maximum-parsimony clad gra %

mG were o constructed (fig. 6A). Since this cladogram includes gene duplication , a set

2 nd ?ls ladogram was const cted from

an abbreviated data set. The cytochrome c gen from Arabidopsis was Y

eluded from this analysis because we have previously shown that this gene encode’ a cytochrome c protein unlike those of other higher plants (Kemmerer et al., accepted). The M. sexta and D. melanogaster DC4 cytochrome c genes were excluded because they have been shown to encode thoracic muscle-specific cytochrome c proteins (Swanson et al. 1985 ) . The testis-specific cytochrome c genes from mouse and rat were also excluded (Virbasius and Scarpulla 1988). Both of the Saccharomyces cerevisiae genes were included because the yeasts form a group separate from the other organisms, so the inclusion of both would not affect the plant and animal groups of the second tree. The resulting cladogram is shown in figure 6B. For both cladograms, branch lengths correspond to the number of base differences.

The K, data used for the phenogram in figure 5 were plotted against the estimated divergence times for the various organisms (fig. 7). The slope of the best-fitting line through the data points yielded a UEP (time required for a 1% change to occur) of 19.4 Myr, which agrees quite well with previous estimates of 22.6 Myr (Margoliash and Smith 1965 ) and 20.0 Myr (Dickerson 197 1) , which were calculated independently from amino acid sequence data. This plot was repeated for the KA data, which yielded a UEP of 19.5 Myr (not shown).

Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106by gueston 05 February 2018

Page 9: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

Table 1 Number of Synonymous and Nonsynonymous Substitutions per 100 Sites, Corrected for Multiple Hits, for Pairwise Comparisons of Cytochrome c Genes, First Codon Position (K,)

,

SOURCE

!S~UR~E

s. s. Mouse Rat Saccharomyces cerevisiae cerevisiae Drosophila Drosophila Somatic Somatic Mouse Rat

Arabidopsis Rice PomPe iso- iso- DC3 DC4 Manduca Chicken Cell Cell Testis Testis Human

Arabidopsis M Rice 291 0 S. pombe 196 291

S. cerevisiae iso- 1 262 288 215 S. cerevisiae iso- 241 336 217 84 Drosophila DC3 323 334 360 339 357 Drosophila DC4 206 328 201 306 314 193 Manduca 198 285 208 285 294 207 54 Chicken 296 284 215 307 341 257 144 140 Mouse somatic cd 271 285 215 287 319 210 115 106 50

Rat somatic cell 271 285 215 287 319 210 115 106 50 0

Mouse testis 306 259 271 328 350 253 185 162 112 101 101 Rat testis 291 252 279 336 358 260 191 177 118 113 113 25

Human 263 250 245 267 306 252 162 141 A7 50 51 135 135 Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106by gueston 05 February 2018

Page 10: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

33

100

II I s. cEREv1s1AE ISO-I

51 s. CEPEvISIAe ISo-

74 s. PMSe

104 ’ AuMIDOPsIs

149 RICQ.

147 r DROSOPH I IA

FIG. 5.-Phenogram based on K, values. The data in table 1 were used with the FITCH program of PHYLIP version 3.3. Numbers refer to relative distances. The average percentage of standard deviation is 7.7.

Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106by gueston 05 February 2018

Page 11: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

222 Kemmerer et al.

DROSOPHILA UC3

ARABIDOPSIS

YOUSE SOMATIC MOUSE TESTIS

S. CEREVISIAE ISO-

DROSOPHILA DC3 RICE

S. CEREVtSlAE ISO-

MOUSE SOMATIC

S. POiBE

FIG. 6.-Wagner maximum-parsimony cladograms. A, Cladogram for all cytochrome c genes sequenced to date. The DNAPARS program of the PHYLIP (version 3.2) package was employed. B, Cladogram for abbreviated data set.

Discussion

We have isolated and sequenced a cytochrome c gene, OsCc- 1, from rice. The three short exons are interrupted by two introns, one of 8 17 bp and one of 747 bp, that are longer than the mean length of 249 bp for plant coding-region introns (Hawkins 1988). The OsCc-1 introns share no statistically significant sequence similarity with either the AtCc- 1 introns or the introns in animal cytochrome c genes. This does not

Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106by gueston 05 February 2018

Page 12: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

Rice Cytochrome c Gene 223

0 200 1000 1200 400 600 800 DIVERGENCE (MYA)

UEP = 19.4 MY

FIG. 7.-Estimate of unit evolutionary period (UEP). The K, values of the organisms used in the tree shown in fig. 6B were plotted against estimated evolutionary divergence times. Divergence time estimates (and their sources) are as follows: vertebrate/yeast, 1,050 ‘Mya (Margoliash and Smith 1965); animals/ plants/fungi, 1,000 Mya (Shih et al. 1986); vertebrate/insect, 600 Mya (Shih et al. 1986); mammal/bird, 270 Mya (Shih et al. 1986); mouse/rat, 15 Mya (Li et al. 1987).

eliminate the possibility that short lengths of sequence similarity are present that may have biological significance, such as being regulatory elements.

The coding regions of the animal and yeast cytochrome c genes have either no introns or only one intron that is in a different location from the OsCc- 1 introns. The OsCc- 1 intron locations are also different from those of the Arubidopsis cytochrome c gene, AtCc-1. Because the AtCc-1 and OsCc-1 introns are conserved neither in location nor in sequence, it is not possible, without examining more plant cytochrome c genes, to determine when and where the introns arose or were lost.

OsCc- 1 from rice and AtCc- 1 from Arubidopsis are the only plant cytochrome c genes sequenced to date. OsCc- 1 encodes a protein that is identical to rice cytochrome c determined by protein sequencing (Ochi et al. 1983 ). The rice cytochrome c amino acid sequence shows high sequence identity with the plant cytochrome c amino acid consensus sequence (fig. 1) . Phenograms based on cytochrome c amino acid sequences show that rice cytochrome c is a typical plant cytochrome c (Syvanen et al. 1989). We have shown that AtCc-1 encodes a protein atypical of other plant cytochrome c proteins ( Kemmerer et al., accepted). We believe the AtCc- 1 and OsCc- 1 are paralogous genes for several reasons. ( 1) It is generally accepted that all higher plants belong to one monophyletic group, when compared with fungi and animals. However, the phe- nogram (fig. 5 ) based on the K, values of cytochrome c genes shows that AtCc- 1 is separated from OsCc- 1. Therefore, the simplest explanation for this inconsistency is that AtCc- 1 and OsCc- 1 are paralogous genes. ( 2) The deduced amino acid sequence of AtCc-1 is more similar to the cytochrome c from Neurosporu and yeasts than to those from OsCc- 1 or other plants (Kemmerer et al., accepted). Moreover, the intron positions are not conserved between OsCc- 1 and AtCc- 1. Therefore, it is highly unlikely that these two genes are orthologous. ( 3 ) Our rice genomic DNA hybridization results indicate that there may be a second type of cytochrome c gene in most species and

Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106by gueston 05 February 2018

Page 13: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

224 Kemmerer et al.

varieties of rice, that must differ in sequence from OsCc-1, since the OsCc-1 probe hybridizes to this second, higher-molecular-weight band with less intensity. The second type of rice cytochrome c may turn out to be the orthologue of AtCc- 1. We have shown elsewhere, by genomic DNA hybridization analysis (Kemmerer et al., accepted), that Arabidopsis has only one cytochrome c gene. We believe that most likely there are two types of cytochrome c genes in higher plants such as rice and that Arabidopsis lost the OsCc-1 type of cytochrome c gene during the process of genome reduction that occurred relatively recently in its evolution. Again, more plant cytochrome c genes must be isolated to provide further information.

Previous molecular evolutionary analyses indicate that cytochrome c genes gen- erally evolve at a fairly linear rate. Although cytochrome c gene evolution may be slightly faster in primates and snakes, the rate of evolution in plant and vertebrate cytochrome c genes is comparable (Syvanen et al. 1989). The analyses based on protein sequence data were drawn from a data base larger than that used for our DNA sequence analysis, so the linearity of our Ki versus divergence-time graph may improve when more DNA sequence data are added. The introns may prove to be a valuable tool in the molecular evolutionary analysis of plant cytochrome c genes. In general, there are fewer constraints on changes in sequence for noncoding regions, so we expect that intron sequences would have evolved at a faster rate, providing more information for comparisons between closely related species. When more cytochrome c genes have been sequenced from plants, a molecular evolutionary analysis of intron sequences may help to resolve the problem of the homoplasy and lack of resolution that are often seen in plant molecular evolutionary analyses ( Syvanen et al. 1989).

Sequence Availability

The sequence for O&c- 1 has been deposited in GenBank under accession number M3505 1.

Acknowledgments

The authors would like to thank Jeff Doyle, Walter Fitch, and David McElroy for critical reading of the manuscript; Ti-yun Wu and Xing-ping Zhao for. genomic DNA from several rice varieties; and James Mecca and Yong Xu for technical con- tributions. E.C.K. was supported in part by a fellowship from the Cornell Biotechnology Program, which is sponsored by the New York State Science and Technology Foun- dation (a consortium of industries), the U.S. Army Research Office, and the National Science Foundation. This work was supported by research grant 29 179 from the Na- tional Institutes of Health, USPHS.

LITERATURE CITED

BABA, M. L., L. L. DARGA, M. GOODMAN, and J. CZELUSNIAK. 198 1. Evolution of cytochrome c investigated by the maximum parsimony method. J. Mol. Evol. 17: 197-2 13.

BLACK, D. L., B. CHABOT, and J. A. STEITZ. 1985. U2 as well as Ul small nuclear ribonucleo- proteins are involved in premessenger RNA splicing. Cell 42:737-750.

DAYHOFF, M. O., C. M. PARK, and P. J. MCLAUGHLIN. 1972. Building a phylogenetic tree: cytochrome c. Pp. 7-16 in M. 0. DAYHOFF, ed. Atlas of protein sequence and structure. National Biomedical Research Foundation, Washington, D.C.

DICKERSON, R. E. 197 1. The structure of cytochrome c and the rates of molecular evolution. J. Mol. Evol. 1:26-45.

EVANS, M. J., and R. C. SCARPULLA. 1988. The human somatic cytochrome c gene: two classes Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106by gueston 05 February 2018

Page 14: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

Rice Cytochrome c Gene 225

of processed pseudogenes demarcate a period of rapid molecular evolution. Proc. Natl. Acad. Sci. USA 85:9625-9629.

FELSENSTEIN, J. 198 1. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 11368-376.

HAWKINS, J. D. 1988. A survey on intron and exon lengths. Nucleic Acids Res. 16:9893-9908. HINKLE, P. C., and R. E. MCCARTY. 1978. How cells make ATP. Sci. Am. 238: 104-123. IYENGAR, G. A. S., J. P. GADDIPATI, and S. K. SEN. 1979. Characteristics of nuclear DNA in

the genus Oryza. Theor. Appl. Genet. 54:2 19-224. JACOB, M., and H. GALLINARO. 1989. The 5’ splice site: phylogenetic evolution and variable

geometry of association with U 1 RNA. Nucleic Acids Res. 17:2 159-2 180. KELLER, E. B., and W. A. NOON. 1985. Intron splicing: a conserved internal signal in introns

of Drosophila pre-mRNAs. Nucleic Acids Res. 13:497 l-498 1. KEMMERER, E. C., M. LEI, and R. Wu. Structure and molecular evolutionary analysis of a

plant cytochrome c gene: surprising implications for Arabidopsis thaliana. J. Mol. Evol. (accepted).

KIMURA, M. 198 1. Estimation of evolutionary distances between homologous nucleotide se- quences. Proc. Natl. Acad. Sci. USA 78:454-458.

LANGFORD, C. J., F. J. KLINZ, C. DONATH, and D. GALLWITZ. 1984. Point mutations identify the conserved, intron-contained TACTAAC box as an essential splicing signal sequence in yeast. Cell 36:645-653.

LI, W.-H, M. TANIMURA, and P. M. SHARP. 1987. An evaluation of the molecular clock hy- pothesis using mammalian DNA sequences. J. Mol. Evol. 25:330-342.

LI, W.-H., C.-I. WU, and C.-C. LUO. 1985. A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nu- cleotide and codon changes. Mol. Biol. Evol. 2: 150- 174.

LIMBACH, K. J., and R. WU. 1983. Isolation and characterization of two alleles of the chicken cytochrome c gene. Nucleic Acids Res. 11:8931-8950.

-. 1985a. Characterization of two Drosophila melanogaster cytochrome c genes and their transcripts. Nucleic Acids Res. 13:63 l-644.

-. 1985b. Characterization of a mouse somatic cytochrome c gene and three cytochrome c pseudogenes. Nucleic Acids Res. 13:6 17-630.

MARGOLIASH, E., and E. L. SMITH. 1965. Structural and functional aspects of cytochrome c in relation to evolution. Pp. 221-242 in V. BRYSON and H. J. VOGEL, eds. Evolving genes and proteins. Academic Press, New York.

MONTGOMERY, D. L., D. W. LEUNG, M. SMITH, P. SHALIT, G. FAYE, and B. D. HALL. 1980. Isolation and sequence of the gene for iso-2-cytochrome c in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA 77:541-545.

MOON, E., T.-H. KAO, and R. Wu. 1987. Rice chloroplast DNA molecules are heterogeneous as revealed by DNA sequences of a cluster of genes. Nucleic Acids Res. 15:6 1 l-630.

MURRAY, E. E., J. LOTZER, and M. EBERLE. 1989. Codon usage in plant genes. Nucleic Acids Res. 17:477-499.

GCHI, H., Y. HATA, N. TANAKA, M. KAKUDO, T. SAKURAI, S. AIHARA, and Y. MORITA. 1983. Structure of rice ferricytochrome c at 2.0 A resolution. J. Mol. Biol. X%:407-4 18.

RUSSELL, P. R., and B. D. HALL. 1982. Structure of the Schizosaccharomyces pombe cytochrome c gene. Mol. Cell. Biol. 2:106-l 16.

SANGER, F., S. NICKLEN, and A. R. COUL.SON. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74~5463-5467.

SCARPULLA, R. C., K. M. AGNE, and R. Wu . 198 1. Isolation and structure of a rat cytochrome c gene. J. Biol. Chem. 256:6480-6486.

SCHWARTZ, R. M., and M. 0. DAYHOFF. 1978. Cytochromes. Pp. 29-44 in M. 0. DAYHOFF, ed. Atlas of protein structure and function. National Biomedical Research Foundation, Washington, D.C.

Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106by gueston 05 February 2018

Page 15: Isolation and Molecular Evolutionary Analysis of a Cytochrome c ...

226 Kemmerer et al.

WOGIN, R. 198 1. Amino acid sequence studies and plant phylogeny. Pp. 19-42 in D. A. YOUNG and D. S. SEMLIN, eds. Phytochemistry and angiosperm phylogeny. Praeger, New York.

SHIH, M.-C., G. LAZAR, and H. M. GOODMAN. 1986. Evidence in favor of the symbiotic origin of chloroplasts: primary structure and evolution of tobacco glyceraldehyde-3-phosphate de- hydrogenases. Cell 4273-80.

SMITH, M., D. W. LEUNG, S. GILLAM, and C. R. ASTELL. 1979. Sequence of the gene for iso- 1 -cytochrome c in Saccharomyces cerevisiae. Cell 16:753-76 1.

SWANSON, M. S., S. M. ZIEMINN, D. D. MILLER, E. A. E. GARBER, and E. MARGOLIASH. 1985. Developmental expression of nuclear genes that encode mitochondrial proteins: insect cy- tochromes c. Proc. Natl. Acad. Sci. USA 82:1964-1968.

SYVANEN, M., H. HARTMAN, and P. F. STEVENS. 1989. Classical plant ambiguities extend to the molecular level. J. Mol. Evol. 28536-544.

VIRBASIUS, J. V., and R. C. SCARPULLA. 1988. Structure and expression of rodent genes encoding the testis-specific cytochrome c. J. Biol. Chem. 263:679 l-6796.

ALAN WEINER, reviewing editor

Received May 14, 1990; revision received October 29, 1990

Accepted October 29, 1990

Downloaded from https://academic.oup.com/mbe/article-abstract/8/2/212/1134106by gueston 05 February 2018