Calculation of the free energy of association for protein complexes

13
Protein Science (1992), I, 169-181. Cambridge University Press. Printed in the USA. Copyright 0 1992 The Protein Society 0961-8368/92 $5.00 + .OO ~ ~ _____ - .. " " Calculation of the free energy of association for protein complexes NANCY HORTON AND MITCHELL LEWIS The Johnson Research Foundation, Department of Biochemistry and Biophysics, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104 (RECEIVED August 5, 1991; ACCEPTED September 5, 1991) Abstract We have developed a method for calculating the association energy of quaternary complexes starting from their atomic coordinates. The association energy is described as the sum of two solvation terms and an energy term to account for the loss of translational and rotational entropy. The calculated solvation energy, using atomic sol- vation parameters and the solvent accessible surface areas, has a correlation of 96% with experimentally deter- mined values. We have applied this methodology to examine intermediates in viral assembly and to assess the contribution isomerization makes to the association energy of molecular complexes. In addition, we have shown that the calculated association can be used as a predictive tool for analyzing modeled molecular complexes. Keywords: hydrophobicity; protein structure; solvation Specific interactions between macromolecules are respon- sible for the assembly of complex biological structures and are essential to theregulation of events within a cell or organism. The association of molecules to form higher ordered oligomers is in many respects analogous to the block condensation model for protein folding where pre- folded units associate to form higher order structures (Richmond & Richards, 1978). To understand the struc- tural basis of recognition we must be able to relate solu- tion measurements of the association process to the structure of the macromolecular complex. This paper considers the problem of calculating the free energies of forming protein complexes from preformed subunits as derived from crystallographic data and relating these val- ues to experimentally obtained association constants. Kauzmann (1959) suggested that a major factor in the stable formation of protein complexes is a consequence of the hydrophobic effect. Using an empirical correlation between the accessible surface area and freeenergies of transfer of amino acids from water to octanol, Chothia and Janin (1975) found that the free energy required to form a stable complex was directly related to the amount of surface area buried in the interface. Eisenberg and McLachlan (1986) recognized that it is an oversimplifica- Reprint requests to: Mitchell Lewis, The Johnson Research Founda- tion, Department of Biochemistry and Biophysics, University of Penn- sylvania School of Medicine, Philadelphia, Pennsylvania 19104. tion to base the energy of association on surface area alone; polarity and charge must also be considered. By introducing five atomic solvation parameters, to account for the polar or apolar character of each atom type most frequently found in proteins, they could more accurately relate surface area to the free energy of transfer. More- over, the solvation energy was shown to be useful for as- sessing protein stability. We assume that the forces that govern the association between two molecules are the same as the forces that are responsible for the folding of a protein in water. As such, the solvation energy should be equally useful as a gauge for evaluating the associa- tion energies of quaternary structures. The association of protein molecules in solution may be complicated by the formation of intermediates. Jaenicke and Rudolph (1986) first described a general pathway for the association of the hypothetical protein molecules a and b as follows: a + b + a' + b' -+ (ab)' + ab. (1) For two molecules, a and b, to form a stable complex, one or both of the molecules may adopt altered confor- mations. Only the activated molecules, a' and b', form the complex (ab)', which may isomerize further to form the structure ab. The free energy of association for this reaction is described by an association constant KO where AG& = -RTln KO in the standard state. 169

Transcript of Calculation of the free energy of association for protein complexes

Protein Science (1992), I , 169-181. Cambridge University Press. Printed in the USA. Copyright 0 1992 The Protein Society 0961-8368/92 $5.00 + .OO ~ ~ _____ - .. " "

Calculation of the free energy of association for protein complexes

NANCY HORTON AND MITCHELL LEWIS The Johnson Research Foundation, Department of Biochemistry and Biophysics, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104

(RECEIVED August 5 , 1991; ACCEPTED September 5 , 1991)

Abstract

We have developed a method for calculating the association energy of quaternary complexes starting from their atomic coordinates. The association energy is described as the sum of two solvation terms and an energy term to account for the loss of translational and rotational entropy. The calculated solvation energy, using atomic sol- vation parameters and the solvent accessible surface areas, has a correlation of 96% with experimentally deter- mined values. We have applied this methodology to examine intermediates in viral assembly and to assess the contribution isomerization makes to the association energy of molecular complexes. In addition, we have shown that the calculated association can be used as a predictive tool for analyzing modeled molecular complexes.

Keywords: hydrophobicity; protein structure; solvation

Specific interactions between macromolecules are respon- sible for the assembly of complex biological structures and are essential to the regulation of events within a cell or organism. The association of molecules to form higher ordered oligomers is in many respects analogous to the block condensation model for protein folding where pre- folded units associate to form higher order structures (Richmond & Richards, 1978). To understand the struc- tural basis of recognition we must be able to relate solu- tion measurements of the association process to the structure of the macromolecular complex. This paper considers the problem of calculating the free energies of forming protein complexes from preformed subunits as derived from crystallographic data and relating these val- ues to experimentally obtained association constants.

Kauzmann (1959) suggested that a major factor in the stable formation of protein complexes is a consequence of the hydrophobic effect. Using an empirical correlation between the accessible surface area and free energies of transfer of amino acids from water to octanol, Chothia and Janin (1975) found that the free energy required to form a stable complex was directly related to the amount of surface area buried in the interface. Eisenberg and McLachlan (1986) recognized that it is an oversimplifica-

Reprint requests to: Mitchell Lewis, The Johnson Research Founda- tion, Department of Biochemistry and Biophysics, University of Penn- sylvania School of Medicine, Philadelphia, Pennsylvania 19104.

tion to base the energy of association on surface area alone; polarity and charge must also be considered. By introducing five atomic solvation parameters, to account for the polar or apolar character of each atom type most frequently found in proteins, they could more accurately relate surface area to the free energy of transfer. More- over, the solvation energy was shown to be useful for as- sessing protein stability. We assume that the forces that govern the association between two molecules are the same as the forces that are responsible for the folding of a protein in water. As such, the solvation energy should be equally useful as a gauge for evaluating the associa- tion energies of quaternary structures.

The association of protein molecules in solution may be complicated by the formation of intermediates. Jaenicke and Rudolph (1986) first described a general pathway for the association of the hypothetical protein molecules a and b as follows:

a + b + a' + b' -+ (ab)' + ab. (1)

For two molecules, a and b, to form a stable complex, one or both of the molecules may adopt altered confor- mations. Only the activated molecules, a' and b', form the complex (ab)', which may isomerize further to form the structure ab. The free energy of association for this reaction is described by an association constant KO where AG& = -RTln KO in the standard state.

169

170 N. Horton and M. Lewis

The actual pathway of association for a particular pro- tein complex may contain monomeric (a’ and b’), dimeric (ab)’, or no intermediates. In Figure 1 we have plotted the free energy of association for the assembly of a dimer aa that contains only monomeric intermediates. The confor- mation of free monomers, a’, is different from that of monomers a and the subunits of dimer aa. The confor- mation of monomers, a, and the subunits of dimer, aa, are however the same. A complex will form when the monomers, a’, isomerize into an intermediate structure a overcoming the barrier A G f . This intermediate struc- ture will associate forming the dimer when there is suf-

ficient energy to overcome the diffusional barrier, AGj. The energy associated with this kinetic pathway can be partitioned into three terms: an interaction energy, an isomerization term, and the energy required to fix one molecule relative to the other:

The driving force for the association is the interaction en- ergy, which results from nonbonded interactions of both

Association Pathway

Fig. 1. This association pathway for a homodimer is a simplified version of the general pathway. Only one intermediate (a) occurs, rather than the dimeric and monomeric intermediates a’, b’, and (ab)’. The individually solvated molecules, a’, must first change conformation, a, and then overcome a diffusional barrier.

Association energy for protein Complexes

polar and apolar atoms. AGiteraction has an enthalpic component due to van der Waals interactions, hydrogen bonds, and charged electrostatic interactions as well as an entropic component that results from the liberation of bound water molecules from the interface. When mole- cules associate via the general pathway, energy is required for one or both monomers to adopt a conformation that will form a complex. AG~omerization is the energy that is required for the free monomers to form an activated complex and for the activated complex to form the final observed quaternary structure. There is, of course, a loss of rotational and translational degrees of freedom when two molecules associate. AG~o,,,runs accounts for a loss of translational and rotational entropy when two mole- cules associate. For molecules to form a stable associa- tion the interaction energy must be sufficiently large to overcome these opposing energies.

Materials and methods

In this paper we have extended the methodology developed by Eisenberg and McLachlan (1986) and reformulated the solvation model for protein folding to accurately de- scribe the energy of molecular associations. The solvation free energy of folding is calculated by summing the dif- ferences in surface area between each atom in the folded state, A i , and in a reference state, Af, multiplied by an atomic solvation parameter, Au;:

all atoms AG,O,,, = A a j ( A i - A : ) . (3)

i= 1

The solvation parameters for polar and charged atoms have negative values that reflect the fact that these atoms prefer to be exposed to solvent when the protein folds.

171

Consequently, removing polar atoms from contact with the solvent ( A j - A f c 0) contributes unfavorably to the solvation energies ( AG:o,, > 0) . Richmond and Richards (1978), however, pointed out that when proteins fold, half of the buried surface area is polar. The vast major- ity of these polar atoms are involved in hydrogen bond- ing the backbone atoms in secondary structure and should therefore add to the stability of the structure. Therefore, an uncritical use of the equation for the sol- vation energy would allege that sequestering of polar at- oms from the solvent is energetically unfavorable to folding, whether or not the atom is involved in an elec- trostatic interaction.

When the formation of quaternary complexes is diffu- sion limited and there is no detection of intermediates, the general pathway is reduced to a + b + ab. In this spe- cial case, molecules associate without a significant con- formational change such that AG~omeriturion = 0. For these simpler associations the general pathway is reduced to

A G L o c = A Giterac t ion + A Grot , trans. (4) 0

Kinetic data have shown that several macromolecular complexes associate without observed intermediates, and the assembly of these molecules is analogous to bringing together two rigid objects.

The driving force for the association (Equation 4) is the interaction energy term. The interaction energy has both an enthalpic and an entropic component and can be related to the solvation energy. In Figure 2 we illustrate the interaction event as two discrete sequential processes. The first step involves the entropically driven hydropho- bic effect, where solvent molecules are released from the contacting surfaces. In the second step, van der Waals

step 1 Step 2

__F - Individually solvated Hydrophobically species associated complex

Specfically interacting complex

Fig. 2. Solvated proteins associate through hydrophobic interactions in step 1 . Each molecule loses overall rotational and trans- lational degrees of freedom relative to the other due to binding. Specific contacts such as electrostatic pairings and hydrogen bonds occur between subunits in step 2. Adapted from Ross and Subramanian (1981).

172 N. Horton and M. Lewis

contacts, electrostatic pairings, and hydrogen bonds are formed. Both steps contribute favorably to the associa- tion process and are responsible for the specificity of the interaction. We will show that the two steps can be de- scribed as the sum of two suitably weighted solvation en- ergy terms. Although the two steps of the association most likely occur concomitantly, quantitatively dissect- ing the interaction into polar and apolar parts provides a convenient way to account for buried polar atoms when calculating a solvation energy. We have defined the interaction energy, as illustrated in Figure 2, as the sum of two solvation terms:

Both the apolar and the polar components are calculated by summing the difference in the solvent-accessible sur- face area of each atom in the uncomplexed state, As, and its surface area in the complex A:, multiplied by the appropriate atomic solvation parameter. The apolar term encompasses those atoms that are not involved in hydro- gen bonds or salt bridges:

if atom i is not bonded. (6a)

Atoms that are involved in hydrogen bonds and salt bridges are accrued in the polar term. The criterion for these electrostatic interactions is determined by atom type and distance from an appropriate partner (Shultz & Schirmer, 1979):

all atoms

AG:o/ur = Au;(AP - AT) i= 1

if atom i is bonded. (6b)

Those polar atoms that are buried in the interface but are not involved in ionic interactions are included in the apo- lar energy term.

Polar and apolar solvation energies were calculated for 24 protein complexes for which we could find experimen- tally determined dissociation constants. The atomic co- ordinates for the complexes used in this study were taken from the Brookhaven Protein Data Bank (Bernstein et al., 1977). Quaternary complexes were prepared by applying the necessary symmetry transformations to the structure of the protomer or monomer. The accessible surface ar- eas (Lee & Richards, 1971) were calculated for individual atoms using a probe radius of 1.4 A and atomic radii for unified atoms (Singh et al., 1986). Atomic solvation pa- rameters (Eisenberg et al., 1989) are listed in Table l .

Kinetic data found in the literature reported that 15 of these complexes associate and form quaternary struc- tures without intermediates. Substituting Equation 5 into

Table 1. Atomic solvation parameters determined with different surface descriptorsa

Type of surface C N/O 0- N+ S

-

MS 2 4 + 2 - 2 8 k 3 - 7 1 + 6 - 8 3 + 5 - 1 4 + 5 SAS 1 2 k 1 - 1 3 + 4 - 3 8 + 6 - 2 9 k 5 - 9 + 5 vv 2 4 + 2 - 1 4 k 3 -61 + 8 - 7 3 k 7 - 1 4 k 5 EM 1 8 + 1 - 9 + 3 - 3 7 k 7 - 3 8 + 4 - 5 + 6

~~

.. . "" .~ ~. - -~

a Atomic solvation parameters calculated using solvent-accessible surfaces (SAS), molecular surfaces (MS), and Voronoi volumes (VV). The values used in calculating association energies are those of Eisen- berg and McLaughlin (EM).

Equation 4 results in the functional form of the equation we used to calculate the association energy:

The coefficients CY and 0 are dimensionless quantities used to scale the solvation energies to the observed dis- sociation constants. The values of CY, 0, and AG;or,rrons from Equation 7 were fit using a recursive least-squares procedure from the data in Table 2.

Results and discussion

The magnitude of the hydrophobic energy can be calcu- lated from the partition coefficients derived from solubility measurements of hydrocarbons and amino acid derivatives in polar and apolar solvents. These measured free energies of transfer correlate well with solvent-accessible surface areas (Chothia, 1974) and are related by a proportional- ity constant that ranges from 16 to 47 cal/AZ. As noted by Sharp et al. (1991), the variation of the coefficient is due in part to different measurements of the molecular surface. To verify the reliability of the atomic solvation parameters (ASP), we recalculated the values using coor- dinates of the amino acids from the PROLSQ standard groups dictionary (Konnert, 1976). The ASP values were determined by solving a set of simultaneous equations using (1) the solvent-accessible surface, (2) the molecular surfaces, and (3) the Voronoi volume as the independent variable. Regardless of the independent parameter the observed free energies of transfer correlated well with the calculated values (Fig. 3).

The ASP values are proportionality constants that pro- vide a relationship between two measured quantities, the surface or the volume of the molecule, and energies of transfer. The magnitude of the solvation parameters is dependent upon how the surface or volume are calculated, but the relative values are less sensitive and similar to those reported by Eisenberg and McLachlan (1989). In- terestingly, using volumes to obtain ASP values (cal/A3) produces the best correlation, suggesting that cavity for- mation is an important parameter in calculating the free

Association energy for protein complexes

Table 2. Coordinates, surface areas, and free energiesa

173

Surface area PDB file

A@polar A q o l a r A G S S O C A G t b s (A2) (kcal/mole) (kcal/mole) (kcal/mole) (kcal/mole) Ref.

2PTC lTPA 2KAI 4CPA 3CPA 3SGB 2SEC lCSE lCHO 2TP I 2TP I lINS 2ss1 ZHFL 1 HBS

1,465 1,474 1,471 1,420

594 1,322 1,547 1,541 1,547

60 1 1,457 1,305 1,594 1,775

760

-11.54 -10.82 -9.74

-10.13 -4.18

-11.02 -11.97 -12.14 -12.22 -5.56

-10.78 -10.33 -12.78 -9.63 -7.84

-8.65 -7.09 -4.74 -1.80 -5.04 -2.72 -2.72 -3.75 -3.55 -3.91 -7.36 - 1.24 -3.28 -4.64 -0.01

-19.9 -17.1 -12.8 -10.0 -5.4

-12.3 -13.6 -15.0 - 14.9 -6.1

-17.3 -9.6

-15.4 - 12.6 -4.7

-18.1 -17.8 -12.4 -10.0 -5.3

-14.7 -13.1 -13.1 -15.7 -5.8

-18.1 -7.4

-16 -14.2 -4.8

a b

d e f g g h

C

I

I

j k I

m .. "~

a Simple associations. The complexes are identified by the Brookhaven PDB code: 2PTC, trypsin-bovine pancreatic tryp- sin inhibitor (BPTI); ITPA, anhydrotrypsin-BPTI; ZKAI, kallikrein A-BPTI; 4CPA, carboxypeptidase A-potato carboxypep- tidase A inhibitor (PCI); 3CPA, carboxypeptidase A-glycyltyrosine; 3SGB, proteinase B-third domain of the turkey ovomucoid inhibitor (OMTK3); ZSEC, subtilisin Carlsberg-N-acetyl Eglin C; ICSE, subtilisin Carlsberg-Eglin C; ICHO, a- chymotrypsin-OMTKY3; 2TP1, trypsinogen (+BPTl)-isoleucylvaline (IV); 2TP1, trypsinogen (+IV)-BPTI; IINS, insulin OP contact; 2SS1, Streptomyces subtilisin inhibitor dimer; ZHFL, lysozyme-Fab; lHBS, sickle cell deoxyhemoglobin mol 1-2 contact.

The experimental association energies were obtained from the literature. a, Vincent and Lazdunski (1972); b, Vincent et al. (1974); c, Chen and Bode (1983); d, Hass and Ryan (1980); e, Bunting and Myers (1975); f , Read et al. (1983); g, Ascenzi et al. (1988); h, Empie and Laskowski (1982); i, Bode (1979) and Bolognes et al. (1982); j , Pekar and Frank (1972); k , Akasaka et al. (1982); I, Sheriff et al. (1987); m, Ross et al. (1977).

A

*.

. * o

-2 g I

-3 -2 -1 0 1 2 3

Calculated Free Energy of Transfer (kcaVmole)

Fig. 3. The observed free energy of transfer vs. calculated free energy of transfer for the 20 amino acids using derived atomic solvation parameters (ASP) values (Table 1). The diamond-shaped figures result from the ASP values calculated using Voronoi volumes, the open squares correspond to molecular surfaces, and the solid squares were calculated using solvent-accessible sur- faces.

174

energy of transfer. As will be shown, in calculating the association energy, the magnitudes of the values are not as important as the signs of the parameters and their rel- ative values. Therefore, in the calculation that follows we chose to use the ASP values determined by Eisenberg and McLachlan (1989) and solvent-accessible surfaces.

Chothia and Janin (1975) first suggested that the free energy of association was related to the amount of sur- face area that was buried in the interface. Figure 4 is a plot of the observed free energy of association vs. calcu- lated surface area for 24 protein complexes where the experimentally determined dissociation constants are known. The open squares represent those proteins for which the association is diffusion limited and where there is no detection of intermediates in the assembly. The closed circles represent those proteins that are known to isomerize prior to or after associating (Fig. 4 and refer- ences therein). The correlation of surface area with the free energies of association for all 30 complexes is 10%. Differentiating the proteins by their kinetic paths shows

C

-1 0

-20

N. Horton and M. Lewis

that proteins that must isomerize in the association pro- cess have a correlation of 17%, whereas proteins that as- sociate as rigid objects have a correlation of 79%. Thus, there exists a relationship between the experimentally de- rived association energies, AG:EZ$,,, and the calcu- lated solvent-accessible surface area. The kinetic pathway of assembly is critical for relating the thermodynamic measurements with the atomic structures.

It has long been thought that hydrophobicity is the major factor that stabilizes protein-protein associations; therefore the calculated solvation energies should simply recapitulate the observed relationship between surface area and free energies. However, the solvation free ener- gies, calculated for the 15 proteins that assemble as rigid objects, correlate poorly (12%) with the experimental free energies (Fig. 5 ) . Polar atoms must contribute favor- ably to the association energies and provide more than just specificity. If the solvation energy is broken up into the two components AG;,,, and AG$,,,, the correla- tion reappears with values of 63% and 77%, respectively

1000 2000 3000 4000

Surface Area (A2)

5000 6000

Association energy for protein complexes 175

-1 0 -a -6 -4 -2 0 2

Solvation Energy (kcaVmole)

Fig. 5. The observed free energy of association (kcal/mol) is plotted vs. the calculated solvation energy. The open squares are those proteins that are known to associate as rigid objects, which are a subset of the points used in Figure 4.

(Fig. 6). When the two terms are added together and weighted by the coefficients a and 0, the correlation ex- tends to 96% (Fig. 7), purporting that the polar and apo- lar terms are additive and both contribute favorably to the association energy.

Analysis of coefficients

The coefficient, a , for the apolar term (1.4 f 0.2) is con- sistent with values that are used for correlating binding energies of substrates to hydrophobicity (Fersht, 1985). The coefficient is greater than unity, indicating that pro- teins are more hydrophobic than octanol or there is a dif- ference in standard states of the two measurements. Protein association experiments are typically performed at higher ionic strength than the amino acid partitioning experiments. In addition, the value of CY is greater than unity and may reflect the required formation of cavities in partitioning experiments that need not occur in the as- sociation of protein molecules (Fersht, 1985). Interest- ingly, the product of our calculated coefficient of a and the ASP value for apolar atoms ( a * A a ( c ) ) has a value of 25 cal/A2, consistent with the original value of Cho- thia (1974).

The coefficient, 0, associated with the polar term

(- 1.2 ? 0.2) indicates that polar interactions contribute favorably to the association. An analysis of the polar in- teractions that contribute to the association energy sug- gests that the strength of hydrogen bonds are slightly less than values determined by mutation experiments (Fersht, 1987). The energies we calculated for the observed 131 hydrogen bonds range from 0.0 to -0.71, with an aver- age of -0.24 kcal/mol compared to -0.8 to -1 .5 kcal/mol determined experimentally. The 44 charged hy- drogen bonds contribute 0.0 to -3.0 kcal/mol, with an average of - 1.9, and the 12 salt bridges range between 0.0 and -3.0, with an average value of -2.5 kcal/mol. Mutational data estimate that charged hydrogen bonds and salt bridges contribute 3-6 kcal/mol of stabilization energy. The calculated energies for these polar interac- tions, based upon changes in surface area and the atomic solvation parameters, agree well with expected electro- static energies.

The free energy lost due to fixing one subunit relative to the other, AG~o,,,,.o,, is 6.2 f 2.2 kcal/mol. This value is less than those calculated by statistical mechanics, which estimates the energy to be on the order of 15 kcal/mol (for proteins the size of a protease and a pro- tease inhibitor). Experimental studies (Erickson & Pan- taloni, 1981; Erickson, 1989) suggest that the entropy loss at room temperature is between 7 and 11 kcal/mol,

176 N. Horton and M. Lewis

(I

-20 . -20 -10 0 10

Calculated Polar and Apolar Energy (kcal/mole)

Fig. 6. The observed free energy of association (kcal/mol) is plotted vs. the calculated free energy of association independently for the polar and apolar components of the association energy. The open squares represent the energy that is attributed to the apolar interactions and the solid diamonds are the result of -1 times the polar contribution.

closer to our calculated value but still greater than the es- timated error.

The contribution that any atom makes to the associa- tion energy depends upon a change in the accessibility of surface area to the solvent. Atoms that are buried in the formation of the complex lose more surface area and therefore contribute more free energy. Polar atoms that become buried in the interface and do not form ionic in- teractions contribute to the apolar term. Due to the sign of the solvation parameter this contribution is unfavor- able. By defining the association energy as the sum of two solvation terms, polar atoms contribute favorably to the association only when hydrogen bonds or salt links are formed and contribute unfavorably if the atom is un- paired. The ratio of AGjo/ur :AG$,,, is in all cases less than 1 , which demonstrates that the apolar atoms con- tribute more to the total free energy than do the polar at- oms. However, the contribution of the polar atoms can be appreciable.

Assessment of AG~omeriza,ion

The general pathway describes how intermediates may occur prior to the association. When such intermediates do occur, the calculation of the association energy using

only the crystal structure of the complex is not possible. Several examples of complexes that associate by this gen- eral pathway are known. In some cases, the AG~o,,,er,zarion can be estimated from the difference between the ob- served and calculated free energies. This is possible only when the intermediates are monomeric (Fig. 1). The con- tribution AG~omerizatjon can make to the association en- ergy is illustrated by the binding of the bovine pancreatic trypsin inhibitor (BPTI) to trypsinogen.

The active cleft residues of trypsinogen are disordered in the native state as determined by crystallography (Bolognes et al., 1982). Activation of this zymogen in- volves the cleavage of the 15 N-terminal residues and a conformational change of the substrate-binding cleft. This allows trypsin to bind substrate or inhibitor with high affinity. Trypsinogen does however bind inhibitor with a relatively low affinity. The crystallographic struc- tures show that the trypsinogen-BPTI association is sim- ilar to trypsin-BPTI, and the active-site residues are ordered as a result of the association to inhibitor. The as- sociation energy, calculated using Equation 7 and the crystallographic coordinates of trypsinogen and BPTI, is -17 kcal/mol. The calculated energy is more negative than the observed dissociation constant (-7 kcal/mol) and is much closer in magnitude to our calculated value

Association energy for protein complexes 177

-20 - 1 0 0

Calculated Free Energy of Association (kcal/mole)

Fig. 7. The observed free energy of association (kcal/mol) is plotted vs. the calculated free energy of association from Equa- tion 7 . The points are labeled with the four-character code from the Brookhaven Protein Data Bank.

for trypsin-BPTI (-18 kcal/mol). The difference be- tween the observed and calculated energies (10 kcal/mol) is due, at least in part, to the energy necessary to order the active-site residues (Bode, 1979). The conformational energy AG$,e,jmrjon is positive, making the observed as- sociation constant less negative. Part of the interaction energy is used to overcome the conformation energy that is lost when these two molecules associate.

Examining assembly intermediates

Many icosahedral viruses self-assemble in vitro, and in some cases intermediates in the assembly process accumu- late to detectable levels. Intermediates are assemblies that become caught in local minima on the assembly pathway. Energetically, we define an intermediate as any assembly whose association energy is preferential to any other as- sembly with the same stoichiometry. In the case of viruses, some particular assemblages of coat proteins are likely to occur more frequently than others. These particular structures may have stronger contacts than any other as- semblies containing the same number of monomers.

The picornavirus, human rhinovirus 14 (HRV-14), is a small RNA virus with T = 3 icosahedral symmetry

(Rossmann et al., 1987). Assembly studies of this virus have shown that particles associate to form two stable in- termediates, a 6 s and a 14s particle (Rueckert, 1986). To search for the structures that correspond to these stable intermediates, we first calculated the pairwise interaction energies that result from associations within the icosa- hedral shell. Then, using a combinatorial approach, we could assess if a particular assembly is more stable than any other assembly with the same number of monomers.

A plot of the difference in the association energy be- tween the two most favored assemblies of n chains as a function of the number of chains shows large differences in association energy at intervals of 3n. Figure 8 depicts the growth of the viral shell from dimers to the pentamer cap (Fig. 8b). The addition of a monomer to the many possible dimers results in many possible trimers. The striped trimer has an association energy that is much more negative and consequently more favored than any other possible trimer. The absolute difference between its association and the next most favored trimer is large and gives the peak in Figure 9. The peaks that correspond to a larger difference in energy (more negative) occur at in- tervals of n = 3 . The first stable intermediate corresponds to the 6s particle. The shell continues to grow by the ad-

178 N. Horton and M. Lewis

Fig. 8. Structural identity of the proposed assembly intermediates for human rhinovirus 14. A: Illustration of some of the possible two-subunit interactions (dimers) that may occur as the virus assembles. No one dimer is significantly more stable than any other. B: Representation of some of the possible trimers that may associate. The crossed trimer is significantly more stable than any other trimer that may form. C and D portray two of the many hexamers that may form. The hexamer in C is significantly more stable than any other hexamer. Finally, E depicts the most stable 15-mer, which corresponds to the pentameric cap.

Association energy for protein complexes

x

179

20

00

80

60

40

20

-

0 HRV14

0 SBMV

T

10 7

20

Number of Chains in the Assembly Fig. 9. The difference in association free energy between the two most favored assemblies of the human rhinovirus 14 shell vs. the number of chains in the assembly. The association energies were calculated for all possible dimers, trimers,. . . , 15-mers. For each n-mer the energies were sorted, and the difference between the most favored n-mer and the next most favored are plotted versus n. When the difference is a large negative number a particular assembly is more stable and likely to be an intermediate in the assembly process, which occurs at intervals of 3n.

dition of trimers, rather than monomers. Again a partic- ular hexamer (Fig. 8C), made as dimers of trimers, is much more favored than any other combination of two trimers or six monomers. The absolute difference be- tween this hexamer and the next most favored gives the second peak of Figure 9. The trimers continue to assem- ble about the fivefold axis to produce the pentameric cap (Fig. 8E), which corresponds to the peak in Figure 9 at n = 15. This intermediate assembly is structurally the same as that proposed for the 14s intermediate from chemical cross-linking and electron microscopy studies (Rossmann et al., 1987).

As a control, we searched for assembly intermediates with another icosahedral virus, southern bean mosaic virus (SBMV) (Rossmann et al., 1983). Assembly studies, performed on the structurally related turnip crinkle virus, indicate there are no intermediates larger than dimers, and the assembly involves the continuous addition of chains to the growing viral shell (Sorger et al., 1986). The pair- wise interaction energies were calculated, and the combi- natorial approach was used to identify intermediates in assembly. The calculations for SBMV, however, found no combination of n chains that were more favorable than any other, contrary to that observed with HRV-14. Therefore, no intermediates are predicted for SBMV in agreement with experimental evidence.

Modeling protein complexes

The association energy can be used as a predictive tool for analyzing modeled molecular complexes. Equilib- rium, thermodynamic, and kinetic parameters are known for several enzyme-inhibitor complexes for which the structure of the isolated inhibitors and the enzyme are known to high resolution. In some instances, however, detailed structural information about the complex formed is not available. To assess the usefulness of the solvation energies, we modeled several quaternary associations and correlated the observed association with our calculated energies.

The initial calculations were performed on the trypsin- BPTI complex. The structures of trypsin and BPTI were superimposed on the framework of the experimentally determined structure of the complex. A simple minimi- zation was performed, using PROLSQ (Konnert, 1976) to eliminate bad contacts (the root-mean square deviation < 0.2 A). The calculated association energy, -14.1 kcal/mol, is close to the experimentally observed associ- ation, which is -14.5 kcal/mol. A similar calculation was performed on the subtilisin-Streptomyces subtilisin inhibitor (SSI). Using the a-carbon coordinates (2SIC) of the complex, we positioned subtilisin and SSI in a qua- ternary complex. Using the same procedure, the com-

180 N. Horton and M. Lewis

puted energy is -14.2 kcal/mol, in good agreement with the observed value of -13.5 kcal/mol. These enzyme- inhibitor complexes have strong associations, and kinetic data have shown that these molecules associate without having to isomerize. As a control, we randomly docked these proteins and calculated the association energies. The mean computed association energy for 20 improper complexes was 1 .3 f 2.4 kcal/mol.

Conclusion

When proteins associate to form quaternary structures or when inhibitors bind to an enzyme, the driving force for the association comes from van der Waals interactions, hydrogen bonds, and salt bridges. Potential energy func- tions can be used to calculate pairwise nonbonded inter- actions and estimate the enthalpic component of the association energy, however, the solvation free energy provides a novel expression to calculate free energies without having to explicitly define the enthalpic and en- tropic components.

Exploring the molecular basis of recognition requires knowledge of the three-dimensional structure of the molec- ular complexes, kinetic analyses of association pathways, and thermodynamic measurements of the association. Quaternary interactions are necessary to maintain the na- tive conformation of many proteins. For those assem- blies where the structure of the isolated monomers differ from their structure in the oligomeric state the associa- tion process is part of the folding pathway. For these molecules the calculated association energies and indi- rectly the surface area do not correlate well with observed dissociations. Our model for protein association and the calculated solvation energies accurately model only those interactions where the isolated molecules have structures similar to those found in the complex. Therefore, to suc- cessfully model molecular associations requires, in addi- tion to three-dimensional structures, detailed kinetic and thermodynamic measurements.

Acknowledgments

We thank Professors F.E. Cohen, D. Eisenberg, S.W. En- glander, I.D. Kuntz, and D.C. Rees for helpful discussions and critically reading this manuscript at various stages. This work was support by NIH grants GM-44617 and GM-39526.

References Akasaka, K., Fujii, S., Hayashi, F., Rokushika, S., & Hatano, H.

(1982). A novel technique for the detection of dissociation-associ- ation equilibrium in highly associable macromolecular systems. Bio- chem. Int. 5, 637-642.

Ascenzi, P., Amiconi, G., Menegatti, E., Guarneri, M., Bolognesi, M., & Schnebli, H.P. (1988). Binding of the recombinant proteinase in- hibitor Eglin C from leech Hirudo medicinalis to human leukocyte elastase, bovine a-chymotrypsin and subtilisin Carlsberg: Thermo- dynamic study. J. Enzyme Inhibition 2 , 167-172.

Ashmarina, L.I., Muronetz, V.I., & Nagradova, N.K. (1982). Evidence for a change in catalytic properties of glyceraldehyde 3-phosphate

FEES Lett. 144, 43-46. dehydrogenase monomers upon their association in a tetramer.

Azuma, T., Kobayashi, O., Goto, Y., & Hamaguchi, K. (1978). Mono- mer-dimer equilibria of a Bence Jones protein and its variable frag- ment. J. Biochem. 83, 1485-1492.

Bernstein, F.C., Koetzle, T.F., Williams, G.J.B., Meyer, E.F., Jr., Brice, M.D., Rodgers, J.R., Kennard, O., Shimanouchi, T., & Tasumi, M. (1977). The Protein Data Bank: A computer-based archival file for macromolecular structures. J. Mol. Biol. 112, 535-542.

Bode, W. (1979). The transition of bovine trypsinogen to a trypsin-like state upon strong ligand binding. 11. The binding of the pancreatic trypsin inhibitor and of isoleucine-valine and of sequentially related peptides to trypsinogen and to p-guanidinobenzoate-trypsinogen J.

Bolognes, M., Gatti, G . , Menegatti, E., Guarneri, M., Marquart, M., Papamokos, E., & Huber, R. (1982). Three-dimensional structure of the complex between pancreatic secretory trypsin inhibitor (Kazal type) and trypsinogen at 1.8 A resolution. J. Mol. Bid . 162,

Bunting, J.W. & Myers, C.D. (1975). Reversible inhibition of carboxy- peptidase A. IV. Inhibition of specific esterase activity by hippuric acid and related species and other amino acid derivatives and a com- parison with substrate inhibition. Can. J. Chem. 53, 1993-2004.

Cassman, M. & King, R.C. (1972). Subunit interactions and ligand binding in supernatant malic dehydrogenase. Cooperative binding of reduced nicotinamide adenine dinucleotide associated with a

4937-4944. monomer-dimer equilibrium of the protein. Biochemistry 11,

Chen, Z. & Bode, W. (1983). Refined 2.5 A X-ray crystal structure of the complex formed by porcine kallikrein A and the bovine pancre- atic trypsin inhibitor. J. Mol. Biol. 164, 283-311.

Chothia, C. (1974). Hydrophobic bonding and accessible surface area in protein. Nature 248, 338-339.

Chothia, C. & Janin, J. (1975). Principles of protein-protein recogni- tion. Nature 256, 705-708.

Eisenberg, D. & McLachlan, A.D. (1986). Solvation energy in protein folding and binding. Nature 319, 199-203.

Eisenberg, D., Wesson, M., & Yamashita, M. (1989). Interpretation of protein folding and binding with atomic solvation parameters. Chemica Scripta 29A, 217-221.

Empie, M.W. & Laskowski, M., Jr. (1982). Thermodynamics and ki- netics of single residue replacements in avian ovomucoid third do- mains: Effect on inhibitor interactions with serine proteinases. Biochemistry 21, 2274-2284.

Erickson, H.P. (1989). Co-operativity in protein-protein association. The structure and stability of the actin filament. J. Mol. Biol. 206, 465-474.

Erickson, H.P. & Pantaloni, D. (1981). The role of subunit entropy in cooperative assembly-Nucleation of microtubules and other two- dimensional polymers. Biophys. J. 34, 293-309.

Fersht, A. (1985). Enzyme Structure and Mechanism, pp. 305-307. W.H. Freeman and Company, New York.

Fersht, A. (1985). Enzyme Structure and Mechanism, pp. 305-307. W.H. Freeman and Company, New York.

Fersht, A.R. (1987). The hydrogen bond in molecular recognition. Trends Biochem. Sci. 12, 301-304.

Gerschitz, J., Rudolph, R., & Jaenicke, R. (1978). Refolding and re- activation of liver alcohol dehydrogenase after dissociation and de- naturation in 6 M guanidine hydrochloride. Eur. J. Biochem. 87, 591-599.

Hass, G.M. & Ryan, C.A. (1980). Cleavage of the carboxypeptidase in- hibitor from potatoes by carboxypeptidase A. Biochem. BiophYs.

Hearn, R.P., Richards, F.M., Sturtevant, J.M., & Watt, G.D. (1971). Res. Comrnun. 97, 1481-1486.

Thermodynamics of the binding of S-peptide to S-protein to form ribonuclease S. Biochemistry 10, 806-817.

Hermann, R., Rudolf, R., & Jaenicke, R. (1979). Kinetics of in vitro reconstitution of oligomeric enzymes by cross-linking. Nature 277. 243-245.

Jaenicke, R. & Rudolph, R. (1986). Refolding and association of O k - omeric proteins. Methods Enzymol. 131, 218-250.

Kauzmann, W. (1959). Some factors in the interpretation of protein de- naturation. Adv. Protein Chem. 14, 1-63.

Mol. B i d . 127, 357-374.

839-868.

Association energy for protein complexes 181

Konnert, J.H. (1976). A restrained-parameter structure factor least- squares refinement procedure for large asymmetric units. Acia Crys- tallogr. A32, 614-617.

Lee, B. & Richards, F.M. (1971). The interpretation of protein struc- tures: Estimation of static accessibility. J. Mol. Biol. 55, 379-400.

Maeda, H., Engel, J., & Schramm, H.J. (1976). Kinetics of dimeriza- tion of the variable fragment of the Bence-Jones protein Au. Eur. J. Biochem. 69, 133-139.

Markert, C.L. & Massaro, E.L. (1968). Lactate dehydrogenase iso- zymes: Dissociation and denaturation by dilution. Science 162, 695-697.

Ovadi, J., Batke, J., Bartha, F., & Keleti, T. (1979). Effect of associa- tion-dissociation on the catalytic properties of glyceraldehyde 3-phosphate dehydrogenase. Arch. Biochem. Biophys. 193, 28-33.

Pekar, A.H. & Frank, B.H. (1972). Conformation of proinsulin. A comparison of insulin and proinsulin self-association at neutral pH. Biochemistry 11, 4013-4016.

Read, R.J., Fujinaga, M., Sielecki, A.R., & James, M.N.G. (1983). Structure of the complex of Streptomyces griseus protease B and the third domain of the turkey ovomucoid inhibitor at 1.8 A resolution. Biochemistry 22, 4420-4433.

Richmond, T.J. & Richards, EM. (1978). Packing of a-helices: Geomet- rical constraints and contact areas. J. Mol. Biol. 119, 537-555.

Ross, P.D., Hofrichter, J., & Eaton, W.A. (1977). Thermodynamics of gelation of sickle cell deoxyhemoglobin. J. Mol. Biol. 115, 1 1 1-134.

Ross, P.D. & Subramanian, S. (1981). Thermodynamics of protein as- sociation reactions: Forces contributing to stability. Biochemistry 20, 3096-3102.

Rossmann, M.G., Abad-Zapatero, C., Hermodson, M.A., & Erickson, J.W. (1983). Subunit interactions in southern bean mosaic virus. J.

Rossmann, M.G., Arnold, E., Griffith, J.P., Kamer, G., Luo, M., Smith, T.J., Vriend, G., Rueckert, R.R., Sherry, B., McKinlay, M.A., Diana, G., &Otto, M. (1987). Common cold viruses. Trends Biochem. Sci. 12, 313-318.

Rudolph, R., Heider, I., & Jaenicke, R. (1977). Mechanism of reacti-

Mol. Biol. 166, 37-83.

vation and refolding of glyceraldehyde 3-phosphate dehydrogenase from yeast after denaturation and dissociation. Eur. J. Biochem. 81,

Rueckert, R.R. (1986). Picornaviruses and their replication. In Funda- mental Virology (Fields, B. & Knipe, D.K., Eds.), pp. 357-390. Ra- ven Press, New York.

Sharp, K.A., Nicholls, A,, Fine, R.F., & Honig, B. (1991). Reconcil- ing the magnitude of the microscopic and macroscopic hydropho- bic effect. Science 252, 106-109.

Sheriff, S., Silverton, E.W., Padlan, E.A., Cohen, G.H., Smith-Gill, S.J., Finzel, B.C., & Davies, D.R. (1987). Three-dimensional struc- ture of an antibody-antigen complex. Proc. Nail. Acad. Sci. USA

Shultz, G.E. & Schirmer, R.H. (1979). In Principles of Protein Struc- ture (Cantor, C.R., Ed.), p. 35. Springer-Verlag, New York.

Singh, C.U., Weiner, P.K., Caldwell, J., & Kollman, P. (1986). Assisted model building with energy refinement, Version 3.0. University of California, San Francisco.

Sorger, P.K., Stockley, P.G., & Harrison, S.C. (1986). Structure and assembly of turnip crinkle virus 11. Mechanism of reassembly in vitro. J. Mol. Biol. 191, 639-658.

Vincent, J.P. & Lazdunski, M. (1972). Trypsin-pancreatic trypsin in- hibitor association. Dynamics of the interaction and role of disul- fide bridges. Biochemistry 11, 2967-2977.

Vincent, J.-P., Peron-Renner, M., Pudles, J., & Lazdunski, M. (1974). The association of anhydrotrypsin with the pancreatic trypsin inhib- itors. Biochemisiry 13, 4205-421 1.

Waley, S.G. (1973). Refolding of triose phosphate isomerase. Biochem.

Zabori, S., Rudolph, R., & Jaenicke, R. (1980). Folding and associa- tion of triose phosphate isomerase from rabbit muscle. Z . Natur- forsch. 35, 999-1004.

Zettlmeissl, G . , Rudolph, R., & Jaenicke, R. (1982). Rate-determining folding and association reactions on the reconstitution pathway of porcine skeletal muscle lactic dehydrogenase after denaturation by guanidine hydrochloride. Biochernisiry 21, 3946-3950.

563-570.

84, 8075-8079.

J. 135, 165-172.