Principles of Molecular Recognition

212
Principles of Molecular Recognition

Transcript of Principles of Molecular Recognition

Page 1: Principles of Molecular Recognition

Principles of Molecular Recognition

Page 2: Principles of Molecular Recognition

Principles of Molecular Recognition

Edited by

A.D. BUCKINGHAM

Department of Chemistry Vniversity of Cambridge

A.C. LEGON and S.M. ROBERTS

Department of Chemistry University of Exeter

Springer-Science+Business Media B .V.

Page 3: Principles of Molecular Recognition

First edition 1993

© 1993 Springer Science+Business Media Dordrecht Originally published by Chapman & Hall in 1993 Softcover reprint of the hardcover 1 st edition 1993

Typeset in 1O/12pt Times by Thomson Press (India) Ltd, New Delhi

ISBN 978-94-010-4959-7

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the UK Copyright Designs and Patents Act, 1988, this publication may not be reproduced, stored, or transmitted, in any form or by any means, without the prior permission in writing of the publishers, or in the case of reprographic reproduction only in accordance with the terms of the licences issued by the Copyright Licensing Agency in the UK, or in accordance with the terms of licences issued by the appropriate Reproduction Rights Organization outside the UK. Enquiries concerning reproduction out­side the terms stated here should be sent to the publishers at the Glasgow address printed on this page.

The publisher makes no representation, express or implied, with regard to the accuracy ofthe information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made.

A catalogue record for this is available from the British Library

Library of Congress Cataloging-in-Publication data

Principles of molecular recognition / edited by A.D. Buckingham, A.C. Legon, and S.M. Roberts. - - Ist ed.

p. cm. Includes bibliographical references and index. ISBN 978-94-010-4959-7 ISBN 978-94-011-2168-2 (eBook) DOI 10.1007/978-94-011-2168-2 1. Molecular recognition. 1. Buckingham, A.D. (Amyand David)

II. Legon, A.C. III. Roberts, S.M. (Stanley M.) QP517.M67P75 1993 547.7'044242--dc20 93-1459

CIP

Printed on acid-free text paper, manufactured in accordance with ANSI/NISO Z39.48-1992 (Permanence of Paper).

Page 4: Principles of Molecular Recognition

Preface

The importance of molecular recognition in chemistry and biology is reflected in a recent upsurge in relevant research, promoted in particular by high-profile initiatives in this area in Europe, the USA and Japan. Although molecular recognition is necessarily microscopic in origin, its consequences are de facto macroscopic. Accordingly, a text that starts with intermolecular interactions between simple molecules and builds to a discussion of molecular recognition involving larger scale systems is timely. This book was planned with such a development in mind.

The book begins with an elementary but rigorous account of the various types of forces between molecules. Chapter 2 is concerned with the hydrogen bond between pairs of simple molecules in the gas phase, with particular reference to the preferred relative orientation of the pair and the ease with which this can be distorted. This microscopic view continues in chapter 3 wherein the nature of interactions between solute molecules and solvents or between two or more solutes is examined from the experimental standpoint, with various types of spectroscopy providing the probe of the nature of the interactions.

Molecular recognition is central to the catalysis of chemical reactions, especially when bonds are to be broken and formed under the severe con­straint that a specific configuration is to result, as in the production of enan­tiotopically pure compounds. This important topic is considered in chapter 4. The origin of the catalytic power of enzymes is examined in chapter 5 where methods of simulating details of the interaction between an enzyme and its substrate are described, with special reference to the catalytic reaction of staphylococcal nuclease. It is then a natural step to address the question of drug discovery in the context of molecular recognition (chapter 6). Finally, the role ofthe dynamical motion of proteins in determining their functionality and properties is illustrated in chapter 7 through the example of met myoglobin in water using the technique of computer simulation.

The editors are grateful to the distinguished scientists who have contributed to this book and hope that their efforts will be helpful to students and to those beginning research in this exciting and challenging field.

A.D.B. A.c.L. S.M.R.

Page 5: Principles of Molecular Recognition

Contributors

Dr J. Aqvist

Dr J.M. Brown

Professor A.D. Buckingham

Dr L.A. Findsen

Dr P.J. Guiry

Professor A.C. Legon

Dr V. Lounnas

Professor D.J. Millen

Professor B.M. Pettitt

Professor S.M. Roberts

Dr J. Saunders

Department of Molecular Biology, Uppsala Biomedical Centre, Box 590, S-75124 Uppsala, Sweden

Dyson Perrins Laboratory, University of Oxford, South Parks Road, Oxford OX13QY, UK

Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB21EW, UK

Department of Medicinal and Pharmaceutical Chemistry, University of Toledo, Toledo, OH 43606, USA

Dyson Perrins Laboratory, University of Oxford, South Parks Road, Oxford OXI3QY, UK

Department of Chemistry, University of Exeter, Stocker Road, Exeter EX4 4QD, UK

Department of Chemistry, University of Houston, Houston, Texas 77204-5641, USA

Department of Chemistry, University College London, 20 Gordon Street, London WCIH OAJ, UK

Department of Chemistry, University of Houston, Houston, Texas 77204-5641, USA

Department of Chemistry, University of Exeter, Stocker Road, Exeter EX4 4QD, UK

Glaxo Group Research Limited, Greenford Road, Greenford, Middlesex UB6 OHE, UK

Page 6: Principles of Molecular Recognition

viii

Dr S. Subramanian

Professor M.C.R. Symons

Professor A. Warshel

Dr A. Wienand

CONTRIBUTORS

Department of Biophysics, University of Illinois, Urbana, IL 61801, USA

Department of Chemistry, The University, Leicester LEI 7RH, UK

Department of Chemistry, University of Southern California, Los Angeles, California, 90089-1062, USA

Dyson Perrins Laboratory, University of Oxford, South Parks Road, Oxford OX13QY, UK

Page 7: Principles of Molecular Recognition

Contents

Preface

Contributors

1 Intermolecular forces A.D. BUCKINGHAM

1.1 Introduction 1.2 The Born-Oppenheimer approximation 1.3 Molecules and forces 1.4 The hydrophobic effect 1.5 Classification of intermolecular forces

1.5.1 Electrostatic energy 1.5.2 Induction energy 1.5.3 Dispersion energy 1.5.4 Resonance energy 1.5.5 Magnetic interactions 1.5.6 Short-range interactions

1.6 Vibrational contributions to intermolecular forces 1. 7 Magnitudes of contributions to the interaction energy 1.8 Forces between macroscopic bodies 1.9 The effect of a medium 1.10 The hydrogen bond References

2 Molecular recognition involving small gas-phase molecules A.C. LEGaN and D.1. MILLEN

v

VII

1

1 2 3 4 6 6 7 8 8 9 9

10 11 12 12 14 15

17

2.1 Introduction 17 2.2 How to determine the angular geometry and strength of intermolecular binding

for an isolated dimer 18 2.3 Empirical observations about angular geometries in the series B ... HX 22 2.4 An electrostatic model for the hydrogen bond interaction: the Buckingham-

Fowler model 25 2.5 The electrostatic model and non-bonding electron pairs 26 2.6 A point-charge representation of non-bonding electron pairs 31 2.7 Isomerism in weakly bound dimers: incipient molecular recognition 36 2.8 Dimers with two interaction sites 39 2.9 Consequences of the rules for angular geometries in the solid state 41 References 41

3 Spectroscopic studies of solvents and solvation M.C.R. SYMONS

3.1 Introduction 3.1.1 History

43

43 43

Page 8: Principles of Molecular Recognition

X CONTENTS

3.2 Background 44 3.2.1 Hydrogen bonding 44 3.2.2 Hydrophobic bonding 46 3.2.3 Comments on some common solvent systems 47

3.3 Ultraviolet spectroscopy 48 3.3.1 Neutral solutes 48 3.3.2 Ions 49

3.4 ESR spectroscopy 51 3.4.1 ESR studies of ion pairing 51 3.4.2 Solvation of aromatic nitro-anions 54 3.4.3 Solvation of neutral nitroxides 55 3.4.4 Gain and loss of solvation 56

3.5 Nuclear magnetic resonance studies 57 3.5.1 Solute shifts 57 3.5.2 Use of 1 H NMR shifts to study solvation of ions 59 3.5.3 Relaxation studies 59

3.6 Vibrational chromophoric probes 60 3.6.1 Triethylphosphine oxide 61 3.6.2 Cyanomethane 64 3.6.3 Acetone 65

3.7 Near infrared studies 66 3.7.1 Free OH groups 66 3.7.2 Some consequences of the 'free-group' postulate 68 3.7.3 Use of overtone infrared (NIR) to study solvation of ions 71

3.8 Use of results from vibrational spectroscopy to interpret magnetic resonance data 72 3.8.1 NMR shifts 73 3.8.2 ESR data 75 3.8.3 Why are solvation numbers for solutes greater in water than in

other protic solvents? 75 3.9 Solvation in biological systems 75

3.9.1 Solvation changes 76 3.9.2 NMR spectroscopy 76 3.9.3 Solvation of small biomolecules 76

References 76

4 Origins of enantioselectivity in catalytic asymmetric synthesis 79 1.M. BROWN, P.l. GUIRY and A. WIENAND

4.1 Introduction 79 4.2 Homogeneous hydrogenation with rhodium complexes 80

4.2.1 Catalytic kinetic resolution and directed hydrogenation 84 4.3 Hydrogenation with ruthenium complexes 87 4.4 Carbon-carbon bond formation through cross-coupling 96 4.5 Carbon-carbon bond formation through allylic alkylation 103 References 106

5 Molecular recognition in the catalytic action of metallo-enzymes 108 1. AQVIST and A. W ARSHEL

5.1 Introduction 108 5.2 Methods for simulating reactions in enzymes and solution 110

5.2.1 Molecular orbital approach 110 5.2.2 The EVB model 112

5.3 Application to the staphylococcal nuclease reaction 116 5.3.1 Free energy profile for the SNase reaction 118 5.3.2 Effects of metal ion substitutions 123

Page 9: Principles of Molecular Recognition

5.4 Concluding remarks Acknowledgements References

6 Drug discovery 1. SAUNDERS

6.1 Introduction

CONTENTS

6.2 Receptors as targets for drug design 6.2.1 Alzheimer's disease and the muscarinic receptor 6.2.2 Angiotensin-II antagonists in hypertension

6.3 Enzymes as targets for drug design 6.3.1 HIV protease inhibitors as anti-AIDS drugs 6.3.2 Emphysema and elastase

6.4 Drug discovery by screening: concluding remarks Acknowledgements References

7 Time scales and fluctuations of protein dynamics: metmyoglobin

xi

134 135 135

137

137 139 141 145 151 152 159 164 165 165

in aqueous solution 168 L.A. FINDS EN, S. SUBRAMANIAN, V. LOUNNAS and B.M. PETTITT

7.1 Introduction 7.2 Methods 7.3 Spatial and temporal fluctuations

7.3.1 The approach to equilibrium 7.3.2 Structure and dynamics

7.4 Conclusions Acknowledgements References

Index

168 170 171 171 180 191 192 192

195

Page 10: Principles of Molecular Recognition

1 Intermolecular forces A.D. BUCKINGHAM

1.1 Introduction

The fundamental basis for molecular recognition is provided by the potential energy surface that represents the interaction energy of two or more molecules in a cluster as a function of their mutual separation and orientation.

Molecules attract one another when they are far apart, since liquids and solids exist. They repel one another when close, since the densities ofliquids and solids have the values they do under normal conditions of temperature and pressure. Figure 1.1 illustrates this important truth and shows a typical inter­action energy u(R) of two spherical molecules as a function oftheir separation R. For two argon atoms, the well-depth e is 0.198 x 10- 20 J (elk = 143 K) and the equilibrium separation Re is 3.76 X 10- 10 m [ll

The number of independent variables upon which the intermolecular energy depends increases as the molecular size increases. For two atoms there is only one variable R (Figure 1.1), and for an atom interacting with a diatomic there are the three variables R, a, r where a is the angle between the internuclear axis of the diatomic and the line joining the atom to the centre of mass of the di­atomic, and r is the separation ofthe nuclei in the diatomic. For two diatomics there are six (R, a1 , a2 , c/J, r l' r 2)' where c/J is the angle between the planes contain­ing the line of centres and the internuclear axis of each molecule. In the general case, for molecules containing N 1 and N 2 nuclei, there are 3(N 1 + N 2) - 6 independent variables of which 3N 1 - 6 and 3N 2 - 6 are vibrational coordi­nates in each molecule and the remaining six (R,a1,X1,a2,X2'c/J) (Figure 1.2) determine the relative positions and orientations of the molecules; X 1 and X2 give the orientation of molecules 1 and 2 about their axes at angles a1 and a2 to the line of centres. The intermolecular potential surface of the water dimer (H 20h has twelve variables, six of which are related to the vibrational coordi­nates of the two H 20 molecules.

The six relative translational and orientational degrees of freedom of an interacting pair of non-linear polyatomic molecules generally fluctuate slowly compared to the intramolecular vibrations. For some purposes, such as rotational relaxation, it may be sufficient to average u over the vibrational motion, thereby reducing the number of variables upon which u depends to just six. For vibrational relaxation of a particular mode, it may sometimes be

Page 11: Principles of Molecular Recognition

2 PRINCIPLES OF MOLECULAR RECOGNITION

u

r-----~--------~=====--.R

Figure 1.1 The interaction energy u as a function of the separation R of two atoms.

Figure 1.2 The six variables R, 8" XI' 82 , X2' ¢ describing the relative positions and orientations of two interacting non-linear molecules.

reasonable to average over the other vibrational modes, thus reducing the effective dimensionality of the problem.

1.2 The Born-Oppenheimer approximation

The concept of a potential energy function u(R) is dependent upon the Born-Oppenheimer approximation [2]. The potential energy u(R, e, r) is the interaction energy for fixed positions of all the nuclei, i.e. it is the difference between the energy ofthe system in that particular configuration and its value when the intermolecular separation R -+ 00. The effect of nuclear momentum on the electronic structure is therefore ignored. There are interesting effects resulting from the breakdown of the Born-Oppenheimer approximation (particularly when there are electronic degeneracies [3]) but for the purposes

Page 12: Principles of Molecular Recognition

INTERMOLECULAR FORCES 3

of studying liquids and solids we may safely employ it. The accuracy of the approximation may be gauged from the following:

(i) The Rydberg constant for the H atom is reduced by 0.054% on changing the nuclear mass from infinity to that of the proton. This energy change of 59.8 cm -I is equal to the mean kinetic energy of the nucleus.

(ii) The clamped-nuclei non-relativistic dissociation energy De for H2 is 38292.83 cm - 1 [4]. The relativistic correction takes De to 38292.30 cm - 1 [5] and the experimental value is 38295.6 cm -1 < De < 38297.6 cm- I [6].

(iii) The dipole moment ofHD, as determined from pure-rotational absorp­tion intensities in the far infrared [7], is 9 x 10-4 D = 3 x 10- 33 C m, and it arises entirely from the breakdown of the approximation, since HD is electrically centrosymmetric in the clamped-nuclei approxi­mation. The dipole has the sense H + D - and is an order of magnitude smaller than the dipole of CH3D which is 5.64 x 10- 3 D [8] and is attributable to the different mean bond lengths in CH3D [9].

(iv) The rotational magnetic moment of a molecule, which results from the distortion of the electronic structure by the angular momentum of the nuclei, is proportional to the quantum number M that gives the space-fixed component of the rotational angular momentum; the con­stant of proportionality is '" 10 - 3 IlB where IlB = en/2me is the Bohr magneton [10]. In a linear molecule in a 1 ~ electronic state, e.g. H 2, the rotational magnetic moment may be thought of as arising from a small admixture of 1 II character induced by the rotating nuclei. Similarly, vibrating nuclei cause fluctuations in the electronic current density which lead to a transition magnetic dipole moment that is important in vibrational circular dichroism [11].

1.3 Molecules and forces

Since we shall be concerned with intermolecular forces, we should consider what we mean by a molecule and what we mean by aforce. Two argon atoms form a bound diatomic Ar 2 but we do not normally consider the species Ar 2 as a molecule, since the binding energy is only about tkT at room temperature. Collisions may easily dissociate Ar 2' and there would normally be many thermally populated vibration-rotation states, I/IvJ' each with a different mean bond length R = (1/1 vJ I R 11/1 vJ) / (1/1 vJ 11/1 vJ ) and a large uncertain ty [(l/IvJI(R - R)21I/1vJ) / (I/IvJll/lvJ) r!2 in R. We prefer to speak of Ar2 as a dimer of argon atoms, or a Van der Waals molecule. Similarly H4, formed on cooling gaseous hydrogen to about 20 K at 1 atmosphere, is an infrared-active species in which two bonds are very similar to that in H2 [12]; we prefer to think ofH4 as (H2h, i.e. as the hydrogen molecule dimer. So by a molecule we mean a

Page 13: Principles of Molecular Recognition

4 PRINCIPLES OF MOLECULAR RECOGNITION

group of atoms (or a single atom) whose binding energy is large compared to kTat room temperature. A molecule therefore interacts with its environment without losing its identity. In some non-rigid molecules, such as NH 3 , 1,2-dichloroethane (CICH 2-CH2CI), or a polypeptide, there may be only a small change in energy with a large change in an internal coordinate; the influence of the environment in producing changes in the energy surface involving this coordinate may be of interest [13]. A Van der Waals molecule is a weakly bound cluster of molecules, such as Ar2, (H2h. (H 20h. (HFb (HFh, etc. There are large zero-point oscillations about the equilibrium structure and signifi­cant changes of structure with vibrational and rotational excitation. The characteristic feature of a Van der Waals molecule is that the constituent molecules retain their identity, even though their geometry and electronic structure may be perturbed; this means that 'long-range' intermolecular force theory, in which u(R) is expressed in terms of the properties of the non­interacting molecules, applies right in to near Re' For chemically bound systems, such as H 2, long-range theory fails at R values - 5Re.

And what do we mean by aforce? In Figure 1.1 the force is -du/dR and there is no difficulty here. The concept may easily be extended to a many-dimensional surface as in Figure 1.2. But what is the effective force on two ions in aqueous solution? It is convenient to consider the potential of average force A(R) which is a Helmholtz free energy and is the mean interaction energy of the two ions at a fixed separation R, averaged over all configurations of all the other mol­ecules and ions in the solution. A(R) is the sum of u(R) and -TS(R) where both u(R) and the entropy S(R) are functions of the temperature T. The entropic contribution may be supposed to arise from the change in the order in the environment resulting from the interaction of the pair. The attractive force in a stretched rubber band is attributable to a decrease in entropy on stretching; and the hydrophobic effect that appears to produce an attractive force between hydrocarbon chains in aqueous media depends on S(R), for the decrease in entropy in forming a cage of water molecules [14J is presumably less in the case of a close pair of chains than when they are far apart. It should be possible to obtain direct evidence for the 'structure-making' around a CH4

molecule in water, and eventually for the hydrophobic attraction of two CH 4

molecules, by utilizing the differential neutron scattering of solutions of CH4 and CD4 in water.

Intramolecular vibrational motion is too rapid to permit adjustment of the relative positions and orientations of neighbouring molecules, so their contribution to the en tropic force ToS(R)/oR; may normally be neglected.

1.4 The hydrophobic effect

Much is known of the hydrophobic effect from experimental studies of solutions of hydrocarbons in water and from computer simulations [15-17J, and it remains an area of active research.

Page 14: Principles of Molecular Recognition

INTERMOLECULAR FORCES 5

Since the hydrophobic forces arise from a general dislike of water by hydrocarbons, which leads to their low solubility and to phase separation, it might be thought that it can play no significant part in molecular recognition, which requires specific and coherent attraction between different atomic groups in the interacting pair. Nonetheless, it is widely believed by molecular biologists that hydrophobic forces do playa key role in protein folding. In the Introductory Lecture to the Faraday Discussion on 'Structure and Activity of Enzymes' Perutz [18] said

Most water-soluble proteins are waxy inside and soapy outside, because their larger hydrophobic amino acid residues shy away from water and coalesce. Van der Waals interactions make a large, but insufficient, contribution to the stability of the hydrophobic core thus formed. The main contribution comes from an en tropic effect discovered by Frank and Evans (1945)1 in a classic paper on the solubility of hydrocarbons in water. Near room temperature, the enthalpy of dissolution of gaseous non-polar atoms or molecules in water is always negative and proportional to the surface area of the solute molecule; the absolute value of that enthalpy decreases with rising temperature. Privalov and Gill (1988)2 demonstrated that dissolution of non-polar molecules in water raises the heat capacity of the water; that rise is also proportional to the surface area of the solute and also decreases with increasing temperature. The entropy of solution is negative, and its magnitude drops with rising temperature. Frank and Evans concluded that the non-polar atoms and molecules become solvated, such that their surface becomes covered with a layer of partially ordered water molecules which they likened to icebergs. Kauzmann (1959)3 recognized the importance of Frank and Evans' hydrophobic effect for the stability of proteins. He suggested that the water molecules' anarchic distaste for the orderly regimentation imposed upon them by the hydrophobic sidechains of the protein forces these sidechains to shy away from water and congregate in the centre of the protein. His prediction was borne out in the same year by Kendrew's structure of myoglobin. Direct experimental evidence for Frank and Evans' icebergs was first found by Hendrickson and Teeter (1981)4 in the structure of cram bin where they saw ordered water molecules covering the surface of a leucine sidechain. The hydrophobic effect stabilizes proteins only near ambient temperatures. With increasing temperature, the loss of entropy due to water adhering to the unfolded protein diminishes, which destabilizes the folded structure. When the temperature drops, the stability of the hydrated hydrocarbons in the unfolded polypeptide chain begins to exceed that of the compact hydrophobic core in the native protein, and the protein unfolds with the release of heat. Privalov and Gill (1988)2 used microcalorimetry to demonstrate this effect in myoglobin.

The magnitude of this contribution to the free energy of interaction of hydrocarbons in water is estimated to be 0.017 x 10 - 20 J for every square Angstrom of buried hydrophobic surface [19,20].

Computer simulations of methane in water have provided a potential of average force for two CH4 molecules in water and the entropy of association [21]. A change of temperature from 275 to 317 K leads to a large increase in the clustering probability [22]. Such entropy-driven attraction may play an important role in molecular recognition of flexible molecules in aqueous solution.

1 H.S. Frank and M.W. Evans (1945) J. Chern. Phys. 13, 507-532; 2 P.L. Privalov and S.1. Gill (1988) Adv. Protein Chern. 39, 191-234; 3 W. Kauzmann (1959) Adv. Protein Chern. 14, 1-63; 4 W.A. Hendrickson and M.MTeeter (1981) Nature 290, 107-113.

Page 15: Principles of Molecular Recognition

6 PRINCIPLES OF MOLECULAR RECOGNITION

1.5 Classification of intermolecular forces

The significant forces between molecules have an electric origin. It is true that there are also magnetic and gravitational interactions, but these can normally be neglected. In considering the nature of intermolecular potentials, it can be helpful to separate various contributions. The primary separation is a division of the interactions into two classes, long-range and short-range. The former decrease as R- m at large R where m is a positive integer. Thus the interaction energy of two ions varies as R - 1 and that of two dipoles as R - 3 at large R; the corresponding forces vary as R -2 and R -4. Short-range interactions decrease approximately as exp(-aR) times a polynomial in R and result from overlap of the electronic wavefunctions describing the isolated molecules. At large separations, this overlap is negligible and it is possible to consider the electrons as belonging to one molecule or another, and the n-electron wavefunction, where n = n1 + n2 , need not be antisymmetrized with respect to exchange of electrons between molecules 1 and 2; such anti symmetrization leads to short-range forces. Long-range forces can be related by perturbation theory to properties of the free molecules such as charge densities and polarizabilities [23].

Short-range forces may be attractive or repulsive, although for small R they are always repulsive. They arise from the Coulomb and exchange energies [24]. Long-range forces can also be attractive or repulsive but for pairs of inert-gas atoms in their ground states, the long-range force is attractive.

The Hellmann- Feynman theorem [25,26] requires that the forces on the nuclei may be evaluated by classical electrostatics from the charge distribu­tion. The attractive force between two inert-gas atoms at long range is associated with a slight build-up of electronic charge in the region between the nuclei. Each atom acquires a dipole moment proportional to R -7 at large R but the dipoles cancel in a homonuclear pair such as Ar2 . The attractive force varying as R - 7 results from the force exerted on each nucleus by the distorted electron cloud of its own atom, but evaluation of the interaction energy does not require such a detailed knowledge of the charge distribution [27]. The interaction at long range results from intermolecular electron correlation. In the short-range overlap region, it is not necessary to invoke a redistribution of charge to explain the force, although such a redistribution does occur and tends to reduce the strength of the repulsion.

A secondary classification of long-range interactions into several distinct types can be helpful. Table 1.1 shows these interactions and whether they are additive in the sense that U 123 = U 12 + U 23 + U 31 ; it also shows whether the forces are attractive or repulsive.

1.5.1 Electrostatic energy

The simplest and, for systems such as polar gases or electrolyte solutions, the most important, long-range interaction is the electrostatic energy. It is strong-

Page 16: Principles of Molecular Recognition

INTERMOLECULAR FORCES 7

Table 1.1 Classification of molecular interaction energies

Attractive (- ) or Additive or Range Type repulsive ( + ) non-additive

Short Overlap (Coulomb + Non-additive and exchange)

Long Electrostatic + Additive

Long Induction Non-additive Long Dispersion Nearly additive Long Resonance + Non-additive

Long Magnetic + (Weak)

ly direction-dependent and therefore crucial for molecular recognition. It is the interaction energy of the unperturbed charge distributions of the mol­ecules, and may be evaluated by performing an integration over the space of each molecule. If the separation between the molecules is large compared to their dimensions, the multipole expansion may be employed. At closer sepa­rations, a system of distributed multi poles provides a more rapidly convergent series that can conveniently be used to compute the electrostatic energy [28].

The electrostatic energy has a major role in hydrogen bonding [29].

1.5.2 Induction energy

The induction energy is the energy resulting from the distortion of one molecule by the mean electric field due to the other molecules. Like the electrostatic energy, it is absent in the case of a pair of inert-gas atoms. The main contribution to the induction energy is due to the electric dipole induced in the ith molecule by the field F(il resulting from the charge distribution ofthe other molecules.

_ 1" (i) (i)2 uinduction - - 2: L, a F - ...

!

(1.1 )

where ali) is the static polarizability tensor of molecule i. Thus, in the interaction of an ion of charge q with a spherical atom, the induction energy is _!aq2 R -4 (4m;o) - 2 where q2 R -4(4m;o) - 2 is the square of the field strength at the atom distant R from the ion. If a(i) is isotropic, a~b = a(i)ba/3 and

_ 1" (i) (i)2 Uinduction - - - L., a F - ...

2 i (1.2)

ali) is positive for a molecule in its ground electronic state, so uinduction :::; O. The

Page 17: Principles of Molecular Recognition

8 PRINCIPLES OF MOLECULAR RECOGNITION

induction energy is not additive since

F(i) = L F(ii)

i*i

FW = L F(ij)· L F(ik) = L F(ij)2 + L L F(ii). F(ik) (1.3) i*i k*i i*i i*i k*i,i

The second term on the right-hand side of equation (1.3) is responsible for the non-addivity. Thus the dipole induction energy of an atom midway between two ions of charge q is zero (since the field vanishes at that point), although the induction energy of the atom with each of the ions separately is -taq2 R- 4 (4m;o)-2, where a is the polarizability of the atom and 2R the separation of the ions.

1.5.3 Dispersion energy

Dispersion forces act between all molecules, although they are absent in the interaction of a proton and an atom. They result from intermolecular correla­tions in the fluctuations ofthe electronic coordinates of the molecules, and are a consequence of the quantum-mechanical nature of the electron.

If the electron were a classical particle, its position could be specified and there would be an electrostatic energy for each electronic configuration. For two spherical atoms, this classical electrostatic energy would average to zero in first order but would lead to a temperature-dependent average attractive energy in second order because of the Boltzmann favouring of the configur­ations of lower energy. Temperature-dependent forces of this nature were discussed by Keesom [30]. The origin of the binding energy of the liquid and solid inert gases remained a mystery until Wang [31] and London [32,33] showed that there is an attraction due to an energy varying as R - 6 between two spherical atoms. London pointed to a link between his second-order perturbation theory for this energy and optical dispersion and hence introduc­ed the name dispersion energy. The dispersion energy varying as R - 6 can be expressed rigorously in terms of an integral over all imaginary frequencies of the product of the polarizabilities a(if) of the molecules at the imaginary frequency if [34, 35]. An approximate formula for this contribution to the dispersion energy, due to London, is given in section 1.6.

1.5.4 Resonance energy

The resonance energy is the additional interaction energy that results from the lifting of degeneracy by the interaction. The degeneracy may arise because one ofthe molecules is in a degenerate state, as in the interaction of an H atom with principal quantum number 2 with an ion or polar molecule. The degeneracy might also result from the exchange of excitation between identical molecules, as in the case of a vibrating molecule having one quantum of excitation in its ith mode (Vi = 1) near an identical molecule with Vi = O. The lifting of the degener-

Page 18: Principles of Molecular Recognition

INTERMOLECULAR FORCES 9

acy by the interaction produces two or more potential surfaces which lie above and below zero; a sum over all the surfaces produces zero in first order in the long-range limit, although in any particular collision the resonance energy produces either an attractive or repulsive interaction, according to the quan­tum numbers describing the state of the pair.

1.5 .5 Magnetic interactions

Since magnetic dipoles are of the order of 1 Bohr magnet on = 0.9274 x 10- 20 e.m.u. = 0.9274 x 10- 23 A m2 , while electric dipoles are - ID = 10- 18

e.s.u. = 3.336 x 10- 30 em, magnetostatic energies are typically 10- 4 of elec­trostatic energies. If the magnetic moments are transitory, as in a non­spherical diamagnetic molecule, then the magnetic energies are smaller still and can normally be neglected. In optically active species, where the molecules exist in right- and left-handed forms, there is a coupling of the fluctuating electric and magnetic moments, giving rise to a weak dispersion energy that is dependent on the handedness of the molecules. This weak dispersion force varies as R - 7 and is attractive between similar species (i.e. left with left and right with right) but repulsive between dissimilar species (left with right) [36]. However, it is probable that this difference is negligible and that the important discriminatory forces are of short range and dependent on the shape of the molecules [37].

1.5.6 Short-range interactions

When the overlap of the electron clouds is significant it is essential that the total wavefunction be antisymmetric with respect to exchange of all pairs of electrons, in accord with the Pauli principle [38].

One important route to short-range interaction energies is through applica­tions of self-consistent-field theory to the interacting system at fixed nuclear positions and to the free molecules. The interaction energy is then the difference between the calculated energies, but unlike the total energy it is not in general bound. This approach can give useful results for short-range energies but if only a single configuration is employed there can be no electron correlation and hence no dispersion energy at long range. Since the dispersion energy is the sole source of attraction between inert-gas atoms, it is to be expected that the Hartree- Fock potential curve for these systems should have no minimum. Minima have sometimes been obtained but these result from a basis set superposition error (BSSE) and can be eliminated by the 'counter­poise' technique in which 'ghost' orbitals are introduced in calculations on the separate molecules to compensate for the extension of the basis set of the pair [39]. However, there may be residual problems resulting from the unbalanced basis set [40].

In the region of electron overlap, the identity of the interacting molecules is lost and they are merged into a 'supermolecule'. It is therefore unlikely to be

Page 19: Principles of Molecular Recognition

10 PRINCIPLES OF MOLECULAR RECOGNITION

helpful to seek a general theory of short- and intermediate-range forces which relates the interaction to the properties of the free molecules. However, in the long-range region such a theory is fruitful and provides a practical route to intermolecular potentials.

One way to ensure the anti symmetry is to choose a basis set for a variational calculation that is antisymmetric with respect to exchange of all pairs of electrons; such a basis set for the interacting pair ab could be the antisymmet­rized product functions .s1PaPb' where .s1 is the operator which antisymmetr­izes with respect to intermolecular exchange of electrons and Pa and Pb are eigenfunctions of the unperturbed Hamiltonian of the isolated molecules a and b. The set of functions .s1Pa Pb are not orthogonal at separations at which overlap is significant and cannot therefore be eigenfunctions of a Hamiltonian. Normal quantum-mechanical perturbation theory is therefore not applicable, and because the basis is overcomplete, there is no unique transformation to an orthogonal set. It is possible to perform a variational calculation with a trial function which is a sum of a finite number ofterms of the set .s1 Pa Pb' A simpler technique is to use self-consistent-field theory to obtain the best one-electron wavefunctions for the interacting molecules and to evaluate the interaction energy by subtracting the energy computed for the separate molecules.

This 'supermolecule' technique for evaluating interaction energies can incorporate electron correlation either through many-body perturbation theory (e.g. MP2) or through configuration-interaction (CI) computations. Such computations are improving but they suffer from a number of difficulties:

(i) The relative smallness of u(R) in comparison to total energies. (ii) Basis set deficiencies, leading particularly to basis set superposition

errors (BSSE). This is most serious in correlated calculations where basis set requirements are more stringent. As an illustration of the difficulties, even using large basis sets and after performing 'counter­poise' computations, BSSE causes in (H20h an uncertainty", 10cm- 1

in uelectrostatic and of at least 1 % in the well depth E which is '" 1750 cm- 1

[41]. (iii) The need for size consistency. MP2 is size consistent but CISD is not; in

CISD, single and double excitations from the SCF wavefunction are included in the CI, so for a single helium atom CISD includes all excitations (since He has only two electrons) but for He2 the triple and quadruple excitations are excluded; thus at large R the CISD computa­tion on He2 does not yield twice the energy of one He atom.

(iv) Calculations must be repeated at a large number of relative positions and orientations.

1.6 Vibrational contributions to intermolecular forces

The zero-point vibrational energies of molecules are affected by interactions, and these may lead to higher or to lower vibrational frequencies. At very high

Page 20: Principles of Molecular Recognition

INTERMOLECULAR FORCES 11

densities, the shifts are likely to be to higher frequency, as the repulsive forces will tend to increase the force constant.

Vibrations contribute positively to the static polarizability of all molecules except homonudear diatomics, and this vibrational polarizability is asso­ciated with a change in the equilibrium structure due to an external electric field. The effect is generally small [42]. There is therefore a small increase in the attraction due to vibrational contributions to the induction energy. However there are more important vibrational contributions associated with the inter­molecular modes of vibration, of which there are up to six for each additional molecule in the cluster (see section 1.1). The zero-point vibrational motion significantly reduces the binding energy and may favour one conformation over another. Thus it is found that the deuterium bond is generally stronger than the hydrogen bond and this is attributed to the greater amplitude of the perpendicular oscillations of the H relative to the D nucleus in the hydrogen bond [43]. These perpendicular vibrations tend to weaken the hydrogen bond, by reducing the collinearity of the proton with its two adjacent electronegative atoms X and Y in X-H ... Y. The X-H stretching vibration is red-shifted approximately )2 times the X-D vibration, and this increases the relative strength of the H-bond; however, this effect is generally outweighed by the bending modes, leaving the D-bond a few percent stronger than the H-bond.

Resonance energy may playa significant role in the interaction of vibration­ally excited molecules with identical partners, leading to an 'exciton' splitting, as in single crystals [44], and the sharing of the vibrational energy between the two molecules.

1.7 Magnitudes of contributions to the interaction energy

It can be useful to know the approximate magnitude of the various contribu­tions to the intermolecular potential. The relative importance of each varies from system to system. Thus electrostatic and induction energies are zero in the inert gases, in which the dispersion force is the sole source of attraction between these atoms, whereas in hydrogen-bonded systems the electrostatic energy is predominant.

The energy of interaction of single-charged positive and negative ions at a separation R is - e2 (4m;oR) -1 which is - 46 x 10- 20 J (= - 280 kJ mol-I) for R = 5 X 10- 10 m. This could be substantially reduced by the presence of a polar medium. The energy of two colinear dipoles Jl (Jl = Lieiz;) of magnitude ID (lD=3.336xlO- 30 Cm) separated by R=5xlO- 1°m is - 2Jl2(4nEo R 3)-1 = - 0.16 x 10- 20 J (= - 0.98 kJ mol-I), and that of two

perpendicular linear quadrupoles 8 (e.g. as in + = + ~) of magnitude 10 - 26 +

e.s.u. = 3.336 x 10- 40 Cm2 (8 = tLieJ3z? - r?)) is - 382 (4neo R 5 ) -1 =

-0.010 X 10- 20 J (= - 0.058 kJ mol-I); these electrostatic interactions could also be substantially reduced by the presence of a medium.

Page 21: Principles of Molecular Recognition

12 PRINCIPLES OF MOLECULAR RECOGNITION

The dispersion energy can be approximated by London's formula [33].

where 11 , 12 and cx 1 , cx 2 are the first ionization energies and static polarizabili­ties of the interacting molecules at a separation R, and eo is the permittivity of free space (4neo = 1.11265 x 10 - 10 CV - 1 m - 1 = 1 e.s. u). The dispersion energy between a pair of - CH 2 - groups separated by 5 x 10- 10 m is ap­proximately - 0.060 x 10 - 20 J (= - 0.3 kJ mol- 1) [45]. For two long parallel linear chains, each containing n -CH2 - groups at a separation d, the total dispersion energy varies as nd - 5 and for d = 5 x 10 - 10 m is equal to - 0.3 n x 10 - 20 J = - 1.7 n kJ mol- 1 [45]. These forces provide a simple explanation of differences in the cohesive energy of cis-unsaturated fatty acids as compared to trans-unsaturated or saturated fatty acids [45-47].

The heat of sublimation of crystalline carbon dioxide at 0 K is 27 kJ mol- 1 ;

of this approximately 45% is due to the electrostatic quadrupole-quadrupole interactions (0 = 14 X 10- 40 Cm2 for CO2 [48]) and 55% to the dispersion forces.

1.8 Forces between macroscopic bodies

The interaction of two macroscopic bodies can sometimes be obtained by summing the dispersion energy between all pairs of molecules or unit cells in the two bodies. There is no electrostatic or induction contribution when the material is uncharged and isotropic. If the separation of the units is large compared to the reduced wavelength 1. associated with the strong electronic transitions, the dispersion interaction is retarded and therefore weakened; it varies as R - 7 rather than R - 6 [34]. If the dispersion energy between the units is

u(R) = - CR- 6 for R ~ f.., u(R) = - KR -7 for R'~ f..,

the interaction of macroscopic bodies may be written in terms of n1 n2 C or n1 n2 K where n1 and n2 are the number of units per unit volume of bodies 1 and 2. Some representative energies are shown in Table 1.2 (see [49, 50]).

1.9 The effect of a medium

A medium of relative permittivity, or dielectric constant, er reduces the electrostatic interaction force oftwo molecules immersed in it by er . The effects of the medium on dispersion energy have been examined [51-53]; it is convenient to introduce an 'effective' or 'excess' polarizability cx* which may be

Page 22: Principles of Molecular Recognition

Tab

le 1

.2

Dis

pers

ion

ener

gies

for

mac

rosc

opic

bod

ies

Sys

tem

Ato

m-a

tom

Ato

m-f

lat

Sph

ere-

sphe

re

Fla

t-fl

at

1 2

0--

--J-

0

( tl

)

Q d ~

)~

f2V

./"'

bfR

'J.

\f!y

v (d

~R2)

N o

n-re

tard

ed

u=

-Cd

-6

u =

-i

nn

2 C

d-

3

, 2

_ (

R,R

2

)Cd

-'

U=

-6n

n,n

2 R

, +

R2

U

=_

1

area

T

Inn

,n2

Cd

-2

Ret

arde

d

U=

-K

d-

7

U=

1

-lO

nn2

Kd

-4

, 2

__

(

R,R

2

)Kd

-2

U =

-

JOn

n, n

2 R

, +

R2

U

, ar

ea

-JO

nn

,n2

Kd

-3

Page 23: Principles of Molecular Recognition

14 PRINCIPLES OF MOLECULAR RECOGNITION

used to give the effective intermolecular energy. The dispersion force between any two similar spherical systems is always attractive, regardless of the nature of the medium, so that two bubbles or two colloidal particles attract one another.

The presence of polarizable matter between interacting molecules may increase their mutual potential energy. For example, if a sphere of polarizabil­ity IX is at the point midway between a pair of charges + q and - q at a separation R the interaction energy is

u(R) = - q2 (4m;oR)-1 [1 + 32IX(4nEo R3)-I].

However, the sphere would not change the potential energy of two charges of the same sign, for which u(R) = q2 (4nEo R) - I. If spheres of polarizability IX are at a fixed distance d beyond each of the charges q and - q, the magnitude of the force between the charges is reduced, and takes the value

-q2(4nEoR2)-1 [1-4IXr2R-I(1 +2dR- 1)(1 +dR- 1)-S(4nEo)-I].

If the two charges have the same sign, the force of repUlsion is enhanced to

q2(4nEoR2)-1 [1 +4IXd- 2 R- 1(1 +2dR- 1 +2d2R- 2)(1 +dR- 1 )-S(4nEo)-I].

1.10 The hydrogen bond

The hydrogen bond is an attractive interaction between a proton donor and a proton acceptor. The donor and acceptor may be in the same or in a different molecule, and we call them intramolecular or intermolecular hydrogen bonds accordingly. The name appears to have been coined by Latimer and Rodebush [54].

The hydrogen bond plays a central role in determining the structure and energetics of biopolymers and is likely to be of great importance in molecular recognition because of its strength and directionality. In the IX-helix structure of proteins the amide )NH group serves as the proton donor and the OC:::: group as the acceptor, or base. The combined effects of many hydrogen bonds provide the major driving force for the tertiary structure of biomacro­molecules. A valuable early review of the hydrogen bond, including its role in molecular biology, is provided by Pimentel and McClellan [55]. Another useful reference is the three-volume set of books edited by Schuster et al. [56].

The hydrogen bond has some characteristic features, particularly in infrared and NMR spectra. It causes a substantial red shift, a large enhancement in the intensity and a broadening of the hydrogen stretching vibration of the proton donor. This proton also experiences a large decrease in its nuclear magnetic shielding, amounting to several parts per million, as a result of hydrogen bonding to a base. Other important manifestations of the hydrogen bond include a shorter distance between the two electronegative atoms involved in

Page 24: Principles of Molecular Recognition

INTERMOLECULAR FORCES 15

the bond, as found by X-ray and neutron diffraction of crystals [57], and a profound effect on the properties of liquids.

Hydrogen bond dissociation energies are typically in the range 10-30 kJ mol- 1 (1.7 - 5.0 x 10 - 20 J per bond). The FHF - anion has a disso­ciation energy to FH + F- of 214 kJ mol- I and is sometimes said to be the strongest hydrogen bond, although Emsley et al. [58] claimed that HCOOH··· F - had a larger f1E of 250 kJ mol-I. However the hydrogen bond in this system is better described as HCOO- ... HF with a f1E of 105 kJ mol- I [59,60], illustrating the point that a hydrogen bond X··· H··· Y can dissociate to XH + Y or to X + HY. It is probably wise not to call FHF- a hydrogen­bonded species, for its dissociation must involve the breaking of a strong HF bond. The term hydrogen bond is probably better kept for systems XH··· Y, such as FH··· FH, where the molecules or groups XH and Y retain their integrity; that is, they resemble free XH and free Y, although the electronic structure and equilibrium bond lengths and angles in XH and in Y may change somewhat from those in the free state. Thus in FH··· FH, the HFs are identifiable as perturbed HF molecules but in FHF- they are not.

The hydrogen bond should be thought of as a strong Van der Waals interaction and the essential feature of a Van der Waals molecule is that its attractive intermolecular potential energy surface can be described by long­range intermolecular force theory, i.e. by a combination of electrostatic, induction and dispersion energies. The short-range repulsive forces in Van der Waals molecules come from the exchange interaction when the charge clouds overlap significantly; they can be approximated from knowledge of the unperturbed charge densities of the free molecules and reflect the size and shape of the monomers.

References

1. G.c. Maitland, M. Rigby, E.B. Smith and W.A. Wakeham(1981) Intermolecular Forces: Their Origin and Determination, Oxford University Press.

2. M. Born and K. Huang (1954) Dynamical Theory of Crystal Lattices, Oxford University Press. 3. H.C. Longuet-Higgins (1961) Advances in Spectroscopy, 2, 429-472. 4. W. KoI'os and L. Wolniewicz (1968) J. Chem. Phys. 49, 401-410. 5. W. KoI'os and L. Wolniewicz (1964) J. Chem. Phys. 41,3663-3673. 6. G. Herzberg (1970) J. Mol. Spectrosc. 33,147-168. 7. 1.B. Nelson and G.C Tabisz (1982) Phys. Rev. Lett. 48,1393-1396. 8. S.c. Wofsy, 1.S. Muenter and W. Klemperer (1970) J. Chem. Phys. 53, 4005-4014. 9. F.A. Gangemi (1963) J. Chem. Phys. 39, 3490-3496.

10. C.H. Townes and A.L. Schawlow (1955) Microwave Spectroscopy, McGraw-Hill, New York. 11. A.D. Buckingham, P.W. Fowler and P.A. Galwas (1987) Chem. Phys. 112, 1-14. 12. A.R.W. McKellar and H.L. Welsh (1974) Can. J. Phys. 52,1082-1089. 13. A.D. Buckingham (1980) Pure Appl. Chem. 52, 2253-2260. 14. J.A.V. Butler (1937) Trans. Faraday Soc. 33, 229-236. IS. C. Tanford (1973) The Hydrophobic Effect, Wiley, New York. 16. A. Ben-Nairn (1980) Hydrophobic Interactions, Plenum Press, New York. 17. W.L. Jorgensen (1991) Chemtracts:Organic Chemistry, 4, 91-119.

Page 25: Principles of Molecular Recognition

16 PRINCIPLES OF MOLECULAR RECOGNITION

18. M. Perutz (1992) Discuss. Faraday Soc. 93,1-11. 19. M. Perutz (1992) Discuss. Faraday Soc. 93,107. 20. D.S Eisenberg, M. Wesson and M. Yamashita (1989) Chern. Scripta, 29A, 217-221. 21. D.E. Smith, L. Zhang and A.DJ. Haymet (1992) J. Am. Chern. Soc. 114,5875-5876. 22. N.T. Skipper (1993) Chern. Phys. Lett. 207, 424-429. 23. A.D. Buckingham (1967) Adv. Chern. Phys. 12,107-142. 24. H. Margenau and N.R. Kestner (1971) Theory of Intermolecular Forces, 2nd edn. Pergamon

Press, Oxford. 25. H. Hellmann (1937) Einfuhrung in die Quantenchemie, Deuticke, Leipzig, p. 285. 26. R.P. Feynman (1939) Phys. Rev. 56, 340-343. 27. J.O. Hirschfelder and M.A. Eliason (1967) J. Chern. Phys.47, 1164-1169. 28. AJ. Stone and M. Alderton (1985) Molec. Phys. 56,1047-1064. 29. A.D. Buckingham, P.W. Fowler and 1.M. Hutson (1988) Chern. Rev. 88, 963-988. 30. W.H. Keesom (1921) Physik. Z. 22, 129-141. 31. S.c. Wang (1928) Physik. Z. 28, 663-666. 32. F. London (1930) Z. Physik. Chern. 11,222-251. 33. F. London (1942) J. Phys. Chern. 46, 305-316. 34. H.B.G. Casimir and D. Polder (1948) Phys. Rev. 73, 360-372. 35. A.D. Buckingham (1978) In Molecular Interactions: From Diatomics to Biopolymers, ed. B.

Pullman, Wiley, Chichester, pp 3-67. 36. C. Mavroyannis and MJ. Stephen (1962) Molec. Phys. 5, 629-638. 37. S.F. Mason (1982) Molecular Optical Activity and the Chiral Discriminations, Cambridge

University Press. 38. P. Claverie (1978) In Molecular Interactions: From Diatomics to Biopolymers, ed. B. Pullman,

Wiley, Chichester, pp. 69-305. 39. S.F. Boys and F. Bernardi (1970) Molec. Phys. 19,553-566. 40. P.W. Fowler and A.D. Buckingham, (1983) Mol. Phys. SO, 1349-1361. 41. AJ. Stone (1990) In Dynamics of Polyatomic Van der Waals Complexes, eds N. Halberstadt

and K.c. Janda, Plenum Press, New York, pp. 329-341. 42. D.M. Bishop (1990) Rev. Mod. Phys. 62, 343-374. 43. S.A.c. McDonald and A.D. Buckingham (1991) Chern. Phys. Lett. 182, 551-555. 44. D.P. Craig and S.H. Walmsley (1968) Excitons in Molecular Crystals: Theory and Applica-

tions, Benjamin, New York. 45. L. Salem (1962) Canadian J. Biochem. Physiol. 40,1287-1298. 46. HJ. Deuel, Jr. (1951) The Lipids, Interscience, New York, p. 52. 47. F.R.N. Gurd (1960) In Lipid Chemistry, ed. DJ. Hanahan, Wiley, New York, p. 222. 48. A.D. Buckingham and R.L. Disch (1963) Proc. Roy. Soc. A. 273, 275-289. 49. J.H. de Boer (1936) Trans. Faraday Soc. 32,10-37. 50. J.N. IsraeIachvili and D. Tabor (1973) Prog. Surface Membrane Sci. 7, I-55. 51. A.D. McLachlan (1965) Discuss. Faraday Soc. 40, 239-245. 52. N.R. Kestner and O. Sinanoglu (1963) J. Chern. Phys. 38, 1730-1739. 53. J.N. Israelachvili (1985) Intermolecular and Surface Forces, Academic Press, New York. 54. W.M. Latimer and W.H. Rodebush (1920) J. Am. Chern. Soc. 42,1419-1433. 55. G.c. Pimentel and A.L. McClennan (1960) The Hydrogen Bond, W.H. Freeman, San Fran­

cisco. 56. P. Schuster, G. Zundel and C. Sandorfy (eds) (1976) The Hydrogen Bond: Recent Advances in

Theory and Experiment, Vols. I, II, III, North-Holland, Amsterdam. 57. R. Taylor and O. Kennard (1984) Acc. Chern. Res. 17,320-326. 58. 1. Emsley, O.P.A. Hoyte and R.E. Overill (1977) J. Chern. Soc. Perkin Trans. 2 2079-2082. 59. WJ. Bouma and L. Radom (1979) Chern. Phys. Lett. 64, 216-218. 60. 1. Emsley and R.E. Overill (1979) Chern. Phys. Lett. 65, 616-617.

Page 26: Principles of Molecular Recognition

2 Molecular recognition involving small gas-phase molecules A.C. LEGON and D.l. MILLEN

2.1 Introduction

The essential aspect of molecular recognition is the specific nature of the interaction of one or more parts of a molecule with one or more parts of another molecule. The important characteristics of the interaction are the strength ofthe interaction(s) and its angular dependence. Presumably, there is a very specific arrangement of the two molecules at which the energy is lower than for other possible orientations. Clearly, it is important to identify such interaction sites and to develop an understanding of the factors that determine site selectivity in dimer formation. Consequently, any discussion of the fundamentals of molecular recognition requires a detailed understanding of the preferred angular geometry at the interaction site, how easily the geometry is distorted and how much energy is required to break the bond. Thus, the objectives in understanding the weak interaction are: (i) to identify the inter­action site; (ii) to characterise the equilibrium angular and radial geometries; and (iii) to measure the two quantities that define the strength of the interac­tion, namely the intermolecular stretching force constant (which is related to the energy required for a unit infinitesimal extension of the weak bond) and the dissociation energy (which is the energy required for an infinite extension of the bond). The lowest energy angular geometry is also characterised by another measure of the strength of binding of the two component molecules, i.e. the force constants associated with the intermolecular bending motion of the dimer. It turns out that the objectives outlined can be readily and precisely achieved from investigations of the rotational spectrum or the vibration­rotation spectrum of a dimer molecule (that is one composed of two monomer species interacting through a bond much weaker than normal chemical bonds). In this chapter, we discuss some generalisations that can be made about the angular geometries of dimers from studies of their rotational spectra. These generalisations have been developed into a successful quantitative model for angular geometries that is based on a simple electrostatic interaction between the pair of molecules. We give an account of this development, emphasising the physical aspects, and suggest simplified models for various molecules or groups that will reconcile the pictorial models familiar in Chemistry and the full ver­sion of the electrostatic model. We then consider in some detail the factors

Page 27: Principles of Molecular Recognition

18 PRINCIPLES OF MOLECULAR RECOGNITION

determining the selection of interaction site on dimer formation. This is followed by an examination of the consequences of the presence of two inter­action sites within a dimer. Attention is then turned to the effects of a secondary interaction site on the properties of a particular isomer.

2.2 How to determine the angular geometry and strength of intermolecular binding for an isolated dimer

As mentioned in the introduction to this chapter, the pure rotational spectrum (or the high resolution vibration-rotation spectrum) of a weakly bound dimer is rich in information about the equilibrium angular geometry and the intermolecular potential energy function. In this section, we give an outline of the spectroscopic observables and their interpretation in terms of the dimer properties of interest. Details of the various experimental techniques used to obtain rotational spectra are given elsewhere [1-3]. Broadly, they fall into two categories: those in which an equilibrium mixture of the two interacting components (e.g. Band HX, if a hydrogen bond is being investigated) is cooled to a sufficiently low temperature to give a detectable concentration of dimers and those in which a mixture of Band HX seeded in for example argon is expanded adiabatically through a nozzle into a vacuum and the resulting dimer-rich, low-temperature gas is then probed spectroscopically.

A molecule that, as we shall see, has been central in the understanding of the angular geometries of hydrogen-bonded dimers is H20 ... HF. Moreover, the rotational spectrum of this dimer has been extensively investigated [4] by both the equilibrium gas mixture method and the supersonic expansion method. The angular geometry of the dimer H20 ... HF presents a particularly difficult problem even for rotational spectroscopy because of the very small changes in energy that accompany large variations of the angular geometry. Normally, rotational constants (which are inversely proportional to principal moments of inertia) are sufficient to settle the relative disposition of the two components (assumed rigid) in space with good precision. For example, the dimer oxirane ... HF has a pyramidal arrangement at the oxygen atom in (CH2)20··· HF [5] and this can be clearly demonstrated from the observed moments of inertia alone. Because of the small contribution of the two hydrogen atoms of H20 to the moments of inertia of H 20 ... HF, a distinction between a pyramidal and a planar arrangement at ° is not possible in this way.

Water is obviously important in molecular recognition in the following sense: does an approaching molecule HX, say, which forms a hydrogen bond to H20, prefer to complete a planar or a pyramidal arrangement at oxygen? It is just because the difference in energy between these arrangements is so small that it is so difficult to come to a decision. Nevertheless, it has been shown from the rotational spectrum that the pyramidal configuration at oxygen in H20 ... HF is energetically preferred and hence that there is a small but specific

Page 28: Principles of Molecular Recognition

SMALL GAS-PHASE MOLECULES 19

angular preference when binding at oxygen [4]. So small is the energy difference that in the liquid or solid phases it could be masked by other effects arising from solvent molecules etc. This underlines the importance of inves­tigating isolated dimers from the point of view of understanding what deter­mines angular geometry in hydrogen bonding. The essence of the problem for H20 ... HF is illustrated by the variation of the potential energy V(¢) with the angle ¢, which is defined in Figure 2.1(a). A plot of V(¢) versus ¢ determined experimentally from the rotational spectrum of H 20 ... HF [4] is shown in Figure 2.2, onto which are drawn the vibrational energy levels associated with the mode vp(o) also shown schematically in Figure 2.1(b). It is clear that in the zero-point state, H2 0 ... HF inverts readily between the two equivalent pyra­midal forms, more rapidly even than for the well known case of the ammonia molecule. On the other hand, the equilibrium geometry, unlike in the zero­point state, definitely has a pyramidal arrangement at oxygen.

The vibrational spacings associated with the mode v P(O) in H 20 ... HF and the potential energy function (Figure 2.2) that governs this mode were in fact determined from the rotational spectrum of H 20 ... HF in an equilibrium gas

~ ..... ~ 1I~ (0)

~ ..... ~

/ /

/\-c:f ..... ~ (8)

~ ..... ~

~ ..... ~

(b)

~ ..... o-®+

Figure 2.1 (a) Definition of the angle 4> in H 20 ... HF. (b) Schematic representation of the intermolecular bending modes vp(o)' vpw vB(o)' vBO ) v. and Vs in H20 ... HF. The modes are classified according to the point group C 2v of the potential surface. (Redrawn from [4] with permission from

the Royal Society.)

Page 29: Principles of Molecular Recognition

20 PRINCIPLES OF MOLECULAR RECOGNITION

8

7 ~IO)=3

6

I

"0 5 E .., VIJIO) =2 ~ 4 -'S

3 i:

2

o

-80 -40 o 40 80

r/> I deg

Figure 2.2 Experimentally determined potential energy function V(Ijl) for H 20 ... HF showing vibrational energy levels associated with vp(O) (see Figure 2.1 for definition of Ijl). (Redrawn from

[4] with permission from The Royal Society.)

mixture of the two components. Each rotational transition of H20 ... HF consists of a strong feature associated with the vibrational ground state accompanied by vibrational satellites corresponding to the same rotational transition but in vibrationally excited states of the molecule. At the tempera­ture of the experiment, only the energy levels associated with the intermolecu­lar modes are sufficiently populated to give rise to satellites. The intensity of a satellite relative to the ground state transition leads, via the Boltzmann factor, to the energy of the state in question above that of the ground state. Vibrational separations in vp(o) determined in this way are seen to be strongly anharmonic. A corresponding irregular variation of the position of the satellites with the vibrational quantum number vp(o) confirmed the presence of a double minimum potential function, the quantitative form of which was then determined by fitting all available spectroscopic data.

The important conclusion that the equilibrium geometry of H20 ... HF is pyramidal (not planar) at oxygen was thereby established. For the related molecule (CH 2hO··· HF, an examination of the appropriate vibrational satellites established that this dimer too is pyramidal, with a higher and wider

Page 30: Principles of Molecular Recognition

SMALL GAS-PHASE MOLECULES 21

(see later) potential energy barrier at the planar conformation [5]. This conclusion also follows (see above) from ground-state moments of inertia because for the more massive oxirane molecule the inversion is too small to observe.

The strength ofthe hydrogen bond in H20 ... HF can, as indicated above, be discussed in terms either of the intermolecular stretching force constant ka or the dissociation energy De of the dimer. Both of these can be obtained from intensity measurements although the first is better determined from the effect of centrifugal distortion on the rotating molecule. When the rotational energy of H 20 ... HF increases, the intermolecular bond increases in length as a result of the centrifugal force. Each rotational state is therefore associated with a slightly different intermolecular distance. Spectroscopically, this is taken up in the centrifugal distortion constant D J. If the H20 and HF molecules are treated as rigid, it is possible to establish a simple relationship between D J and ka. The value thereby established for ka in H2 0 ... HF is 24.9 N m - 1 [6]. Alternatively, if the intensity of the vibrational satellite arising from the state Va = 1, where (J signifies the hydrogen bond stretching mode (see Figure 2.1), relative to that of the ground-state transition is measured, Va can be determined and hence ka from the simple harmonic expression va=(2nc)-1 (ka/Il)1/2. For a few dimers, Va has been determined directly by infrared spectroscopy, although not yet for H20 ... HF. The ease of angular distortion of the dimer for very small displacements is measured by the hydrogen bond bending force constants kp and kB , where f3 and B are labels identifying the low-frequency and high-frequency intermolecular bending modes, respectively. For H 20 ... HF, there are two bending modes of each type, conveniently described as the in- and out-of-plane bending modes Vp(i)' vp(o) and VB(i)' vB(o)' as shown schematically in Figure 2.1. There are two sources of kp and kB values: relative intensities of vibrational satellites in the rotational spectrum (see above) and direct observation of transitions associated with these modes in the infrared spectrum.

The other measure of strength of binding, De' can be obtained by measuring the absolute intensity ofa rotational transition of H2 0 ... HF and of H20 in an equilibrium gas mixture of H20, HF and H20 ... HF. The intensity leads to the number density no.o(B) of the component B in its v = 0, J = 0 state and thence to Do from the relation

If sufficient is known of the contribution of the various vibrational modes of H20 ... HF to the zero-point energy, Do can be corrected to give De. The experimentally determined values are Do = 34.3(3) kJ mol- 1 and De = 42.9(8) kJ mol- 1 [7].

Page 31: Principles of Molecular Recognition

22 PRINCIPLES OF MOLECULAR RECOGNITION

2.3 Empirical observations about angular geometries in the series B ···HX

In the preceding section, it was established that the dimer H20 ... HF has a pair of equivalent conformations with a pyramidal arrangement at the oxygen atom (see Figure 2.1). The potential energy V(¢) varies with ¢ as illustrated in Figure 2.2 and presents only a low barrier to the planar (¢ = 0) dimer. The equilibrium angle ¢e = 46(8)° is not far from half the tetrahedral angle (54° 28'). The value of ¢e and the facile inversion implied by Figure 2.2 suggest a very simple model for the equilibrium geometry of H20 ... HF, namely one in which the HF molecule lies along the axis of one of the two equivalent non-bonding electron pairs on the oxygen atom (see Figure 2.3). Extension of this simple model to other dimers B··· HF is straightforward and leads to equilibrium angular geometries in good agreement with those obtained experimentally. A summary of the angular geometries of a selected group of key dimers B··· HF, as determined by rotational spectroscopy, is given in Table 2.1 [8]. These fall into four main groups. the first is constituted by the series B = H20, 2,5-dihydrofuran, oxetane, oxirane and H2S, where each acceptor molecule is one in which the acceptor atom is conventionally viewed as carrying two equivalent non-bonding electron pairs that do not lie in the molecular plane. Secondly, there is an example of a molecule (H 2CO) in which the acceptor atom carries two equivalent n-pairs trigonally disposed and lying in the molecular plane. Thirdly, S02 is an example of a case where the acceptor atom carries two inequivalent n-pairs. Finally, the group B = ethyne, ethene and cyclopropane contains molecules which carry no n-pairs but for which the n- or pseudo-n electron pairs might fulfil the role of n-pairs when forming dimers B ... HX.

The experimental angular geometries of dimers recorded in Table 2.1 can be predicted with the aid of a set of simple rules suggested [8,9] by the model for H20 ... HF discussed above and shown in Figure 2.3. The gas-phase equilib­rium geometry of a dimer B ... HX can be obtained by assuming that:

1. the axis of the HX molecule coincides with the supposed axis of a non-bonding electron pair as conventionally envisaged, or, if B has no non-bonding electron pairs but has n-bonding electron pairs,

Figure 2.3 Non-bonding pair model for H20 ... HF. (Redrawn from [8] with permission from the Royal Society of Chemistry.)

Page 32: Principles of Molecular Recognition

Table 2.1 Experimental and predicted angular geometries of some key dimers B··· HF

B

6

<>

H-C_C-H

Experimental angular geometry Angular geometry predicted by the rules

Pyramidal at 0

Pyramidal at 0

Pyramidal at 0

Pyramidal at 0

Pyramidal at S

Trigonal at 0

Trigonal, cis arrange­ment

T-shaped

HF lies along the per­pendicular C2 axis of ethene

HF lies along the extension of a median of the cyclopropane equilateral triangle

~~;.

- - - -~

~ ~ • .'.j,

-_. - ,- 'J'W - - - - - - :- - -'\0 ,.

_.- ._,- -,_. _.- -.- . -~ ~ .. :~ \/\ .":.

~ ~ --rt -11,----~ .. : -

1

~. \

~ ---y--

~ ~.

i

• The rules also predict the arrangement (not shown) with the HF molecule trans to the s=o bond.

Page 33: Principles of Molecular Recognition

24 PRINCIPLES OF MOLECULAR RECOGNITION

2. the axis of the HX molecule intersects the internuclear axis of the atoms forming the n-bond and is perpendicular to the plane of symmetry of the n-bond.

3. Rule (1) is definitive when B has both non-bonding and n-bonding pairs.

The angular geometries of the set of key dimers abstracted in Table 2.1 are then readily rationalised. The geometries predicted by application of the rules are given in the final column of Table 2.1. The agreement with experiment in each case is remarkable for so simple a model. We note in particular that the angle COC decreases along the series 2,5-dihydrofuran, oxetane and oxirane and therefore, presumably, the angle between the n-pairs on oxygen should increase. This is borne out in the experimental angular geometries of the corresponding dimers B··· HF by the fact that the angle cjJ increases from 49°, through 58° to 72° along the series [8]. Evidently, the HF molecule can be viewed as probing the direction of a non-bonding electron pair in each case, as required by the rules. Likewise, the observed perpendicular geometry of H2S··· HF is reproduced if the n-pairs on sulphur are assumed to occupy sp hybrid orbitals and the S-H bond pairs are formed using pure 3p orbitals. The planar arrangement with an angle COF = 120° predicted by the assumption of Sp2 hybridisation at oxygen in formaldehyde (see Table 2.1) is also in good agreement with the experimentally observed geometry.

When the n-pairs carried by the acceptor atom are inequivalent, as for the oxygen atoms of S02' the rules are non-committal. In fact, the observed angular geometry ofS02··· HF indicates that the HF molecule lies in the plane of the S02 molecule and along a direction which almost coincides with the direction of the axis ofthe n-pair that is cis to the S=O double bond. This may be confirmed by examination of the n-pair model of S02 shown in the final column of Table 2.1.

The final group of acceptor molecules of interest are those carrying only n-electron pairs. The n-electron density models of ethyne and ethene shown in Table 2.1 lead to the prediction of aT-shaped planar and a perpendicular geometry, respectively, for the dimers with, e.g. HCl. The agreement with experiment is again good. The Coulson-Moffitt [10] model of cyclopropane assumes Sp3 hybridization ofthe carbon atoms.IfC-C bonds in cyclopropane are to be formed by overlap of the Sp3 hybrid orbitals on adjacent carbon atoms, the bonds will be bent, i.e. the internuclear line and the line of greatest electron density do not coincide. The result is a pseudo-n type bond (see Table 2.1) which is like the n-bond in ethene and which accounts for the unsaturated behaviour of cyclopropane. The rules then predict that HCI should lie along the extension of a median of the cyclopropane equilateral triangle, as observed.

The angular geometries of a very large number of hydrogen-bonded dimers B··· HX are now available and almost all of these are in accord with the above rules [8].

Page 34: Principles of Molecular Recognition

SMALL GAS-PHASE MOLECULES

2.4 An electrostatic model for the hydrogen bond interaction: the Buckingham - Fowler model

25

The experimental developments that led to the empirical rules for angular geometries, as outlined above, were accompanied by much theoretical activity. In the sense that the rules rely on the HX molecule seeking the direction of greatest electron density and on the notion that one sub-unit does not perturb the other, the rules can be viewed as electrostatic in origin, i.e. the interaction between Band HX is that which minimises the simple electrostatic energy of the system. An important result of the theoretical activity has been the recognition that the electrostatic contribution dominates the angular depend­ence of the molecular interaction [11, 12]. Of particular importance for the chemical understanding of angular geometries of weakly bound dimers is the model introduced by Buckingham and Fowler [12] which takes advantage of a method of representing the electric charge distribution of a molecule that satisfies chemical intuition. At the same time, the model is kept computation­ally simple by replacing continuous charge densities by point multipoles.

In the Buckingham - Fowler model, each monomer electric charge distribu­tion is described by a set of point multipoles (charges, dipoles and quadru­poles) located on the atoms and, sometimes, additionally at bond midpoints. The values of the point multi poles are determined by the so-called distributed multi pole analysis (DMA) of an ab initio wavefunction. This multicentric representation of the charge distribution shows superior convergence behav­iour to the one-centre molecular multi poles when calculating the electrostatic potential around a molecule.

The second important contribution to the Buckingham-Fowler model is concerned with the choice of short-range repulsive potential which governs the repulsive force between two molecules sufficiently close together and which therefore defines the shape and size of the molecules. They use hard spheres with Van der Waals radii placed on atomic centres in each molecule. The procedure in calculating the electrostatic energy of the two molecules as a function of relative orientation is then to place the monomers in Van der Waals contact and allow one molecule to roll over the other until a minimum in the interaction energy is achieved. This is repeated for all possible contacts to find the global minimum and thereby establish the angular geometry of the most stable form of the dimer.

The Buckingham-Fowler model, as described above, has been especially successful in predicting the quantitative angular geometries of a wide range of hydrogen-bonded (and other weakly bound) dimers. Figure 2.4 gives the model geometries [12] for a selection of complexes B··· HF which can be compared with the observed counterparts as given in Table 2.1. The angles o ... H - F in the B··· HF of Table 2.1 are shown as 1800 because of the difficulty of placing the H -atom through rotational spectroscopy. On the other hand, the Buckingham-Fowler model finds small deviations of this system

Page 35: Principles of Molecular Recognition

26 PRINCIPLES OF MOLECULAR RECOGNITION

1a~~~ ./ 172" /

i

J: .. ~ 113"

C

I

fi.8~~ I 178" I i i

Figure 2.4 Angular geometries predicted for the dimers H20 ... HF, H2CO··· HF, H2S··· HF and ethene ... HF by the Buckingham - Fowler electrostatic model.

from linearity. The reasons why the electrostatic model is so successful have been discussed by Hurst et al. [13] who examined the variation of the different contributions (electrostatic, polarisation, charge transfer and exchange repul­sion) to the interaction energy and found that all but the first of these are not strongly dependent on angle. Moreover, the small angular dependencies shown by exchange repulsion and charge transfer approximately cancel.

2.5 The electrostatic model and non-bonding electron pairs

It has been established in the two preceding sections that the empirical rules (which rely on the conventional view of non-bonding electron pairs and the notion of a simple non-perturbing electrostatic interaction of Band HX) and the Buckingham - Fowler electrostatic model, both give a successful account of observed angular geometries of B··· HF. On the other hand, when the electronic charge distribution near to, for example, the oxygen atom in H 20 is examined there is no evidence for other than tiny deviations from a hemi­spherical distribution in the vicinity normally associated with non-bonding electron pairs (see Figure 2.5 which shows, for example, the electron density contour diagram for H 20 resulting from a recent ab initio SCF calculation [14]).

A question which then arises naturally is: why are the rules successful? This can be answered (following the approach of a recent review [8]) by examin­ation of the angular variation of electrostatic potential about the appropriate atom in an acceptor molecule B in B··· HF. The reason for this approach is evident from the fact that the electric charge distribution of the HF molecule

Page 36: Principles of Molecular Recognition

SMALL GAS-PHASE MOLECULES 27

(8) (b)

Figure 2.5 Electron density contour diagrams for H 20 (a) in the molecular plane and (b) in the perpendicular plane from the ab initio SCF calculation given in [14]. (Redrawn from [14] with

permission from the American Chemical Society.)

can be represented in acceptable approximation by a very simple model in which a charge of + 0.54 e is placed on H and a charge of - 0.54 e is situated on F [12]. In one further degree of approximation, the charge on F can be ignored since F will be further from B than will H in B··· HF [8]. We then seek the angular variation of the electrostatic potential energy of a non-perturbing point charge at the appropriate distance [r(B··· HX)] from B. In what follows, we use for convenience a non-perturbing charge of magnitude e instead of 0.54 e. The effect of changing from e to 0.54 e and including the charge on F (which then makes HF an extended electric dipole) is discussed later.

The approach outlined above can be illustrated through its application to H20 ... HF by plotting the electrostatic potential energy V(¢) of the point charge + e at a fixed distance r = 1.74 A from the oxygen atom as a function of the angle ¢, where r is the experimental distance from 0 to H in H 2 0 ... HF. Such a plot is shown in Figure 2.6 where the angle ¢ is defined again. The quantity V(¢) has been accurately calculated by using the DMA of H 2 0 given by Buckingham and Fowler [12] together with the electrostatic formalism set out by Buckingham [15]. We note that the curve is of the double-minimum type (¢min = + 30° and ¢min = - 30°) with a potential energy barrier to the ¢ = 0 position (point charge in the plane) of only 0.8 kJ mol- 1. This type of diagram clearly and simply reveals the directionality of the electrostatic potential. Moreover, the similarity of Figure 2.6 to Figure 2.2 is striking and indicates that, for H 20 ... HF, the zeroth approximation in which HF IS

considered as merely a point positive charge is not too unsatisfactory. We now examine why the angle between the minima in Figure 2.6 is

2 x '"'"'30° instead of 2 x '"'"'54° expected on the basis of the non-bonding pair model of H 20. The reason is that the electrostatic potential at any point is determined not only by the non-bonding pair in question but also by the resultant partial positive charge on the protons of H 20 and the negative

Page 37: Principles of Molecular Recognition

28 PRINCIPLES OF MOLECULAR RECOGNITION

120

i I . r-I.74 A 'j' 130 r¢\ / '0 'ct> E .., ~ --e. 140 --~

150

-80 -40 o 40 80

¢ Ideg

Figure 2.6 Electrostatic potential energy V(cjJ) ofa point charge +e at r = 1.74 A from oxygen in H 20. (Redrawn from [8] with permission from the Royal Society of Chemistry.)

charge of the other non-bonding pair. The partial positive charge always acts to decrease the potential of the point positive charge for a given angle ¢ and becomes more effective as ¢ increases. On the ther hand, the effect of the other non-bonding pair is that V(¢) changes more slowly than expected as ¢ is reduced from ¢min to zero. Hence, ¢min will always be less than that expected on the basis of an isolated non-bonding pair. This effect is more serious in H 20 than in any other example that we consider because, first, the angle between the non-bonding pairs is smaller than in the other cases and, secondly, the O-H bond is short and strongly polar. As we shall see, for example in H 2S, where 2¢min is large and the S-H bond is longer and less polar, ¢min is much closer to the expected value.

An analogous calculation of V(¢) as a function of ¢ has been performed for H2 S but for several different values of r [8]. The results are shown in the composite diagram in Figure 2.7. While the height of the potential energy barrier between ¢ = 0 and ¢min is sensitive to r, the value of ¢min changes only slowly with r when varied from r = 2 A, through r = 2.33 A (the observed S··· H distance in H2S··· HF), to r = 4.24 A, that is from ¢min = 80 to 65°. The

Page 38: Principles of Molecular Recognition

SMALL GAS-PHASE MOLECULES 29

r-2.0A V(ct»

Figure 2.7 Electrostatic potential energy V(.p) of a point charge +e at various distances r from sulphur in H 2S. (Redrawn from [8] with permission from the Royal Society of Chemistry.)

value ¢min = 80° for r = 2.33 A should be compared with the experimental angle ¢min = 89° for H 2S··· HF (see Table 2.1) and with the angle (90°) expected on the basis of the rules and the familiar hybridisation model of H2S. The relative insensitivity of 2¢min and the barrier height to r for H2S (in contrast to the case of H20) arises because of the large angle between the non-bonding pairs on S and as a result the effect of each on a point positive charge is essentially isolated.

Formaldehyde is another case in which the non-bonding pairs are expected to be separated by a large angle ( ~ 120°). The corresponding diagram of V(8) for formaldehyde is given in Figure 2.8 for r = 1.79 A (the experimental r(O .. · H) in H2CO .. · HF), where 8 is defined as shown. The point charge at a distance r from ° is confined to the plane of the formaldehyde molecule. We find again that V(8) has a double minimum and that the angle 28min is 80°. We would predict from the usual non-bonding pair model offormaldehyde a value of 28min = 120°; cf. the experimental angle C=O .. · Fin H 2CO .. · HF of 1100

A similar approach for S02leads to the potential energy curve V(8) shown in Figure 2.9 for r = 1.89 A (the experimental r(O··· H) in S02 '" HF). The two

Page 39: Principles of Molecular Recognition

30 PRINCIPLES OF MOLECULAR RECOGNITION

-50

-70

Y T

"0 -90 E ...., [) r = 1.79" .¥ - le\ ~ -110 i --9

~ i I

-130 i i

-150 -80 -40 0 40 80

e I deg

Figure 2.8 Electrostatic potential energy V(II) ofa point charge +e at r = 1.79 A from oxygen in the plane of H 2CO. (Redrawn from [8] with permission from the Royal Society of Chemistry.)

inequivalent minima fall at 150° and 230°. The simple non-bonding pair model of S02 has the axes of the pairs occurring at 120° and 240° while the experimental result for S02'" HF corresponds to fJ = 215°.

The few representative examples of n-pair acceptors discussed above (H20, H2S, H2CO and S02) make it clear that the variation of the electrostatic potential at a fixed r with the angle ¢ or fJ is a good semi-quantitative method of establishing the existence of and directionality of non-bonding electron pairs in acceptor molecules B.

Finally, a similar approach is possible for n-bonding acceptors such as acetylene and ethylene. Appropriate graphs of V(fJ) versus fJ show a single minimum at the expected angle fJ = 90°.

As discussed earlier, the next order of approximation is to consider not only the point positive charge on H in HF but also the point negative charge on F. Then HF is treated as an extended electric dipole, and we need to calculate the electrostatic potential energy of this dipole in the electric field due to B. In practice, we do this by evaluating the potential energy of two point charges + 0.54 e and - 0.54 e and adding the two contributions to give V(¢) at a given angle ¢ and fixed distance r. For H 20 ... HF, the result is shown in Figure 2.10 [8]. We note that the potential energy minimum for the system is now at ¢min = 55° which is within experimental error of the observed value ¢min = 46(8t· A similar approach for H 2S .. · HF leads to ¢min = 85°

Page 40: Principles of Molecular Recognition

SMALL GAS-PHASE MOLECULES 31

0

(,t)

-20 ~//\ T o 0' r-1.89A '0 e ....,

-40 oX -~ ~ -60

-80 I 0 trons, e = 150

100 150 200 250 300

e Ideg

Figure 2.9 Electrostatic potential energy V(/I) of a point charge + e at a distance of r = 1.89 A from one of the oxygen atoms and in the plane of S02. (Redrawn from [8] with permission from

the Royal Society of Chemistry.)

and for H 2CO··· HF to (]min = 130°. Both results are in good agreement with experiment and with the predictions of the zeroth-order model.

The discussion in this section shows that to a useful approximation, the angular geometries of B ... HF are determined by the angles at which the minima in the electrostatic potential energy of a non-perturbing point positive charge are found. An even better approximation involves a similar approach but finds the minimum potential of HF taken as an extended electric dipole. Of course, if the full electric charge distribution of HF is used, the Buckingham-Fowler model is recovered [12]. Given that the electrostatic­potential around B reflects the existence and disposition of n-pairs, it seems reasonable to seek an even simpler model for B which reproduces the electrostatic potential in its vicinity. Ideally, the representation of the full charge distribution ofB should be simplified so that it leads to physical insight and it facilitates the computation of electrostatic energies and therefore the prediction of angular geometries when even quite large components interact.

2.6 A point-charge representation of non-bonding electron pairs

In this section, we develop simple point-charge models for representing n-pairs on the acceptor atom in each ofthree molecules B = H20, H2S and H2CO. We show that, when taken with the extended electric dipole model of HF, the

Page 41: Principles of Molecular Recognition

32 PRINCIPLES OF MOLECULAR RECOGNITION

-20

-30

-40 j'"

"0 E .., -50 "H( ct» + v,o( ct» "" --'$

~ -60

-70

-80 VH(ct»

-80 -40 0 40 80

ct> I deg

Figure 2.10 Electrostatic potential energy VH(4)) and V~4» of charges + O.54e and -O.54e, respectively, at the experimental distances of the Hand F nuclei of HF in H 20 .. ·HF. See Figure 2.1 for definition of 4>. The curve VH(4)) + VF(4)) is the potential energy of the HF extended electric dipole as a function of 4>. (Redrawn from [\6] with permission from the Canadian Journal

of Chemistry.)

simple point-charge model of B allows a description of the angular variation of the electrostatic potential energy of interaction between Band HF that is in good agreement with the full Buckingham-Fowler result and with experiment.

Several electrostatic descriptions of the interaction between molecules B and, for example HF have been discussed [8]. The most important contribu­tion to the electrostatic energy results from the interaction between the distributed charges (assigned by the model to B) with those of HF. Thus, the simplest possible model consists of point charges placed at atom centres and implies a spherical charge distribution about each nucleus. Although this model is quite good for reproducing the electrostatic interaction energy, it fails completely to predict the correct angular geometries for dimers such as H 20 ... HF, H 2CO··· HF and H 2S··· HF. One way the correct geometries can be obtained is, for example, by placing in addition point dipoles and quadru­poles on the atoms according to the distributed multi pole analysis (DMA).

Page 42: Principles of Molecular Recognition

SMALL GAS-PHASE MOLECULES

I .8 . r'

--~~~.-z -~'"

. '. I a

y

, c5

i ~ o.....i ~a. __ .:..""t~_L __ z

of'fr I ~ i 6

33

Figure 2.11 Definitions ofr," and C( used to define point-charge models of H20, H2CO and H2S. (Redrawn from [16] with permission from the Canadian Journal of Chemistry.)

This approach forms the basis of the Buckingham-Fowler method [12] discussed above. An alternative model [16], which is set out here, begins with the point charges on the atoms but accounts for the deviations from sphericity about atom centres by translating small fractions ofthe point charges from the acceptor atom centre of B along directions conventionally associated with non-bonding pairs.

The procedure for constructing the model in the case of H20, for example, is to begin with the point charges at the atom centres appropriate to the DMA given in [12], although these could be generated in other ways, such as by use of the molecular electric dipole moment. Small fractional point charges bare then removed from the DMA charge on the acceptor atom (e.g. 0 in H 20) and translated through a distance r along the directions lX (see Figure 2.11) normally associated with the axes of non-bonding electron pairs in, for example, the Gillespie-Nyholm VSEPR model [17]. The point charges qi appropriate to the DMAs of H20 and HF are recorded in Table 2.2 [12]. The potential energy V(¢) = VH(¢) + VF(¢) of the HF molecule is then calculated at the experimental distance r(O··· F) = 2.662 A for the range of angle ¢ = 0 to 90° (see Figure 2.l). This procedure is repeated for various sets of values b, r, lX.

The set of b, r, lX chosen is then that which best reproduces the potential energy barrier obtained when the full DMA ofB and the extended dipole model ofHF is used, as displayed in Figure 2.l0 and discussed in section 2.5.

It is found [16] that physically reasonable solutions are generated for only a limited range of values of band r in the region of 0.04 e and 1 A, respectively. Indeed, for r ~ 0.5 A, the double minimum apparent in the full calculation

Page 43: Principles of Molecular Recognition

34 PRINCIPLES OF MOLECULAR RECOGNITION

Table 2.2 Distributed point charges and their coordinates for H20, H 2CO and H 2S; compari­son of barrier heights in B··· HF for point-charge model of B with values calculated using full DMA for B

B···HF

Quantity H 2O H 2CO H 2S

Atom 0 H 0 C H S H

x/A 0 0.758 0 0 0.935 0 0.957 z/A 0.651 -0.522 0.601 -0.602 -1.181 0.055 -0.866 1019q/C -1.285 0.642 -0.868 1.088 0.109 0.053 -0.026 riA 1.0 1.0 1.75 WWC 0.061 0.080 0.040 IX/deg 54 60 90 h/kJmol- 1

Point charge': 3.7 6.5 18.9 Full DMA": 3.6 5.9 19.4

cPmin or Bmin/deg Point charge: 45 50 90 Full DMA: 55 65 80

a The values quoted for h in Ref. [16] were too large by a factor of 1/0.54 as a result of an oversight which assigned charges of + e and - e to Hand F, respectively, in HF instead of 0.54e and - 0.54 e, respectively.

(Figure 2.1 0) is absent. On the other hand, when r is in the region of 1.5 A, the barrier height becomes very large for even very small b. Physical acceptability therefore constrains r to be in the region of 1 A, which we note is similar to the O-H bond length in H2 0. The important conclusion is that b is found to be a relatively small fraction of the electronic charge. The results [16] for band c<

when r = 1 A that give the same barrier height and emin as are obtained when using the full DMA of B are given in Table 2.2.

Similar analyses have been executed [16] for H 2CO···HF and H 2S···HF (see Figure 2.11 for definitions) at the appropriate experimental distances with the results shown in Table 2.2. The angles c< and e for H 2CO··· HF are defined in a similar way but lie in the molecular plane rather than the perpendicular plane. Again reasonable solutions are obtained only when b ~ 0.04 e and r ~ 1 A for H 2CO··· HF while for H 2S··· HF longer distances r of b from the S atom are required. This is not inconsistent with the notion that the H2S molecule is generally more extensive than the H20 molecule. Again for both H2S and H 2CO, b is found to be only a few percent of the full electronic charge.

The main advantages ofthis model are its easy visualisation and its obvious affinity with the valence-sheIl-electron pair repulsion model. The difference of this approach from many earlier methods of representing non-bonding pairs by point charges is that it needs only relatively tiny charges (e.g. ~ few hundredths of an electron) at reasonable distances (r ~ 1 A). These small charges are generated naturally from the condition that the 'observed' barrier

Page 44: Principles of Molecular Recognition

SMALL GAS-PHASE MOLECULES 35

heights (i.e. those from the full DMA calculation or from experiment in the case of H20 ... HF and H2CO··· HF) must be reproduced. Another advantage of the model is that it is in accord with the nearly spherical nature of the electronic charge distribution in H 20 (see Figure 2.5) in which there are only very small deviations from hemisphericity at oxygen. Moreover, such a model is very easy to use in situations where large clusters are being discussed, for the electrostatic calculations involve only point charges rather than higher poles. Finally, the model is consistent with the ideas recently advanced by Bader et al. [18] who show that a small localisation of electron density along the non-bonding pair directions in the valence-shell-electron pair repulsion model results from the ligand field operating in concert with the Pauli exclusion principle.

The point-charge models allow the development of a qualitative under­standing of the factors underlying molecular recognition by a single hydrogen bond in a dimer. It is convenient to begin by taking H 20 ... HF as an example and to consider first the electrostatic energy VH( ¢) of the H atom of HF in the potential that results from the point-charge model of H 20. The angular dependence of the interaction energy is very similar to that shown in Fig­ure 2.10 but its origin can now be viewed in terms of the point-charge model for H 20. It will be recognised straight away that the dominant term in VH(¢) that largely determines the sharply rising walls of the well is the repulsion be­tween the H atom ofHF and the H atoms of H 20. This by itself would provide a single minimum well whose depth is determined mainly by attractive inter­action between the point charges on the ° atom and the H atom of HF. The remaining term contributing to VH(¢) arises from the small charge concentra­tions in the non-bonding pair directions and although this term is small, it leads to the important double minimum in the base of the well and to considerable broadening of the well width. By themselves, the two point charges t5q would lead to a function with a maximum at ¢ = 0°, minima at ¢ = 54 0 and a barrier height of 13 kJ mol- 1. The effect of combining this with the repulsive part of the potential is to reduce the barrier height to 1 kJ mol- 1

and to move the minima to a smaller angle. Finally, the effect of including the F atom in these considerations in order to obtain V(¢) leads, through attraction of the F atom for the H atoms of H 20, to a reduction in the rate ofrise of the repulsive part of the potential and to some consequent increase in barrier height.

A similar interpretation can be given for the in-plane double-minimum potential function for H 2CO··· HF. In this case, the walls of the repulsive potential rise less steeply and because the bottom of the well is flatter, the effect ofthe point charges t5q, which are similar in magnitude to those for H20, is more pronounced and leads to a larger barrier height. Again, and for the same reason, the minima occur at a smaller angle than the angle subtended by t5q at 0.

The conclusions reached from the above discussion can readily be sum-

Page 45: Principles of Molecular Recognition

36 PRINCIPLES OF MOLECULAR RECOGNITION

marised. For etheral oxygen, pyramidal geometries are favoured by: (1) large bq; and (2) a flat bottomed repulsive potential. For H20 ... HF, the repulsive potential rises sufficiently rapidly that although the equilibrium geometry is pyramidal the barrier is so low that the effective zero-point geometry is planar. On the other hand, (CH2)20··· HF evidently satisfies the conditions well, for it has recently been shown [19] that the barrier is high and the geometry at 0 is for all practical purposes pyramidal. Analogous conditions favour bent geometries at carbonyl oxygen over those with a linear c=o··· H-X geometry. By contrast, for hydrogen-bonded dimers formed at a single non-bonding pair on nitrogen, e.g. HCN··· HF, CH3CN··· HF, the preference for an axially symmetric geometry will be greatest when: (1) bq is large; and (2) the repulsive potential rises steeply.

2.7 Isomerism in weakly bound dimers: incipient molecular recognition

So far we have discussed only hydrogen-bonded dimers and in particular the angular geometry of the isomer that corresponds to the lowest energy of interaction. This is because the spectroscopic techniques used to identify and characterise dimers B··· HX operate at low temperature and favour the lowest energy form. In fact, any molecule will have one or more regions that are relatively nucleophilic and conversely regions that are relatively electrophilic. As a result, two or more interaction sites can be available to the two molecules involved in dimer formation. The possibility of isomerism in a weakly bound dimer then exists, one isomer corresponding to the interaction of a nuc­leophilic region on B and an electrophilic region on, e.g. HX, with the other corresponding to the reverse type of interaction. A convenient example of such isomerism has been identified experimentally in a mixture of ethyne and HCN. Initially, the T-shaped isomer shown in which the proton of HCN acts as the electrophile and the n-bond of ethyne serves as the nucleophile (see Fig­ure 2.12) was identified through its rotational spectrum [20]. Recently, how­ever, another (higher energy), linear isomer in which the n-pair of HCN is the nucleophile and a proton of ethyne is the electrophile (see Figure 2.12) has been characterised through high-resolution vibrational spectroscopy [21]. This observation has important implications for a rudimentary form of

H-C.C-H

H I

C III N

HCN .. • .. HC=CH

Figure 2.12 Two observed isomers of the (ethyne, HCN) dimer.

Page 46: Principles of Molecular Recognition

SMALL GAS-PHASE MOLECULES 37

molecular recognition, i.e. is it possible to predict from the properties of the individual molecules which isomer is favoured energetically?

Ideally the above question could be answered by estimating the interaction energy of a pair of molecules from for example electrophilicities and nucleo­philicities assigned to different parts of each of the molecules. Unfortunately, too few binding energies are known experimentally to allow progress in this direction. However, the other measure of strength of binding is more promis­ing. The intermolecular stretching force constant k" is a measure of the energy required for a unit infinitesimal extension ofthe weak bond and this quantity is now available for a wide range of dimers B··· HX from the effect of centrifugal distortion in stretching the intermolecular bond, which is readily observable in the rotational spectrum [6,22]. The approach used in these two references is followed closely in the discussion given below.

Collected in Table 2.3 are observed k" values for a number of dimers B··· HX. In a column of Table 2.3, X is fixed and B varies, while in a row X varies and B is fixed. A simple relationship exists among the values along rows and down columns. Clearly, k,,(B··· HF)/k<r(B··· HCI) has the nearly constant value of approximately 2 for all B. Moreover, k,,(B··· HF)/ k,,(B··· HCN) and k,,(B··· HF)/kG(B··· HBr) are each approximately constant and independent of B. This observation suggests that a nucleophilicity N can be assigned to the acceptor region in each B and an e1ectrophilicity E to the proton region in each HX. Because the quantities are derived by considering weakly bound dimers in the gas phase, we refer to Nand E as limiting, gas-phase nucleophilicities and electrophilicities, respectively. The strength of the hydrogen bond, as measured by kG' is then given by

k,,= cEN (2.1 )

Table 2.3 Observed and calculated intermolecular stretching force constants k. (N m - ') for dimers B··· HX'

HX

B HF HCl HCN HBr HC CH HCF,

N2 5.5(5.5) 2.5(2.8) 2.3(2.3) - (2.3) CO 8.5(8.5) 3.9(4.2) 3.3(3.6) 3.0(3.6) - (2.0) PH, 10.9(11.0) 5.9(5.5) 4.3(4.7) 5.0(4.6) - (2.6) - (2.1) H2CCH 2 - (11.8) 5.9(5.9) 4.5(5.0) - (4.9) - (2.8) - (2.2) HCCH - (12.8) 6.4(6.4) 5.2(5.4) - (5.4) - (3.1) - (2.4) H2S 12.0(12.0) 6.8(6.0) 4.7(5.1) 5.9(5.0) - (2.9) - (2.3) (CH2h - (16.0) 8.0(8.0) 6.3(6.8) - (6.7) - (3.8) - (3.0) HCN 18.2(18.3) 9.1(9.1) 8.1(7.8) 7.3(7.7) - (4.4) 3.5(3.5) CH3CN 20.1(20.3) 10.7(10.1) 9.8(8.6) - (8.5) 4.7(4.9) - (3.9) H2O 24.9(25.0) 12.5(12.5) 11.1(10.6) - (10.5) 6.5(6.0) - (4.8) NH, - (28.8) 18.2(14.4) 12.2(12.2) 13.1(12.1) 7.0(6.9) - (5.5)

• See [6,22] for origin of experimental values. b Predicted values of k. < 2.0 N m - 1 are considered to be too small to be reliable.

Page 47: Principles of Molecular Recognition

38 PRINCIPLES OF MOLECULAR RECOGNITION

where c is a constant of proportionality. If the arbitrary assignments N = 10 for H 2 0 and E = 10 for HF are made, the value c = 0.25 N m -1 results [6]. The next step is to establish the N values for all other B from the k" of the series B··· HF and E = 10 for HF. Finally, the Ns for this set ofB are used with the k" values to obtain E for HCl, HCN, HBr, HCCH and HCF 3' The set of Nand E values thereby generated is shown in Table 2.4. The k(1 calculated from eqn. (2.1) using the appropriate E and N of Table 2.4 is shown for each B ... HX in parentheses in Table 2.3. The agreement of the calculated and observed values is remarkably good given the simplicity of eqn. (2.1). Hence, the values of Nand E can be used reliably to predict k(1 values where these have not been determined. In particular, they can be used to explore the relative binding strengths of the structural isomers possible for a pair of monomers Band HX.

A compact way of discussing relative binding strengths for structural isomers is demonstrated in Table 2.5 where ka values calculated from eqn. (2.1) are set out. The labels for the rows consist ofthe molecules B (with the N values in parentheses) listed in order of decreasing N. The labels for the columns are

Table 2.4 Values of nucleophilicities Nand electrophilicities E ofB and HX

Molecule B

N2 CO PH 3

H2C=CH2

H 2S HC=CH (CH 2h HCN CH3CN H 20 NH3

N"

2.2 3.4 4.4 4.7 4.8 5.1 6.4 7.3 8.1

10.0 11.5

Molecule HX

HF HCl HCN HBr HC=CH HCF 3

E

10.0 5.0 4.25 4.2 2.4 1.9

"See [6] for values of N for B = H 2CO, (CH3hP, (CN)2' etc.

Table 2.5 Matrix of calculated kG values for dimers HY ... HX and HX··· HY display­ing geometrical isomerism"

HX(E)

HY(N) H20 (5.0) HCN (4.25) HCCH (2.4) HF(lO.O) HCl(5.0)

H20 (10.0) 12.5 10.6 6.0 25.0 12.5 HCN(7.3) 9.1 7.8 4.4 18.3 9.1 HCCH (5.1) 6.4 5.4 3.1 12.8 6.4 HF (4.8) 6.0 5.1 2.9 12.0 6.0 HCl (3.1) 3.9 3.3 1.9 7.8 3.9

" Numbers in italics refer to calculated kG for isomers that have been observed.

Page 48: Principles of Molecular Recognition

SMALL GAS-PHASE MOLECULES 39

the same molecules but viewed now as the donor HX and taken in the same order. The quantities in parentheses in this case are then the E values of these electrophiles. Thus the first row of k" entries gives values for fixed B, namely H 20, while HX varies from H 20 through to HCl. The entries are calculated using the appropriate Nand E values in eqn. (2.1). Values for isomers that have been characterised spectroscopically are presented in italics. We note that the observed isomers preponderate on and above the diagonal. The predicted k" values for the two isomers ofthe ethyne, HCN dimer set out in Figure 2.12 are 5.4 N m - 1 and 4.4 Nm - 1 when ethyne is the proton acceptor and donor, respectively. The corresponding observed values are 5.2 N m - 1 [20] and 4.6 N m -1 [21], the latter value having been observed only some time after the above value of 4.4 N m - 1 had been originally predicted [6]. The N value for HF in Table 2.5 was obtained from k" for HF··· HF while that of HCI was correspondingly derived from k" for HCI··· HCl. We note that the two isomers HF··· HCI and HCI··· HF are predicted to have k" = 7.0 and 7.8 N m- 1

respectively, while the observed values are 5.7 N m -1 and 7.7 N m -1, respect­ively [23].

The success with the prediction of k" for the pair of isomers of (HCCH, HCN) and the pair of isomers of(HF, HCI) give some confidence that the most strongly bound isomer (as measured by k,,) is the one having the larger product EN.

2.8 Dimers with two interaction sites

In the preceding section, we have seen that the dimer (HC=CH, HCN), for example, has two structural isomers, each of which presumably corresponds to a minimum on the interaction potential energy surface. The lower and higher energy minima can be referred to as arising from the primary (p) and secondary (s) interactions, respectively, between the component molecules. Thus, for component molecules labelled 1 and 2, k~=N1E2>k~=N2E1' When k~» k~, only the primary isomer is observed in low temperature jet experiments. On the other hand, when k~ ~ k~, two possibilities exist, namely either two separate isomers are observed (see above) or, if the geometries ofthe two components permit, advantage is taken of both interactions to give a dimer held together by a two-site interaction. An example will clarify the gene.ral argument.

The series of dimers (H 2CO, HX) has been investigated through rotational spectroscopy for X = F, CI, CN and CCH [24]. It is known that the strength of the primary interaction k~ decreases along the series X = F, CI and CN, as expected. This is reflected in the decrease of the angle COX from 1100 when X = F (see Table 2.1) to 138° when X = CN. Moreover, evidence of inversion doubling is available from the spectrum of H2CO··· HCN, indicating that the barrier to the C2v conformation in which CO .. · HCN are collinear is now

Page 49: Principles of Molecular Recognition

40 PRINCIPLES OF MOLECULAR RECOGNITION

sufficiently low. The expectation on further weakening the primary interaction by replacing HCN by HCCH was of an angle even closer to 180°. In fact, the observed geometry [24] of the dimer (H2CO, HCCH) is as shown in Figure 2.13. The primary interaction H 2CO ... HCCH is now so weak that the dimer can be further stabilised by bending the hydrogen bond to allow the secondary interaction between the H of H 2CO and the n-bond of ethyne to come into play. On the other hand, for dimers like H2CO .. · HF, the loss of energy incurred on bending the hydrogen bond so that the H atom moves away from the 0··· F line is too high (see [8] for a discussion of this type of hydrogen bond bending) to be compensated by the gain in energy from the weak secondary interaction. We note, however, that even for H2CO··· HF, the Buckingham-Fowler model [12] predicts the 0 .. · H-F angle to deviate slightly ( ~ 8°) from 1800 in the direction expected from electrostatic inter­action of the secondary type involving an n-pair of fluorine and an H of CH2 (see Figure 2.4).

Occasionally, there are two equivalent primary interactions, as in the carboxylic acid dimers where at each site the rules are satisfied, i.e. the C=O ···0 angle of 120° is in accord with Sp2 non-bonding electron pairs on the carbonyl oxygen atom, as shown schematically in Figure 2.14. The well known strong interaction then results.

In summary, three classes can be recognised according to whether the strength of the secondary interaction relative to the primary one is: (i) weak; (ii) moderate; or (iii) equivalent. The consequences are, respectively: (i) a slight deviation from the geometry expected on the basis of the primary interaction (e.g. Figure 2.4); (ii) a geometry which is a compromise between the demands of the two interactions; and (iii) a dimer geometry in which there are two

Figure 2.13 Observed geometry of the (ethyne, formaldehyde) dimer. (Redrawn from [22] with permission from the Royal Society of Chemistry.)

O-H .. ···O / ~

H-C C-H ~ /

O· .... H-O

Figure 2.14 Two equivalent hydrogen bonds in formic acid dimer.

Page 50: Principles of Molecular Recognition

SMALL GAS-PHASE MOLECULES 41

equivalent hydrogen bonds and, as a consequence, a particularly strong interaction (e.g. Figure 2.14).

2.9 Consequences of the rules for angular geometries in the solid state

All of the discussion so far has applied to isolated dimers. Observations made about 0··· H-O hydrogen bonds in the solid state can, however, be rationalis­ed on the basis of the rules given above for angular geometries and the relative magnitudes of the force constants associated with bending the hydrogen bond in two ways.

The only example where the hydrogen bond bending force constants have been determined in detail is the C3v dimer CH3CN ... HF [25]. Bending force constants are also available, but less well determined, for HCN··· HF [1] and the 'in-plane' bending motion of H 2 0 ... HF [26]. In each case, bending at the acceptor atom (e.g. N or 0) is relatively weakly resisted by comparison with bending at the hydrogen bond hydrogen atom. This general finding provides a basis for understanding deviations from the predictions of the n-pair model that have been observed in the solid state [8]. Analyses of large numbers of X-ray and neutron diffraction investigations of systems containing the 0··· H-O bond show that, when proper statistical weighting is applied, only a small proportion of the 0··· H-O bonds are bent at the hydrogen atom, while relatively large deviations are commonly observed from the angle expected when using the n-pair model of the acceptor molecule. Nevertheless, there is some preference for hydrogen bonding in the plane containing the axes of the non-bonding pairs even though there is no favoured angle within this plane.

References

1. A.C. Legon, D.1. Millen and s.c. Rogers (1980) Proc. R. Soc. London, Ser. A 370, 213-237. 2. A.C. Legon (1983) Annu. Rev. Phys. Chern. 34, 275-300. 3. T.R. Dyke (1984) Top. Curro Chern. 120,85-113. 4. Z. Kisiel, A.C. Legon and D.1. Millen (1982) Proc. R. Soc. London, Ser. A 381, 419-442; A.C.

Legon and L.C. Willoughby (1982) Chern. Phys. Lett. 92, 333-338; A.C. Legon and D.1. Millen (1992) Chern. Soc. Rev. 21, 71-78.

5. A.S. Georgiou, A.C. Legon and D.1. Millen (1980) Proc. R. Soc. London Ser. A 373, 511-526; A.S. Georgiou, D.1. Millen, Z. Kisiel and A.C. Legon (1989) Chern. Phys. Lett. 155,447-454.

6. A.c. Legon and D.1. Millen (1987) J. Am. Chern. Soc. 109, 356-358. 7. A.c. Legon, D.1. Millen and H.M. North (1987) Chern. Phys. Lett. 135,303-306. 8. A.C. Legon and D.1. Millen (1987) Chern. Soc. Rev. 16,467-498 and refs. cited therein. 9. A.C. Legon and D.1. Millen (1982) Faraday Discuss. Chern. Soc. 73, 71-87.

10. C.A. Coulson and W.E. Moffitt (1949) Phi/os. Mag. 40,1-35. 11. T.1. Brobjer and J.N. Murrell (1983) J. Chern. Soc. Faraday Trans. 2 79,1455-1464. 12. A.D. Buckingham and P.W. Fowler (1985) Can. J. Chern. 63, 2018-2025. 13. G.1.B. Hurst, P.W. Fowler, A.1. Stone and A.D. Buckingham (1986) Int. J. Quantum Chern. 29,

1223-1229. 14. 1. Bicerano, D.s. Marynick and W.N. Lipscomb (1978) J. Am. Chern. Soc. 100,732-739.

Page 51: Principles of Molecular Recognition

42 PRINCIPLES OF MOLECULAR RECOGNITION

15. A.D. Buckingham (1978) In Intermolecular Interactions:from Diatomics to Biopolymers, ed. B. Pullman, Wiley, New York, pp. 3-67.

16. A.C. Legon and D.J. Millen (1989) Can. J. Chem. 67,1683-1686. 17. R.J. Gillespie and R.S. Nyholm (1957) Q. Rev. Chem. Soc. 11, 339-380. 18. R.F.W. Bader, R.J. Gillespie and P.J. MacDougall (\ 988) J. Am. Chem. Soc. 110,7329-7336. 19. A.C. Legon, A.L. Wallwork and D.J. Millen (1991) Chem. Phys. Lett. 178,279-284. 20. P.D. Aldrich, S.G. Kukolich and E.J. Campbell (\983) J. Chem. Phys. 78, 3521-3530. 21. P.A. Block, K.W. lucks, L.G. Pedersen and R.E. Miller (1989) Chem. Phys. 139, 15-30. 22. A.C. Legon (1990) Chem. Soc. Rev. 19, 197-237. 23. Values of k. for (HF)2' (HCIlz, HF ... HCI and HCI··· HF are given by G.T. Fraser and A.S.

Pine (1989) J. Chem. Phys. 91, 637-645. 24. A summary of the observations about angular geometry for the series H 2CO··· HX

(X = F, CI, CN, CCH) is given in the paper on H2CO ... HCCH by N.W. Howard and A.C. Legon (1988) J. Chem. Phys. 88, 6793-6800.

25. J.W. Bevan, A.C. Legon, D.J. Millen and S.c. Rogers (1980) Proc. R. Soc. London Ser. A 370, 239-255.

26. Z. Kisiel, A.C. Legon and D.J. Millen (1984) J. Mol. Struct. 112, 1-8.

Page 52: Principles of Molecular Recognition

3 Spectroscopic studies of solvents and solvation M.C.R. SYMONS

3.1 Introduction

This chapter is concerned with the nature of strong interactions between solute molecules and solvent and sometimes between two or more solute molecules, at the molecular level. It it is not concerned with bulk properties of solutions, and the challenge remains to use data such as those outlined here as an aid in interpreting bulk properties. In our own work, we have used a range of spectroscopic techniques, looking at both solute and solvent molecules, the solute species being neutral or ionic. Since the inception of using spectroscopy to study solvation, a great deal of work has been carried out. For convenience, and reasons of space, I draw attention primarily to work done in my own laboratories as being illustrative ofthe different types of information available. Thus, this chapter is, in a sense, a retrospective review of work carried out by my group and I apologise to all those whose work should also have been discussed.

3.1.1 History

Early work was based on the ultraviolet spectra of halide (especially iodide) ions [1, 2]. It was already known that intense p --> s type spectra were exhibited by these ions in alkali-halide crystals and in aqueous solutions. We found that these bands were remarkably sensitive to changes in solvent, and hence could be used as a method for studying primary solvation.

Another early development was the discovery that ion-pair equilibria and rates could be precisely studied by electron spin resonance (ESR or EPR) spectroscopy. These results, together with others using this specialised tech­nique, are discussed in section 3.4. Similarly, it was found that the NMR chemical shifts for nuclei of ions such as 19F or 23Na varied markedly with solvent changes, as did the 1 H resonances of protic solvents. However, as shown in section 3.5, structural information was difficult to obtain and there has been much speculation regarding the correct interpretation of the observed shifts (see section 3.8).

More recently, vibrational spectroscopy has been used systematically, especially for neutral solutes. Certain polar groups in solutes (such as :::C=O

Page 53: Principles of Molecular Recognition

44 PRINCIPLES OF MOLECULAR RECOGNITION

for example) are markedly solvent sensitive. Also, solvent groups, especially O-H or N-H, show solvent-sensitive spectra. These are discussed in sections 3.6 and 3.7 respectively.

In section 3.8 an attempt is made to link results from this wide range of studies, and to show how they overlap with results from other 'molecular level' studies. Finally, in section 3.9, a brief introduction to the applications of these techniques in biological systems is given.

Throughout, attention is focused primarily on structural parameters rather than on relaxation or rate processes. Spectroscopic studies are, of course, of very great importance in the latter field, but it seemed better to concentrate here on structure, since a good understanding of structure must precede a proper interpretation of rates.

3.2 Background

In this section, I outline some of the concepts that run through this chapter, and briefly discuss some of the more important solvents.

3.2.1 Hydrogen bonding

Much of this chapter is concerned with hydrogen bond making or breaking. This is because, for polar solutes in protic media, such bonds dominate solvation. Also, anions in protic solvents are solvated by hydrogen bonds. This is not true for cations, but when a cation is solvated it uses H-bond sites (lone-pairs of electrons), which changes hydrogen bonding in the solvent.

There are many ways of describing hydrogen bonds, and for some purposes a simple dipolar model suffices. However, from a spectroscopist's view, a useful model is one of partial proton transfer. Thus, comparing A-H, A-H ···B and A - ... H + -B, A-H··· B can be thought of as havin~ moved slightly towards the limiting case of A - ... H + -B. Hence the A - H bond is slightly weakened which explains the shift in v(A-H) to low frequencies. There is an increase in the J - and J + charges on A and H, which explains the increase in oscillator strength with increasing H-bond strength. Also, the increase in J+ on the proton explains the down-field shifts in the 1 H NMR spectra as H-bonding increases. The correlation between H-bond strength on Jv(A-H) takes the form indicated in Figure 3.1. This helps to explain the very large width increments found for v(A-H) in strong hydrogen-bonded systems. Thus, small fluctuations in bond strength which occur in fluid media cause large fluctu­ations in dv. This figure shows that vibrational spectroscopy is not a very sensitive technique for studying weak H-bonding .. Fortunately, 1 H NMR shifts do not have the same limitation, and can therefore be used to help distinguish between weak bonds and 'free' groups.

Page 54: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOLVENTS AND SOL VA TION 45

3.40

0« ...... Q) 0

3.00 c: Cd -II) :0 0 I I I 2.60

0

3000 2000

Figure 3.1 Trend in the O-H vibration (v OH) for HOD in D20 for various OH ... 0 hydrogen bonded units, as a function of the 0··· 0 sep~rJtion. Note the insensitivity for very weak hydrogen

bonds.

3.2.1.1 Cooperativity and anti-cooperativity. In order to understand solva­tion by hydrogen bonding, it is necessary to bear in mind that multiple bonding can strengthen or weaken a given bond. Examples are given through­out the chapter. They can be illustrated by the simple example for methanol given in Scheme 3.1. Consider bond IX. This has strength IXl in the dimer (a). In the trimer (b), the IX-bond is strengthened considerably to 1X2 and 1X3' the increments being about equal. In the tetramer (c) these effects combine to give an even stronger bond, 1X4 • However, in the trimer (d) the bonds work against each other, and IXs is weaker than IX l. Thus IXs < IXl < 1X2 '" 1X3 < 1X4 •

In my view, solvents themselves usually avoid anti-cooperativity situations. However, whenever a solute is involved, if a given group has a solvation number greater than one, anti-cooperativity occurs. This is especially the case for anions such as the halide ions. The charge ofthe ion is so effective that high solvation numbers are exhibited despite the fact that each bond is, as a con­sequence, quite weak. Of course, this anti-cooperativity for primary solvent molecules is greatly reduced because of the cooperative effects of secondary solvation.

Page 55: Principles of Molecular Recognition

46 PRINCIPLES OF MOLECULAR RECOGNITION

Me Me Me Me Me I I I I I Q-H----Q-H

<Xl Q-H----Q-H----Q-H

(a)

Me Me Me Me I I I I Q-H----Q-H----Q-H----o-H

<X4

(c)

Scheme 3_1

<X2 <X3

(b)

Me Me I <Xs I Q-H----o-H

: <Xs

fI I

Me-D

(d)

3.2.1.2 Functionality_ This useful concept is taken from polymer chemistry. In terms of hydrogen-bonded systems, it relates to the number of proton­donor groups (x) and the number of proton-acceptor groups (y), expressed as (x:y). Thus, for example, acetone is (0:2), methanol is (1:2) and water is (2:2) functional.

3.2.2 Hydrophobic bonding

This is a very difficult term to define. To biologists, it simply relates to the tendency of non-polar groups in biomolecules to stay together. This is, of course, a total effect for lipids in water, but for proteins, which comprise a polar backbone with side groups that are often non-polar, there are regions in the tertiary structures in which non-polar units are in close contact. This situation is caused, in a negative way, by water which cannot solvate these regions, and in a positive way by attractive Van der Waals forces.

To chemists dealing with small molecules in water, the concept is linked to clathrate cage formation. Solid water clathrates comprise an array of small molecules such as methane held together with a filigree of water molecules all tetrahedrally hydrogen bonded to each other. It is reasonable to suppose that similar cages can occur in liquid water and, indeed, this is the most probable way for water to surround non-polar groups. However, these water molecules will be subject to the same fluctuations as others, being closely linked to bulk water, and they are not readily detected by spectroscopic methods.

There are two ways in which 'bonding' can occur: one is by 'cage sharing' by pairs or even clusters of guest molecules, and the other is by 'cage pairing', which is thought to be favourable because cages breed cages, as in solid-state clathrates.

Page 56: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOLVENTS AND SOLVATION 47

3.2.3 Comments on some common solvent systems

3.2.3.1 Water. Water is infinitely the most important and, in many ways, the most difficult to understand. The special structure of liquid water which arises because of its (2:2) functionality is essentially a three-dimensional network of hydrogen bonds, each molecule generally forming four bonds directed roughly tetrahedrally (structure (1)). All molecules diffuse very rapid­ly and these bonds have short lifetimes. Some are strong, some are weak, some are bent and, presumably, some are broken. Much of this chapter relates to how spectroscopy has been used to 'shed light' on this subject.

H"

/'

"" H

..... .. I"

.0 ....

(1)

3.2.3.2 Alcohols. Methanol and ethanol are important solvents, and are extensively used as mixed solvents with water. They mimic water in the way they solvate but, having a (1 :2) functionality, they tend to form linear structures, with the capacity of branching. Comparisons between water and methanol are drawn extensively herein.

3.2.3.3 Ethers and esters. These are relatively poor solvents, especially for electrolytes, since their functionality is (0:2). They solvate cations quite well, and polyethers are important in this respect. They are only sparingly soluble in water, so mixed solvents are not used.

3.2.3.4 Amines. Although liquid ammonia is used widely for certain chemi­cal reductions (which hinge on the solubility of alkali-metals therein), amines are not widely used. Ammonia has (3: 1) functionality, ethylamine (2: 1), etc. They are quite basic via the nitrogen lone-pair, but only very weakly acidic, and many N-H groups are essentially 'free' in the pure solvents and in aqueous solutions. Thus they are good cation, but poor anion, sol va tors.

3.2.3.5 Amides. Several are important solvents, both pure and in aqueous solution. The C=O group is quite basic and forms two hydrogen bonds in water. The N-H group, if present, is more acidic than for amines, and participates in hydrogen bonding. Thus, formamide has a 2:2 functionality and is strongly associated. It is a good solvent for electrolytes.

Page 57: Principles of Molecular Recognition

48 PRINCIPLES OF MOLECULAR RECOGNITION

3.2.3.6 Aprotic solvents. Several other aprotic solvents which are miscible with water are widely used, one ofthe most important being dimethylsulphox­ide (DMSO). The molecule is pyramidal with a large J positive charge on sulphur and a large J negative charge on oxygen. With this exposed dipole, it tends to associate strongly. It is quite basic, is a good cation solvator, and probably forms three hydrogen bonds to oxygen in water. In contrast, trimethylphosphate is relatively weakly basic, as is cyanomethane.

Spectroscopic studies of a number of these solvents are described here.

3.3 Ultraviolet spectroscopy

3.3 .1 Neutral solutes

Electronic spectra of molecules are often strongly dependent upon the nature of the solvent. For example, the n ~ n* transition for ::::C=O groups shifts considerably to the red on going from pro tic to aprotic media. Such shifts have been used, in particular, in the construction of empirical solvent scales. Use has been made of systems with strongly dipolar transitions, as for example, the betaine (2) used by Dimroth and Reichardt [3] to construct the E, scale and (3) used by Kosower [4] to give the Z-value scale. Much has been written about these scales and their utility [5]. However, they are empirical and do not provide direct structural information.

+ Q!- ....-:0 Et-N C~ OMe

(3)

(2)

As in infrared studies (section 3.6), we need to distinguish between band shifts and the gain and loss of discrete bands. The former behaviour is generally observed for a given ultraviolet chromophore in mixed solvents, whereas the latter is frequently observed in comparable infrared studies. The infrared studies show that discrete species are usually involved, whereas the shift behaviour seems to imply a more general change without specificity. The difference probably arises because the ultraviolet bands are very broad, and the shifts for individual components are relatively small. As with NMR shifts discussed below, it is very difficult to interpret such shifts in isolation.

An important exception is the electronic spectrum for nitroxide radicals [6]. In this case, solvent shifts for v N-O in the infrared are too small to give reliable information, but shifts in the first electronic band are large enough to give separate components. These studies revealed that nitroxides in water form two

Page 58: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOLVENTS AND SOLVATION 49

strong hydrogen bonds to oxygen, whereas in methanol, the major component is the mono-hydrogen-bonded solvate. These results have provided key information for the interpretation of solvent effects on the ESR spectra for nitro xi des (section 3.4).

3.3.2 Ions

As mentioned above, the ultraviolet spectra for Cl-, Br- and 1- are remark­ably solvent-sensitive [1,2]. Of these, the most studied is I - since its spectrum is most accessible (Amax is ca. 220 nm (45500 cm -1) in water and shifts to ca. 250 nm (40000 cm - 1) in aprotic solvents}. Even so, because the bands are broad, clear gain and loss of bands has only been observed in a few cases [7]. Shifts are so large that the solvent must be able to modify ground and/or excited states quite drastically. We suggested that the effect is primarily an excited-state phenomenon, and although this is an over-simplification, it helps to give an understanding of the observed shifts. There are no clear np -+(n + l)s atomic type transitions for these ions in the gas-phase, because photo-ionization occurs preferentially at relatively low energies. The large shifts to high energies for 1- in solvents can be understood in terms of a large solvent barrier to photo-ionization caused by the shell of solvent molecules around the parent ions. In fact, photo-ionization is a rare event in the liquid phase. Thus the solvent provides a barrier and a site for electron binding. The excited state is then seen as a combination of an outer s-type orbital on the halogen and an s-type 'square-well' orbital defined by the primary solvent shell. This unique type of excitation has been described as charge-transfer to solvent and is frequently represented as a CTTS transition [1,2] (Figure 3.2). We suggested that the outer part of the wavefunction for the excited electron closely resembles that of the solvated electron. This comparison was sup­ported by the correlation between the optical spectra for I - and e - in a range of solvents [8]. (The solvated electron can be visualised by removing the halogen atom from the cavity after excitation.)

A simple, localised, square-well model seemed to give a reasonable explana­tion for the shifts [1,2] and this has been supported by recent, far more

~ CTIS

Figure 3.2 Simple representation of a CTTS transition for iodide ions in water.

Page 59: Principles of Molecular Recognition

50 PRINCIPLES OF MOLECULAR RECOGNITION

sophisticated, calculations. The strongly localised model is favoured because of the closepacking of the first solvation layer and the high energy of anti bonding solvent orbitals.

The results show that there is a marked break between solvation by protic solvents (water, alcohols) and aprotic solvents (MeCN, etc.), and suggest that ammonia and amines resemble aprotic rather than protic solvents. There is a shift to high energy in Em • x on going from H 20 to MeOH which was interpreted in terms of a higher solvation number for IH20 than I~eOH' and hence a large 'cavity' for water. I suggest below that such a situation is common for many aqueous versus methanolic solutions.

There is a marked broadening and shift to low energies when solutions are heated, as is illustrated in the interesting correlation shown in Figure 3.3. This is expected on the square-well excited-state model. It is unfortunate that

i w

126

118

110

EtOH 0 a \ o b

cO I--fHO 2

H dloxan

1,3-dioxolane

EtCN

102~----------~----------~------~L-~ o 40 80 120

dEma.ldT (cal/deg)

Figure 3.3 Variation ofthe shift in v for I - ions as a function of the temperature coefficient for this band. Note that as the frequen~y gets smaller, so the temperature sensitivity increases.

Page 60: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOLVENTS AND SOL V A nON 51

separate bands cannot be resolved using mixed solvent systems (as is found for many infrared chromophore systems discussed below). This probably arises because individual shifts for each of the many expected solvates are small compared with the large bandwidths. It would be interesting to study some of these systems at low temperatures in the hope of resolving individual bands.

3.4 ESR spectroscopy

This technique is currently used very widely to study the behaviour of 'spin labels', especially in biological systems. Spin labels are stable nitroxide radicals (R2NO") which can be attached to large molecules such as proteins, lipids or drugs, which can then be monitored by ESR techniques. Generally, it is line width changes that are of interest, since these give a measure of correlation times [9].

However, the first use of ESR spectroscopy in the field of solvation was in the study of ion pairs, and this remains perhaps the most important step forward achieved by ESR spectroscopists in solvation studies.

3.4.1 ESR studies of ion pairing

Despite the limitation that only paramagnetic species can be studied with this technique, its use in the study of ion-pair and ion-multiplet formation has been unique in its precision and lack of ambiguity [10]. The first observation of ion-pair formation was made by Weismann and co-workers, who quickly recognised the major significance of their results [11]. The essence of these results is that, in solvents oflow dielectric constant, a range of aromatic radical anions show extra hyperfine splittings clearly associated with the cation nuclei (such as 23Na), thereby establishing that each anion is associated with one cation. Subsequent work [10] established that reaction (3.1) can be studied quantitatively over a range of temperatures thereby establishing equilibrium constants and derived thermodynamic parameters.

(3.1 )

In 'slow exchange' under controlled conditions, separated features for A"- and the ion-pair (A"- ... M+) are detectable. As the temperature is increased and reaction rates increase, the features broaden and accurate reaction rates can be estimated from the width increments. In fast exchange, the 23Na splitting is averaged and lost, and the spectra become similar to those of the 'free' anions.

These results came at a time of intense controversy amongst physical chemists. Conductivity, ion mobility and related studies clearly indicated that ionic association occurred, the extent increasing as the dielectric constant of the solvent decreased. However, the structures of the ion pairs and clusters were unknown. Some considered that a pair of oppositely charged ions, within

Page 61: Principles of Molecular Recognition

52 PRINCIPLES OF MOLECULAR RECOGNITION

I I I I -88- -8-- s---8-

I I I I

contact ion-pair

(a)

solvent-shared ion-pair

I I -8-s --- s-8-

I I solvent-separated

ion-pair (d)

I Me

-8--- 0 /

I "H

(b)

" /

methanol-shared /8\ ion-pair

(c)

Figure 3.4 Solvation is retained in (a) except in the region of contact. In (b) and (c) one solvent is shared between the ions, and in (d) both ions retain their full primary solvation.

a certain distance, were effectively neutral, behaving as large dipoles. Others considered that only contact ion pairs should be significant. Yet others postulated specific pairs, but with intervening solvent molecules [12] (Figure 3.4).

The ESR results unambiguously detected contact ion pairs. Comparison with conductivity results showed that, especially in protic + aprotic solvent mixtures, these were not the only types of pairs present. Careful analysis of certain ESR spectra confirmed that, on adding pro tic solvents to solutions containing ion pairs in aprotic solvents, new species showing either no cation hyperfine splitting, or very reduced splitting, were present, having 1 H hyperfine parameters which differed from those for the 'free' ions. These were assigned to 'solvent shared' or 'solvent-separated' ion pairs [10,12].

In these ways, conductivity results could be fully explained in terms of ESR structural and kinetic data [13], and some of the extensive controversies were settled.

3.4.1.1 Ion-pair fluctuations. Other, surprisingly detailed, information about ion pairing has been forthcoming from these studies. They relate to radical anions with two alternative cation binding sites such as semiquinones or dinitrobenzene anions (Figure 3.5). There is strong coulombic attraction between the [)- oxygen or -N0 2 units which are therefore favoured cation sites. At low temperatures, these asymmetric structures have lifetimes that are long on the ESR time-scale and this is reflected in a reduction in the symmetry

Page 62: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOL VENTS AND SOL V A TION

I " I I

---- Na+

... ... ...

--------, , , ... ...

0- 0-0-< ___ >-0 ...

... ~

53

Figure 3.5 Fluctuational movement of Na + between the two negative oxygen sites for p­benzosemiquinone anions. This is relative movement, probably involving considerable solvent

reorganisation.

of the anion spectra. Thus, for example, the 5 lines found for solvated benzosemiquinone anions from hyperfine coupling to its four equivalent protons changes to a 9-line spectrum from two inequivalent sets of two protons for these ion pairs. As the temperature is increased, or the solvent changed, migration between the two equivalent sites increases in rate, giving a broadening of alternate lines. These lines ultimately vanish, leaving a 5-line spectrum similar to that of the free ion. However, because the same cation is involved in the flip-flop process, the cation hyperfine features remain narrow.

A similar change from 9 to 5 lines results sometimes when a simple salt with a common cation is added. In this case reaction (3.2) occurs, so that the lifetimes of the asymmetric units are reduced, but different sodium ions (with varying nuclear spins) are involved, so that the cation hyperfine splitting is lost at the same rate. As usual, at intermediate rates, lines become broad and from the width increments, rates and activation energies and entropies can be obtained.

In certain cases, especially when the anion of the added salt is large and does not compete for cations, triple ions such as (Scheme 3.2) can be detected. For these long-lived species, hyperfine features from two equivalent cations are

M+ ---- 0 -@- 0 ---- M+

Scheme 3.2

resolved on each of the 1 H anion features [14]. These results show that the displacement mechanism is associative rather than dissociative. They also suggest that small monatomic anions such as chloride can be directly involved in these displacements. There is no doubt that ESR spectroscopy remains the most powerful technique for studying such intimate details of ion pairing.

Page 63: Principles of Molecular Recognition

54 PRINCIPLES OF MOLECULAR RECOGNITION

3.4.1.2 Electron transfer and exchange processes. Studies with stable radical ions have also been used to obtain kinetic information on electron transfer processes that have been of great use in probing these biologically important reactions. The reactions (3.3) and (3.4) are symmetrical, with no net change, but because nuclear spins can change each time an unpaired electron is transfer­red, 1 H hyperfine lines broaden, and are finally lost, leaving a single averaged line.lfion pairs are involved, the cation invariably moves with the electron (for symmetry reasons) and hence cation hyperfine splitting is retained.

AO- +A~, =~> A+Ao-

AO- ... M+ +A, ' A+Ao- .. ·M+

(3.3)

(3.4)

Activation energies and entropies are quite large. The barriers involved are mainly shape changes and solvation changes. Thus, the electron will not move from a site favoured by the correct shape and solvent orientation and will wait until these are equivalent within the reaction intermediate. This is a rare event, so the reactions are not as fast as might be expected.

Another reaction that is uniquely studied by ESR spectroscopy is spin exchange (3.5).

Aia) + Aim ' ' AiPl + A (a) (3.5)

This also results in line-broadening and ultimate loss of hyper fine splitting, but clearly, the barriers discussed for reactions (3.3) and (3.4) are no longer present so the process is very much faster. If ion pairs are present, the triple-ion intermediate AO-M+ AO- may have sufficient lifetime to be detected. This will appear as a triplet state, each electron being delocalised over both A units.

3.4.2 Solvation of aromatic nitro-anions

Provided ion pairing is avoided, changes in solvation are nicely reflected in changes in 14N hyperfine splitting for these ions [15]. The results cannot be unambiguously interpreted, however, because solvation changes are always very fast, and only monatomic shifts are observed. There is, however, one fascinating and revealing exception to this statement. The radical anion of meta-dinitrobenzene is normally symmetrical in aprotic solvents, provided there is no ion pairing. This is, of course, expected on symmetry grounds, there being no quantum mechanical reason for asymmetry. However, in water at room temperature, the anions are completely asymmetric, the ESR spectra showing that the spin density is largely confined to one N02 group [16]. This cannot be due to ion pairing, which is not expected for aqueous solutions. We therefore suggested that it is caused by long-lived asymmetric solvation, which is encouraged by the small energy gap between the symmetrical and two alternative asymmetric states. We suggest that such an asymmetric structure

Page 64: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOLVENTS AND SOL V A nON 55

once formed, is fixed by two processes. One is rapid and extensive solvation at the anionic -N02 unit, involving, say, four hydrogen bonds to oxygen. Also, any bonds to the other N02 group must be rapidly lost, possibly with the build-up of a partial clathrate cage about this side of the anion. This fixes the asymmetric structure long enough for it to be detected as a unique entity by ESR spectroscopy. We estimate lifetimes of ca. 1 J.1S for these units at 25°C.

For methanolic solutions, the two nitrogens are apparently equivalent, but there is broadening of alternate lines (the 1= ± 1 components) indicating a rapid fluctuation between asymmetric forms. So the asymmetry is also in­duced, but the solvent 'flip' occurs far more rapidly in methanol. It IS

noteworthy that methanol cannot develop clathrate cage type structures.

3.4.3 Solvation of neutral nitroxides

Nitroxides have become of very great importance to biologists, but neverthe­less, have not been extensively studied by solvation chemists. They are widely used as 'spin labels', first introduced by McConnell, because of their stability, resistance to dimerisation or disproportionation, and large 14N hyperfine anisotropy [17]. This last property controls the widths of the lines, espe­cially the MI = - 1 feature, the width being a function of tumbling rate. Hence correlation times can be estimated over a very wide time span (ca. 1011 -10- 5 s).

However, they have another important property, namely that the isotropic hyperfine coupling is solvent dependent. Thus, for example, for (Me3C)2ND, one of the most studied of the many hundreds of nitroxides available, the 14N hyperfine splitting for aqueous solutions is ca. 17 G, that for methanolic solutions is ca.16G and that for solutions in aprotic solvents is ca. 15G. We have assigned these differences to changes in hydrogen bonding to oxy­gen [15]. The value for Aiso changes monotonically for mixed solvent systems and shifts can be reasonably interpreted in terms of preferential solvation. However, the usual ambiguities remain. For example, is Aiso larger for aqueous than for methanolic solutions because the H-bonds are stronger, or because there are more of them? This question is addressed in section 3.8. As with NMR spectroscopy, the problem is that the rates of exchange of solvent molecules (making and breaking of H-bonds) is very fast on the ESR time­scale, so that only time-averaged spectra are observed. Fortunately, this is generally not the case for vibrational spectra (sections 3.6 and 3.7) ..

In marked contrast with R2NO molecules, the dianion, (03S)2N02 -, has an isotropic 14N hyperfine splitting that is remarkably insensitive to changes in solvation [18]. Also, the electronic spectrum is insensitive, in contrast with those for neutral nitroxides. This difference seems to be largely due to very strong hydrogen bonding to the two -SO; units in pro tic solvents, which dominates the overall solvation.

Page 65: Principles of Molecular Recognition

56 PRINCIPLES OF MOLECULAR RECOGNITION

3.4.4 Gain and loss of solvation

In two unique temperature resolved experiments, it has been possible to follow the slow, quanti sed, build-up of solvation of an un solvated anion, and the similar loss of solvation from a cation after charge neutralisation. These ESR experiments nicely complement very recent time-resolved studies of similar processes [19]. In the first, very dilute solutions of dioxygen in alcohol glasses were irradiated at 4 K. Absence of any solvent-trapped electrons established that addition to oxygen to give ·0; ions occurred efficiently. However, no discernible ESR features for these ions were detectable. Note that for unsol­vated ·0; ions, none would be expected. Thus at 4 K, there is not enough energy available to break solvent-solvent H-bonds and the ·0; ions remain in their original solvent cavities with no specific interaction [20]. However, on controlled annealing, a broad, low-field (parallel) feature grew in, clearly due to weakly solvated ·0;. Two others grew in, in turn, on further warming, each being closer to the free-spin region (g = 2) and narrower than the first. The final, sharp line at 9 ~ 2.08, was identical to that for freely solvated ·0; generated at room temperature and frozen to 77 K in alcohol. There can be little doubt that step-wise solvation is occurring to give, finally, a tetra­solvate in which the solvent probably concentrates in the electron-rich plane (structure (4)).

....0 H~~

H~ .... 0 ~H

(4)

0 ....

The other example is for solvated silver ions, after electron addition to give neutral silver atoms [21]. Frozen dilute solutions of silver perchlorate in cyanomethane (MeCN) gave, on exposure to 60Co y-rays, an ESR spectrum resembling that for silver atoms, but with a considerably reduced spin density on silver, and with well-defined isotropic hyperfine coupling to four equivalent 14N nuclei. This result nicely established the previously unknown structure of the silver cation solvate as Ag+(NCMe)4 with four tetrahedrally arranged MeCN ligands coordinated via nitrogen. On e - -addition at low temperatures, loss of charge is not sufficient to cause loss of this solvation. However, on warming, desolvation occurred giving, ultimately, 'free', unperturbed, silver atoms, with very sharp features and no 14N splitting.

Both these processes nicely illustrate our 'principle of anti-cooperativity'

Page 66: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOLVENTS AND SOLVATION 57

[22]. For solvation of °2, the energy gain for adding one ROH hydrogen bond is greater than that for adding the second, and so forth. Thus the four equivalent bonds for '02 (ROH)4 are much weaker than the one for ·02 (ROH)l although the total strength added over the four bonds is obviously greater. Similarly, the first MeCN molecule is lost from the AgO(MeCN)4 unit far more readily than the last.

3.5 Nuclear magnetic resonance studies

Although NMR is a powerful tool for studying solvation, it suffers the same limitation outlined above for ESR spectroscopy, namely, that systems are almost always in fast exchange. In our experience, the only way to reach unambiguous interpretations of NMR shift data is to link the results with infrared studies, when possible. (Shifts are by far the most widely studied parameters: coupling constants are generally relatively insensitive to solvent changes.)

As usual, one can study changes in the shifts for selected nuclei in solutes, neutral or ionic, or one can study solvent shifts, generally the sensitive 1 H resonances of pro tic solvents.

3.5.1 Solute shifts

3.5.1.1 Neutral solutes. Perhaps the most informative studies have been those for nuclei that are directly a part of an active chromophore that can be studied by vibrational spectroscopy. Examples are the 13C resonances for carbonyl groups in ketones, esters and amides, and the 31 P resonances for phosphine oxides, alkyl phosphates, etc. Shifts are usually monotonic and can be (and have been) interpreted in a variety of ways. I indicate, in section 3.8, how they can be uniquely rationalised using infrared spectroscopy.

3.5.1.2 Ions. There was extensive early work on concentration shifts for alkali-metal and halide ion resonances. These are strongly dependent upon salt concentration and on the nature of the gegen ions [23]. Clearly, dehydra­tion and ion pairing are both involved in these shifts, which tend toward the pure salt values. However, no firm structural information has been forthcoming.

For the anions, solvation occurs by hydrogen bonding and it is often claimed that the shifts reflect the strength of such bonding. However, it has recently been shown that there are linear relationships between the 19F, 35Cl and 125Xe shifts for F -, Cl- and Xe in a range of solvents [24] (Figure 3.6). Since Xe is not expected to form hydrogen bonds in protic solvents, this correlation seems to rule out hydrogen bonding as the controlling factor. They

Page 67: Principles of Molecular Recognition

58 PRINCIPLES OF MOLECULAR RECOGNITION

(a)

50

• ...-. Formamide / : E • I a.

NMF/

• DMSO a. ---LL (1)

'-<:::> 7~NMA 0

CF3CH2OH .' EtOH

/eMeOH

-50 0 50 100

J35CI(ppm)

(b)

OMSO 50

/ Z·Fo<mam;d. HMPA

e ...-.

0 H2O

E @ a. eOMA a. NMF e ---Q) eNMA X It> N

'-<:::>

MaOH

-50 e I I

I I I eCF3CH2OH

-50 0 50 100 J~5CI(ppm)

Page 68: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOL VENTS AND SOL V A TION 59

also suggest that solvation of fluoride ions is unlikely to involve very strong hydrogen bonds, despite recent claims.

3.5.2 Use of 1 H N M R shifts to study solvation of ions

Proton resonance shifts for salts in water have been studied for many years [25,26], but have proven difficult to interpret. Apart from the other problems associated with time-averaged shifts, this arises because of the problem of assigning separate shifts to cations and anions in a non-empirical fashion. Because of its relative simplicity, we have studied methanol rather than water [27] and endeavoured to extrapolate to aqueous solutions [28].

Fortunately, for methanolic solutions, the first problem can be simplified by using Mg2 + salts at low temperatures such that the band of Mg(MeOH)~ +

units is separately resolved because of slow exchange [29]. It was originally argued that shifts in the 'bulk' MeOH resonance is then purely due to the anions [26,27]. However, this cannot be the case if the effect of changes in concentrations of (OH)free and (LP)free groups are also considered [30,31]. When this is done, a reasonably self-consistent set of results is obtained. Using shift values for (OH)free and (LP)free units taken from infrared correlations, a set of individual ion molal shifts have been obtained. These are the weighted averages of the real shifts for the protons in the first solvent shells ofthese ions and those in the bulk (pure) solvent (methanol). The link is the solvation number of the ion. On the basis of correlations between infrared data and NMR data for ions of known solvation number, reasonable values have been obtained for a range of cations and anions [30,31] (Table 3.1 and Figure 3.7).

In addition to studying proton resonance shifts for water and aqueous solutions, there have been extensive studies of 170 shifts and relaxation effects. Early work has again been thoroughly reviewed by Deverell [23]. Difficulties remain in attempts to interpret the shifts in terms of structural effects. However, since oxygen is directly involved in bonding to cations, they induce relatively large shifts, and for certain di- and trivalent cations, separate resonances can be resolved indicating relatively slow exchange [32]. Subse­quent studies using this method gave solvation numbers of 4.1 (Be2 +), 5.95 (AIH) and 6 (GaH ) [33]. These values are in good agreement with results of more recent studies, and with aquation in crystal hydrates.

3.5.3 Relaxation studies

Linewidth effects and relaxation time studies have been very extensive in this field. These are, of course, primarily studies of rates, which are not my present .... Figure 3.6 Trends in chemical shifts for (a) F- and (b) Xe in a range of solvents relative to those for Cl- ions in the same solvents. Cations were Na +, Li + or Bu4N +, and shifts were largely independent of the concentrations used. All shifts (ppm) are given relative to aqueous solutions. The value for F- in DMSO is uncertain, because of the difficulty of removal of traces of water.

Page 69: Principles of Molecular Recognition

60 PRINCIPLES OF MOLECULAR RECOGNITION

Table 3.1 Proton resonance shifts for various salts and individ-ual ions in methanol, together with solvation numbers required to fit the correlation of Figure 3.7

Key Salt/ion Shift (ppm)··b Solvation number

Bu4 N I 0.2nd

Bu4 N Br O.lnd

Bu4 NCI04 0.240d

i ClO; 2.0 2 ii ClO; 3.0 3 iii ClO-c 2.6 2 IV (LP); 3.3 v 1- 3.6 4 vi Br- 3.9 4 vii (LP); 4.0 viii I-c 4.1 4 IX Br- c 4.3 4 X MeOH bulk 4.5

CI- 4.5 4 XI Li+ 5 4 xii Na+ ca. 4.5 5±I xiii MeOH bulk 5.4 XIV Li+c 6.1 4 xv Mg2+ 6.75 6 xvi Mg2+c 7.1 6

• From the value for MeOH monomers in CCI4 .

bAt 25 DC unless otherwise stated. c At -70°C. d From [30,31]; these are experimental shifts from the bulk methanol value; to convert to shifts from the monomer value, subs tract from 4.5.

concern. However, structural inferences are often drawn from such studies. Such inferences are rather indirect and may be open to criticism, and indeed, there are several instances in which they have been revealed as faulty. It is always difficult to interpret rate data unless there is definite structural information from other sources.

3.6 Vibrational chromophoric probes

I use the word 'probe' to refer to solutes whose spectra are studied in a range of pure and binary mixed solvents, the solute concentration being held as low as possible [34]. Although this has always been common practice for ions, it has not been normal procedure for neutral species, the usual practice being to treat the solute being studied as one component of a binary system. This is a far less informative procedure, as I try to show here. Generally, in our studies, the solvent can be treated as binary, and solute-solute interactions can be ignored.

Page 70: Principles of Molecular Recognition

E a. a.

.... E C

4

~2

SPECTROSCOPIC STUDIES OF SOL VENTS AND SOL V A nON

I xvi xvI/

r/ /XiV Yi • xiii

xiiT • ix .... ~

vii. ~iii .... ~i ;. ·v

en /"

• i/ejjj

O~------~---------2~OO--------~3~OO~------~40~O-------

AV(i.r.)/cm-1

61

Figure 3.7 Correlation between infrared shifts .1. v (cm - 1) and proton resonance shifts .1. v' (ppm) for solutions of salts in methanol. The origin is the point for monomeric methanol in tetra­chloromethane. The key, together with the solvation number required to fit the correlation, is

given in Table 3.1.

Rather than attempt to generalise, I give one example in depth, and then briefly describe the range of systems so far studied, with results.

3.6.1 Triethylphosphine oxide

I have selected Et3 PO as my example because of the important work of Meyer and Gutmann who used the 31 P chemical shift for this molecule as a measure of the 'acceptor numbers' of solvents [35]. As it turns out, infrared spectro­scopic studies have shed considerable light on the significance of these acceptor numbers [36]. The P-O stretch chromophore is very sensitive to solvent changes but, in contrast with the 31 P shift, the infrared bands do not shift much but are gained and lost successively as solvation changes. This is because the lifetimes of the different solvates are long on the infrared time­scale. On going from pure water to a pure aprotic solvent such as MeCN (cyanomethane), band (3) (the pure water band) falls and a new band, (2), grows in (Figure 3.8). As [MeCNJ increases and that of water falls, band (2) de­creases, and a third band, (1), develops. This, in turn, gives way to band (0), characteristic of Et3PO in pure MeCN. The way these bands rise and fall with change in mole fraction (MF) is shown in Figure 3.9. The obvious and, hope­fully, correct interpretation is that in pure water, Et 3PO forms three hydrogen bonds to water 0-H protons. As MeCN is added, these are progressively lost. Thus band (3) is assigned to Et3PO(H 20h, (2) to Et3 PO(H 20b (1) to Et 3PO(H 20) and (0) to Et 3 PO with no hydrogen bonds.

Page 71: Principles of Molecular Recognition

62

!I) :!:: C ::s .ci (ij

Q) () c

1l o !I) .0 (l!

PRINCIPLES OF MOLECULAR RECOGNITION

(0) (1) (2)

00 CJ (3)

0 1 I

1150 1100

wavenumber / cm-1

1050

Figure 3.8 Infrared spectra for dilute solutions of Et3PO in water + MeCN systems. Key (in mole fractions); 1,0.00; 2,0.03; 3,0.07; 4,0.16; 5,0.36; 6,0.65; 7,0.79; 8,0.95; 9,0.97; 10,0.99; 11, 1.00. Band maxima are indicated by broad arrows which span their shifts, the numbers

indicating the postulated number of water molecules.

*' 100· )(

.---~~~ o 0.2 0.4 0.6

mole fraction MeCN

Figure 3.9 Trends in the intensities of the v bands for individual solvates in water + MeCN: ., 3 H 20; ., H 2<O;<.R, 1 H 20; x, no H 20.

Page 72: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOLVENTS AND SOLVATION 63

The mode of solvation for aprotic solvents is largely dipole-dipole in nature and is very strongly controlled by steric factors. Otherwise, solvent is simply present because of packing considerations. Generally dimethylsulphoxide,

J+ J-

Me2 S -0, which is pyramidal and hence has an exposed dipole, produces the

largest shift whereas non-dipolar solvents such as hexane or tetra­chloromethane (CCI4 ) are treated as causing zero shifts for condensed systems such as these. Often, the shifts are to lower frequencies when hydrogen bonds form, as in this case. The shift from zero is proportional to the total H-bond strength, but the individual shifts decrease as the number of bound water molecules increases. This is another example of the anti- cooperativity principle.

It is also noteworthy that individual bands, although clearly distinct from one another, shift considerably as the solvent composition is changed. Thus bands (1), (2) and (3) shift to high frequency (weaker bonding) as the water content falls. This is thought to be a consequence of the steady weakening of the H-bonds formed by water as the reinforcing (cooperative) secondary structure is lost. These effects can be used as a measure of secondary solvation [37], but it must be remembered that, for water, the situation is complicated because of bonding to aprotic solvent molecules. Thus, in the unit Et3PO .. · HOH .. · B (where B is the aprotic solvent), when B is a strong base, the bond to Et 3PO will be weaker than it is when B is a weak base (anti-cooperativity). No such complications arise for MeOH-aprotic solvent systems, which are therefore easier to interpret.

So far as I know, this is one of the best methods for estimating primary solvation numbers experimentally. In principle, neutron diffraction should be superior, although it gives somewhat different information, but it has not yet been used very successfully for systems such as those under consideration here. Using these procedures, solvation numbers for a range of 'probe' solutes have been obtained (Table 3.2). It is noteworthy that the solvation number for water

Table 3.2 Solvation numbers for a range of solutes in water and in methanol, estimated from infrared spectro­scopic data

Solute Water Methanol

MeCN la O.5b

Me2CO ca. l.4b HCONMe2 2

MeCONMe 2 2 ca. 1.4 MeC02Me ca. 1.5 ca. 0.5 (Me3C)2NO 2 ca. 1 Et3 PO 3 ca. 1.8b

'This result is not as well-established as the remainder (see text) b 0.5 implies that some molecules form no H-bonds and some form one; 1.4 implies some form one and some form two, etc.

Page 73: Principles of Molecular Recognition

64 PRINCIPLES OF MOLECULAR RECOGNITION

is always a maximum, and greater than that for methanol, even though single hydrogen bonds formed by water and methanol molecules are of comparable strength. Also, water generally solvates all solute molecules 'fully', that is, they are all maximally solvated. However, methanol generally contains a mixture of solvates, the spectra being characterised by two bands.

3.6.2 Cyanomethane

This is the weakest basic probe that we have studied so far. It is, of course, an extremely important solvent, and is completely miscible with water despite its inability to form strong hydrogen bonds. Results, displayed in Figure 3.10, show that, in contrast to all the other systems studied, hydrogen bonding causes a shift of the C N chromophore to high frequencies, whilst strong dipolar interactions still show the usual low frequency shifts. Reasons for this contrast have been outlined [38]. Of major interest, however, is the behaviour in mixed solvents. On going from water to, say, DMSO, there is just a monotonic shift as found in NMR shift studies. A series of bands do not seem to form, even though the bands are relatively narrow compared with the shifts. This is especially surprising since water only forms one or possibly two hydrogen bonds to MeCN [38], so the expected gain and loss of bands should be readily detected.

For methanol aprotic solvent systems, the same effect is observed at ca. 40 DC, but at low temperatures well-defined, separate bands are detected as is usually observed for other systems. We have tentatively suggested that the

:r 0

1000 :r UN :r ;:;:., 0

~ N

:r U ., u..

0 0 '0 U N C/)

(3 600 :r :E E

!

Cl N E u u.. , :r ......

r} MeOH: I-

(1) --NU 200

U 2270 2260 2250

V (C=Nl/cm-1 max

Figure 3.10 Changes in absorbance (e ) and in v(C==N) for MeCN in a range of pure protic and aprotic solvents. Since MeOH solutj~ns exhibit two bands, the true absorbance for the high­frequency band would be greater if only this species were present. This is indicated by the dashed

line.

Page 74: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOL VENTS AND SOL V A nON 65

molecules involved are in slow exchange on the infrared time-scale at low temperatures, but approach the fast exchange situation on warming, and exist only in the fast exchange state for mixed aqueous systems. In a simplified sense, this means that reaction (3.6) can occur so rapidly that the individual bands are averaged.

MeCN··· HOR~MeCN + HOR (3.6)

This, in turn, implies that the mean lifetime ofthe MeCN ... HOR unit is less than ca. 10- 12 s. This lifetime agrees quite well with that estimated from molecular dynamics studies [38] but much more work is required on this and related systems.

3.6.3 Acetone

Our studies of acetone as a probe gave well-resolved bands for the )==0 chromophore showing clearly that a di-hydrate is present in aqueous solutions [39]. The results for frozen aqueous solutions are of particular interest since, under controlled conditions, freezing results in clathrate formation. The resulting spectra are shown in Figure 3.11. These show clearly that in the clathrate, acetone is quite free, the Lorentzian bandwidth being governed by rotation within the cages offully hydrogen-bonded water. In contrast, once the solutions melt, water forms two hydrogen bonds to the oxygen atom of the carbonyl group, giving a Gaussian band, strongly shifted to low frequency. The important, negative conclusion is that one cannot argue that because a

Q) o C !1S .0 .... o CJ)

.0 «

1750

Hexane Water

1700

Wavenumber I em-1

1650

Figure 3.11 Fundamental C =0 stretching band for dilute solutions of acetone in hexane and in water.

Page 75: Principles of Molecular Recognition

66 PRINCIPLES OF MOLECULAR RECOGNITION

given molecule is encaged in a solid clathrate, it will be similarly 'solvated' in fluid aqueous solutions.

3.7 Near infrared studies

It is important to realise that the actual distribution of 0-H hydrogen bond strengths in solvents such as water or alcohols is not reflected in the infrared spectra. As stressed above, the oscillator strength and low-frequency shift increase strongly as the hydrogen bond strength increases (Figure 3.1). Thus, weak bonds, although present in significant amounts, may make a negligible contribution to the observed spectra. Fortunately, this situation is, for various reasons, reversed in the first and second overtone regions. Here, very weak hydrogen bonded units have much greater relative oscillator strengths, and as the bands remain narrow, their contributions are readily detected. Indeed, there is often a clear break between free or weakly bonded OH units, and strongly bonded groups. Thus the NIR is the region of choice if interest is in weak hydrogen bonds, whereas the fundamental region is preferable for studying strong bonds (Figure 3.12).

3.7.1 Free OH groups

There is some controversy regarding the possibility that at least part of the narrow high-frequency band in the NIR spectra for water and aqueous

0.6

C1> (,) c: as .0 ... 0.4 0 I/)

.0 <

0.2

1400 1600

Wavelength/nm

Figure 3.12 First overtone (2v(OH) for HOD in D20, over a range of temperatures.

Page 76: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOLVENTS AND SOLVATION 67

solutions is due to 'free' O-H units, that is O-H units that are not involved in any hydrogen bonding. (Note that, for water and aqueous solutions, it helps to use solutions of HOD in D 20 for studying the 0-H groups. This is because in HOH, there is strong coupling giving Vj, V3 and (v j + v3 ) components. Thus the spectrum is greatly simplified using HOD where coupling can be ignored, and only one O-H is involved.)

In our view [40J, the narrow band at ca. 7100 cm -1 has contributions from (OH)rree and weakly bound O-H groups, but others maintain that all O-H units are bonded, this band being a measure of all weakly bonded units [41]. The following general points seem to support the (OH)free postulate: (i) Since there must be lots of very weak OH··· 0 hydrogen bonds, I find it hard to understand how the presence of 'free' groups can be avoided. The three­dimensional structure of water is such that one would expect occasional sites where 0-H groups are surrounded by strongly bonded molecules somewhat resembling part of a clathrate cage, so that formation of a bond is momentarily prohibited. The lifetime of such units in fluid water must surely be short because of rapid fluctuations but, nevertheless, they are expected to have a real existence. (ii) The overtone spectrum for methanol has only a very weak shoulder in the 'free' region. It is hard to understand why fluid methanol should have so much less weakly bonded units than water. However, it is quite reasonable that water should have many more broken bonds, because of its three-dimensional structure. Indeed, without many broken bonds, the low viscosity of liquid water would not be easy to understand.

The other major reason why methanol is expected to have very low concentrations offree OH groups is that each molecule has one site (lone-pair of electrons) not taken up by hydrogen bonding in the normal 'linear structure' (5). Anyone of these will 'scavenge' (OH)free groups thereby keeping their concentration very low. For water, all bonding sites are normally occupied, the majority of molecules forming four hydrogen bonds as in ice. It is noteworthy that on going to t-butyl alcohol, a narrow 'free' OH band is again seen. This is, in our view, because it is sterically difficult to form branched chain units, so not all the 'free' OH groups can be scavenged.

Me Me Me Me I I I I

---- OR ---- OR ---- OR ---- OR ---- (5)

i i i i The arrows indicate free bonding sites

Another factor (iii) which indicates the presence of 'free' (OH) groups in liquid water is that, on adding basic co-solvents capable of forming strong H-bonds, the 'free' band is rapidly reduced in intensity, although only ca. 50% ofthe intensity is lost. Since there is no reason why the range of weakly bonded OH groups should be greatly modified, we argue [42J that this loss is due to

Page 77: Principles of Molecular Recognition

68 PRINCIPLES OF MOLECULAR RECOGNITION

reaction with 'free' OH units. I stress that if there are (OH)free groups, there must be an equal number of free oxygen sites, which I have termed free lone-pair or (LP)free groups in pure liquid water (Scheme 3.2). For water, only the (OII)free groups are readily followed spectroscopically, but for methanol, it is the (LP)free groups that are most readily monitored [43].

The link between these studies and those outlined in the previous section can be seen from Figure 3.13. Instead of attempting to 'count' (OH)free groups directly (because of concomitant changes in the weakly bound units) we have calibrated the trends shown in Figure 3.13. against known systems. For example, it is well established that chloride ions in water have a solvation number of 6 [44]. Also, tetraalkylammonium ions do not form hydrogen bonds. Hence, when R4N + Cl- salts are added to water, the decrease in the band assigned to (OH)free groups is largely a consequence of hydrogen bonding to Cl- ions. Given that the slope of the line for R4N + Cl- salts corresponds to the uptake of 6 (OH)free per CI-, then the slope for Et3 PO, which is just half that of CI-, corresponds to the uptake of 3 (OH)free and that for, say, dimethylformamide, to the uptake of 2 (OH)free groups per molecule. These data, and many others, agree well with the quite distinct results of these solutes in the fundamental IR region discussed above (Table 3.2). Because the latter studies seem to be compelling, I consider that this agreement offers strong support for the concept of (OH)free and (LP)free units in liquid water.

3.7.2 Some consequences of the 'free-group' postulate

Hit is accepted that there are significant concentrations ofthese 'fr~e' groups, it is worthwhile considering some consequences. Ifthe results of , tit rations' with

KBr

I ./ ~~:::== _______ KF

" o~ ~

:::c o -1

c:

OJ 01 c: til

.r::. (.)

-3

o

I Error

[Solute) 493

Figure 3.13 Near infrar~d estimates of relative changes in the concentration of (OH)f«' groups as a functIOn of the concentrations of added salts and of Et3PO.

Page 78: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOL VENTS AND SOL V A TION 69

basic co-solvents are correct, the concentrations of these groups must be approximately 10% at room temperature. (The inaccuracies involved are such that it is not, at present, worth trying to make a more accurate estimate.) That is, some 10% of the maximum possible number of hydrogen bonds are broken at any time, each 'broken bond' contributing one (OH)free and one (LP)free group. These are completely uncorrelated, that is, they do not move in pairs but as single units. (This is an important point, since many discussions of broken bonds are centred on some arbitrary cut-off for weak, but real hydrogen bonds. Such discussions artificially constrain the (OH)free and (LP)free units to be close together.)

3.7.2.1 Analogy with ionisation. Perhaps the best way to understand this point and, indeed, the potential importance of these 'free' groups, is the link with the process of self-ionisation (3.7).

2HzO , . H30+ + OH-

2HzO -=, =~. (LP)free + (OH)free (3.7)

The connection between these reactions is depicted in Scheme 3.3. This shows an (OH)free group separated (arbitrarily) from an (LP)free unit by two water molecules. Three concerted proton transfers leave these two intermediate molecules fully hydrogen bonded. The OH - ion is generated from (OH)free directly, with three potentially strong hydrogen bonds already present. It is known that the proton for OH - ions is free, and the solvation number is probably four on average, so that a value of three is quite reasonable for the

I I I •

:~:~: ~: ..... O-H····· O-H····· O-H····· O-H·····

I I I I H H H H i : : ' , ,

(OH)free

'-...... 0-···· H-O······ H-O····· H-<r-H······

I I 1 1+ H H H H , , , , , ,

, , , i OH·

Scheme 3,3

Page 79: Principles of Molecular Recognition

70 PRINCIPLES OF MOLECULAR RECOGNITION

system. Similarly, the H30+ ion, derived from (LP)free' is directly solvated correctly, with three good bonds to the three hydrogens and no bond to oxygen. Thus a single concerted act generates well-solvated ions and no major reorganisation is required. I suggest that this is the preferred route to the formation of these ions. An important consequence of this theory is that factors that reduce the concentrations either of (OH)free or of (LP)free groups should decrease the equilibrium concentration of ions. So far, I have found no exceptions to this rule [45].

There must be many bulk properties of water such as diffusion, viscosity, etc., that depend, in part, on the concentrations of (OH)free and (LP)free units. Also, in my view, there are important chemical consequences [46, 47]. Just as OH- and H30+ are often important reagents in chemical reactions, so also, I suggest, are the far more abundant (OH)free and (LP)free groups. Two types of reactions seem to illustrate this suggestion very well. One is the SN 1 type reaction typified by reactions (3.8) and (3.9).

(3.8)

(3.9)

The first reaction can be written as an attack on the CI unit by (OH)free groups. We have adduced spectroscopic evidence that R-hal molecules are partially hydrogen bonded in methanol [48]. In that case, using the generalisa­tion that the solvation number for a given solute is greater in water than in methanol, we conclude that there are many RCI··· HOH units already present. In that case, only one or possibly two more hydrogen bonds need to be formed to achieve ionisation (3.8). Analysis of the results suggests that two (OH)free groups are involved. Reaction (3.8) is rate limiting but reversible, reaction of the carbocation being very fast (3.9). Indeed, it is probable that this occurs before the ions have fully separated. This does not alter my conclusion that the active agent is (OH)free. (I stress that the reason for this contention is that fully hydrogen-bonded water molecules have no reactive sites, whereas (OH)free has a very strong hydrogen-bonding capability and the ReI molecule has vacant hydrogen bonding sites.) If this is correct, then additives that increase or decrease the concentration of (OH)free should also increase or decrease the rates of SN 1 type reactions. This is indeed the case [49].

A second example is the hydrolysis of arylsulphonylmethyl perchlorates [50]. The first, and rate-determining, step is thought to be the ionisation (3.10). In this case the active agent is thought to be (LP)free' and catalysis should occur when additives induce an increase in the concentration of these groups. This is again found to occur in the predicted fashion. So far as I know, these concepts provide the simplest and most direct explanation for these quite disparate reactions. It seems a pity that this approach is not used in mechanistic studies.

(3.10)

Page 80: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOLVENTS AND SOLVATION 71

3.7.3 Use of overtone irifrared (NIR) to study solvation of ions

The simple model proposed in this chapter requires that ions are solvated as indicated in Scheme 3.4. Anions form normal 'linear' hydrogen bonds as in (a), rather than bifurcated bonds as in (b). Cations, at least in crystalline hydrates, can solvate as in (c) via one (LP) group, or as in (d) via both. The former is favoured by large univalent ions such as K + and the latter by small polyvalent ions such as Mg2 +. It is important to realise that ionic solvation is an example of 'anti-cooperativity'. Solvation is relatively strong because of the localised charge, but each time a solvent molecule is added, bonds to the new shell of primary solvent molecules all get weaker [51]. However, for cations, on going from (c) to (d) the bond to the cation becomes stronger since anti-cooperativ­ity is reduced. Also, secondary solvation is cooperative, and increases the strengths of primary solvent bonds .

. ' H

- / ..... X····H-O

~ ~~ ....

(a)

, H' :+ /

.... ··M .... · 0

: '" : H

(d)

or

or

Scheme 3.4

..... H"'" .... ··X- O'

..... / .

(c)

H (b)

"

These concepts nicely explain trends in the intensity of the NIR HOD band associated with (OH)free groups as electrolytes are added (Figure 3.13). The 'normal' salts such as NaCl have little effect, but Na +BPhi causes a marked increase in the (OH)free band whilst R4N+hal- salts cause a marked decrease in the band. The results agree very well with the theory, provided PPhi and R4N+ ions do not form hydrogen bonds or bonds to oxygen (LP) groups. (These ions may cause water 'cages' to grow around them at low temperatures, but this has no major influence on the concentrations of (LP)free or (OH)free groups.) In that case, Na + ions take up (LP)free groups and the equilibrium shifts, with consequent gain in (OH)free groups as observed. Similarly, hal­ions take up (OH)free groups, and this loss is again seen in the spectral change.

Page 81: Principles of Molecular Recognition

72 PRINCIPLES OF MOLECULAR RECOGNITION

If these are added together for Na +BPhi and R4N+Cl-, the shift for Na +Cl­is well reproduced. Thus the small observed shift does not mean that Na +Cl­has only a small effect, but that these effects nearly cancel. Markedly different results for Na +Cl- and R4N+Cl- have been observed in a number of different studies, but they are usually interpreted in terms of some special property of R4N+ ions rather than being attributed to chloride ions.

In this simple model for ionic hydration, there is no reason to postulate any major disruption of the secondary solvent shell around ions. This is a popular concept, but it seems to me to be quite unnecessary, and certainly the spectroscopic evidence is against the concept. Such a disordered region would have to give rise to an increase in the number of broken bonds and hence in the concentrations of (LP)free and (OH)free units. This is clearly not observed.

In principle, it should be possible to estimate primary solvation numbers from the trends found (Figure 3.13). In practice, as stressed above, this is difficult because of the unknown contribution from very weak hydrogen bonds to the sharp spectral feature in the NIR. It is easier to argue from one 'known' value. Thus, inverting the argument given above, we can say that our 'probe' results for Et3 PO indicate a solvation number of 3. This leads to a value of 6 for Cl- ions which agrees well with neutron scattering studies [44]. A 'solvation' number of ca. 7 is then predicted for Na + ions, but we are measuring changes in (OH)free groups, and if some water molecules act as in (c) rather than (d) (Scheme 3.4), this will require a reduced solvation number. Thus, if the true value is, say, 6, then one of these solvent molecules should be bonded as in (c), the others forming three hydrogen bonds to neighbouring water molecules (d).

These results can be compared with those based on NMR 1 H shifts caused by added electrolytes [30, 31].

3.8 Use of results from vibrational spectroscopy to interpret magnetic resonance data

From the outlines given above, it is clear that NMR and ESR results give precise data, but being time-averages, do not provide structural data com­pletely unambiguously. In contrast, infrared or Raman spectroscopies give broad lines, but often these can be resolved into individual components and hence provide structural information. From plots depicting the gain and loss of individual bands which have been assigned to different solvates in mixed solvents systems, it is possible to obtain estimates of the expected NMR shifts in certain cases. These are cases for which NMR shifts are large, and are for nuclei which are a part of the chromophore studied by vibrational spectroscopy.

Page 82: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOLVENTS AND SOLVATION 73

3.8.1 N M R shifts

In such cases, plots ofNMR shifts against infrared band maxima are linear for many pure solvents. A good example is that for Et3PO where the 31 P shift is correlated with v(P-O) (Figure 3.14). The only data that fail to fit the linear correlation are those for methanol and ethanol. However, for these solvents two distinct infrared bands are resolved. If the weighted mean of these bands is used, the correlation with the NMR data is good. In order to interpret mixed solvent NMR shifts, we make the assumption that this plot can be used to estimate NMR shifts for specific solvates from vrnax(P-O). It is on this basis that NMR solvent shifts can be reconstructed. The comparison with experi-

1100 / /OH,O 1120

·diol /

EtOH I ' MeOH 001

/ 0;-

cHcl,/lo E 1140 (.) ..... 0 I

a.. / (s) (s) -;::,. 1160 CH 2CI2 •• MeN02

.'OMSO • / MeCN

• CCI4

1180 ·'THF , I· Et3N

• Hex / 1200

40 50 60 70

31 p shift (ppm)

Figure 3.14 Correlation between 31 P resonance shifts and v(P-O) for Et3PO in dilute solution in a range of pure solvents. (Et3N, triethylamine; THF, tetrahydrofuran; Hex, hexane; DMSO,

dimethylsulphoxide; diol, ethane-l,2-diol.)

Page 83: Principles of Molecular Recognition

74 PRINCIPLES OF MOLECULAR RECOGNITION

mental shifts, shown in Figure 3.15, is remarkable. Inaccuracies arise particu­larly because of the extra shifts induced in the infrared bands as a result of changes in secondary solvation. Since the NMR results are very precise, it should be possible to use this precision to improve the less certain infrared data. However, we are satisfied that the interpretations offered are broadly valid, and that the ambiguities involved in the interpretations of NMR shifts have thereby been removed.

I ·~~~-x __

.~__ x ____

E a. a.

56

52

48

- 0

o

(a)

.,,~-x~

.~ ~ X~CN • ~ x

~~F • I:>

DM~.

0.4 0.8

0.4 0.8

mole fraction base

2(MeOHl

1(MeOHl

Figure 3.15 31 P resonance shifts for Et 3PO in methanol + base systems as a function of the mole fraction of base. The points marked 2 MeOH and 1 MeOH were deduced from the infrared spectra

and the correlation of Figure 3.11; (aj experimental values; (b) reconstruction using IR data.

Page 84: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOLVENTS AND SOLVATION 75

3.8.2 ESR data

Similarly, the ESR results for nitroxide radicals can now be interpreted. For example, the problem outlined above (section 3.4.3) for solutions ofDTBN in methanol and water is resolved. The increase in A isoe4N) on going from methanol to water is caused by an increase in solvation number rather than in hydrogen bond strength. This knowledge helps us to understand the fact that Aiso for phospholipids spin-labelled at the head group is about equal to that assigned to molecules having one hydrogen bond to water rather than two. We suggest that such ~NO groups can only add one water molecule under the constraints of the membrane surface [6].

3.8.3 Why are solvation numbers for solutes greater in water than in other protic solvents?

The hydrogen bond donating powers of water molecules and alcohols are comparable, both for the monomers in inert solvents and for the bulk solvents. Nevertheless, our results show that the primary solvation number for hydro­gen bond acceptor solutes (such as MezCO, MeCONMe2 or anions) is usually at a maximum for aqueous solutions, but considerably less for methanolic solutions.

A reasonable solution to this problem comes directly from the concept of (OH)free and (LP)free groups in water. Such groups are present in considerable numbers (ca. 10%). They are not present to anything like this extent in liquid methanol because each 'linearly' hydrogen bonded methanol molecule has an unused lone-pair and can accept an (OH)free unit. Thus, when hydrogen bonds break in methanol, the resulting (OH)free units are immediately scav­enged. Hence a dilute solute in water will take up as many (OH)free groups as possible (depending on the number of basic sites) whereas, in methanol, there is composition for OH-groups between solute and solvent, and, generally, there is a compromise.

3.9 Solvation in biological systems

There is no room here for anything more than a brief reference to the fact that these spectroscopic tools are very powerful potential weapons in the study of bio-solvation. One aspect is to extend the work described above, especially in sections 3.5-3.7, to biological systems. This means that good chromophores such as ':::C =0 in amides, esters, etc., or :::P-O in phosphates or -SH and -S-S- units in proteins can be studied, preferably by a combination of vibrational and NMR spectroscopies. This information is leading to a better understanding of the solvation of many active groups.

Page 85: Principles of Molecular Recognition

76 PRINCIPLES OF MOLECULAR RECOGNITION

3.9.1 Solvation changes

Another area of great importance is the way in which solvation (aquation) changes when molecules interact. For example, when a substrate enters a protein pocket and is prepared for reaction by the enzyme, how does the solvation change? What is the role of water? An important contribution has been made in this area, using infrared spectroscopy by Wharton and co­workers [52,53].

3.9.2 N M R spectroscopy

Modern NMR techniques are, of course, widely used as a complement to X-ray crystallography to give high resolution structural information on rela­tively small proteins, and are continually being extended upwards. This method is also extremely powerful in studying 'substrate' molecules in the active sites of proteins. Such work has more in common with crystallography than with the type of spectroscopic studies considered here. It has been very widely reviewed in recent years.

3.9.3 Solvation of small biomolecules

Yet another question is how are small biologically important molecules normally aquated? This is important when considering the distribution of such molecules within cells. It is probable that molecules such as 02 or NO are in clathrate-type cages in water. There is some evidence for weak hydrogen bonding for CO 2 in water [54], although the normal view is that these molecules are also enclathrated [55].

Another potentially important example in which solvation can playa major role is the conversion of oxygen into the superoxide ion, 0i - , mentioned in section 3.4.4. If the oxygen is in aqueous solution, aquation ofOi - will occur in the picosecond time-scale, and the active agent will be fully aqua ted 0;. However, oxygen is very lipid soluble, and if 0i - formation occurs in that phase, the ion will remain unsolvated for a relatively long time. Such 'dry' 0i­ions are very much more reactive than the solvated ions with respect to their electron-donating power, their free-radical nature, and also their ability to act as nucleophiles [56].

These are just a few examples. The field of bio-solvation is surely one of extreme importance, and I am sure that spectroscopic methods will play an important and exciting role in its exploration.

References

1. M. Smith and M.C.R. Symons (1958) Trans. Faraday Soc. 54, 338-345. 2. M. Smith and M.C.R. Symons (1958) Trans. Faraday Soc. 54, 346-352.

Page 86: Principles of Molecular Recognition

SPECTROSCOPIC STUDIES OF SOL VENTS AND SOL V A TION 77

3. K. Dimroth and C. Reichardt (1965) Angew. Chern. Int. Ed. 4, 29. 4. E.M. Kosower (1958) J. Arn. Chern. Soc. 80, 3253; E.M. Kosower and P.E. Klinedinst (1956)

J. Arn. Chern. Soc. 78, 3493. 5. V. Gutmann (1976) Electrochirn. Acta 21, 659. 6. M.C.R. Symons and A.S. Pena-Nunez (1985), J. Chern. Soc., Faraday Trans. ] 81,2421. 7. M.C.R. Symons and S.E. Jackson (1979) J. Chern. Soc., Faraday Trans. ] 75, 1919. 8. R. Catterall (1971) Nature 229, 10; M.J. Biandamer, R. Catterall, 1. Shields and M.C.R.

Symons (1964) J. Chern. Soc. 4357. 9. See, for example D. Marsh (1985) Spectroscopic Studies of Dynarnic Molecular Biological

Systerns, Academic Press, New York, p. 209. 10. J.H. Sharp and M.C.R. Symons (1972) in Ions and lon-Pairs in Solution, ed. M. Schwarc,

Wiley, New York. 11. T. Tuttle, J. Ward and S. Wiseman (1956) J. Chern. Phys. 25,189. 12. T.R. Griffiths and M.C.R. Symons (1960) Mol. Phys. 3, 90. 13. B.E. Conway, (1981) Ionic Hydration in Chern is try and Biophysics, Elsevier, Amsterdam. 14. T. Gough and G.R. Hindle (1970) Trans. Faraday Soc. 66, 2420. 15. Y.Y. Lim, E.A. Smith and M.C.R. Symons (1976) Trans. Faraday Soc. 72, 2876. 16. D. Jones and M.C.R. Symons (1971) Trans. Faraday Soc. 67, 961. 17. H. McConnell and P. Rey (1977) J. Arn. Chern. Soc. 99, 1637. 18. N.A. Malik, E.A. Smith and M.C.R. Symons (1989) J. Chern. Soc., Faraday Trans. ] 85, 3245. 19. See, for example, G.W. Robinson, P.J. Thistlethwaite and J. Lee (1986) J. Phys. Chern. 90, 4224. 20. M.C.R. Symons and J.M. Stephenson (1981) J. Chern. Soc., Faraday Trans.] 77, 1579; M.C.R.

Symons, G.W. Eastland and L.R. Denny (1980) J Chern. Soc., Faraday Trans. 1 76, 1868. 21. D.R. Brown, G.W. Eastland and M.C.R. Symons (1979) Chern. Phys. Lett. 61, 92. 22. M.C.R. Symons (1972) Nature (London) 239, 257; (1975) Phi/os. Trans. R. Soc. London, Ser. B.

272,13. 23. C. Deverell (1961) Nuclear magnetic resonance studies of electrolyte solutions. Prog. Nucl.

Magn. Reson. Spectrosc. 4,235. 24. C. Carmona, G. Eaton and M.C.R. Symons (1987) J. Chern. Soc., Chern. Cornrnun. 873. 25. M.C.R. Symons and J. Burgess (1968) Q. Rev. 22, 276. 26. R.N. Butler and M.C.R. Symons (1969) Trans. Faraday Soc. 65, 945. 27. R.N. Butler, J. Davies and M.C.R. Symons (1970) Trans. Faraday Soc. 66, 2426; S. Ormon­

droyd, E.A. Phillpot and M.C.R. Symons (1971) J. Chern. Soc., Faraday Trans. 1 67, 1253. 28. J. Davies, S. Ormondroyd and M.C.R. Symons (1971) J. Chern. Soc., Faraday Trans. 1 67,

3465. 29. J.H. Swinehart and H. Taube (1962) J. Chern. Phys. 37,1579. 30. M.C.R. Symons (1983) J. Chern. Soc., Faraday Trans. 1 79, 1273. 31. H. Robinson and M.C.R. Symons (1985) J. Chern. Soc., Faraday Trans. 1 81,2131. 32. J.A. Jackson, J.F. Lemour and H. Taube (1960) J. Chern. Phys. 32, 553. 33. R.E. Connick and D.N. Fiat (1966) J. Arn. Chern. Soc. 80, 4754. 34. M.C.R. Symons (1986) Pure Appl. Chern. 58, 1121. 35. V. Gutmann and R. Schmid (1974) Coord. Chern. Rev. 12,263. 36. M.C.R. Symons and G. Eaton (1982) J. Chern. Soc., Faraday Trans. ] 78, 3033. 37. G. Eaton and M.C.R. Symons (1988) J. Chern. Soc., Faraday Trans. 1 84,3459. 38. G. Eaton, A.S. Pena-Nunez, M.C.R. Symons, M. Ferrario and I.R. McDonald (1988) J. Chern.

Soc., Faraday Trans. ] 85,237. 39. G. Eaton and M.C.R. Symons (1982) Faraday Syrnp. Chern. Soc. 17, 31. 40. J.D. Worley and I.M. Klotz (1969) J. Chern. Phys. 45, 2868; W.A.P. Luck and W. Ditter (1966)

Ber. Bunsenges. Phys. Chern. 70, 1113; M.C.R. Symons (1981) Acc. Chern. Res. 14, 179. 41. Y.Y. Efimov and Y.1. Naberukhin (1988) Faraday Discuss. Chern. Soc. 85,117. 42. M.C.R. Symons, J.M. Harvey and S.E. Jackson (1980) J. Chern. Soc., Faraday Trans. 1 76,256. 43. M.C.R. Symons, V.K. Thomas, N.J. Fletcher and N.G.M. Pay (1981) J. Chern. Soc., Faraday

Trans. 1 77, 1899. 44. G.W. Neilson and J.E. Enderby (1979) Annu. Rep. Prog. Chern. 76,185. 45. Unpublished results. 46. M.C.R. Symons (1978) J. Chern. Soc., Chern. Cornrnun., 419. 47. M.C.R. Symons (1978) J. Chern. Res. (S) 140. 48. M.C.R. Symons, N.G.M. Pay and G. Eaton (1982) J. Chern. Soc., Faraday Trans. 1 78, 1841.

Page 87: Principles of Molecular Recognition

78 PRINCIPLES OF MOLECULAR RECOGNITION

49. E. Grunwald and S. Winstein (1948) J. Am. Chern. Soc. 70, 846. 50. L. Menninga and 1.B.F.N. Engberts (1976) J. Am. Chern. Soc. 98, 7652. 51. I.M. Strauss and M.C.R. Symons (1977) J. Chern. Soc., Faraday Trans. 1 73, 1796. 52. AJ. White and C.W. Wharton (1990) Biochem. J. 270, 627. 53. PJ. Tonge, M. Pusztai, AJ. White, C.W. Wharton and P.R. Cavey (1991) Biochemistry 30,

4790. 54. G. Eaton and M.C.R. Symons (1990) J. Mol. Liquids 46,197. 55. D.W. Davidson (1973) in Water-A Comprehensive Treatise, ed. F. Franks, Plenum Press,

New York. 56. PJ. Boon, M.T. Olm and M.C.R. Symons (1988) J. Chern. Soc., Faraday Trans. 1 84,3334.

Page 88: Principles of Molecular Recognition

4 Origins of enantioselectivity in catalytic asymmetric synthesis J.M. BROWN, P.l. GUIRY and A. WIENAND

4.1 Introduction

The catalytic application of organometallic compounds in organic synthesis dates back over 50 years but their use in asymmetric synthesis is much more recent. The most favourable examples rival or exceed the power of enzymes in specific cases, and several procedures are working on a large scale in industry [ll

Low-valent transition metal complexes are most effective in catalysing reactions which involve retention or reduction of the oxidation level of the reactant. Thus, homogeneous hydrogenation, normally by Rh or Ru com­plexes, is probably the most widely applied example of asymmetric catalysis, and also the best understood. A related development, likewise based on C-H bond formation, is the asymmetric isomerisation reaction of olefins in which the double bond shift creates a new chiral centre. Asymmetric hydroformyla­tion (the conversion of an olefin into its homologous aldehyde by co-addition of H2 and CO) is known but much less well developed. The reaction may be viewed as the formal addition of the C-H bond of formaldehyde to an olefin simultaneously producing a new C-H and a new C-C bond. Several catalytic procedures are known which involve the formal coupling of a carbon electro­phile and a carbon nucleophile to form a single new C-C bond, generating an asymmetric centre in the process. Among these, cross-coupling of a racemic Grignard reagent (or the related organozinc compound) with a vinyl or aryl halide catalysed by a Ni or Pd complex is the most widely known. Allylic alkylation with soft nucleophiles such as malo nates or J1-diketonates catalysed by palladium complexes is also a potential asymmetric reaction, but that potential has proved quite difficult to realise in practice*.

The chapter concentrates on those reactions where the mechanism is at least partly understood, for a wholly pragmatic reason; speculative comments on the origin of enantioselection are hardly justified if the catalytic pathway is ill-defined.

*But see O. Reiser (1993) Agnew. Chern. Int. Ed. 32, 547.

Page 89: Principles of Molecular Recognition

80 PRINCIPLES OF MOLECULAR RECOGNITION

4.2 Homogeneous hydrogenation with rhodium complexes

Almost as soon as Wilkinson and others had demonstrated the possibility of homogeneous hydrogenation of olefins using CIRh(PPh3}3 to be a practically useful reaction [2], attempts were made to develop asymmetric analogues through the reduction of prochiral reactants. These initial attempts relied on the configurational stability of resolved chiral pili, and although the principle was readily demonstrated, enantiomer excesses in these early reactions were uniformly low. The first breakthrough came from two laboratories, those of Knowles in Monsanto, St. Louis and Kagan in Orsay; both were attempting to prepare enantiomerically pure amino acids from dehydroamino acid derivatives. The former used a P-chiral phosphine 'PAMP' (closely related to a Monsanto agrochemical intermediate) and the latter a C2 symmetric chelating diphosphine 'DIOP' where the chirality resided in the backbone rather than at phosphorus [3,4]. This latter prescription has been widely followed and there are currently well in excess of a hundred examples of chelating diphosphines R2 P{Q}PR2 in which {Q} represents a chiral element; examples of central, axial and planar chirality are all well represented in ligand synthesis. The group R is most commonly phenyl, but other aryl and more occasionally bulky alkyl groups have occasionally been utilised. If the two R group atoms or both phosphorus nuclei are distinct, then that phosphorus is chiral; very few examples are known beyond the original Knowles' chemistry (DIPAMP was first prepared in the mid-1970s). If R = Ph and the backbone is symmetrical then the ligand will have a C2 symmetry axis, and this has been the most commonly defined type of ligand structure [5]. Almost all of these chelating diphosphines form rhodium complexes which are effective for the reduction of N-acyldehydroamino acids or esters in > 70% e.e. In some of the best cases, exemplified by Figure 4.1, the optical yield is close to 99%, the reaction is rapid at normal or slightly elevated pressure of hydrogen and the catalyst can frequently permit 103 _105 turnovers before deactivation.

Our knowledge of the reaction mechanism stems from three sources. Firstly, and particularly through the work of Halpern and co-workers in Chicago, the reaction kinetics are well defined [6]. Secondly, and through experiments carried out in Oxford, the structure, dynamics and stereochemistry of catalytic intermediates have been delineated by NMR experiments [7]. Thirdly, the crystal structures of several chelate biphosphine rhodium complexes relevant to the catalytic cycle have been determined [8]. Taking this evidence together permits the construction of the reaction scheme shown in Figure 4.2 for the Monsanto ligand DIPAMP. The most salient features of this are as follows:

1. The resting state ofthe catalytic reaction is a dynamic equilibrium (much faster than the addition ofH2 under enantioselective conditions) between a solvate complex (the reaction is normally run in methanol) and the two possible diastereomeric enamide complexes in which the pro-R or pro-S

Page 90: Principles of Molecular Recognition

ENANTIOSELECTIVITY IN CATALYTIC ASYMMETRIC SYNTHESIS 81

5~

-- H o I ->-N co H , 2

H

&:: H o H

->-N H H \ C02H

new asymmetric centre

The ligand can be any of a range of chelate chiral ligands with the asymmetric centre in the backbone or exceptionally at phosphorus

89 R

Ph--\,"~O o OPh

Ph2P PPh2

97 5

Ph2P't---,

l...> .... 1 Ph2 N C02Bu'

91 R

CPPh2

H3C···· PPh2

915

~ .6 .6 PPh2

9' PPh2 :::,.. I

84 R,5

Ph2Pb PPh2

N C02Bu'

98 R

Figure 4.1

HC 3 ~PPh2

PPh2 H3C

93 R

d=(~ PPh2

96 5

0-~ OCH3 P-Ph

( P-Ph

(rOCH3 :::,.. 96 5

face of the olefin is bound to rhodium, respectively. The equilibrium favours the enamide complexes, and is strongly biased so that one predominates; in this case by around 15:1, but up to 100:1 elsewhere.

2. The overall kinetics of the catalytic reaction is consistent with hydrogen addition to the rhodium enamide complex in the limiting step. This process is irreversible, since it is not accompanied by the concomitant interconversion of ortho- and para-hydrogen [9]. At high hydrogen pressures, this situation no longer holds and the enantioseiectivity decreases. In several cases, the enantioselectivity increases with increas­ing temperature.

3. In the several cases where it has been established, the less favoured of the two en amide diastereomers has the same configuration as the preferred hydrogenation product, assuming cis-delivery of Hz via rhodium. Thus,

Page 91: Principles of Molecular Recognition

82 PRINCIPLES OF MOLECULAR RECOGNITION

O.,Ph H MeO('p +/OCH3

;Rh p. "

Ph')) acH3 MeO

7%

CH3

H02C';; NHCOMe

96% S

Figure 4.2

NEVER OBSERVED !!

when hydrogen addition is carried out at low temperature, an alkyl­rhodium hydride derived from the minor diastereomer is the first observ­able intermediate, and this decomposes with release of the saturated product at above - 50°C. The alkyliridium hydride analogue is thermally stable at ambient temperature, and its absolute configuration has been established by NMR.

Figure 4.2 represents the simplest construction consistent with this informa­tion, and is the mechanism most widely accepted for Rh asymmetric hydro­genation. It still needs to be treated with some caution because of the dynamic nature of the ground state, and the possibility for direct interconversion of en amide diastereomers without dissociation. For example, the kinetics and spectroscopic observations cannot rule out an alternative in which hydrogen added to a part-dissociated enamide complex, which reverted to the preferred intermediate by rapid olefin dissociation-recombination prior to internal hydride transfer. If hydrogen adds reversibly to the solvate complex (which is present at low concentration) to produce a transient '12-intermediate without interconversion of or tho- and para-H 2 , then this must react irreversibly with substrate in the rate-determining step to accord with the observed kinetics. These alternative possibilities can only be discriminated by further experiment.

In the absence of detailed knowledge of the transition-state structure for asymmetric hydrogenation, speculative approaches help us to understand why these reactions are uniformly enantioselective. Given the large body of

Page 92: Principles of Molecular Recognition

ENANTIOSELECTIVITY IN CATALYTIC ASYMMETRIC SYNTHESIS 83

Figure 4.3

crystallographic ally defined square-planar biphosphine rhodium complexes, some of which relate to enamide complexes [8, 10], molecular modelling of the approach of hydrogen to rhodium is facilitated. Such studies are necessarily qualitative, particularly in transition-metal complexes where the para­meterisation necessary for good quality molecular mechanics is only just being developed [11]. For the two cases where asymmetric hydrogenation has been studied by this technique [12], the ligand CHIRAPHOS was utilised, since an X-ray crystal structure of the enamide complex exists; Figure 4.3 shows the critical interactions between ligand and bound reactant in a fragment derived from this X-ray analysis. The main interaction is between one PPh2 unit and the O~C~O region of the carboxylic acid or ester. Evidently the transition­state is very different from the ground state in this respect because the preferred stereochemistry is reversed. Simulating the hydrogen addition step requires eight different sets of calculations, four for each diastereomer, because orthog­onal approach can occur from above or below the coordination plane and along either the P~Rh~C or P~Rh~O vector. For an enamide structure with fixed bond lengths and angles in which rotation about unconstrained bonds was permitted, and the square-planar en amide relaxed to octahedral during the addtion, both Bosnich and Brown discovered that one of these eight pathways (the same one) was strongly favoured on steric grounds; several of the alternatives engendered impossible levels of non-bonded interaction dur-

Page 93: Principles of Molecular Recognition

84 PRINCIPLES OF MOLECULAR RECOGNITION

ing the approach of H2. The favoured pathway involved the reactive minor diastereomer, with H2 approaching along the P-Rh-C vector. Nevertheless, the non-dissociated enamide complex does seem to be rather crowded and heavily constrains the addition of dihydrogen.

This very crude method at least provides a pictorial representation of the region ofthe energy surface which is difficult to define by experiment. It takes little account of the possible variability in the mechanism of hydrogen addition to rhodium; whether 1]2-dihydrogen complexes [13] play any part in the reaction, for example. This criticism is generally true of the molecular mechan­ics approach to reaction mechanism; unless the system is sufficiently simple to permit the incorporation of good quality ab initio calculations on the bond­making and bond-breaking components of energy change, then purely steric considerations may produce misleading results. In addition, and as was commented on earlier, there is little accurate data to permit the parameterisa­tion of heavy atoms, so that torsional, bending and stretching changes in Rh- P and related bonds are not readily incorporated. This leaves the bare bones of a set of calculations based on the minimisation of Van der Waals interactions, necessarily a gross approximation.

4.2.1 Catalytic kinetic resolution and directed hydrogenation

A simple extension of the principles of asymmetric hydrogenation involves the reduction of chiral reactants. Consider first the reaction indicated in Figure 4.4 in which the product could be either the anti (threo) or syn (erythro) dias­tereomer. Since the enantioselectivity in asymmetric hydrogenation of enam­ides is part governed by secondary binding conferred by the amide oxygen, it is reasonable to expect that coordination of the hydroxyl oxygen moiety to rhodium during catalysis will affect the configuration of the product. In practice, using an achiral Rh catalyst to convert a racemic starting material into a racemic product, this stereochemical control is very high indeed, with > 99% of the anti-isomer produced in the cited case. Several related examples involving different binding groups, including (IX-carbomethoxyalkyl)acrylates or (IX-amidoalkyl)acrylates and (IX-hydroxyalkyl)vinyl sulphones all behave in the same way, and the hydrogenation is rapid and quantitative to give a single product under mild conditions [14]. This internal (substrate-induced) dias­tereoselectivity is fairly general for di- and trisubstituted allylic alcohols with a variety of substituents at the IX- and p-carbons of the double bond and several useful applications in synthesis have been reported [15].

Since the sense of diastereoselective hydrogen addition is common to almost all the examples tested, it should conform to a predictive model which can be set in the general context of addition reactions of acyclic olefins with an IX-chiral centre. Much effort has been expended to explaining such addition reactions with electrophilic reagents, which follow a predictable pattern in several cases. The relevant factors are the energies of the possible gound-state

Page 94: Principles of Molecular Recognition

(a)

(b)

(c)

I

t

(e) •

I

t

Plate 1 A possible sequence of intermediates involved in asymmetric cross-coupling of a-methylbenzylmagnesium chloride with p-methoxy-l3-bromostyrene, showing first the oxidative addition of the halide, and then the formation of a dialkylpalladium complex with reductive elimination in the final step. The molecular models were produced in Chem3D Plus, using literature

crystallographic coordinates to generate the initial complex.

Page 95: Principles of Molecular Recognition

(a)

(b)

(e)

I

t

I I I I I I

t

(e)

(d)

'" '"

'" '" '" '" '"

'" '" '"

..

"'~

Plate 2 As Plate I, with a-(trimethylsilyl)benzylmagnesium chloride as the nucleophilic reagent.

Page 96: Principles of Molecular Recognition

ENANTIOSELECTIVITY IN CATALYTIC ASYMMETRIC SYNTHESIS 85

R·enantlomer of racemate drawn

Rh catalyst

Figure 4.4

Figure 4.5

>99% <1%

conformers of the reactant and their relative reactivity towards the elec­trophile. Since more is known about the conformational energies of allylic alcohols than of other reactants in this class, with both an experimental and a theoretical basis [16], most authors have concentrated their efforts on them and on their ethers and silyl ethers. In the simplest case, 2-buten-l-01, there are two low energy conformations, as indicated in Figure 4.5, with the R or OR eclipsing the double bond, respectively. Approach of a reagent E + from the less hindered side of a suitably substituted olefin gives the erythro-isomer (syn) in one case and the threo-isomer (anti) in the other. For hydroboration and iodoacetoxylation of several examples, the direction is consistently as in­dicated in Figure 4.6(a), with the level of stereo selectivity dependent on the olefin substitution pattern, but most commonly in the range 5:1 to 50:1. This pattern holds for a number of related addition reactions [17]. To offer a contrasting example, asymmetric epoxidation by titanium tartrate complexes occurs with the electrophile approaching from the opposite face, but again with high diastereoselectivity (cf. Figure 4.6(b)). Several other addition reac­tions of olefins fall into this latter class [18]. Results for directed hydrogena­tion are clear-cut; reaction occurs with the same stereochemical course as Ti epoxidation and the reverse course to hydroboration. The underlying rule appears to be that reactions which involve 'open' transition states prefer to

Page 97: Principles of Molecular Recognition

86 PRINCIPLES OF MOLECULAR RECOGNITION

(a) be/ow

I ::) Non-chelate

H

Hydroboration 9-BBN Peracid epoxidation Simmons-Smith cyclopropanation Cyclopropanation with CH212 I Sm(Hg) Cyclopropanation with CCI2 lodoacetoxylation

(b)

above

l O(H)

H

Catalytic hydroboration (catecholborane/Rh) Catalytic osmylation Titanium-promoted epoxidation Catalytic hydrosilylation with (Me2H)Si2NH Rh or Ru-directed hydrogenation

Figure 4_6

react via the low-energy ground-state conformation with the torsion angle C=C-C-H circa 0° and the reagent approaching from the same face as the smaller OH or OR group. In contrast, those which involve 'closed' chelate transition states with the OH bonded to a metal reagent or catalyst react via the conformation with the torsion angle C=C-C-H circa 180°, an unfavour­able conformation for the uncoordinated species. Molecular mechanics calcu­lations clearly demonstrate that this path is followed as a consequence of the minimising of internal steric interactions in the coordinated reactant. In the route to the favoured diastereomer with the ct-hydroxy or other substituent coordinated to the metal, the ct'-substituent on the double bond should be eclipsed by the smaller of the two free substituents at the ct-position. X-Ray analysis of an iridium chelate complex [19] shows that a chelate geometry exists in accord with these principles.

Now consider the situation which arises when an olefin with a directing group at a racemic chiral centre is hydrogenated with an enantiomerically pure catalyst. In this case the two hands of the starting material must react via diastereomeric transition states, and react at intrinsically different rates. To take the cases of Figure 4.7 as an example, there is a sufficient difference in the rates of reaction of R- and S-enantiomers to make this a useful kinetic resolution, and for a range of cases, the starting material is recovered in > 90-95% e.e. around 60-65% reaction. It should be pointed out that the P-chiralligand DIPAMP is superior to others in this reaction; the kinetic resolution selectivity parameter S (roughly the relative reactivity of the rapidly

Page 98: Principles of Molecular Recognition

ENANTIOSELECTIVITY IN CATALYTIC ASYMMETRIC SYNTHESIS 87

CH2 H2. MeOH <;:H3

Meo2C~ Rh catalyst Meo2C~ OH fast OH

>99.5% il11l

CH2 H2. MeOH

CH3

Me02C~ Me02C~ Rh catalyst OH slow OH

Recover In 95% e.e. 98% iUJ.lJ.

at 63% react/on

l) Ph

MeoeP, + /' I y· .... 1 ~03CF3

/Rh p))" I as asymmetric ptf catalyst

MeO -

Figure 4.7

and slowly consumed enantiomers) is in the range of 1O~25. Both hands of starting material give the anti-isomer of product showing that the substrate exerts stronger stereochemical control on the course of reaction than does the catalyst [20].

4.3 Hydrogenation with ruthenium complexes

Until 1985, rhodium had been pre-eminent in studies of asymmetric hydro­genation, and other metal complexes gave inferior results. The situation then changed with the synthesis and application of ruthenium complexes based on the atropisomerically chiralligand BINAP [21J which in the first publication detailing Ru complexes in asymmetric hydrogenation seemed comparable in efficiency to the corresponding rhodium complex for the asymmetric hydro­genation of enamides. More dramatic results followed a couple of years later, and subsequently, such that the ligand has become closely associated with Noyori [22J, and the versatility and enantioselectivity of its applications clearly outstrips rhodium chemistry. It is useful to compare Rh and Ru asymmetric hydrogenations, since there are many cases where the same ligand has been employed (usually BINAP) for the reduction of a common substrate; a consistent pattern emerges. For the hand of Rh complex which gives predominantly R-product in catalytic hydrogenation, the corresponding Ru catalyst will give the S-enantiomer, often with greater optical efficiency. This is comprehensively illustrated through the examples of Tables 4.1 ~4.3. These

Page 99: Principles of Molecular Recognition

88 PRINCIPLES OF MOLECULAR RECOGNITION

Table 4.1 Chiral phosphine ligand: (R)-BINAP

2mol% cat., THF/EtOH, NEt3 2atm H2 , 35°C, 24h

Substrate Main product

"",Jl""".. """~'"OO" H02C""" NHCOM.

Ph

Ho2cf NHCOR

(Ph

H02C""" NHCOR

Phl H02C NHCOM.

Phl H02C NHCOM.

Ph,

HO,C""" NHCOM.

B' ~

~. ~co.-

"",,~ _,.d 'R .'

ij I

1mol% cat., EtOH 3-4atm H2 , 25°C, 48h

Catalyst % o.p. Configuration Reference

A 76 R a

B 67 S b

A R=Me R=Ph

B R=Me R=Ph

A

B

86 92

84 100

65

87

R R

S S

R

S

a a

b b

b

b

a H. Kawano, T. Ikariya, Y. Ishii, M. Saburi, S. Yoshikara, Y. Uchida and H. Kumobayashi (1989) J. Chern. Soc., Perkin Trans. 1 1571. b A. Miyashita, H. Takaya, T. Souchi and R. Noyori (1984) Tetrahedron Lett. 40,1245.

results present the first pointer towards mechanistic divergence between the two metals.

A second interesting feature emerges when the full range of ruthenium­catalysed reactions is considered. For many catalytic asymmetric reactions, including Sharpless asymmetric epoxidation, it is possible to relate the con­figuration of the product to the configuration of reagent or catalyst in such a way as to make general predictions. Thus, for the Rh asymmetric hydrogena­tion of dehydroamino acids, it is known that substituents in the ligand which enforce a A-configuration of the chelate ring lead to the S-enantiomer whilst a <5-configuration gives the R-enantiomer. The situation in ruthenium chemis­try, where a much wider range of substrates has been reduced in high enantiomer excess, is more complicated, and consequently much more difficult to predict, as indicated in Table 4.4 and 4.5 and in Figure 4.8. Although all of

Page 100: Principles of Molecular Recognition

ENANTIOSELECTIVITY IN CATALYTIC ASYMMETRIC SYNTHESIS 89

Table 4.2 Chiral phosphine ligand: (S, S)-CHIRAPHOS

C: P~ CI Phi D:

H3CX"" P, CI I PXCH3 'Au/ 'R1/

H C p/ I 'CI/ \..p "'" 3 P~ CI p~ CH3

0.1 mol% cat, DMA latm H2 , 30°C, 24h

1 mol% cat, EtOH 1 atm H2 , 25°C, 24 h

Substrate Main product Catalyst % e,e, Configuration Reference

C 97 S c

D 89 R d

eRR, James, A. Pacheco, S.J. Rettig, J.S. Thorburn, R.G. Ball and lA. Ibers (1987) J. Mol. Catat. 41, 147; cf. J.P. Genet, C. Pinel, S. Mallart, S. Juge, S. Thorimbert and lA. Laffitte, (1991) Tetrahedron Asymmetry 2, 555~567. d M.D. Fryzuk and B. Bosnich (1977) J. Am. Chern. Soc. 99, 6262.

these reactions proceed in high enantiomer excess, the configuration of hydrogenation product formed is not obviously related to the relative spatial disposition of the metal-binding group (OH or C02H in most cases) and the prochiral olefinic double bond. It is slightly comforting to observe that at least the sense of asymmetric synthesis is constant within a closely defined family of reactants.

With the additional complexities thus evident in ruthenium asymmetric hydrogenation, it is not surprising that the origin of enantioselection is poorly understood here at present, and rigorous mechanistic studies are quite recent, relating only to the hydrogenation of unsaturated carboxylic acids. The most significant piece of work is due to Ashby and Halpern [23], who have shown that the kinetics oftiglic acid reduction fit the mechanism shown in Figure 4.9. The evidence for this is that the rate equation has the form

k(obs) = 2k [Ru]tot [H 2]/I {[S] + [P]}

where [S] and [P] represent the concentrations of reactant and product, respectively, and the anticipated inhibition by excess carboxylic acid (either reactant or product) is indeed observed. The implication is that one inert carboxylate moiety is always coordinated to ruthenium, in keeping with the conclusions from NMR observations on the catalyst system (at about lOO-fold

Page 101: Principles of Molecular Recognition

Tab

le 4

.3

Chi

ral

phos

phin

e li

gand

: (R

,R)-

DIO

Pj(

S,S

)-D

IOP

Sub

stra

te

Mai

n pr

oduc

t C

atal

yst

Con

diti

ons

%e.

e.

Con

figu

rati

on

Ref

eren

ce

~C02H

HOC~C02H

O.l

mol

%

DM

A

[R, R

-DIO

P]2

RU

2 Cl 4

1

atm

H2,

30°

C,

3 da

ys

56

S c

H0

2C

2

=

0.2mol~~

I-B

uO

HjT

ol

(1:1

) HOC~C02H

HR

h[S

, S-D

IOP

]2

1 at

m H

2, 3

0°C

20

R

e

2

H0

2CJl

NH

CO

Me

Imol

%

DM

A

~

[S, S

-DIO

P] 2

Ru

2CI 4

1

atm

H2,

60

°C

59

S e

H0

2C

NH

CO

Me

~

0.3m

ol%

H0

2C

NH

CO

Me

[R,R

-DIO

P]R

h+

E

tOH

jC6H

6(1

:1)

81

R

H0 2

CJl

ph

Im

ol%

D

MA

H02C~Ph

[S, S

-DIO

P]2

Ru

2CI 4

1

atm

H2,

60

°C

40

R

e

H02C~P

h 0.

3mil

%

[R,R

-DIO

P]R

h+

E

tOH

jC6

H6

(I:l

) 64

S

f

C A

s in

Tab

le 4

.2.

e B

.R.

Jam

es,

R.S

. M

cMil

lan,

R.H

. M

orri

s an

d D

.K.W

. W

ang

(197

8) A

dv.

Che

rn.

Ser.

167

, 12

2.

f P.

Av

iro

n-V

iole

t, Y

. Col

leni

lle

and

J.

Var

agua

t (1

979)

J.

Mol

. C

atal

. S.

41.

Page 102: Principles of Molecular Recognition

ENANTIOSELECTIVITY IN CATALYTIC ASYMMETRIC SYNTHESIS 91

Table 4.4 Hydrogenations with ruthenium complexes

.. .... ,.""R' ;;,:

DG H

[(R)-BINAP]Ru

Catalysts: A: [(R)- BINAP] Ru(OAc)2 B: [(R)-tolbinap] Ru(OCOCF 3)2 C: [(R)-tolbinap] Ru(OAch D: [(S)- BINAP] Ru(OAc)2

Substrate Main product Cata- R atm %e.e. Refe-Iyst H2 (%d.e.) rence

oh~ A., R)lN I '" OMoJl(X~(. I R N I A H 4 99.5 g

.Q OMe .Q OMe A Me 4 99.5 g

W6c Ar'" o I o _

HAN I ")lex B 100 98 h

Jjf~ OSiR3 A C02H 100 74

~ R . C CH20H 99.9

HN 0 0

HOyl HOyl Ph Ph D 112 92 0 0

AI-Q-OMe

At- o-m~ OMe

Conditions: 0.5-1 mol% cat., IS-30°C, MeOH or EtOH, 24-48h. • R. Noyori, M. Ohta, Y. Hsiao, M. Kitamura, T. Ohta and H. Takaya (1986) J. Am. Chern. Soc. 108,7117. h M. Kitamura, Y. Hsiao, R. Noyori and H. Takaya (1987) Tetrahedron Lett. 28, 4829. iT. Ohta, H. Takaya, M. Kitamura, K. Nagai and R. Noyori (1987) J. Org. Chern. 51, 3174. j M. Kitamura, R. Nagai, Y. Hsiao and R. Noyori (1990) Tetrahedron Lett. 31, 549.

Page 103: Principles of Molecular Recognition

92 PRINCIPLES OF MOLECULAR RECOGNITION

Table 4.5 Further hydrogenations with ruthenium complexes

oof:' [(R)-BINAP]Ru ~~' • H2 ...... IIH

DG R

Catalysts: A: [(R)- BINAP] Ru(OAc)2 8: [(R)-BINAP]2 RU2 CI4 NEt3

Substrate Main product R R' Cata- %e.e.(A) Iyst %o.p. (8)

HOyC HOyC A 91

0 0

~Ph ~Ph A* 85

HO I HO

0 0

Y::JY:::Y A* 98

~~ ~p H H 8 88 H Ph 90

OR OR Me H 60

0 Me Ph 72

0

~YCl ~p H H 8 79 Me H 68

OR OR Me Ph 0

0 0

Conditions: 0.2-2mol% cat., 15-35 ac, MeOH or EtOH, 2-4 atm H2, 12-24 h. * l00atm H2, 70h. a and iAs in Tables 4.1 and 4.4

Refe-rence

k

a,1

k H. Takaya, T. Ohta, N. Sayo, H. Kumbayashi, S. Akutagawa, S. Inoue, I. Kasahara and R. Noyori (1987) J. Am. Chern. Soc., 109, 1596. J H. Kawano, Y. Ishii, T. Ikariya, M. Saburi, S. Yoshikawa, Y. Uchida and H. Kumobayashi (1987) Tetrahedron Lett. 28, 1905.

Page 104: Principles of Molecular Recognition

ENANTIOSELECTIVITY IN CATALYTIC ASYMMETRIC SYNTHESIS 93

S-BINAP Ru (ach

} x \donor

group Ph

L-----M..Jeo cJNJph 2 I

H

Ar

Meorol J I N R

MeO ~

Class A ~----~ -------------------------------------------------------------------------------

J( ~OH X

~ \donor group I I ~OH

"Q OH

o

~OH

n"" Class B

Figure 4.8

higher concentration!) in the presence of excess reactant and in the absence ofH2 •

The most unusual feature of Halpern's proposal is the heterolytic activation of molecular hydrogen, so that an anionic Ru hydride and a protonated solvent molecule are formed concomitantly. This carries with it the implica­tion that the addition of D2 will lead to the incorporation of a single deuterium, /3, to the carboxylic acid if the 5-ring alkylruthenium chelate shown in Figure 4.9 is the true intermediate. The second will be pooled with exchange­able protons of the solvent. Experimental results for a variety of carboxylic acids [24] support this postulate, since the predominant product is always the monodeuterated saturated acid derived from a 5-ring O-C chelate intermedi­ate. Nevertheless, the same results could be obtained through the intermediacy of a Ru dihydride (dideuteride) which transferred a single hydrogen to the coordinated substrate. If the resulting alkyl monohydride then had sufficient lifetime (and a sufficiently fast exchange rate) to permit essentially complete

Page 105: Principles of Molecular Recognition

94 PRINCIPLES OF MOLECULAR RECOGNITION

;----

o o o o o

o o o o

o o o

~

n R-<O

X

\donor group

CH3

O~ P~ -I ~O + C --" + MeOH2

'~:¥-I

Figure 4.9

S-BINAP RuBr2

o 0

~OMe o 0

~SEt o OH

)lJ

Figure 4.10

Page 106: Principles of Molecular Recognition

ENANTIOSELECTIVITY IN CATALYTIC ASYMMETRIC SYNTHESIS 95

exchange of Ru - H with solvent O-H before the elimination step, the product would be regioselectively monodeuterated. A reaction pathway of this type avoids heterolytic cleavage of the H - H or D-D bond and provides a more closely similar reaction pathway to that ofRh asymmetric hydrogenation. It is pertinent to mention that recent studies on hydrogenation of dehydroamino acids by rhodium complexes in water [25] gives rise to a very similar result, in that the predominant product is that of formal H - D addition with the exchange taking place exclusively at the alkylrhodium hydride stage.

The ability of ruthenium BINAP complexes to effect the catalytic asymmet­ric hydrogenation of ketones [26] has captured the imagination of the synthetic organic community to an even greater extent. Results are delineated in Figure 4.l0; the most significant feature is that high enantiomer excess is only obtained in those cases where a secondary binding group is in proximity to the reacting carbonyl group. Unlike the olefin case, all examples to date fall into a single stereochemical pattern, indicated in Figure 4.10. The best results have been obtained with BINAP RuBr 2 and related species; presumably Br­is more easily displaced from Ru than is OAc -. At the time of writing there is no mechanistic work of any significance on this topic. It will be worthwhile, however, to consider some elegant experiments which enhance the synthetic utility of ketone reduction. The first of these is illustrated in Figure 4.l1, and involves double asymmetric induction in the hydrogenation of pentan-2, 4-dione [27]. The net result is that the diol is formed in extremely high enantiomer excess; if the ratio of the two enantiomers formed in the first reduction step is 50: 1 then the ratio of pure enantiomers of the final product is 2500:1. The reason is that the disfavoured product of the first step is predomi­nantly reduced to the meso-product, which can then be separated chemically. The second experiment involves dynamic kinetic resolution [28], as shown in Figure 4.12. A disadvantage of conventional catalytic kinetic resolutions (and of resolutions in general) is that the resolved starting material can never be recovered enantiomerically pure in > 50% yield, even with perfect selectivity. A potentially superior experiment is to take the unreactive enantiomer and epimerise it rapidly under the conditions of catalysis, so that a single stereochemically pure material is produced from a racemic reactant. This can be realised in practice through the reduction of some unsymmetrical /3-

0 0 H2 OH 0 H2 OH OH

AA . ~ . AA MeOH

96 E.e 100 E.e. mainly SS; some RS but

OH 0 H2 OH OH almost no RR

AA . AA 2% mainly

Figure 4.11

Page 107: Principles of Molecular Recognition

96 PRINCIPLES OF MOLECULAR RECOGNITION

o 0

~OEt r /-W'\HBoc acldlc/

hydrogan l! o 0

~OEt NHBoc

OH 0

~OEt NHBoc

> 99% syn R,S

Figure 4.12

hydrogenation Is much slower than

racemlsatlon

dicarbonyl compounds containing a single substituent at the a-carbon. The RuBINAP catalysed reduction of simple alkylated fJ-ketoesters leads to the formation of two diastereomeric products, both in high e.e., because the isomerisation of the reactant is slow compared with hydrogenation in MeOH. This contrasts with the situation for a-acylamino-fJ-ketoesters for which the isomerisation is fast particularly in CH2CI2 , so that the ideal of dynamic kinetic resolution is attained. Other examples involving cyclic fJ-diketones follow the same trend [28].

In summary, asymmetric hydrogenation by Ru complexes is of proven synthetic worth, although at this stage much remains to be done before an adequate mechanistic understanding of the reaction pathway is available. The work on rhodium catalysts was considerably aided by the fact that the resting state ofthe catalyst appears to carry an olefin bound to rhodium, so that direct NMR observations of the catalytic cycle, and the kinetics of inter conversion of relevant intermediates are possible. This guides and constrains speculation about the way in which enantioselection might arise. In ruthenium chemistry, the nature of reactive intermediates is much less well understood, since the I8-electron Ru diacetate complexes normally utilised do not complex revers­ibly with olefins. This lack of clear evidence on the structure of catalytic intermediates makes speculation about the origin of enantiose1ectivity prema­ture, particularly when the complexities alluded to earlier are taken into account.

4.4 Carbon-carbon bond formation through cross-coupling

The catalytic cross-coupling of a Grignard reagent with a vinyl or aryl halide was discovered independently by Corriu and Yamada in 1972. Initially, nickel biphosphine complexes were employed, but it was discovered that their palladium analogues were also effective [29]. Since it is known that Grignard reagents derived from secondary racemic halides undergo inversion of configuration quite readily [30], asymmetric catalysis via dynamic kinetic resolution is possible. This was recognised early on but using the chiral

Page 108: Principles of Molecular Recognition

ENANTIOSELECTIVITY IN CAT AL YTIC ASYMMETRIC SYNTHESIS 97

Catalyst is

Catalyst is

~i(CH1l3

Pd catalyst UV~ - ~ -T--~----.~ I I OOC.48h. .h A

9S" E.e

Figure 4.13

Ni catalyst ..so 0c ,9"-:-6-h-. ---l,."

96 OJ E.e. Fe

Figure 4.14

biphosphines conventionally employed for asymmetric hydrogenation, the enantiomer excesses were low. Much better results were obtained when chiral chelating phosphinamines were employed [31], and in the best cases (Fig­ure 4.13) enantiomer excesses of up to 95% were obtained. Two types ofligand proved to be successful. The first of these is based on the planar chirality of 1,2-disubstituted (aminoalkyl)-ferrocenylphosphines, as illustrated, and the second on p-aminoethylphosphines derived from amino acids.

Examples of the introduction of axial chirality through catalytic asymmet­ric synthesis are rare. In this context, biaryl coupling of an electrophilic aryl halide and a nucleophilic aryl magnesium halide provides a testing ground [32]. This reaction can be effected in high e.e. but only by utilising the nickel phosphinoether complex in the reaction shown in Figure 4.14. It is not yet clear whether the reaction proceeds through a chelate or whether the ether functionality has an ancilliary role in binding to the magnesium of the

Page 109: Principles of Molecular Recognition

98 PRINCIPLES OF MOLECULAR RECOGNITION

... " .. UOCH3

P2Pd---) --- ----Br

Observable at 9-40 C rearranges readily.

I •.. ,,(jr0CH3

P2Pd----)i ..... _-----R

Observable at or below 0 C

Figure 4.15

Stable , fully characterised

Observable only for R = PhCH2 • not aryl

Grignard reagent as it approaches a nickel complex carrying the electrophilic component.

For simpler achiral cases of palladium-catalysed cross-coupling, the mech­anistic course seems straightforward, and is supported by direct observations of the relevant intermediates shown in Figure 4.15 [33]. The cis-diphosphine catalyst is normally introduced into the reaction as a Pd(II) dihalide complex which is itself inactive. Initially, this complex is reduced to Pd(O); this involves double displacement of halide by Grignard reagent to form a Pd(II) dialkyl which subsequently undergoes reductive elimination with C-C bond forma­tion. The coordinatively unsaturated palladium(O) species thus obtained complexes with the electrophilic vinyl or aryl halide; in the case of a vinyl halide an olefin 1/2-complex is observable, but decays rapidly to the 1/ 1,1/ 1-

isomer. This step creates a Pd-X bond which is much more reactive towards organometallic reagents than is its C-X precursor, and the ensuing displace­ment step further creates a dialkylpalladium complex with the two reacting entities (electrophile and nucleophile) in proximity. When one or both of these is unsaturated, then the complex is quite labile and decomposes to form product and Pd(O) at - 80 to O°C, depending on the substituents and the palladium ligand. This last stage regenerates the catalyst for further reaction.

Insofar as the closely related asymmetric synthesis with an asymmetric P-N chelate is concerned, the crucial step is the one in which the chiral Grignard reagent reacts with the palladium (or nickel)-haloalkyl complex with displace-

Page 110: Principles of Molecular Recognition

ENANTIOSELECTIVITY IN CATALYTIC ASYMMETRIC SYNTHESIS 99

Electrophlle addition stage

13C· labelling shows oniy one diastereomer

Nucleophlle addition stage

Ph CH2 MgCI

·60 to ·30 C

Both diastereomers formed

in comparable proportions

Figure 4.16

ment of halide. This leads to one of several diastereomers of the cis-PdC2

intermediate; either the R- or S-enantiomer of the Grignard reagent reacts in preference, and the new alkyl can be cis to N or cis to P. It is most likely (although not formally proven) that the subsequent steps of catalysis are irreversible and that the configuration of the product is defined in this addition step, with the subsequent elimination step occurring with retention of con­figuration. If this is indeed the case, then asymmetric synthesis arises from the discrimination by the catalyst between the two hands of the Grignard reagent. In our attempts to observe this addition step directly, the experiments related in Figure 4.16 were carried out [34]. Using the ferrocenylphosphine-pallad-

Page 111: Principles of Molecular Recognition

100 PRINCIPLES OF MOLECULAR RECOGNITION

ium catalyst precursor, a Pd(O) cyclooctatetraene complex A could be observed in solution by 31 P NMR, but on addition of the vinyl halide, only the 1]1, 1]1-species was observed without intervention of an 1]2-0Iefin complex with a significant lifetime. This was apparently a single diastereomer B, because of the absence of a strong P-C trans-coupling in the 31 P NMR spectrum when vinyl 13C-Iabelled precursor was employed. This demonstrates that the phosphorus and carbon are cis-related and that the second possible regioisomer of this square planar intermediate is disfavoured. There is a general tendency for the more electronegative pair of groups in a square-planar complex to be cis rather than trans-related [35]. When this complex B is reacted with benzyl­magnesium chloride in ether, then two new diastereomeric complexes C and D are seen from the outset. Taken alongside other evidence [36], this indicates that the initially formed dialkylpalladium intermediate (presumed to be C) is capable of rapid stereomutation prior to the elimination step.

The apparent requirement for a chelating phosphinamine ligand in order to achieve a high enantiomer excess is indicative of one of two constraints, possibly the N atom of the ligand needs to coordinate to magnesium during the nucleophilic transfer step of the Grignard reagent. Alternatively, the unsymmetrical nature of the P-N ligand fulfils an electronic role, and ensures that a single diastereomeric pathway is preferred over alternatives; the single regioisomer B observed at the alkenyl halide addition step makes this viable. In assessing possible reaction pathways, the preferred conformation of the bound reactant entity is important, and there is good precedent for consider­ing that the palladium-vinyl group is orthogonal to the square plane, for both electronic and steric reasons [37].

Since the X-ray structure of a relevant ferrocenylphosphinamine palladium dichloride complex has been published, a basis exists for molecular modelling of the nucleophilic addition step. This may at least provide some basis for discriminating between the two possible roles suggested for the amino group. The procedures employed are illustrated in the series of diagrams en­compassed by Plate 1. The product derived from the oxidative addition of p-methoxy-f3-bromostyrene on to a Pd(O) intermediate complex is shown in Plate 1 (a), as the preferred regioisomer. If the approach of Grignard reagent (a-methylbenzylmagnesium chloride in Plate 1 (b) et seq.) is then considered to occur from an orthogonal direction, some further constraints need to be imposed. If the nitrogen atom remains bonded to palladium, then it is reasonable to expect that the formation of a new Pd -C bond will be concerted with bromide transfer from palladium to magnesium. This encourages ap­proach of the Grignard carbon to Pd such that a cyclic transition state with a 4-membered ring in which Br links the two metal atoms is formed. At the transition state the Pd -C distance of the new bond is assumed to be 0.30 nm, although, this value is not critical in defining the outcome. For a given rotameric form of the complex, freezing the torsion about the existing vinyl-palladium bond, there are six different ways in which a single enan-

Page 112: Principles of Molecular Recognition

ENANTIOSELECTIVITY IN CATALYTIC ASYMMETRIC SYNTHESIS 101

tiomeric molecule of the Grignard reagent can approach palladium, three above and three below the square plane; within each set of three, the individual species are interconverted by 120° rotation about the developing C-Pd bond. Models were constructed for each of these in Chem3D Plus [38] and then the minimum energy region found by rotating about the P-Ph and C-Ph groups until non-bonded contacts, mostly C-C, C-Hand H - H were qualitatively at a minimum. The approach of the now 5-coordinate Grignard reagent was modelled so that the initial Pd-C bond distance was 0.45 nm with the Pd-Br and Mg-C bond distances unchanged from their optimal distances of 0.24 nm and 0.213 nm, respectively. In this manner, two of the three possible ap­proaches from above the plane were immediately discounted as they led to considerable non-bonded interactions with the phosphinyl phenyl group, despite attempts to minimise these interactions. But exactly the same observa­tion was made for the alternative enantiomer of the reagent. Stage 2 of the modelling procedure involved shortening the Pd-C bond distance to 0.35 nm and lengthening the Pd - Br and Mg-C bond distances to 0.26 nm, respective­ly. This failed to demonstrate any appreciable difference between the two RMgCI enantiomers. However, further alterations to bond distances, Pd-C (O.3nm), Pd-Br (0.27nm) and Mg-C (0.25nm), resulted in unfavourable interactions between the Grignard and phosphinyl phenyl groups for one of the two enantiomers. At a more advanced stage ofPd-C bond formation close to the putative transition state, Pd-C (0.25 nm), Pd-Br (0.29 nm) and Mg-C (0.27 nm), these phenyl-phenyl interactions became more pronounced. For the other enantiomer, the level of steric repulsion was much lower, leading to the supposition that this represents the preferred pathway. Plate 1 illustrates the preferred reaction pathway. The Pd alkenyl bromide complex formed in the initial oxidative addition step is shown in Plate l(a). In Plate l(b) and then (c), the R-enantiomer of the Grignard reagent approaches Pd with the previously discussed constraints in operation. The dialkylpalladium inter­mediate Plate l(d) is then formed with elimination of MgBrCI, and C-C coupling in turn leads to Plate l(e). Olefin displacement by the electrophile then regenerates Plate l(a) and the catalytic cycle recommences. Because of the relative simplicity and qualitative nature of this method no attempt has been made to suggest any value for the energetic difference between the two pathways. Nevertheless it does predict the correct S-enantiomer of the reac­tion product. Scheme 4.1 indicates two inferior candidates for the critical catalytic intermediate, which experience more repulsive contacts during the approach than does the set of intermediates in Plate 1. Thus, despite the rather crude approximations, attack from above the plane as defined appears to be favoured, with only one enantiomer of the reagent lacking severe non-bonded interactions.

Experimentally, a-(trimethylsilyl)benzylmagnesium chloride gives higher enantiomer excesses in asymmetric cross-coupling than does a-methylbenzyl­magnesium chloride [31]. The same modelling procedure as described above

Page 113: Principles of Molecular Recognition

102 PRINCIPLES OF MOLECULAR RECOGNITION

(a)

Scheme 4.1 Disfavoured pathways in approach of the Grignard reagent. (a) Lowest energy approach of the enantiomeric form of the Grignard reagent leading to the disfavoured enantiomer of product. (b) Lowest energy approach of the Grignard reagent from the opposite face of the

square plane.

for a-methylbenzylmagnesium chloride was carried out with the silyl Grignard reagent. Attack from above the plane was again favoured and the approach having the more bulky trimethylsilyl Grignard substituent in the least crowded region of space (as in Plate 2(a)) resulted in the formation of the key diorganometal complex, Plate 2(d), which reductively eliminates coupled product of (R) configuration. This correlates with the product configuration observed experimentally. The full cycle of steps are illustrated in Plate 2(a)-e).

For the alternative interpretation in which the Grignard reagent coordi­nates to the N atom of the phosphinamine ligand as the Grignard reagent approaches palladium, models in Chem3D Plus were constructed as above, but with the four-membered metallocyclic ring involving Pd coordination to N rather than Br. The basic features were similar to those described above, for each enantiomer of Grignard reagent, only one spatial arrangement of the substituents has a low level of non-bonded interactions. Again, four of the six possible diastereomeric approaches were readily discounted leaving the two intermediates shown in Figure 4.17(a, b). Figure 4.l7(a) involves the enan­tiomer of Grignard reagent whose configuration corresponds to that of Plate l(b) and Figure 4.l7(b) corresponds to the less favourable intermediate shown in Scheme 4.1. In the present case, however, the energetics are reversed relative to Plate 1 as there are considerable phenyl-phenyl interactions in Figure 4.17(a). The alternative, Figure 4.l7(b), which is a precursor of the R-configuration of cross-coupled product is thus favoured, leading to an incorrect prediction of the product experimental configuration.

This rather crude simulation of the reaction pathway provides the con­clusion that the P-N ligand stays bound to palladium during the nucleophilic addition step of cross-coupling, rather than dissociating so that N can coordinate to the incoming Mg. This does not, of course, define the mechanism as such but it does permit a hypothesis to be advanced with a little more confidence. In addition, it provides a basis for detailed analysis ofligand design in asymmetric cross-coupling.

Page 114: Principles of Molecular Recognition

ENANTIOSELECTIVITY IN CATALYTIC ASYMMETRIC SYNTHESIS 103

(b)

Figure 4.17

4.5 Carbon-carbon bond formation through allylic alkylation

Since Trost and co-workers demonstrated that the C-C bond forming reaction between an allylic acetate and a 'soft' nucleophile such as sodio diethyl malonate was catalysed by Pd(O) complexes [39], many attempts have been made to develop a catalytic asymmetric synthesis based on this principle. A particular problem is that the catalytic reaction involves nucleophilic attack on the allyl from the side remote from Pd so that the influence of the chiral ligand is severely curtailed. The most successful early work was the reaction type exemplified in Figure 4.18(a) described by Bosnich and co-workers [40]. The essential features are the combination of a CHIRAPHOS Pd catalyst with an allylic acetate reactant carrying aryl substituents at both terminal posi­tions. This combination gave rise to products of allylic alkylation with a typical e.e. of 80%.

An advance on these procedures using ferrocene-based diphosphinamines was introduced by Hayashi and Ito [41]. The characteristic feature is that the

Page 115: Principles of Molecular Recognition

104 PRINCIPLES OF MOLECULAR RECOGNITION

(a)

Catalyst is

(b)

~ OAc

racemic

Catalyst is

1 mol%

Pd catalyst

PPh2 -BF4

MLeooooooo P" + rr---­Pd- :

Me p/ ! Ph2

1 mol%

Pd catalyst ~

NaCH(COMeh

Figure 4.18

~ r) ~

CHICOMe)2

90% E.e.

catalyst ligand carries an extended side-chain adjacent to phosphorus on one of the ferrocene rings, typically CH(Me)N(Me)CH2CH2N(CH2CH20H)2' Using this method, as exemplified in Figure 4. 18(b), Hayashi and Ito [41J propose that the high enantiomeric efficiency observed is due to a direct interaction of this side-chain with an incoming malonate or diketonate anion. They claim in the original paper that this involves the Na + counterion ofthe nucleophile and an OH of the side-chain. Although H-bonding between this OH group and an oxygen of the acetylacetonate nucleophile is possibly more likely, this is a point of detail; if the side-chain does interact with the incoming nucleophile then it will control which terminus of the allyl undergoes attack and hence the enantiomer formed. The postulate has been evaluated by us employing molecular modelling based on the X-ray crystal structure of the PdCl2 complex of the ligand [42]. The allyl ligand was modified to give the diphenyl-substituted allyl intermediate proposed in the catalysed reaction between racemic (E)-1,3-diphenyl-3-acetoxy-l-propene and sodium

Page 116: Principles of Molecular Recognition

ENANTIOSELECTIVITY IN CATALYTIC ASYMMETRIC SYNTHESIS 105

Figure 4.19

acetylacetonate. Assuming that the malonate approaches the terminal carbon of the n-allyl in the complexation plane and anti to Pd, then a reasonable transition-state distance would be 0.28 nm. At this distance, the side chain hydroxyl is ca. 0.30 nm from one carbonyl oxygen ofthe carbanion, sufficiently close to engage in H-bonding. In this fashion, the ferrocenyl side chain guides the incoming nucleophile to only one ofthe two diastereotopic terminaln-allyl carbon atoms. The resulting alkene has (S) stereochemistry which is in agreement with experiment. This further illustrates the power of Chem3D Plus to simulate the course of a catalytic cycle once a relevant X-ray structure is available. The models are shown in Figure 4.19 with Figure 4. 19(a) repre­senting the ground-state of the complex before anion approach, and Figure 4. 19(b) representing the approach of the nucleophile along the path leading to the preferred enantiomer of product.

Page 117: Principles of Molecular Recognition

106 PRINCIPLES OF MOLECULAR RECOGNITION

References

1. J. Crosby (1991)Tetrahedron 47, 4789; S. Otsuka and K. Tani (1991) Synthesis 665. 2. F.H. Jardine (1981) Prog. Inorg. Chem. 28, 63. 3. W.S.Knowles, M.J. Sabacky and B.D. Vineyard (1972) J. Chem. Soc., Chem.Commun., to. 4. H.B. Kagan and T.-P. Dang (1972) J. Am. Chem. Soc. 94, 6429. 5. J.K. Whitesell (1989) Chem. Rev. 89, 1581. 6. CR Landis and J. Halpern, (1987) J. Am. Chem. Soc. 109, 1746. 7. J.M. Brown, P.A. Chaloner and G.A. Morris (1987) J. Chem. Soc. Perkin Trans. 2 1583; J.M.

Brown and P.J. Maddox (1987) J. Chem. Soc., Chem. Commun. 1276. 8. A.S.C Chan, J.J.Pluth and J. Halpern (1980) J. Am. Chem. Soc. 102,5952; N.W. Alcock, J.M.

Brown and A.R. Lucy, (1985) J. Am. Soc., Chem. Commun. 575; N.W. Alcock, J.M. Brown and P.J. Maddox (1986) J. Chem. Soc. Chem. Commun. 1532; B. McCulloch, J. Halpern, M.R. Thompson and CR. Landis (1990) Organometallics 9,1392.

9. J.M. Brown, L.R. Canning, A.J. Downs and A.R. Forster (1983) J. Organomet. Chem. 255,103. 10. Earlier examples are tabulated in [8, 12a]; see also, for example, M.J. Burk, J.E. Feaster and

R.L. Harlow (1991) Tetrahedron Asymmetry 2, 569; R. Schmid, J. Foricher, M. Cereghetti and P. Schonholzer (1991) He/v. Chim. Acta 74, 370; R. Schmid, M. Cereghetti, B. Heiser, P. Schonholzer and H.-J. Hansen (1988) He/v. Chim. Acta 71, 897.

11. L.A. Castonguay, A.K. Rappe and C.J. Casewit, (1991) J. Am. Chem. Soc. 113,7177; V.S. Allured, CM. Kelly and CR Landis (1991) J. Am. Chem. Soc. 113, 1.

12. (a) J.M. Brown and P.L. Evans (1988) Tetrahedron 44, 4905; (b) P.L. Bogden, J.J. Irwin and B. Bosnich (1989) Organometallics 8, 1450.

13. RH.Crabtree and D.G. Hamilton (1988) Adv. Organomet. Chem. 28,299. 14. J.M. Brown (1987) Angew. Chem., Int. Ed. 26, 190. 15. For example, A. Villalobus and S. Danishevsky (1990) J. Org. Chem. 55, 2776; D.A. Evans, RL.

Dow, T.L. Shih, J.M. Takacs and R. Zahler, (1990) J. Am. Chem. Soc. 112,5290; Y. Hamada, A. Kawai, T. Matsui, O. Hara and T. Shiori (1991) Tetrahedron 46, 4823.

16. S.D. Kahn and W.J. Hehre (1987) J. Am. Chem. Soc. 109,666 and refs. therein. 17. Hydroboration: K.N. Houk, N.G. Rondan, V-D. Wu, J-T. Metz and M.N. Paddon-Row

(1984) Tetrahedron 40, 2257; Peracid epoxidation: P. Kocovsky and I. Stary, (1990) J. Org. Chem. 55, 3236 and refs. therein; P. Chautemps and J-L. Pierre (1976) Tetrahedron 32, 549; Simmons-Smith cyclopropanation (Zn): M. Ratier, M. Castaing, J-Y. Godet and M.Pereyre (1978) J. Chem. Res. (S) 179; (Sm): G.A. Molander and L.S. Harring (1990) J. Org. Chem. 54, 3525; Dichlorocyclopropanation: F. Mohmahdi and W.C Still (1986) Tetrahedron Lett 27, 893; Iodooxygenation: M. Labelle and Y. Guindon (1989) J. Am. Chem. Soc. 111,2204; A.R. Chamberlin and R.L. Mulholland, (1984) Tetrahedron 40, 2297. For a general discussion see S.D. Kahn, CF. Pau, A.R. Chamberlin and W.J. Hehre (1987). J. Am. Chem. Soc. 109,650, and refs. therein.

18. Directed hydrogenation: R. Naik and J.M. Brown (1982) J. Chem. Soc., Chem. Commun. 348; M. Kitamura, I. Kasahira, K. Manabe, R. Noyori and H. Takaya (1988) J. Org. Chem. 53, 708; M. Takagi and K. Yamamoto (1991) Tetrahedron 47, 8869-8882; [14], [15]; Osmylation: D.A. Evans and S.W. Kaldor (1990) J. Org. Chem. 55, 1698 and refs therein; Titanium­promoted epoxidation: Y. Gao, R.M. Hanson, J.M. Klunder, S. Y. Ko, H. Masamune and K.B. Sharpless (1987) J. Am. Chem. Soc. 109, 5769; V. Jager, D. Schroter and B. Koppenhoefer (1991) Tetrahedron 47, 2195; M. Bailey, I. Staton, P.R. Ashton, I.E. Marko and W.D. Ollis, (1991) Tetrahedron Asymmetry 2, 495, Directed hydrosilylation: K. Tamao, N. Inui, O. Nakayama and Y. Ito (1990) Tetrahedron Lett. 31, 7333; Catalytic hydroboration: D.A. Evans, G.C Fu and A.H. Hoveyda (1988) J. Am. Chem. Soc. 110,6917-6918.

19. N.W. Alcock, J.M. Brown, A. Conn and R.I. Taylor, in preparation. 20. J.M. Brown and I. Cutting (1985) J. Chem. Soc., Chem. Commun. 578; N.W. Alcock, J.M.

Brown and P.J. Maddox, (1986) J. Chem. Soc., Chem. Commun. 1532; J.M. Brown and A.P. James (1987) J. Chem. Soc., Chem. Commun. 181; J.M. Brown, A.P. James and L.M. Prior (1987) Tetrahedron Lett. 28, 2179; J.M. Brown, I. Cutting and A.P. James, (1988) Bull. Soc. Chim. France, 211.

21. T. Ikarya, Y. Ishii, H. Kawano, T. Arai, M. Saburi, S. Yoshikawa and S. Akutagawa (1985) J. Chem. Soc., Chem. Commun. 922.

Page 118: Principles of Molecular Recognition

ENANTIOSELECTIVITY IN CATALYTIC ASYMMETRIC SYNTHESIS 107

22. R. Noyori (1989) Chern. Soc. Rev. 18, 187; R. Noyori and M. Kitamura, (1989) Modern Synthetic Methods 6, 131; R. Noyori (1990) Science 248, 1194.

23. M.T. Ashby and J. Halpern (1991) J. Am. Chern. Soc. 113,589. 24. T. Ohta, H. Takaya and R. Noyori, (1990) Tetrahedron Lett. 31, 7189. 25. Y. Amrani, L. Lecomte, D. Sinou, J. Bakos, I. Toth and B. Heil (1989) Organornetallics 8,542. 26. M. Kitamura, T. Ohkuma, S. Inoue, N. Sayo, H. Kumobayashi, S. Akutagawa, T. Ohta. H.

Takaya and R. Noyori (1988) J. Am. Chern. Soc. 110,629. 27. R. Noyori and H. Takaya (1990) Accounts Chern. Res. 345. 28. R. Noyori, T. Ikeda, T. Ohkuma, M. Widhalm, M. Kitamura, H. Takaya, S. Akutagawa, N.

Sayo, T. Saito, T. Taketomi and H. Kumobayashi (1988) J. Am. Chern. Soc. 110, 629; M. Kitamura, T. Ohkuma, M. Tokunaga and R. Noyori, (1990) Tetrahedron Asymmetry 1, 1.

29. R.1.P. Corriu and J.P.1. Masse, (1972) J. Chern. Soc., Chern. Cornrnun. 144; M. Tamoa, K. Sumitami and M. Kumada (1972) J. Am. Chern. Soc. 94, 4374. For example, G. Consiglio and C. Botteghi (1973) Helv. Chirn. Acta. 56, 460; Y. Kiso, K.

30. Tamao, N. Miyake, K. Yamamoto and M. Kumada(1974) Tetrahedron Lett. 15, 3; H. Brunner and M. Probster (1980) J. Organornet. Chern. 209, C1.

31. T. Hayashi, M. Konishi, M. Fukushima, T. Mise, M. Kagotani, M. Tajika and M. Kumada (1982) J. Am. Chern. Soc. 104, 180; T. Hayashi, M. Konishi, M. Fukushima, K. Kanehira, T. Hioki and M. Kumada, (1983) J. Org. Chern. 48, 2195; T. Hayashi, T. Mise, M. Fukushima, M. Kagotani, N. Nagashima, Y. Hamada, A. Matsumoto, S. Kawakani, M. Yamamoto and M. Kumada, (1986) J. Org. Chern. 51, 3772; T. Hayashi, A. Yamamoto, M. Hojo and Y. Ito (1989) J. Chern. Soc. Chern. Cornrnun. 495.

32. T. Hayashi, K. Hayashizaki, T. Kiyoh and Y. Ito (1988) J. Am. Chern. Soc. 110, 8153; T. Hayashi, K. Hayashizaki and Y. Ito, (1989) Tetrahedron Lett. 30, 215.

33. I.M. Brown and N.A. Cooley (1990) Organornetallics 9, 353. 34. K.V. Baker, J.M. Brown, N.A. Cooley, G.D. Hughes and R.1. Taylor (1989) J. Organornet.

Chern. 370, 397. 35. G.R. Clark and GJ. Palenik (1970) Inorg. Chern. 9, 2754. 36. E.g. M.J. Hampden-Smith and H. Ruegger (1989) Magn. Reson. Chern. 27,1107. 37. MJ. Calhorda, J.M. Brown and N.A. Cooley (1991) Organornetallics 10,1431. 38. Chern3D Plus is produced by Cambridge Scientific Computing, Cambridge, MA, 02139, USA. 39. B.M. Trost, P.E. Strege, L. Weber, TJ. Fullerton and TJ. Dietsche (1978) J. Am. Chern. Soc.

100,3407,3416,3426; B.M. Trost and T.R. Verhoeven (1978) J. Am. Chern. Soc. 100,3435. 40. P.R. Auburn, P.B. Mackenzie and B. Bosnich, (1985) J. Am. Chern. Soc. 107, 2033; P.B.

Mackenzie J. Whelan and B. Bosnich (1985) J. Am. Chern. Soc. 107,2046. 41. T. Hayashi, A. Yamamoto and Y. Ito, (1987) Chern. Lett. 177. See also T. Hayashi (1988) Pure

Appl. Chern. 60, 7; Y. Okada, T. Minami, Y. Umezu, S. Nishikawa, R. Mori and Y. Nakayama, (1991) Tetrahedron Asymmetry. 2, 667.

42. F.H. van der Steen and I.A. Kanters (1986) Acta Crystallogr. Sect. C. 42, 547.

Page 119: Principles of Molecular Recognition

5 Molecular recognition in the catalytic action of metallo-enzymes 1. AQVIST and A. W ARSHEL

5.1 Introduction

In this chapter, the origin of the enormous catalytic power of enzymes is examined. It is argued that many hypotheses which address this issue cannot be verified on a detailed molecular level without a model that translates the three-dimensional protein structure to activation free energies and reaction rates. The empirical valence bond method in combination with free energy perturbation simulations is shown, by examining the specific case of the staphylococcal nuclease catalysed reaction, to provide a practical model for structure-catalysis correlation. The calculations reproduce the overall cata­lytic power of the enzyme and elucidate the roles of different catalytic factors, in particular the importance of electrostatic effects. Changes in the activation barrier caused by amino acid mutations as well as metal ion substitutions are evaluated and found to be in accord with experimentally observed trends. It is found that the reaction free energy barrier has its minimum with Ca2+ as the catalytic ion and that for more electrophilic ions, the barrier increases rapidly as the ion becomes smaller. The same type offree energy relationships that give rise to a rate optimum for Ca 2 + in staphylococcal nuclease seem to be valid in other cases as well, and may indicate more general free energy relationships for metalloenzyme-catalysed reactions.

Some of the most pertinent examples of 'molecular recognition' are encoun­tered among the interactions between enzymes and their substrate molecules. This is reflected not only by a high affinity in the initial substrate binding step, but also in the apparent velocity of the reaction. That is, the enormous enhancements of rate constants, compared to solution reactions, which en­zymes can achieve are principally due to their ability to lower the relative free energy of the rate limiting transition state. This can be viewed as a conse­quence of enzymes having a higher affinity or binding propensity towards their transition states than a regular solvent such as water. Another way of phrasing it is that one would expect the enzyme active site to be complementary, in some sense, to the rate-limiting transition state configuration, which is what Pauling anticipated many years ago [1] (although he apparently had shape comple­mentarity in mind rather than electrostatic complementarity, as proposed here).

Page 120: Principles of Molecular Recognition

THE CATALYTIC ACTION OF METALLO-ENZYMES 109

In order to be able to answer the question of how an enzyme stabilises its transition state and what the more precise physical origin ofthe catalytic effect is, we must, however, look into the details of the molecular interactions between protein and substrate. During the last 20 years, a number of hypo the­ses addressing the fundamental mechanisms of enzyme catalysis have been proposed. One can tentatively classify the different proposals according to the major factor that is suggested to be responsible for catalysis in each case. Thus, we have witnessed, for example, stereochemical strain [2J, entropic factors such as effective concentration [3aJ and orientational effects [3bJ (see also niscussions in [3c-e J), electrostatic stabilisation [4J and desolvation (or, gas-phase chemistry) [5J as candidates for the origin of enzyme catalysis. It is probably not an exaggeration to say that a consensus opinion on the matter is still lacking and that the issue of enzyme mechanism remains a challenging open question in biochemistry.

With the impressive advances in genetic engineering it has now become possible to, almost at will, modify particular amino acid residues of the protein and measure how such changes manifest themselves in terms of catalytic activity. Interpreting the kinetics of genetically modified enzymes is, however, not totally unproblematic. That is, one is always faced with the question of how extensive structural perturbations have been introduced by a given mutation, and without accompanying structural studies, it can be difficult to separate 'purely catalytic' effects from the possibility that the protein is falling apart. Perhaps more important is the fact that the overall catalytic effect of an enzyme cannot, in general, be partitioned into (additive) contributions from specific groups. After all, what one measures is the total effect of the protein (native or mutant) microenvironment on the reaction.

It seems to us that the complicated interrelations between structure and function in enzyme catalysis cannot be fully understood without a model that takes all the relevant interactions into account. If one can devise sufficiently accurate schemes for simulating enzymatic reactions and reproducing the observed rate constants, it would be possible to examine the different contri­butions to the calculated activation energy and evaluate their relative import­ance. This would also make it possible to explore the detailed mechanisms of enzyme catalysis in a way that is not accessible to direct experimental methods (e.g. with a reliable computational simulation scheme, the relative importance of such factors as strain and electrostatics can be readily evaluated). However, it is, important to realise that in order for a theoretical framework to be really useful in this context, it should be able to give semi-quantitative or quantitative information, rather than just providing an exercise in computational quantum chemistry at the qualitative level.

These aspects are discussed in section 5.2, where we review different methodological approaches to describing reactions in enzymes as well as in solution. The emphasis is given to the empirical valence bond (EVB) method which is presented in some detail. In section 5.3, we examine the catalytic

Page 121: Principles of Molecular Recognition

110 PRINCIPLES OF MOLECULAR RECOGNITION

reaction of staphylococcal nuclease, a Ca2 + -dependent metalloenzyme, using EVB simulations. These calculations are divided into two parts; the first deals with evaluating the overall activation barrier of the enzyme and examining where the catalytic effect comes from in this specific case. In the second set of calculations, the dependence of the enzyme's activation barrier on the proper­ties of the bound metal ion are studied. These calculations demonstrate how the enzyme can become optimised for a particular ion and some interesting generalisations regarding free energy relationships in metalloenzymes also emerge as results.

5.2 Methods for simulating reactions in enzymes and solution

The two main routes available for obtaining quantum mechanical potential energy surfaces for molecular systems are provided by the molecular orbital (MO) and valence bond (VB) approaches. In modern (gas-phase) computa­tional quantum chemistry, the former method has become the favoured choice because of its more convenient algorithmic implementation for computing. However, in this chapter, we are mainly concerned with the VB approach which has the advantage that empirical data can easily be incorporated to give asymptotically correct energy surfaces at minimial computational effort; this is what we call the empirical valence bond (EVB) method [6]. It is nevertheless useful and instructive to compare the MO and VB formulations for implemen­tation in free energy calculations, since some of the fundamental problems become easily identifyable.

5.2.1 Molecular orbital approach

The effect of the environment (which is denoted by s) surrounding the reacting system (r, referred to as solute) can be included in MO calculations in a fairly straightforward way if the overlap integrals between the solute and solvent are neglected [7]. The effect of the solvent polarisation on the solute electronic states can be estimated within the zero differential overlap approximation by rewriting the diagonal matrix elements of r as

Fvv = F?v + L v",k (5.1) k

where FO is the Fock matrix of the system r in a vacuum and Vv,k is the potential at the site of the vth orbital due to the permanent charge distribution and electronic polarisation of the atoms (k) in s. The eigenvectors of the modified Fock matrix give the solute molecular orbital vectors, C, and the corresponding charges, Qi = Zi - I:.vEiI:.jnjCJv (where Z is the nuclear charge and v and j enumerate the atomic and molecular orbitals, respectively). The

Page 122: Principles of Molecular Recognition

THE CATALYTIC ACTION OF METALLO-ENZYMES

total potential energy of the system (r + s) can then be written as

1 E = EO + II Qiqk/rik - -2I(J.k~~ + V::w + vs

i k k

111

(5.2)

where EO is the vacuum energy of the quantum system (r) and the second and third terms denote interactions with permanent charges and induced dipoles in s. qk' (J.k and ~k are the partial charge, electronic polarisability and electric field on the kth atom in s, respectively. Since the charge distributions of rand s are coupled through the second and third terms of eqn. (S.l) one must solve the equation self-consistently. The fourth term of eqn. (S.l) denotes the Van der Waal's interaction between rand s, while the last term represents the interac­tion energy of the surrounding system itself. These potential energy terms are most conveniently represented by standard classical force fields. More import­antly, eqns. (S.1) and (S.2) and the approach of [7] corresponds to a descrip­tion in which the solvent is explicitly represented and it does not rely on unknown parameters such as a cavity radius or an effective dielectric constant. This allows one to actually examine the validity of various hypotheses about enzyme catalysis, rather than to determine the effective parameters that reproduce the different hypotheses. It can be noted that the first three terms of eqn. (S.1) are formally equivalent (within the CNDO approximation) to the Hamiltonian in Tapia and lohannin's reaction field approach [8], but that eqn. (S.1) is probably easier to implement computationally since it does not involve any integration over electric fields.

Calculation of the potential surface of the reaction according to eqn. (5.2) using iterative SCF procedures in combination with energy minimisation of the surrounding system does not pose any major difficulties [7]. However, the computational scheme should also involve an iterative evaluation of the solvent polarisation in response to the solute (r) charge distribution (which requires additional computing time) and the resulting change in the solute charge distribution. Furthermore, in order to evaluate the actual free energy surface one must be able to perform statistical averaging over many configur­ations of the system. This is most conveniently achieved with the free energy perturbation technique described in section S.2.2.3. A problem with the MO formulation in this context, is that it is difficult to rigorously devise a proper mapping potential which allows one to gradually force the system to move from reactants to products. That is, one must be able to make the total (mapping) Hamiltonian an explicit function of a control parameter, A (cf. section 5.2.2.3), without constraining the coordinates to any particular value, and still be able to monitor the actual ground-state surface. Note also that the iterative calculation, mentioned above, of the quantum (r) charge distribution and the polarisation of the surrounding medium is likely to slow down the con­vergence of any MO/MD procedure. Finally, and perhaps most important­ly, it should be mentioned that MO approaches do not readily lend themselves

Page 123: Principles of Molecular Recognition

112 PRINCIPLES OF MOLECULAR RECOGNITION

to a convenient calibration by experimental information about bond breaking and bond forming reactions. In this respect, one should keep in mind that even for a simple process such as homolytic bond cleavage, SCF calculations can give very large errors if double configuration interaction is not included and it is difficult to obtain analytical derivatives for such corrections.

5.2.2 The EVB model

One of the appealing features of VB methods is that their basic concepts, namely bond functions and ionic terms, have a simple and clear physical meaning. Consequently, it is at least conceptually. easy to define different states along a chemical reaction path in terms of VB configurations. Furthermore, as will be emphasised in what follows, the definition of VB resonance structures with a clear physical meaning allows for incorporating experimental informa­tion as constraints on the potential surfaces. With this procedure, i.e. the EVB model, one can assure that the reaction surface asymptotically behaves in accordance with experimental facts. This is important since semi-empirical or ab initio calculations for large reacting fragments with many electrons can be associated with considerable errors. In order to outline the ideas behind the EVB approach, we start by examining a simple test case.

5.2.2.1 VB potential surfacefor proton transfer reactions in solutions. Let us consider a proton transfer reaction in solution, which can be written as

RXH + YR' ---+1 RX- + HY+ R' (5.3)

In the language of VB theory, such a reaction can be described by the three resonance structures

¢1 = RX-HYR'

¢2 = RX-H-Y+R'

¢3=RX-H+YR' (5.4)

The electrons involved in the actual reaction (referred to here as the active electrons) can be treated according to the general prescription of the four­electron three-orbital problem with the VB wave functions [9].

<1>1 = N1{IXHYYI-IXHYYI}x1 = ¢lX1

<1>2 = N 2{IXXHYI-IXXAYI}x2 = ¢2X2 (5.5)

where X, Y and H designate atomic orbitals on the corresponding atoms, the Ns are normalization constants and the XS are the wave functions of the inactive electrons moving in the field of the active electrons. We have thus partitioned the molecular electronic space into active and inactive parts and assumed no interaction between these parts. The three resonance structures,

Page 124: Principles of Molecular Recognition

THE CATALYTIC ACTION OF METALLO-ENZYMES 113

<1>1' <1>2 and <1>3 can be treated by the approach detailed in [9] and can be reduced to an effective two-state problem, where one state is mostly <1>1 and the other is mostly <1>2' The corresponding matrix elements can be evaluated by standard quantum chemical methods but this evaluation is very tedious. Instead we can exploit the simple physical picture of <1>1 and <1>2 and describe H 11 and H 22 by analytical potential functions that can be calibrated by both experimental information and accurate quantum mechanical calculations. That is, the function H 11 will be given, at the range where the X ... Y distance is large as compared to the X-H bond length, by a Morse potential function that depends on the distance RX - H ' When the H atom approaches Y, we have a repulsive Van der Waals interaction between these atoms. We can describe both of these forces using analytical potential energy terms (see below). The same argument applies to H 22' As far as H 12 is concerned, we can approximate it by an exponential term and fit the parameters in this term to the experimen­tal information on the gas-phase potential energy surface of the reaction or, if needed, to accurate gas-phase calculations.

Thus we describe the gas-phase potential by

E~ = H~ 1 = AM(b1) + U~~ + (K/2)(01 - O~f + U~~. inae!

E~ = H~2 = AM(b2) + U~~ + (K/2)(02 - 0~)2 + ct~ + U~~. inae! (5.6)

H~2 = Aexp { - J.l(r 3 - rm

where b1, b2 and r3 are, respectively, the X-H, H-Y, and X .. · Y distances, where 01 and O2 are, respectively, the R-X-H and H-Y -R' bond angles,AMis a Morse potential function taken relative to its minimum value (AMb = M(b) - D) and U~b. inae! is the repulsive non-bonded interaction in the given configuration. The parameter ct~ expresses the difference between the energy of '" 1 and '" 2' where the fragments of each resonance structure are at infinite separation. The potentials U!~ae! represent the interaction with the inactive part of the reacting system and are described by

U!~ae! = <Xi I H inae! I Xi)

=!" K(i) (b - b(i))2 +!" K(i)(O - 0(i))2 (5.7) 2 L... bonds b 0 2 L...angles 9 0

+" . K(i) [1 + cos (n(i) ,/,(i) _ £5(i))] + U(i) . L.,..torslOOS 4> 'P ob, mact

where b, 0 and ¢ are, respectively, the bond length, bond angle and dihedral angle in the fragments Rand R'.

The effect of the solvent (or, more generally, the surrounding medium) on our reaction Hamiltonian is obtained by adding the corresponding energies to the diagonal matrix elements,

E~ = Hll = H~1 + AV~!I E~ = H22 = H~2 + AV~~1

H12=H~2

(5.8)

Page 125: Principles of Molecular Recognition

114 PRINCIPLES OF MOLECULAR RECOGNITION

where A v~21 is the interaction potential between the solute atoms in the ith VB configuration and the surrounding solvent, The ground-state potential surface for the reaction is then obtained by solving the secular equation He = EgC.

5.2.2.2 The EVB calibration procedure. A key feature of the EVB method is its unique calibration possibilities, making use of reliable experimental infor­mation. That is, returning to our proton transfer example, after evaluating the free energy surface with the initial parameter a~, we can use the fact that the free energy of the proton transfer reaction is given by

AGpT = AG [(A - H + B) ~ (A - + HB +)]

= 2.3RT[pK.(A - H) - pK.(B+ - H)J (5.9)

Now we can adjust a~ until the calculated and observed AGpT coincide. This calibrated surface can then be used with confidence for studying the reaction in different solvents (or environments, such as an enzyme's active site), since a~ remains unchanged and only Ag~21 is recalculated. In this way, the error associated with the evaluation of H~ does not affect the calculations of the relative effects of different surrounding media.

In general, when one deals with a more complicated reaction, for which it is hard to obtain gas-phase estimates of a?, it is convenient to use solution experiments to obtain the first estimate of a? This is done by using

E i ( (0)- El (00) ~a? + Ag!o'( oo)-Ag;o'( CX))= a? + AAg!o, ~AG;( OO)obs (5.10)

where Ei ( (0) is the energy of the ith resonance structure when the correspond­ing fragments are held at infinite separation. Ag!o' is the solvation energy of the ith resonance structure and can be estimated by the free energy perturbation approach that is described below or by simpler models [6b]. Similarly AG;( CX) )obs is the free energy involved in forming the ith configuration from the first configuration where the fragments in each configuration are held at infinite separation; this leads to the useful estimate

(5.11)

The off-diagonal matrix elements Hij can also be determined by ab initio calculations (see discussion in [9J) or by semi-empirical procedures (e.g. [6b J). Here, however, we follow the simplified procedure of [9J and describe Hij by a simple function

HZ = L Al~·l) exp { -Ill~·l)rkl} (5.12) (k.l)

where the atom pair (k, /) is chosen according to the specific H ij' This function is fitted to experimental information about the activation free energy for the different steps of the relevant reference reaction in solution, using

(5.13)

Page 126: Principles of Molecular Recognition

THE CATALYTIC ACTION OF METALLO-ENZYMES 115

This procedure is far less reliable than that used for the diagonal energies and can benefit from ab initio calculations on the gas phase reaction (see [9]), which can be used as extra constraints on the parameters of eqn. (5.6). However, the calculated difference between the free energy surface in solution and in the enzyme is not very sensitive to the exact value of the Hij's. It has previously been demonstrated [9] that the dependence of I1gt on the reaction free energy is almost linear. Moreover, the relation between I1gt and I1Go is virtually independent of the magnitude of the particular Hij (this is why linear free energy relationships were found to be so powerful in physical organic chemistry [10]).

Some readers might feel uncomfortable with the approach above since it may appear as an arbitrary empirical parametrisation procedure. However, the present approach is aimed at understanding enzymes and not at speculat­ing about possible effects. This requires that one must be able to draw conclusions from the simulations, with a minimum degree of uncertainty. Hence, the procedure we use here does not involve a single adjustable parameter in the enzyme calculations, but it does involve solid parametrisa­tion with respect to data for the corresponding processes in solution. This makes our conclusions about the difference between the enzyme and solution reactions quite unbiased. Lastly, it is quite likely that when ab initio calcula­tions reach the stage of being able to handle enzymatic reactions, the results will have to be contrasted to the same information (e.g. pKa values) as used in the EVB method.

5.2.2.3 Free energy perturbation technique. Knowing how to calculate the ground-state potential surface, we must now be able to perform statistical averaging over the configurational space in order to obtain free energies. The strategy for this involves the free energy perturbation (FEP) technique [11], which involves a non-physical transformation from one resonance structure to another. That is, by introducing the mapping potential

E=A~E1 +A7E2-2IH12IJA~A7

(5.14)

and changing the mapping vector A = (A1' A2 ) by small increments in an MD simulation from (1,0) to (0, 1), we can drive the system from the reactants via the transition state to the products. For A = (!, !) the mapping potential will approach the true ground-state potential at the transition state region, E~ = !(E1 + E2) -IHd.

The free energy associated with changing E1 to E2 can be obtained from

bG(Am -. Am.} = - RT In < exp { - (Em' - Em)/RT} >m (5.15)

m=n-1

I1G(An)=I1G(AO-'An}= L bG(Am-'Am,} (5.16) m=O

Page 127: Principles of Molecular Recognition

116 PRINCIPLES OF MOLECULAR RECOGNITION

where the average < >m is evaluated on the potential surface Em' Since L1G(A.) represents the free energy associated with moving on the constraint potential Em' we still need to obtain the free energy, L1g, corresponding to the trajectories moving on the ground-state potential Eg• This is done using the relation­ship [9, 12]

exp { _L1g(xn)jRT} ~exp { - L1G(A.m)jRT}<exp { - [Eg(xn)- Em(Xn)]jRT} >m

(S.17)

where the reaction coordinate xn can be defined in terms of the energy gap, L1E = E2 - E1, between the potential surfaces [9,12]. This means that we calculate the energy difference between the mapping potential and the ground­state potential (give by eqn. (S.3)) at each point of the MD trajectory and use the Boltzmann average of this difference to correct the free energy obtained on the mapping potential. Using L1g(xn) we determine the values of(L1gL2)calc,w and (L1Gl~2)calc,w and adjust !l(2) and H?2 until the calculated and observed values of thesefree energies coincide and satisfy eqns. (S.1 0) and (S.13). We now apply the computational framework outlined above to a specific enzymatic reaction.

5.3 Application to the staphylococcal nuclease reaction

Staphylococcal nuclease (SNase) is a single peptide chain enzyme consisting of 149 amino acid residues. It catalyses the hydrolysis of both DNA and RNA at the S' position of the phosphodiester bond, yielding a free 5' -hydroxyl group and a 3'-phosphate monoester [13]

H20 + S' - OP(02)-0 - 3'~S' - OH + (OH)P(02)-0 - 3' (S.18)

The enzyme requires one Ca 2 + ion for its action and shows little or no activity when Ca2 + is replaced by other divalent cations [14]. A crystallographic structure at I.S A resolution ofSNase in complex with the inhibitor pdTp has been determined by Cotton and co-workers [IS]. The active site is located at the surface of the protein with the pyrimidine ring of pdTp fitting into a hydrophobic pocket while the 3'- and S'-phosphate groups interact with several charged groups. In particular, the two arginine residues 3S and 87 donate hydrogen bonds to the S'-phosophate thereby partly neutralising its double negative charge. The Ca2+ ion is ligated by the carboxylate groups of Asp 21 and Asp 40, the carbonyl oxygen ofThr 41, two water molecules and one of the S'-phosphate oxygens (Figure S.1).

Based on this inhibited structure, a reaction mechanism for the enzyme has been postulated [ISb]: (i) General base catalysis by Glu 43 which accepts a proton from a (crystallographically observed) water molecule in the second ligand sphere of the Ca2+ ion, yielding a free hydroxide ion. (ii) Nucleophilic attack by the OH - ion on the phosphorous atom in line with the S'-O-Pester

Page 128: Principles of Molecular Recognition

THE CATALYTIC ACTION OF METALLO-ENZYMES 117

L R

Figure 5.1 Stereo view of the active site ofSNase. Coordinates are from the dataset 2SNS [15] of the Brookhaven Protein Databank. The positions of the two Ca2 + ligated waters and the reacting

water molecule have been indicated with small circles.

bond leading to the formation of a trigonal bipyramidal (i.e. penta-co­ordinated) transition state or metastable intermediate. (iii) Breakage of the 5'-O-P bond and formation of products (cf. Figure 5.2). This mechanism would be rate limited by the second step which in solution corresponds to an activation free energy barrier of 33 kcaljmol [16]. This mechanism has been confirmed by kinetic experiments of several mutant versions of SNase [17].

The major difference between the inhibitor pdTp and a true substrate (in the reactant state) is the extra negative charge on the 5'-phosphate of pdTp. It therefore seems reasonable to believe that the SNase-pdTp-Ca2+ structure resembles the activated complex associated with the hydrolysis step rather than the reactant conformation (although it is not strictly a transition state analogue). Sepersu et al. [17] have examined the kinetic properties of several different SNase mutants. Their data confirm the general aspects of the proposed mechanism, indicating the importance of Glu 43 as a general base catalyst and of the Ca2 + ion, Arg 35 and Arg 87 in stabilising the transition state. However, it was found [17b] that while both the R35G and R87G mutants reduce kcat by factors ~ 35000, only R35G affects substrate binding appreciably. They therefore suggested a slightly modified mechanism where Arg 87 interacts primarily with the trigonal bipyramidal transition state (or metastable intermediate) and not with the 5'-phosphate group in the reactant conformation.

The overall catalytic rate constant of SNase is kcat = 95 S-l at T = 297 K corresponding to a total free energy barrier of (~g~at)obs = 14.9 kcaljmol. This should be compared to the pseudo-first-order rate constant for non-enzymatic hydrolysis of phosphodiester (with a water molecule as the attacking nuc­leophile) which is 2 x 10- 14 S -1, corresponding to ~gtaq) = 36 kcaljmol [16].

Page 129: Principles of Molecular Recognition

118 PRINCIPLES OF MOLECULAR RECOGNITION

c.2 + S'

TP= 0/

0 I 1 -c( e ",O'H °e-t°-3' 0

! c.2 + s'

0/ P O-H I '1'2 = -c/

~ 9 0, 0e--t0- 3' 0 H

! c.2 + 0/

s'

'I'P. O-H o I 3 -c/ e )P-0""3'

o I 0 ~H

Figure 5.2 The three EVB resonance structures ('1'1' 'I' 2' 'I'l) used to describe the catalytic reaction of SNase. The first two steps of the mechanism are described by: '1'1 -+ 'I' 2 -+ 'I' lo

The rate acceleration accomplished by the enzyme is thus 1015-1016, which is quite impressive.

5.3.1 Free energy profile for the SNase reaction

The first two steps of the SNase reaction, of which the second one is rate­limiting, can be described by the three EVB resonance structures of Figure 5.2. Here, 'I'~ represents the reactant state, with Glu 43 negatively charged and the 5'-phosphate group in tetrahedral conformation. The state resulting from the general base catalysis step, where Glu 43 has been protonated by the adjacent water molecule, is denoted by 'I'~ and the pentacoordinated phosphate group formed after nucleophilic attack by the OH- ion is denoted 'I'~. The atoms depicted in the figure are considered as our solute system (r) while the rest of the protein/water environment constitutes the 'solvent' (s) for the enzyme reaction. Although the Ca2+ ion does not actually 'react', it is included in the

Page 130: Principles of Molecular Recognition

THE CATALYTIC ACTION OF METALLO-ENZYMES 119

reacting system for convenience. The reaction free energy profile was evalu­ated by the EVB + FEPjMD procedure outlined above, in which the system consisting of the enzyme, substrate and 60 water molecules (surrounding the active site) was slowly driven from the state 'P~, via 'P~, to 'P~ (see [18] for details).

A discussion of the relevant reference reactions (in aqueous solution) pertaining to the SNase mechanism is given in [18]. The experimental data on these reactions, which are used to calibrate our EVB parameters, are sum­marised in Figure 5.3. The calibrated free energy curve for the first two (reference) reaction steps in solution obtained from the EVB + FEPjMD calculations is denoted by tlgs and is shown in Figure 5.4. We should emphasize that tlg; does not correspond to the activation free energy asso­ciated with the normal hydrolysis of phosphodiester in solution, but to the free energy barrier that would result if the non-enzymatic reaction proceeded through the same steps as in the enzyme. The free energy profile tlg p in Figure 5.4 is the result of transferring the reference reactions from a water cage into the protein environment. The state 'P~corresponds to the FEP mapping vector A. = (1,0,0), while A. = (0, 1,0) and A. = (0, 0,1) denote the states 'P~ and 'P~, respectively. The free energy curves in Figure 5.4 are calculated from the MD trajectories using eqn. (5.17) with A. taken as the reaction coordinate. In this case, we consider only a step-wise mechanism, but examination of concerted pathways is straightforward. However, kinetic experiments on the Glu 43 -+ Ser 43 mutant [17c] gives no evidence for a concerted mechanism.

~G (kcallmol)

reaction coordinate

Figure 5.3 Free energy diagram based on experimental kinetic data (see text for references) for the reference solution reactions corresponding to the mechanism proposed for SNase.

Page 131: Principles of Molecular Recognition

120 PRINCIPLES OF MOLECULAR RECOGNITION

60 ,---------------------------------------------~

50

~40 S

:::::-30 til u

2S 20 tJ <l

10

o (1.0.0) (0.1.0)

A (0.0.1)

Figure 5.4 Calculated free energy profiles for the reference reaction(s) in solution (after calibra­tion of lIm and H;), I1g" and for the enzyme reaction, I1gp

5.3.1.1 Proton transfer step. The transfer of a proton from the water molecule in the second ligand sphere ofthe Ca2 + ion to its observed hydrogen bonding partner, Glu 43, in the first step of the reaction was suggested by Cotton et al. [15b]. Based on their previous 2 A structure of the SNase-pdTp-Ca2 + complex, which involved Glu 43 as a direct ligand to the calcium, these authors [19] had initially proposed that a hydroxide ion directly bound to the Ca 2 + ion could serve as the nucleophile for the second step. However, this hypothesis was not supported by any electron density (in the 2 A map) corresponding to the proposed Ca2 + -bound hydroxide ion (or water molecule) adjacent to the phosphate group. Since in the refined 1.5 A resolution map, a water molecule positioned between the 5' -phosphate and Glu 43 was observed, this seemed to be a good candidate for a nucleophile probably requiring general base catalysis [15b]. In this context, one should bear in mind the fact that pdTp has an extra negative charge on the 5'­phosphate group as compared to substrates. This could make a discrimination between the two possibilities difficult because of effects of this charge on the affinity of the Ca 2+ ion for a 0 H - ligand.

The barrier of the proposed proton transfer to Glu 43 must be reduced by the enzyme by at least 3 kcal/mol, since the barrier in solution for this step is already about 18 kcaljmol (cf. Figure 5.3), while the overall barrier in the enzyme is '" 15 kcaljmol. In fact, the hydroxide ion must be stabilised by much more than 3 kcaljmol (as compared to the solution reaction) in order to provide an energetically reasonable starting point for the second reaction step. Our calculated free energy profile (Figure 5.4) shows that the enzyme can fulfil both of these requirements. The energetic cost of the general base catalysis step

Page 132: Principles of Molecular Recognition

THE CATALYTIC ACTION OF METALLO-ENZYMES 121

is reduced by almost 15 kcaljmol to only about 1 kcaljmol and the activation barrier by about 14 kcal/mol. This demonstrates that the proposed first step is indeed an energetically acceptable way offorming the OH- ion. However, the major factor responsible for the free energy reduction in this step is the electrostatic field from the Ca2 + ion. The resulting OH- ion is stabilised in our simulation by becoming directly ligated to the calcium (cf. Figure 5.5). This result seems very reasonable, since the Ca2 + ion provides the only positive charge within such distance that it can significantly affect the proton transfer step. It seems most unlikely that the required stabilisation of the OH- ion could be achieved without its interaction with some positively charged group. Both Arg 35 and Arg 87 as well as Lys 48 appear to be too far away to exert any sizable electrostatic influence on this reaction step. The importance of the Ca2 + ion in stabilising the OH- ion, indicated by our calculations, is also consistent with the observation that at lower Ca 2 + concentrations, the pH optimum of the catalytic reaction is shifted towards higher pH values [14].

5.3.1.2 Formation of the penta-coordinated transition state. It is evident that SNase must provide a very efficient catalytic environment in order to reduce the activation barrier of the nucleophilic attack by the OH- ion on the 5'-phosphate group. The observed barrier for this reaction in solution, (AgL3)obs,w = 33 kcaljmol [16], is 11 kcaljmol higher than that for a nucleo­philic attack by OH- on a neutral phosphate [16]. This difference is most likely due to the increased electrostatic repulsion between the OH- ion and negatively charged phosphate oxygens. The crystallographic structure of the enzyme-inhibitor complex [15] provides a plausible explanation for how the enzyme reduces this electrostatic repulsion and catalyses the reaction. The two arginine residues Arg 35 and Arg 87 interact closely with the doubly negative­ly charged 5'-phosphate of pdTp and one of the phosphate oxygens is also ligated to the calcium ion. These positive charges should be able to at least partially neutralise the double negative charge associated with the formation of the transition state in the hydrolysis step.

Our calculated reaction free energy profile (Figure 5.4) shows a reduction of the free energy barrier associated with the second step of about 19 kcaljmol, which is indeed impressive. It is also encouraging to note that the total free energy barrier (for both reaction steps) resulting from the EVB calculations, Ag~at = 15.1 kcal/mol, is well in accord with the experimentally observed overall rate of the enzyme (Ag~at)obs = 14.9 kcaljmol. However, it should be pointed out that this good agreement is somewhat fortuitous as our error range for the absolute value of Ag~ is about 5 kcaljmol. This error range is mainly determined by the errors in the experimentally evaluated reaction free energies [16] and by the convergence errors in the FEP/MD calculations.

Figure 5.5 shows an MD snapshot of the active site structure at A = (0, 1,0) ('¥~) and at A = (0,0.6,0.4) (the transition state for the second step). The simulations support the suggestion [17b] that Arg 87 primarily interacts with

Page 133: Principles of Molecular Recognition

~

:J~l ~

,.

:~c,

~ 0

40

\

"':'C

;{':"'

:' -

~"OH

-

~

E4

3'·

# \ \ I

" R

8n

I '\

+

i \

: \ ~

. \

: \

.'

.

3'

~

:J~l

~

\/

,.

_~.Ca2+

04

0

:, ....

..

6·~~::·

7-J

E4

3 "

R8~\~~

, +

'\

\

.: \(

i .:'

\ \

3'

Fig

ure

5.5

MD

sna

psho

t of t

he a

ctiv

e si

te s

truc

ture

aft

er th

e pr

oton

tran

sfer

ste

p (s

tate

'¥~

. da

shed

line

s) a

nd a

t the

tran

siti

on s

tate

of t

he s

econ

d re

acti

on s

tep

(ful

l lin

es).

The

Ca2

+ a

nd O

H-

ions

are

rep

rese

nted

by

dott

ed c

ircl

es in

the

'¥~

str

uctu

re a

nd s

olid

cir

cles

at

the

tran

siti

on s

tate

.

Page 134: Principles of Molecular Recognition

THE CATALYTIC ACTION OF METALLO-ENZYMES 123

the 5' -phosphate group in the transition state rather that with the reactant state. As can be seen from Figure 5.5, Arg 87 does not interact as closely with the 5' -phosphate group on the 'P~ state as in the transition state, where it denotes two strong hydrogen bonds to the 05' atom and one of the free phosphate oxygens. The second arginine residue in the active site (Arg 35), however, is hydrogen bonded to the 5' -phosphate group in both the reactant ('P~) and transition states, the interaction being somewhat strengthened at the transition state. This is consistent with the finding [17b] that Arg 35 affects substrate binding substantially more than Arg 87. The calculations therefore support the idea that the SNase-pdTp-Ca2 + ternary complex resembles the activated complex of the reaction (although it is not a transition state analogue), rather than reactant conformation with a bound substrate.

It appears that the excellent 'recognition' of the transition state by SNase can be described almost completely as an electrostatic effect. The loss of interaction energy between the Ca 2 + ion and the hydroxide ion in moving towards the penta-coordinated structure is compensated for mainly by in­creased interaction between the Ca 2 + ion and the 5' -phosphate oxygen ligand. The accumulating negative charge (- 1 --+ - 2) on the phosphate group is effectively neutralised by closer interactions with Arg 35 and Arg 87. In particular, Arg 87 appears to be an important factor, as it hydrogen bonds strongly with two of the phosphate oxygens in the transition state but not in the reactant state.

As an additional check, we have also performed calculations [18] on the Asp 21--+Glu 21 (D21E) mutation which corresponds to an enlargement of one of the Ca 2 + ligands. While this modification has very little effect on both substrate and ion binding, the catalytic activity ofthe D21E mutant is a factor of 1500 lower than that of the native enzyme [17b]. This effect is fairly well reproduced by our simulations, which give an overall increase in the activation barrier of LlLlGLt(D21 --+ E21) = 3.8 kcaljmol. It appears that the reduced catalytic activity of this mutant is mainly due to a displacement of the Ca2+ ion caused by enlarging the Asp 21 ligand, which gives a less efficient solvation of the negatively charged transition state.

5.3.2 Effects of metal ion substitutions

We now turn to the calculation of free energy changes of the en­zyme-substrate-ion-water system (at different configurations along the reaction path) associated with substituting Ca2+ for other divalent cations. F or this purpose, FEP jMD simulations are most useful since they allow one to gradually transform a given ion into another, while recording the free energy change of the system. For a given configuration (see below), the only par­ameters that need to be varied are the non-bonded parameters of the ion interactions, and the free energy can be directly obtained as a function ofthese. In order to examine the catalytic effects of 'mutating' the Ca 2 + ion at its site in

Page 135: Principles of Molecular Recognition

124 PRINCIPLES OF MOLECULAR RECOGNITION

the protein, we basically need to monitor the free energy change for the three different states 'P l' 'P 2 and 'PL3' 'PL3 denotes the rate-limiting transition state associated with forming the penta-coordinated phosphate group (we drop the p superscripts since we now will deal exclusively with the states in the protein environment). If we denote the free energy of changing Ca 2 + to Me2 + (where Me2 + can be either in the Sr2 + or the Mg2 + direction) for these states by L1L1G1(Ca2+ ~ Me2+), L1L1G2(Ca2+ ~ Me2+) and L1L1G1~3(Ca2+ ~ Me2+), the change in the overall activation barrier is given by

L1L1G~at (Ca2+ ~Me2+) = L1L1GL3 (Ca2+ ~Me2+)

- min [L1L1G1(Ca2+ ~ Me2+),L1L1G2(Ca2+ ~ Me2+) + L1L1G2~3(Ca2+)] (5.19)

where L1G1~iCa2+) ~ 1 kcaljmol. The potential parameters used in the present calculations are identical to

those in [18]. The ion interacts via electrostatic and Lennard-Jones terms, V[j= - 332Q[q;/r[j + A[A/rjf- B[B/rYj' where the subscript I denotes the ion and j is another atom with which it interacts. In order to calibrate the calculated hydration energies for different ions versus the non-bonded par­ameters, we first carry out FEP/MD simulations of a solvated ion in water. In these calculations, the Lennard-Jones parameters of the ion were gradually changed from (A[, B[) = (2340.0,25.0) to (A[, B[) = (5.0, 1.0). The calibration calculations in water covered a total simulation time of 40 ps for both the forward and backward transformation direction.

The same calculation is then performed with the ion in the active site of SNase for each ofthe three states 'P l' 'P 2 and 'PL3' Hence, with the previously calculated energy profile for Ca 2 +, we can directly obtain the difference in L1L1G~at from the free energy changes of the three states as a function of the metal ion parameters.

As mentioned above, we must start by calibrating the ion interaction parameters versus the free energy of hydration (see [20] for details). Without the calibration of these parameters in water, it would not be possible to make meaningful comparisons to experimental data. If one were to use, for example, ab initio potentials for the ion inter-actions without verifying that these reproduce the observed hydration energies, attempts to make quantitative comparisons to kinetic data would involve significant uncertainties.

The calibration procedure is quite straightforward for the alkaline earth metals and these ions can be reasonably well modelled by simple charged Lennard-Jones spheres. That is, the non-bonded parameters can be adjusted so that the solvation energy and the first peak of the radial distribution function (REF) in water coincide with experimental values. For transition metals, however, the situation becomes more complicated and we return to this issue below.

Page 136: Principles of Molecular Recognition

THE CATALYTIC ACTION OF METALLO-ENZYMES 125

5.3.2.1 Free energy relationships related to metal ion catalysis. After estab­lishing the scale for our ion potential parameters, we may start exploring the catalytic effect of substituting the active site Caz + ion in SNase. As a first step, it is useful to examine the role of the metal in a simple two-state reaction model in which the state 'I' Z is of lower energy than '1'1 for all divalent ions. It is then only the free energies of 'I' z and 'I'L3that determine the rate of the catalytic reaction, if we assume that ion binding is always 'downhill'. This simplfied reaction model (which will later be extended to the actual reaction) allows us to present the main conceptually limiting cases in terms of simple linear free energy relationships. In other words, with the two states 'I' z and '1'3 we can examine the correlation between the activation free energy ('I'L3 in this model) of the reaction and the energy of 'I' z and 'I' 3 in terms of the diagram given in Figure 5.6. As seen from the figure, the transition state of the ground­state potential surface (obtained from the mixing of 'I' z and '1'3) will be strongly correlated with the height of the intersection between the free energy functionals corresponding to the diabatic surfaces (dg z and dg 3 ) of 'I' z and 'I' 3' If the minimum of dgz is pushed down relative to that of dg 3 then dGL3 (which corresponds to the adiabatic ground-state Eg) will increase, while if the

/' I

C,9 2 (E 2 ) .-..... .:..-:..-~/- ______ -;-

reaction coordinate

Figure 5.6 Demonstration of the correlation between the ground-state activation free energy (~G!) and the height of the intersection between free energy functionals of the diabatic surfaces in a two-state reaction model. If the curve ~g3 which describes the free energy of the state '1'3' is shifted relative to I1g 2 as shown in the figure, the activation energy (i.e. the relative energy of the transition state 'I'L3) will increase by approximately ~~GL3[E.(X)] '" Ag~(X) - ~g3(X)-

~~g3(XL-+X!'), since Eg is given by E. '" 1(E2 + E3) - h/(E2 - E3)2 + 4Hi2 (X denotes the reaction coordinate).

Page 137: Principles of Molecular Recognition

126

(a)

MG

(b)

MG

PRINCIPLES OF MOLECULAR RECOGNITION

Mg2+

'1',

-'. '1',

Mg2+ 1/rton

Mg 2+ ___ -----'

'1',

B.2+

Ca 2•

Mg 2+'-

'1',

,:' 6g 2 : ,

-', :' I

" • .' I

,.'.~. ;(. '. , ,

'1', reaction coordinate

Figure 5.7 Schematic description of how the dependence of the free energy of the two states 'P 2

and 'P 3 on the size of the bound metal ion can give rise to different rate 'selectivities'. In each of the four cases (a -d) the left hand diagram shows how the free energy of each state changes (relative to an arbitrary ion, which we call Ca2+) as a function of ion radius, The right hand diagrams demonstrate how the particular free energy dependences of 'P 2 and 'P 3 give rise to different relations between the reaction profiles for different ions (only the diabatic energy surfaces are

drawn),

minimum of I.lg 3 is pushed down, it will decrease. The difference in solvation energy between different ions will, of course, result in a shift of the absolute free energies of 'P 2 and 'P 3 (given by the curves I.lg 2 and I.lg 3) if a given ion is replaced by another. Moreover, it is likely that the energies of the two states are not equally affected by the ion substitution and this would therefore also give a relative shift between the two free energy curves, as indicated in Figure 5.6.

Page 138: Principles of Molecular Recognition

(c)

MG

(d)

MG

THE CATALYTIC ACTION OF METALLO-ENZYMES

Mg2+ 1/rlon

-------'t'3

ca 2•

Mg2+

'1',

S.2+ ____ -- .. -

C. 2+

Mg 2,!" __ ---"

127

We would thus expect a few basic types of 'selectivity' patterns for the rate constant (Figure 5.7(a-d)) as a consequence of different sensitivities between the two states to the ion properties. The changes in 'solvation' free energy of the states 'P 2 and 'P 3 as a function of metal ion are plotted in Figure 5.7 relative to an arbitrary ion. In order to make the discussion less abstract, we call this reference ion Ca2+, but this choice is completely arbitrary and does not involve any assumptions concerning the nature of the two states; it simply serves to define a scale for the ion size. The four cases depicted correspond to different relations between the curvatures of ~~G2(1/rion) and ~~G3(1/rion)' In Figure 5.7(a), 'P 3 is less sensitive to the ion size than 'P 2 over the entire range of rion ' Hence, the larger the ion the higher the rate constant will be, k(Ba2 +) > k(Ca2 +) > k(Mg2 +). If, on the other hand, 'P 2 is less sensitive to the

Page 139: Principles of Molecular Recognition

128 PRINCIPLES OF MOLECULAR RECOGNITION

ion radius, we will obtain the opposite ordering between the rates, k(Ba2 +) < k(Ca2+) < k(Mg2 +) (Figure 5.7(b)). As a third case, one can imag­ine the possibility that '1'2 is more sensitive to larger ions while '1'3 is more sensitive to smaller ions. This case is depicted in Figure 5.7(c) and would lead to a maximum of the activation barrier for the intermediate ion, k(Ba2+) > k(Ca2+) < k(Mg2 +). The only case which could give a minimum barrier for the intermediate ion is shown in Figure 5.7(d), in which the sensitivities ofthe states in Figure 5.7(c) have been reversed. Here, the ordering between the rate constants would be k(Ba2+) < k(Ca 2 +) > k(Mg2+) and the enzyme could thus be said to be optimised for the intermediate ion (for a related discussion, in the context of selectivity in ion channels, see [21]). Although not included in our examples above, one can also imagine 'anomal­ous' selectivity patterns which would result if one of the ddG curves has an inflection point (i.e. change of sign of the curvature).

In Figure 5.8, the results of actual FEP/MD calculations for the states '1'2 and 'I'L3 along the reaction coordinate are shown. Instead of plotting the free energy of 'I' 3' which does not directly give us the change in activation barrier, the free energy ofthe transition state 'I'L3 is depicted in Figure 5.8. However, this does not change the general picture outlined above, since 'I'L3 is given by 0.6'1' 2 + 0.4 'I' 3 (cf. Figure 5.4). The quantity ddGso1 (Ca2 + -+ Me2 +) is plotted versus the ion repulsive non-bonded parameter which effectively determines the radius of the ion (although the RDF for each set of non-bonded parameters

100 \~Ba

~ 50 ~

• III - 0 •• Ca

0 III

€ III

ii ~ u -50 ~~ ~

" ~~III.Mg <1 <1 -100 0.

0 • 0 • -150

0· 0 •

0

-200 0.5 0.6 0.7 0.8 0.9

A-1/12 Ion

Figure 5.8 Calculated free energy changes (relative to the case with Ca2 + bound) of 'I' 2 (_) and the rate-limiting transition state 'I';~3(~) as a function of the repulsive non-bonded parameter.

Page 140: Principles of Molecular Recognition

THE CATALYTIC ACTION OF METALLO-ENZYMES 129

can be calculated, we have chosen to represent the ion size by Ai~:/12 since these are explicit parameters in the FEP/MD simulations. The value of A ion is approximately related to the ion size by A f!: 2 ~ k [r ion + r water])' It can be seen that the relationship between AAG2 and AAGL3 most resembles the last of the four cases considered above. The influence of the state 'P 1 (not shown) on AAG~at is rather small since AAG1 ~ AAG2 for ions larger than Ca2+ and IAAG11 < IAAG21 for the smaller ions.

The origin of the dependencies of AAG2 and AAGL3 on rion can be rationalised in the following way. When the smaller metals are bound to the enzyme, the free energy of 'P 2 will be lowered considerably more than that of the transition state (as well as 'P 3) since in the former, the OR - ion is free to interact with or ligate the metal, while it is becoming partially bound to the 5' -P atom at the transition state with accompanying charge delocalisation. On the other hand, when the metal ion becomes too large, it has less ability to perform its other major catalytic role (besides stabilising the hydroxide ion in the first reaction step), namely, solvating the developing double negative charge on the phosphate group. That is, for the larger ions the state 'PL3 would be more sensitive to the ion size than 'P Z because of the less efficient solvation of the phosphate group.

After calculating the ion size dependence of AAG, we can evaluate the effect of metal substitution on the catalytic rate. By inserting the calculated quantities AAG1 (Ca2+ -+ Me2+), AAGz(Ca2+ -+ Me2+) and AAGL3 (Ca2+ -+Me2+), in eqn. (5.l9) we obtain the overall change in activation energy (relative to CaZ +) as a function ofthe ion size. The resulting free energy diagram is presented in Figure 5.9, where the location of Sr2+, Baz +, Ca2+ and Mg2 + have been indicated on the curve. Two main conclusions can be drawn from the dependence of AAG~at on the ion radius. First, there is a clear minimum in the neighbourhood of Ca 2 +, which suggests that the enzyme has been optimised to work exactly with bound calcium. Second, it can be noted that the calculated effect on the catalytic rate is more pronounced when a smaller ion, such as MgZ + , replaces Ca Z + than is the case of the larger Sr2 +

and Ba2+ ions. This appears mainly to be due to the fact that AAG2 departs from the corresponding free energy change of the two other states for smaller ions, while the free energy of all three states show a more commensurable behaviour for larger ions. The general trend in AAG~at appears to be in agreement with experimental observations when available (see discussion in [20]).

5.3.2.2 An empirical modelfor transition metals. In addition to the alkaline earth metal ions, which lack d-orbital valence electrons, it is important to try to extend the applicability of our model to also include transition metals. Unfortunately, the hydration energies of the transition metal ions cannot be well modelled by a simple Lennard-Jones sphere with a charge in the centre; in order to reproduce the observed hydration energy, the ion radius must be

Page 141: Principles of Molecular Recognition

130 PRINCIPLES OF MOLECULAR RECOGNITION

made too small to be compatible with the observed one. Manifestations of ligand field effects lead to a more complicated pattern of solvation energies for the transition metal ions (the differences in hydration energy among these ions can be directly related to the crystal field stabilisation energies). All of the divalent transition metals have lower (more negative) hydration energies than an alkaline earth ion of the same radius would have (we say would, since the latter type of ion only comes in a limited number of versions and one must interpolate between these). It is thus clear that in order to model these ions in microscopic simulations, one must resort to more sophisticated models than simple charged spheres. We have attempted in a preliminary way to model the Mn2+ ion in order to obtain a comparison with the experiments of [17b].

Our model for Mn2 + (which has some theoretical justification, as will be argued below) corresponds to adding six fractional positive charges, + <5, in octahedral geometry at a distance rb from the centre of the ion and a corresponding charge of - 6<5 in the centre, thus retaining a net charge of + 2. The ion then interacts with one Lennard-Jones centre (at the centre of the ion) and seven electrostatic centres; the geometry of the six peripheral fractional charges is rigid, but overall rotation of the six-centre frame about the nucleus is allowed and no internal forces are associated with such rotations. The model parameters used here for Mn2 + are: (AI' B/) = (145.0, 25.0), <5 = 0.35, rb = 0.9 A. These parameters reproduce the observed hydration energy and simultaneously give a radial distribution function in accordance with the observed ionic radius. Apparently this empirical model retains some of the basic characteristics of the Mn2 + ion in solution (and also in the SNase active site environment [22]). That is, the model simulates the preference of transi­tion metals (relative to simple spherical ions) for an octahedral geometry by its specific charge distribution. Furthermore, this type of charge distribution leads to an increase of the (absolute value of the) calculated hydration energy compared to an alkaline earth ion with the same effective radius. On the basis ofligand field theory, one would expect the electron density to be partly shifted to the d-orbitals along the bisectors between the ligands, so that the screening of the nuclear charge becomes smaller in the ligand directions than in the case of a 'spherical' ion. This situation is described by our model in a rather crude fashion (one could perhaps achieve an improvement by representing all 18 bisector and ligand directions with explicit charges, which would be deter­mined according to ligand field theory prescriptions, but for the present purpose, the 7-centre model seems satisfactory).

The calibration of the Mn2 + parameters in solution were done by gradually perturbing a Ca 2 + ion into Mn 2 + and calculating the change in hydration free energy associated with the transformation. The experimental value for AGSo1 (Ca2+) - AGso1 (Mn2+) is - 57 kcal/mol (see, e.g. [23]), while the corre­sponding calculated number in water for our model is - 59 kcaljmol. These calibrated parameters were then used to calculate the effect of a corresponding ion transformation, for each of the three relevant states along the reaction

Page 142: Principles of Molecular Recognition

THE CATALYTIC ACTION OF METALLO-ENZYMES

20 .-~~~.-~~~.-----~~~.-----

15

-'0 ~ 10 iii U :!!. iii

*"CJu 5 <l <l

o

_5L-----~L------J----~~ ______ ~

0.5 0.6 0.7

Ao1/12

Ion

0.8 0.9

131

Figure 5.9 Calculated effect of metal ion substitutions on the overall activation barrier relative to the case with Ca 2 + bound. Each point is calculated according to eqn. (5.19) from the quantities LlLlG I' LlLlG2 , LlLlG;_3 and LlG 1-2 (Ca2+). The value of LlLlG;at for Mn2+ is plotted above that of a 'spherical' ion with the same radius (as given by the RDF), so that the value of A. corresponds to that of the latter ion and not to the actual value used for Mn 2 +. The observed v~r~es for Sr2 + and Ca 2 + are denoted by circles and experimentally estimated limits for Ba 2 +, Mg2 + and Mn2 + by i.

coordinate, in the protein environment. The calculated effect of a Mn2 + ion on the overall activation barrier of the enzyme is shown in Figure 5.9. The value of ~~G~at for Mn2 + is plotted above that of the corresponding spherical ion which has the same radius as Mn2+ (i.e. the value of the repulsive Lennard­Jones parameter refers to the spherical ion and not to Mn2 +).

As discussed above, the main reason for the relatively sharp increase in activation energy as Ca2 + is transformed to smaller ions (cf. Figure 5.9) is that the free energy of 'P 2 is lowered more than that of the transition state. The effect is much more pronounced for Mn2 + (as represented by our model) than for a corresponding hypothetical alkaline earth-like ion of the same radius. Furthermore, if one considers a spherical ion with the same calculated hydration energy as that of Mn2 +, the latter gives a value of ~~G~at which is about 2 kcal/mol higher. The calculated change in the overall activation barrier, ~~G~at(Ca 2+ -+ Me2+), is about 6( ± 1) kcal/mol. This value agrees fairly well with the experimental observation [17a, bJ that Mn2 + is at least a factor of 36000 less efficient as an activator than Ca2+.

5.3.2.3 A correlation between metal and general base strength? The finding that SNase appears to have its turnover optimum for the ion which it in fact uses in nature may, of course, be considered not terribly surprising. However, the free energy relationships leading to a rate optimisation are quite interesting

Page 143: Principles of Molecular Recognition

132 PRINCIPLES OF MOLECULAR RECOGNITION

(a)

M ("metal")

H2O

H2O

(b)

M ("metal")

carbo anhydrase ak::. dehydrogenase alk. phosphalas.

e o" .... ,?o

c I

Ihermolysin carboxypelidase A

13 '---..

(), (e)

DNase I Phospholipase ~

lact. dehydrogenase serine proteases RNase

B (base)

'---t-----t"-----+--t~ B (base) H20

(), (e)

Page 144: Principles of Molecular Recognition

THE CATALYTIC ACTION OF METALLO-ENZYMES 133

and point towards more general features that may pertain also to other metallo-enzymes, with both, similar as well as quite different catlytic reactions. In [20], a number of different metallo-enzymes are discussed, whose sensitivity to ion species could be interpreted in terms similar to those presented here for SNase. The apparent optimisation of the SNase activation barrier for Ca2+ suggests that the electrophilicity of the metal should be correlated with the strength of the base in the proton transfer step. If we consider this type of proton transfer reaction in solution under the influence of a base as the proton acceptor and a metal ion assisting as a catalyst, we can write

M R-OH+B~R-O- +BH+ (5.20)

where B is base that which can be either a water molecule or a stronger base, and M denotes a metal ion, if present, otherwise simply a water molecule. The energetics of eqn. (5.20) (in solution) can be described by Figure 5.10(a), which shows the influence of some prototypes for Band M on the reaction free energy. The approximate numerical values in Figure 5.1O(a) are calculated from observed pK shifts in solution. If we think of Figure 5.10(a) as defining a sort of free energy surface for the solution reaction shown in eqn. (5.3), it is interesting to examine to what extent this picture is reflected by enzymatic reactions of the same type. In Figure 5.10(b), a number of enzymes with well-characterised reaction mechanisms are 'plotted' according to their cata­lytic machinery for handling eqn. (5.20). Although it is clear that the actual free energy values of Figure 5.10(a) cannot apply strictly to Figure 5.10(b), e.g. because of different dielectric properties of the environments, it is suggestive that the 'high energy' region appears to be avoided in Figure 5.10(b). It would thus appear that enzymes have not managed to handle eqn. (5.20) completely 'at will', but must resort to particular energetically reasonable combinations of metal and/or base. The types of general base catalysts represented by a carboxylate ion and an imidazole ring in Figure 5.10 can also, be tentatively viewed as two different typ6S of charge redistributions associated with the proton transfer reaction (eqn. (5.20)). That is, when a carboxylate ion acts as a general base, one unit of negative charge is simply 'moved' from the base to the

Figure 5.10 (a) Diagram showing the effects of metal ion and general base catalysts on the reaction of eqn. (5.20). The abscissa denotes increasing general base strength, represented by a water molecule, a carboxylate ion and an imidazole ring. The ordinata represents increasing metal ion electrophilicity, where a water molecule denotes the case where no metal is present. The energy values, fJ.Gij' correspond to the proton transfer reaction (M?+)H20 + Bj;:::!(M?+)OH- + BH7, where each entry is obtained from fJ.Gij = - 2.3RT(pKj - pKJ For example, in the case of a metallo-enzyme using Ca2 + and glutamate, respectively, as M and B, i denotes Ca2 + (H 20) andj denotes (Glu)-COOB (the pK. values for metal bound water are taken from [24]. (b) the figure shows a number of different enzymes that catalyse reactions involving a proton transfer step like eqn. (5.20), plotted according to their use of metal ion and general base catalysis (SNase, DNase I and RNase denote staphylococcal nuclease, deoxyribonuclease I and ribonucleases (e.g., A and

T 1)' respectively).

Page 145: Principles of Molecular Recognition

134 PRINCIPLES OF MOLECULAR RECOGNITION

nucleophile ([ - 0] --+ [0 - ]). On the other hand, with an imidazole ring as the proton acceptor, an ion pair is created by the reaction ( [0 0] --+ [ + - ]). In the latter case, the ion pair is often 'pre-stabilised' by the charge distribution of the active site, e.g. by the serine proteases' Asp-His pair. It would appear that such a pre-stabilisation of the resulting configuration (relative to the reactant state) is easier to achieve in the [00] --+ [ + - ] case, since the character of the charge distribution changes more drastically. With this simple type of charge redis­tribution picture in mind, one could perhaps attempt to examine more distantly related enzymatic proton transfer mechanisms (for instance, the proton transfer step in lysozyme would correspond to the [00] --+ [ + - ] case with pre-stabilisation by Asp 52).

The discussion in the preceding paragraph should be regarded as a some­what speculative attempt to rationalise the relations between the roles of metal and general base catalysis in enzymes as far as only proton transfer is concerned. In reality there are, of course, several additional factors that determine the efficiency of a particular catalytic configuration, notably the nature of the rate-limiting transition state (as our calculations on SNase show) and the requirement for proper binding of substrates. It should also be realised that the active site microenvironment, as a whole, determines the catalytic properties of a given enzyme rather than just one or two catalytic groups. Future calculations on the enzymatic reactions discussed above can hopefully reveal some of the rationale behind the particular choices of metal ions, but it seems likely that the general free energy relationships leading to optimisation for Ca2+ in SNase will also be relevant in other cases.

5.4 Concluding remarks

In this chapter, we have reviewed the basic elements of the empirical valence bond approach for simulating chemical reactions in enzymes and in solutions. The alternative molecular orbital treatment has also been outlined and the differences between the two approaches discussed. As far as calculations offree energy profiles in enzymes is concerned, we conclude that the former method is far more convenient and accurate since it allows for the incorporation of experimental information about the relevant energy surfaces, e.g. in aqueous solution. This point deserves to be emphasised in view of the common belief that only ab initio quantum calculations (as opposed to those based on some degree of empirical parametrisation) can provide accurate answers to chemical questions (for a related discussion, see [24]); this is particularly untrue for reactions in liquid phases and in proteins. As is the case with semi-empirical MO schemes, the EVB method is also semi-empirical but it is parametrised on information that is more relevant as far as bond breaking/forming processes in condensed phases are concerned.

We feel that our study of the phosphodiester hydrolysis reaction catalysed

Page 146: Principles of Molecular Recognition

THE CATALYTIC ACTION OF METALLO-ENZYMES 135

by Staphylococcal nuclease and related studies (e.g. [9b, c]) gives reason for optimism concerning the prospects of being able to examine in detail (and with some accuracy) enzymic reactions by computer simulation approaches. The calculated free energy profile for the SNase reaction, although not yet within the 1 kcaljmol accuracy one would desire, does in fact provide a semi­quantitative picture of the entire catalytic effect of SNase. The roles ofthe main catalytic factors in this case, namely the general base (Glu 43), the calcium ion and the two active site arginines (35 and 87) are clearly elucidated by the EVB simulations and the overall agreement with available experimental data is quite satisfactory.

The study ofSNase that has been presented here as a specific example of the EVB simulation approach, indicates that the electrostatic interaction between the metal ion and the reacting system is the key to the reduction of the activation barrier relative to the corresponding reference reaction in a solvent cage (in which the effect of the general base is already included). This conclusion is in accord with previous simulations of other enzymatic reactions [9J, in which electrostatic complementarity between the enzyme's active site and the change in charge distribution during the reaction was identified as the most important factor for catalysis. Regardless of whether this is a completely general finding or whether entropic factors are also important, it appears that free energy calculations focusing on electrostatic effects provide the simplest way to correlate enzyme structure and activity [4b, 25].

Finally, it is interesting to note that the free energy relationships elicited in this work might have quite general implications for other enzyme reactions. In fact, the validity of such relationships in enzymes and solutions can be examined by computer simulation methods as has been illustrated in several preliminary studies from this laboratory [9, 12b]. It appears that polar sites in enzymes obey to some extent the linear response approximation (the system polarisation is proportional to the applied local field) and therefore follow linear free energy relations.

Acknowledgements

J. Aqvist gratefully acknowledges support from the Swedish Natural Science Research Council. Support from the office of Naval Research (Grant N00014-91-3-1318) is also acknowledged.

References

1. 1. Pauling (1984) Nature (London) 161, 707-709. 2. D.C. Phillips (1966) Sci. Am. 215 (5),78-90. 3. (a) M.1. Page and W.P. Jencks (1971) Proc. Natl. Acad. Sci. USA 68, 1678-1683; (b) D.R.

Storm, and D.E. Koshland, Jr. (1970) Proc. Nat!. Acad. Sci. USA 66, 445-452; (e) T.c. Bruice,

Page 147: Principles of Molecular Recognition

136 PRINCIPLES OF MOLECULAR RECOGNITION

(1976) Annu. Rev. Biochem. 45, 331-373; (d) F.M. Menger (1985) Acc. Chern. Res. 18, 128-134; (e) A.J. Kirby (1980) Adv. Phys. Org. Chern. 17, 183.

4. (a) A. Warshel (1978) Proc. Natl. Acad. Sci. USA 75, 5250-5254; (b) A. Warshel (1981) Acc. Chern. Res. 14, 284-290.

5. (a) S.G. Cohen, V.M. Vaidya and R.M. Schultz (1970) Proc. Natl. Acad. Sci. USA 66, 249-256; (b) R. Wolfenden (1983) Science 222, 1087-1093; (c) M.J.S. Dewar and D.M. Storch (1985) Proc. Natl. Acad. Sci. USA 82, 2225-2229.

6. (a) A. Warshe1 and R.M. Weiss (1980) J. Am. Chern. Soc. 102,6218-6226; (b) A. Warshel and S.T. Russell (1984) Q. Rev. Biophys. 17,283-422.

7. A. Warshel and M. Levitt (1976) J. Mol. Bioi. 103,227-249. 8. O. Tapia and G. Johannin (1981) J. Chern. Phys. 75,3624-3635. 9. (a) J.-K. Hwang, G. King, S. Creighton and A. Warshel (1988) J. Am. Chern. Soc. 110,

5297-5311; (b) A. Warshel, F. Sussman and J.-K. Hwang (1988) J. Mol. Bioi. 201,139-159; (c) A. Warshe1, S.T. Russell and F. Sussman (1986) Isr. J. Chern. 27, 217-224.

10. (a) G.S. Hammond (1955) J. Am. Chern. Soc. 77, 334-338; (b) J.W. Albery and M.M. Kreevoy (1978) Adv. Phys. Org. Chern. 16,87-157.

11. J.P. Valleau and G.M. Torrie (1977) in Modern Theoretical Chemistry, ed. B. Berne, Vol. 5, Plenum, New York, pp. 169-194.

12. (a) A. Warshe1 (1982) J. Phys. Chern. 86, 2218-2224; (b) A. Warshel (1984) Pontif. Acad. Sci. Ser. Varia 55, 59-81.

13. P.W. Tucker, E.E. Hazen Jr. and F.A. Cotton (1978) Mol. Cell. Biochem. 22, 67-77. 14. P. Cuatrecasas, S. Fuchs and c.B. Anfinsen (1967) J. Bioi. Chern. 242,1541-1547. 15. (a) P.W. Tucker, E.E. Hazen Jr. and F.A. Cotton (1979) Mol. Cell. Biochem. 23, 67-86; (b) F.A.

Cotton, E.E. Hazen Jr. and M.J. Legg (1979) Proc. Natl. Acad. Sci. USA 76, 2551-2555. 16. J.P. Guthrie (1977) J. Am. Chern. Soc. 99, 3991-4001. 17. (a) E.H. Sepersu, D. Shortie and A.S. Mildvan (1986) Biochemistry 25, 68-77; (b) E.H. Sepersu,

D. Shortie and A.S. Mildvan (1987) Biochemistry 26, 1289-1300; (c) E.H. Sepersu, DW. Hibler, J.A. Gerlt and A.S. Mildvan (1989) Biochemistry 28,1539-1548.

18. J. Aqvist and A. Warshel (1989) Biochemistry 28, 4680-4689. 19. F.A. Cotton, C.J. Bier, V.W. Day, E.E. Hazen Jr. and S. Larsen (1971) Cold Spring Harbor

Symp. Quant. Bioi. 36, 243-249. 20. J. Aqvist and A. Warshel (1990) J. Am. Chern. Soc. 112, 2860. 21. G. Eisenman and R. Horn (1983) J. Membr. Bioi. 76, 197-225. 22. E.H. Sepersu, 1. McCracken, J. Peisach and A.S. Mildvan (1988) Biochemistry 27, 8034-8044. 23. M.A. Burgess (1978) Metal Ions in Solution, Ellis Horwood, Chichester, UK. 24. M.J.S. Dewar (1985) J. Phys. Chern. 89, 2145-2150. 25. (a) A. Warshel and J. Aqvist (1989) Chern. Scripta. 29A, 75; (b) A. Warshel and J. Aqvist (1991)

Ann. Rev. Biophys. Biophys. Chern. 20,267.

Page 148: Principles of Molecular Recognition

6 Drug discovery J. SAUNDERS

6.1 Introduction

For over a decade now, the major players in the pharmaceutical league have had at their disposal an array of modern tools to assist in the discovery of new drugs; ample time one would think for the products of such research to move through the various stages of clinical evaluation and on to the market. These tools include computerised QSAR analyses, high resolutien NMR spectro­scopy, vastly improved X-ray crystallographic methods, molecular modelling and the power of molecular biology. In addition, an often undervalued, and sometimes derided tool, is the ability to systematically search the 'chemical universe' and screen representative molecules from each sector against new biological assays. In order to judge the impact of these modern techniques, it is interesting to compare the discoveries of the last 25 years with those recently emerged from the secret vaults of the top-ranked companies. This is the focus of this chapter.

Twenty of the most successful drugs in 1992 are listed in Table 6.1 [ll Clearly many have been the result of innovative pharmacological research with lead generation and optimisation resulting from either serendipity and/or from traditional medicinal chemical ploys. With possibly two exceptions, ACE inhibitors and clavulanic acid, this process relied only upon a knowledge of the active entity ('the hand'), there being little or no structural information available concerning the drug target ('the glove'). It is the concept of using information from both sources, that is knowing the shape, style ('style' may be likened to the fluctuation of electrostatic potential over the surface of the molecule) and size of the glove as well as the properties of the hand, that is creating excitement within the research base. It is on these terms that drug design should now be assessed, although the formidable list of potential new drugs would suggest that such information, whilst desirable, is certainly not essential. Serendipity still has a major role to playas will be illustrated below; the successful companies will establish the correct balance between the two extreme approaches.

A vailable targets for drugs can be divided superficially into three major categories (Figure 6.1). It is impractical to give comprehensive coverage in a chapter of this size; representative examples have been chosen which have made use of the tools alluded to above to weigh up their relevance alongside

Page 149: Principles of Molecular Recognition

Tab

le 6

.1

Top

Dru

gs-

19

92

Dru

g C

ompa

ny

Act

ivit

y (U

tility

) Y

ear

Com

men

ts

disc

over

ed

Ran

itid

ine

Gla

xojS

anky

o H

2 an

tago

nist

(an

ti-u

lcer

) 19

78

2nd

gene

rati

on

Cap

topr

il

B-M

Squ

ibb

AC

E i

nhib

itor

(an

tihy

pert

ensi

ve)

1977

B

reak

thro

ugh

drug

E

nala

pril

M

erck

A

CE

inh

ibit

or (

anti

hype

rten

sive

) 19

80

2nd

gene

rati

on

Nif

edip

ine

Bae

rjT

aked

ajP

fize

r C

alci

um c

hann

el

bloc

ker

(ant

i-an

gina

; an

ti-

hype

rten

sive

19

68

Nov

el m

ode

of a

ctio

n A

teno

lol

ICI

tJ-B

lock

er (

anti

angi

na;

anti

hype

rten

sive

) 19

70

Lin

e ex

tens

ion

from

pro

pran

olol

, its

elf

a br

eakt

hrou

gh d

rug

Cim

etid

ine

SB

H2

anta

goni

st (

anti

-ulc

er)

1974

B

reak

thro

ugh

drug

D

iclo

fena

c so

dium

C

iba-

Gei

gy

Cyc

loox

ygen

ase

inhi

bito

r (a

nti-

infl

amm

ator

y)

1966

L

ine

exte

nsio

n C

efac

lor

Eli

Lill

y C

epha

losp

orin

(an

tiba

cter

ial)

19

74

2nd

gene

rati

on c

epha

losp

orin

C

ipro

flox

acin

B

ayer

D

NA

gyr

ase

inhi

bito

r (a

ntib

acte

rial

) 19

83

Bre

akth

roug

h dr

ug

Alb

uter

ol

Gla

xo

tJ2-

agon

ist

(ant

i-as

tham

tic)

19

68

Bre

akth

roug

h dr

ug

Flu

oxet

ine

Eli

Lill

y 5H

T-u

ptak

e in

hibi

tor

(ant

idep

ress

ant)

19

75

Lov

asta

tin

Mer

ck

HM

G-C

oA i

nhib

itor

(an

ti-h

yper

chol

es-

tero

lem

ic)

1980

B

reak

thro

ugh

drug

D

ilti

azem

M

MD

C

alci

um c

hann

el b

lock

er (

anti

angi

nal)

19

69

Nov

el s

truc

tura

l cl

ass

Cla

vula

nic

acid

SB

tJ

-Lac

tam

ase

inhi

bito

r (c

ombi

nati

on w

ith

tJ-la

ctam

ant

ibio

tics

) 19

75

Bre

akth

roug

h dr

ug

Acy

clov

ir

Wel

lcom

e D

NA

pol

ymer

ase

inhi

bito

r (a

nti h

erpe

tic)

19

74

Bre

akth

roug

h dr

ug

Nap

roxe

n S

ynte

x P

rost

agla

ndin

bi

osyn

thes

is

inhi

bito

r (a

nti-

infl

amm

ator

y)

1968

S

econ

d ge

nera

tion

C

eftr

iaxo

ne

Roc

he

tJ-L

acta

m a

ntib

ioti

c (a

ntib

acte

rial

) 19

79

Thi

rd g

ener

atio

n ce

phal

ospo

rin

Pir

oxic

am

Pfi

zer

Non

-ste

roid

al a

nti-

infl

amm

ator

y 19

70

Ter

fena

dine

M

MD

H

I-re

cept

or a

ntag

onis

t (a

ntih

ista

min

e)

1973

O

mep

razo

le

Ast

ra

Pro

ton

pum

p in

hibi

tor

(ant

i-ul

cer)

19

86

Bre

akth

roug

h dr

ug

Page 150: Principles of Molecular Recognition

G-PROTEIN

LINKED

DRUG DISCOVERY

GROWTH FACTORS

CHEMICALLY GATED ION CHANNELS

Figure 6.1 Categories of drug targets.

139

traditional methods. Of necessity, discussion focuses on the design of agents interacting with specific proteins since design for DNA targets is still in its infancy. It has to be acknowledged, however, that this situation could change rapidly within the next few years as the now fashionable antisense culture is explored [2]. In each of the four case histories selected, a lead drug candidate has emerged although all have yet to prove utility beyond preclinical studies. It will be illuminating to follow progress in each of these fields over the next few years.

6.2 Receptors as targets for drug design

Originating from the pioneering studies of Dale, Ahlquist, Gaddum and others, the number of pharmacologically characterised types and subtypes of receptors for each of the classical neurotransmitters has continued to grow [3]. Criteria for characterisation has traditionally relied on tissue distribution, ligand binding properties and signal transduction mechanisms. However, recent molecular biological approaches have revealed that all receptors cloned

Page 151: Principles of Molecular Recognition

140 PRINCIPLES OF MOLECULAR RECOGNITION

and sequenced to date can be assigned to one of four super-families [4,5]. The largest gene family is that which interacts with guanine nucleotide regulatory proteins (G-proteins) (Figure 6.2) and is represented by such pharmacologi­cally diverse receptors as adrenergic, muscarinic cholinergic, dopaminergic, serotonergic and various peptidergic receptors (including angiotensin-II [6J and the tachykinins [7J). Even more of a surprise is that these receptors are also homologous to the visual pigments [8J, the opsins and bacteriorhodop­sin, and the yeast mating factor receptors [9]. The medicinal chemist is

(a) oontenu of box macniflCd. below

(b) ACHE

lI<d}kbolil1e 1 c:=:J ~ Gp(ally.GTP)

101 101 101 101 .. Gp(aPY.G IIP) )

ta

Nl!ne

~Gp(a.GTP)

• Gp(a.GTP).PLC

Impulse PIPZ ~ !PJ • DAG

2 / 101101 Ca2+ lUobllluUOD

101 101 • c:=:J Prott:ln klDut C actlvalloa

ACHE

presynlpUt nbre $yoapSe po5Uy .. plknbre

Figure 6.2 (a) Cholinergic pathways in the brain. (b) Sites for enhancement of cholinergic transmission as therapies in Alzheimer's dementia. (1) Inhibitors of acetylcholinesterase (ACHE) will potentiate the effect of endogenous neurotransmitter. (2) Antagonists at the presynaptic auto receptor, believed to be of the M2 subtype, will increase transmitter release through blockade of the negative feedback mechanism. (3) Occupancy of postsynaptic sites by muscarinic agonists will stimulate production of the active subunit (GplX'GTP) of the G-protein (GplXfJy. GDP) and thence, by the cascade summarised, will lead to signal transduction. A, amygdala; C, cerebral cortex; DB, nucleus of the diagonal band; Hi, hippocampus; MS, medial septum; NB, nucleus basalis of Meynert; PLC, phospholipase-C; PIP2 , phosphat idyl inositol-4, 5-bisphosphate; IP)-inositol triphosphate; DAG, diacylglyceroL Reproduced with permission from Emson, P.e. and Lindvall, O. (1986), British Medical Bulletin, Special Issue, Alzheimers Disease and Related

Disorders, eds. M. Roth and L.L. Iverson, Figure 2, p. 59.

Page 152: Principles of Molecular Recognition

DRUG DISCOVERY 141

beginning to realise that not only can structural information derived for one receptor system be applied to another within this family but also that new ligands may be designed on the basis of what is already known for homologous receptors. Like the G-protein linked receptors, two other super-families of receptors are also extracellular: the growth factor receptors [1OJ for prolactin, insulin and epidermal growth factor amongst others and the chemically gated ion channels [11, 12J exemplified by the nicotinic cholinergic, y-aminobutyric acid and glycine receptors. The cytosolic receptors for steroid and thyroid hormones represent the fourth family. We discuss two receptors within the G-protein linked family, the muscarinic acetylcholine receptor and the recep­tor for the peptide hormone angiotensin-II, in order to illustrate how homo­logy information may increasingly be used. Clearly such a strategy may also be applied to the other receptor super-families.

6.2.1 Alzheimer's disease and the muscarinic receptor

Agonists for the muscarinic receptor (MR) have been much studied [13, 14J in recent years in connection with their potential role in Alzheimer's disease (AD). The so-called cholinergic hypothesis [15] for AD is based on a wide body of clinical and neurochemical evidence which indicates that the marked deficits in cognitive function that accompany AD can be most consistently related to selective degeneration of cholinergic neurones projecting from the nucleus basalis of Meynert into cortical and hippocampal regions (Figure 6.2). Two clear-cut strategies to accentuate cholinergic transmission have been evaluated extensively in the clinic: (i) inhibition of acetylcholinesterase (ACHE) to potentiate the effects of endogenous neurotransmitter; and (ii) directly acting agonists at postsynaptic MRs in the cortex. Experience with directly acting agonists such as arecoline, RS-86 and pilocarpine have yielded either disappointing or equivocal results [16]. Only arecoline showed a marginal improvement in a rather limited patient population but its extremely short duration of action renders its widespread clinical use impractical. In contrast, inhibitors of ACHE such as tacrine act to prolong the effect of the endogenous full agonist, that is acetylcholine itself, within the synaptic cleft and this may account for their relatively more clear cut clinical benefit in AD patients [17]. These and other results may be interpreted in terms of the low efficacy of the agonists at cortical MRs used in trials; this was the hypothesis that led to the discovery of 'super-agonists' as discussed below. Whilst there are several high efficacy MR agonists known, they all bear a quaternary ammonium group, such as present in acetylcholine itself, and will, therefore, not penetrate into the brain when given systemically. This latter point represented the real challenge for the medicinal chemist since established dogma [18] until recently was that the quaternary ammonium head group was an obligatory feature for full agonist behaviour at the cortical site.

What exactly is meant by 'efficacy'? For the cortical muscarinic receptor, the

Page 153: Principles of Molecular Recognition

142 PRINCIPLES OF MOLECULAR RECOGNITION

ability of the ligand to stimulate the hydrolysis of phosphatidyl inositol (PI) may be used as a measure of this property [19]. Note that efficacy must be distinguished from affinity, affinity being a measure of how tightly bound a particular ligand is to the receptor. Ligands with extremely high affinity may induce no response at all, that is they have zero efficacy and are thus antagonists. In the cerebral cortex, those MR that are linked to the PI turnover appear to have only limited receptor reserve and thus only highly efficacious agonists are capable of stimulating of good PI response. Until 1988 or thereabouts, it was widely accepted that the muscarinic receptor located post-synaptically in cerebral cortex was of the pirenzepine-sensitive Ml subtype but there are now at least three subtypes known, Ml, M3 and M5, all positively coupled to PI hydrolysis [20]. Binding experiments and functional data recorded in cortical slices are thus an 'average' of the properties of these three closely related sites.

Cortical efficacy of a given ligand for the receptor may be conveniently predicted from radiolabelled antagonist ([3H]-N-methylscopolamine, NMS) and agonist displacement ([3H]-oxotremorine-M, OXO-M) measurements in rat cortical slices [21]. It was found that the log of the binding ratio, KJNMSJ/Kj[OXO-MJ, correlated directly to the ability ofthe test ligand to stimulate cortical PI. Four broad categories of muscarinic ligand may be defined according to efficacy: Antagonists show equal affinity in both binding assays and thus have ratios close to unity, partial agonists have intermediate ratios whereas at the other end of this continuum, full agonists display a ratio in excess of 800-1000 through preferential binding to the agonist labelled state of the receptor (Table 6.2).

Table 6.2 Activity of novel muscarinic ligands

Compound Binding data (K.pp, JlM)

[3H]NMS' [3H]OXO-Mb Ratio' Ligand class

Atropine 0.0010 0.00048 2.1 Antagonist Pilocarpine 4.0 0.040 100 Weak partial agonist Arecoline (1) 6.2 0.011 560 Partial agonist Carbachol 24 0.0058 4100 Full agonist

(2) 1.8 0.0046 390 Partial agonist (3) 0.00012 0.000067 1.8 Antagonist (4) 0.44 0.00090 490 Partial agonist (5) 0.032 <0.00004 >1000 Full agonist (6) 1.4 0.0013 1100 Full agonist

Values were determined by displacement of tritiated ligand from rat cortical homogenates, expressed as affinity constants corrected for ligand occupancy . • Displacement of N-[3H]methylscopolamine. b Displacement of [3H]oxotremorine-M. , Ratio of K.pp (NMS) to K.pp (OXO-M).

Page 154: Principles of Molecular Recognition

DRUG DISCOVERY 143

Many groups have chosen the partial agonist arecoline (1) as the starting point in the design of full agonists with the combined objectives of increasing efficacy and replacing the hydrolytically susceptible ester bond [22, 23]. Replacement of the methyl ester group of arecoline by methyl oxadiazole [24] afforded a compound (2) with a modest improvement in affinity but without the anticipated increase in efficacy (Table 6.2). Almost simultaneously, a series of quantum mechanical calculations were undertaken to understand more precisely the distribution of electrostatic charge on a quaternary methylam­monium group. It was concluded that the spherically symmetrical distribution of charge should be best mimicked by a protonated 1-azabicyclic system such as is found in quinuclidines. In addition, it was hypothesised that the function of the ring heteroatoms in the oxadiazole was to participate in H-bonding interactions with the agonist binding site of the receptor and factors that enhance this capability would improve ligand efficacy. This line of reasoning eventually led to the synthesis [24, 25J of a range of oxadiazoles having high affinity for the MR and which span the efficacy range from antagonist (3) through partial agonist (4) to the full agonist (5). That the aminooxadiazole (5) behaves as a full agonist was confirmed [19J by the magnitude of its PI response which was measured at 150-170% to that seen with acetylcholine and the effect was completely atropine sensitive. Further in vivo characterisa­tion of these molecules showed that they readily penetrated into the CNS as witnessed by their ability to induce centrally mediated physiological effects (for example, hypothermia) at similar doses to peripheral muscarinic effects such as salivation when given systemically. Thus both criteria, high efficacy and rapid equilibration into the CNS had been met allowing the cholinergic hypothesis to be rigorously tested. Interestingly, the indirectly acting full agonist tacrine is shortly expected to receive approval from the regulatory authorities.

(4)

Ph OH

N-J-Ph

cs:to~N N (3)

VN -{.NH2

I IN

P 0

H (5)

Page 155: Principles of Molecular Recognition

144 PRINCIPLES OF MOLECULAR RECOGNITION

Electrostatic potential mapping using the DENPOT procedure allowed alternative heterocyclic systems to the oxadiazole to be identified. Experience with the oxadiazoles had shown that predicted efficacy correlated directly with the magnitude of the negative electrostatic potential in the vicinity of the oxadiazole ring nitrogen atoms. A similar analysis of a variety of other heteroaromatic systems [26,27] predicted that both thiadiazole (6) and a correctly substituted pyrazine ring (7) should have potential wells at least as deep as the oxadiazole and should therefore display full efficacy at the cortical MR. Both predictions were subsequently verified experimentally.

From the SAR developed during this work and the known homology of all G-protein linked receptors for monoamine neurotransmitters [28], it is possible to differentiate between ligands of varying efficacy in general terms as follows: the cationic head group is an essential element for all ligands. Full agonists utilise H-bonding interactions, usually two, whose nature (that is acceptor or donor) and topology varies depending upon the receptor type. Agonists are small hydrophilic molecules. Antagonists rely on lipophilic binding energy to stabilise binding to the receptor and are therefore usually large molecules with some region of high bulk lipophilicity. This would also imply that the design of antagonists from agonists, or vice versa, should be straightforward and, indeed, that ligands for one receptor may be used to design ligands for a second receptor. In confirmation of this latter point is the discovery [29] of potent, quinuclidine-based antagonists (8) for the neurokinin peptide receptor (NK 1) bearing the same lipophilic motif as is present in muscarinic antagonists (compare (3) with (8)). Additionally, a selective an­giotensin receptor antagonist (9) bearing the diphenyl motif has also been reported [30] thereby providing a convenient link with the next part of the story.

H

l)LN-{. I ,N

P S H

(6) (7)

N ([l .... N.H OCH3

l,J) .. (Ph N ....

Ph (8)

Page 156: Principles of Molecular Recognition

DRUG DISCOVERY 145

6.2.2 Angiotensin-II antagonists in hypertension

The renin-angiotensin system (RAS) (Figure 6.3) is responsible for the produc­tion ofthe hormones angiotensin-II (A-II) and A-III which are essential for the maintenance of normal blood pressure through biochemical responses such as vasoconstriction both directly and indirectly by enhancing noradrenaline release, increased sodium reabsorption in the kidney, and stimulation of aldosterone production [31]. The development of angiotensin-II (A-II) blockers for the treatment of essential hypertension has been anticipated for well over a decade and such compounds are now being progressed into clinical trials. The discovery of the angiotensin converting enzyme (ACE) inhibitor, captopril, demonstrated that interruption of the RAS, ultimately preventing the generation of the potent vasoconstrictor A-II, was a highly efficacious method of blood pressure control in a wide patient population. The design process has been widely documented already [32] and is a superb example of rational drug design using mechanistic and structural information available for an homology enzyme (carboxypeptidase-A) in the absence of such data for ACE itself. Two such compounds are now billion dollar products (Table 6.1) and have been accepted in some centres as first line therapy for hypertension.

On theoretical grounds, inhibition of the R -A cascade at either the renin or A-II level should be even more therapeutically beneficial. Renin being the rate-limiting enzyme in the RAS, it is considered to be the better target over ACE since inhibition of the latter may be surmountable through build up of

Asp-Arg-Val-Tyr-Ile-His-Pro-Phe-His-Leu -Val-Ile-His-Lys--

Mo __ (M>l .' r Asp-Arg-Val-Tyr -Ile-His-Pro-Phe-His-Leu Angiotensin-I (A-I)

.' t Direct vasoconstriction Noradrenaline release

Asp-Arg-Val-Tyr-Ile-His-Pro-Phe - - - - - - Aldosterone release

t Increased sodiwn reabsorption

3

Arg-Val-Tyr-Ile-His-Pro-Phe I Angiotensin-ill (A-ill)

Elevated blood pressure

I L. _______ CNSeffects?

Figure 6.3 The renin-angiotensin system_ Interventions that prevent the production of A-II will reduce elevated blood pressure by the mechanisms shown. I, renin; 2, angiotensin converting

enzyme; 3, aminopeptidases_

Page 157: Principles of Molecular Recognition

146 PRINCIPLES OF MOLECULAR RECOGNITION

substrate, A-I [33]. Interaction at the receptor for A-II may well provide a more direct and specific pharmacological response. A-II blockers may over­come the side effects profile and perceived deficiencies of ACE inhibitors [34] such as the inability to use such agents in kidney-compromised patients, the cough syndrome possibly associated with potentiation of bradykinin action (ACE also degrades this inflammatory peptide) and the need to supplement these agents with diuretics for more complete blood pressure control [35].

Table 6.3 Renin inhibitors'

Compound

A H 0

P-H-P-F-H'''~N~ N I-H-K I

H

Z-R-R-P-F-H~ I-H-K-OCH3 >~ I Ii OH 0 Boc

BOC_F-H\:f° ~ I-Amp H OH

BO'-F-H~~I-H-OCH' H OH H 0

BO'-F-H~ ~ I-Amp '~~ ~ F F

BOC-F-H'~~I-Amp H 0 0

OH 0

, I-Amp II OH

Inhibition of human renin ICso (nM) (structural class)

10 (reduced amide)

1.0 (statine-like)

0.22 (hydroxyethylene)

230 (hydroxyethylamino)

3.1 (keto amide)

0.52 (diftuorostatone)

0.35 (diol)

• Data taken from [33]; see this review for original references. Single letter codes for amino acids are used as follows: F, phenylalanine; H, histidine; I, isoleucine; K, lysine, P, proline; R, arginine. Boc, Amp and Z are abbreviations for t-butyloxycarbonyl, 2-aminomethylpyridine and ben­zyloxycarbonyl, respectively.

Page 158: Principles of Molecular Recognition

DRUG DISCOVERY 147

How is it then that the development of both renin inhibitors and A-II blockers has been so singularly unsuccessful? The answer for renin inhibitors is simply pharmacokinetics [33]. The abundance of structural information available for closely homologous proteins within the same family, the aspartate proteases, and, more recently, the three-dimensional structure of human renin itself, has led to several structurally diverse series of sub-nanomolar inhibitors [33] of the isolated enzyme (Table 6.3) However, many of these compounds retain their peptide nature, other than replacement of the renin-sensitive bond by mimicks of the putative tetrahedral intermediate, and are thus subject to rapid metabolic degradation. In the few instances where this problem has been overcome, the size of the inhibitor has precluded useful oral bioavailability; where improved intestinal permeability has been achieved, biliary excretion limited adequate plasma concentrations. Phase-I clinical trials have been conducted with several renin inhibitors [36]. Whilst such compounds given i.v. to human volunteers have shown efficacy comparable to ACE inhibitors, there has been no clear-cut demonstration of oral activity. In this modern computerised era, it is likely that solving the pharmocodynamic problem will cease to be the real challenge in medicinal chemistry. It is interesting to speculate how much further drug discovery would have been advanced if the resources allocated to 'rational design' had been diverted to analyse the problem of gastrointestinal transport and absorption.

All A-II receptor antagonists prior to 1982 were limited to angiotensin­like peptides, typified by saralasin, Sar1-Ile8-A-II, which shared the same pharmacokinetic limitations of renin inhibitors. However, another problem not shared by enzyme inhibitors was also apparent; many had partial agonist properties and this efficacy (see section 6.2.1); could induce a hypertensive response in some situations [37]. In the absence of any solid structural information, both for the receptor and even for the peptide agonist, progress was doomed to be slow. Other than in a few well-described instances, the medicinal chemist has performed appallingly badly in the objective of trans­forming a peptide into a low molecular weight, effective, specific and orally active drug.

Fortunately, most well-managed drug discovery programs also allow for an empirical approach to operate where, provided that robust in vitro assays can be developed to handle high throughput screening, serendipity can play an important role. The value of this approach cannot be overstated and the chances of success would be even greater if corporate collections of com­pounds were to be analysed in a more sophisticated manner. In the case of A-II antagonists, the origin [38] of the first non-peptide lead stems from a combination of chemical synthesis and random screening. A novel synthesis of the imidazole-5-carboxaldehyde (10) led to many derivatives, random screen­ing of which showed CV-2198 (ll) to have strong diuretic activity, although anti-inflammatory activity had been anticipated [39]. It was subsequently found that (ll) showed hypotensive action at a lower dose than that at which

Page 159: Principles of Molecular Recognition

148 PRINCIPLES OF MOLECULAR RECOGNITION

~ ~NJ(CI

N /, ,

H CHO

(10)

(11) (12) X=2-CI; CV-2947 (R=H); S-8307 (R=Na)

(13) X=2-NOz; CV-2961 (R=H); S-8308 (R=Na) (14) X=4-COzNa; R=Na

diuresis had been observed. During the pharmacological analysis that fol­lowed, (11) was found to inhibit A-II induced contractions in the isolated rabbit aorta. Further optimisation gave S-8307 (12) and S-8308 (13) which were reported [40J to inhibit this response by greater than 50% at concentra­tions lower than 0.1 flM and also showed activity in vivo when given i.v. at 0.1-0.5 mg/kg to anaesthetised rats made hypertensive by infusion of A-II.

Several researchers from other institutions have attempted to exploit these pioneering observations, most notably the Du Pont group [41]. Having confirmed that S-8307 and S-8308 were specific antagonists at A-II receptors and devoid of partial agonist activity, albeit with only weak potency (pA2 of 5.49 and 5.74, respectively, against A-II induced contractions in the isolated rabbit aorta; saralasin has an apparent pA2 = 9.47), a lead optimisation programme was initiated. Rather than use the in vitro pharmacological assay described above, dissociation constants were determined for test compounds using a radioligand binding assay with 3H labelled A-II as the displace able ligand in rat adrenal cortical microsomes. Radiolabelled agonist displacement studies can often provide anomolous SAR unless the test compound is devoid of agonist properties. However, of the compounds progressed into the normotensive or renally hypertensive rat, none showed a pressor response indicating their lack of intrinisic efficacy.

An interesting analysis [41J of the comparison between S-8307 and A-II followed which allowed for additional functionality ofthe peptide antagonists to be built into the non-peptide lead (Figure 6.4(a)). The carboxyl group of S-8307 was aligned with the C-terminal acid group of A-II known to be critical for good affinity and indicative of an electrostatic interaction with a putative positively charged residue on the receptor. This would be the reverse of the situation found in muscarinic, beta and the rest of the monoamine neurotrans­smitter receptors and would also suggest that residues Ile5-Phe8 are more likely to be in intimate contact with the receptor if an analogous binding mode is operative. Not surprisingly, the imidazole rings were seen as commonalities and the benzyl group of S-8307 was directed towards the N-terminus of A-II. As a working hypothesis, the fJ-carboxylic acid group of Aspl in A-II together with phenolic group of Tyr4 were assumed to provide negative charge in a

Page 160: Principles of Molecular Recognition

Pla

te 3

H

ypot

heti

cal

supe

rpos

itio

n o

f S

-830

7 w

ith

a co

nfor

mat

ion

of

A-I

I pr

edic

ted

by N

MR

exp

erim

ents

. R

epri

nted

wit

h pe

rmis

sion

fro

m D

unci

a, J

.V.,

Chi

u, A

.T.,

Car

ini,

D.J

. et

al.

(199

0) D

isco

very

of

pote

nt n

on­

pept

ide

angi

oten

sin

II r

ecep

tor

anta

goni

sts;

a n

ew c

lass

of

anti

hype

rten

sive

s, J

ourn

al o

f'M

edic

inal

Che

mis

try

33,

1312

-132

9. C

opyr

ight

(19

90)

Am

eric

an C

hem

ical

Soc

iety

.

Page 161: Principles of Molecular Recognition

Pla

te 4

Cry

stal

str

uctu

re o

f H

IY-I

pro

teas

e w

ith

A-7

4704

bou

nd a

t th

e ac

tive

sit

e. R

epro

duce

d w

ith

perm

issi

on f

rom

Eri

kson

, 1.,

Nei

dhar

t, D

.l.,

Yan

drie

, 1.

et a

l. (1

990)

Des

ign,

act

ivit

y an

d 2.

8 A

crys

tal

stru

ctur

e o

f a

C,

sym

met

ric

inhi

bito

r co

mpl

ex t

o H

IY-I

pr

otea

se,

Scie

nce

249,

527

-533

. C

opyr

ight

199

0 by

the

AA

AS

.

-L

Page 162: Principles of Molecular Recognition

(a)

(b)

Tyr 4 aromatic

ring

(c)

DRUG DISCOVERY 149

I His6 1 N Cl

I 0

o N Cl

I 0

Figure 6.4 Hypothetical correspondence between receptor binding elements of A-II and non­peptide antagonists. (a) Pharmacophore of (12) proposed by Duncia et al. [41]; (b) alternative

proposal by Weinstock et al. [43]; (c) model suggested for (19) [43].

Page 163: Principles of Molecular Recognition

150 PRINCIPLES OF MOLECULAR RECOGNITION

~N7rCl

~OH

(15) R=COOH (16) R=tetrazolyl

region which could be accessed by substitution into the benzyl group of S-8307. This rather bold hypothesis was based on just one of several solution conformations predicted [42] for A-II from NMR experiments Plate 3. Either in confirmation of the theory, or by complete coincidence, it worked! A para-COOH into S-8307 yielded an antagonist (14) with a hundred-fold improvement in activity (ICso = 1.2 11M; S-8307 ICso = 150l1M as inhibitors of A-II binding) and, by further extension of the COOH-linking group, (15) was discovered with a further five-fold increase in affinity (IC5o = 0.28 11M). Subsequently, however, it was shown that a simple H-bonding interaction was probably involved since this acidic group could be replaced by such groups as CH20CH3 and CH20H. An additional bonus was that in going from S-8307 to (15), oral activity had been incorporated into the molecule and accordingly had overcome the hurdle that had been the downfall of renin inhibitors.

Isopharmacophoric replacement of the acidic functionality in (15) by the tetrazolyl moiety gave a further advance in that (16) (DuP 753) showed even greater affinity for the A-II receptor (IC5o = 19 nM) and this activity was reflected in its potency as an antagonist of the A-II induced constriction of rabbit aorta (pA2 = 8.48). In the renal hypertensive rat (single artery ligation), DuP 753 interestingly showed greater potency in reducing blood pressure when given orally (ED5o = 0.59 mg/kg) relative to i.v. (ED5o = 0.78 mg/kg) possibly suggesting a different metabolic fate when given by the two routes. This compound is now being evaluated in man.

Over the last year, many of the competitors in this field have revealed their end-points using strategies which only marginally diverge from that related above. Imidazole-5-acrylic acid derivatives emerged [43] from alternative suggestions for the binding mode of (12) (S-8307) in which the N-benzyl and carboxyl groups of (12) correspond to the A-II Tyr4 aromatic ring and Phe8

carboxyl group respectively (Figure 6.4(b)), an overlay inconsistent with NMR predictions given above (Plate 3). As with all hypothetical models, almost any chemically reasonable prediction is acceptable provided it may be tested experimentally and modified to accommodate new data. Even with sophisti-

Page 164: Principles of Molecular Recognition

DRUG DISCOVERY 151

(17) (18) X=2·c\ (19) X=4·COOH

cated modelling techniques, this approach is indispensable for drug discovery; fortunately medicinal chemists are skilled in this art. As a consequence, it was decided to extend the acid side-chain in (12) to more closely approximate the Tyr4 - Phe8 separation but with limited conformational freedom. The trans acrylic system (17) was the chosen candidate. This produced a modest improvement in binding (IC5o = 8.9 J.lM versus 43 J.lM for S-8307 using 1251 labelled A-II as the radioligand in rat mesenteric membranes) which was markedly enhanced by incorporation of an IX-thienyl group to mimick the Phe8 side chain (18, IC50 = 0.4 J.lM). Finally, from a series of analogs with varying substituents in the benzyl ring, the 4-COOH derivative (19) was shown to have a dramatic increase in affinity (IC 5o = 1.0 nM). To take these findings into account, the revised model places the acrylic acid carboxyl and the thienyl groups in alignment with the corresponding elements of Phe8, the 2-butyl group lies in a hydrophobic region near to Ile5 and N -C-N of the imidazole and the acrylic acid double bond represent the original peptide backbone (Figure 6.4(c)).

Compound (19) is highly selective for A-II type 1 receptors located in the vascular bed; it is at least lOO-fold less active at the A-II type 2 receptor present in the CNS and human uterus. In the normotensive rat, (19) inhibited the pressor response to A-II (ID5o = 0.08 mg/kg i.v.) and in normotensive dogs made hypertensive by infusion of A-I, the compound was active both i.v. (3 mg/kg) and orally (10 mg/kg) reducing blood pressure by almost equal amounts and for greater than 6 h. It has been selected as a candidate for clinical evaluation for the treatment of hypertension.

6.3 Enzymes as targets for drug design

Several features of enzymes as drug targets make them far easier prey for the medicinal chemist compared to receptors. First, there is a wealth of crystallo­graphic data available for representatives of many of the therapeutically important classes of enzymes [44]. Accordingly, there is always a high degree of expectation that a three-dimensional structure of a newly identified enzyme will become available at some time during a synthetic chemical campaign;

Page 165: Principles of Molecular Recognition

152 PRINCIPLES OF MOLECULAR RECOGNITION

indeed most champions of such a new project will write this objective into the research proposal. Even today, however, one may reasonably ask whether such information really does help the discovery process and it may be noted that not one of today's leading drugs or those to be released this year have directly utilised enzyme structural data! A more fundamental distinction, and one that is indispensible, is that an enzyme is involved directly in a chemical reaction, the bare outlines of which are known, and the details may be established relatively straightforwardly through kinetic analysis. Since the 'holy grail' in enzyme inhibitor design remains a knowledge of the transition state(s) [45,46] because of the inevitable gain in binding energy, analysis of the reaction coordinate both to identify the catalytic apparatus and to detect likely intermediates is the most fruitful approach. This information allows the medicinal chemist to exploit the complementarity of the enzyme active site to the transition state of the reaction and thus design so-called 'slow, tight binding' inhibitors which are often several orders of magnitude more potent than classical inhibitors based on substrates and/or products. Such inhibitors are now known for over 50 enzymes, many of which are therapeutically relevant [47].

The two examples that have been chosen are members of the aspartate and serine protease families of enzymes for which more crystal structures are available than for any other homologous group of enzymes. Additionally, representatives of both families have been widely studied mechanistically so, for both reasons, inhibitor design should be relatively straightforward.

6.3.1 HIV protease inhibitors as anti-AIDS drugs

The growing importance of acquired immunodeficiency syndrome (AIDS) and the discovery of human immunodeficiency virus (HIV) as the aetiologic agent [48] has prompted a thorough study of the genetic structure of the virus with a view to identifying molecular targets of therapeutic potential [49, 50]. Although a number of investigational compounds have been evaluated in man, to date only one drug, zidovudine (AZT), has been approved for use [51]. Clinical utility of this agent is limited by associated toxicity problems such as bone marrow suppression and the emergence of AZT-resistant strains of HI V isolated from patients receiving AZT -therapy [49, 52]. Since this compound, and several others in clinical development [53] work at the level of reverse transcriptase (RT, see below), it is considered desirable to design agents that suppress viral replication by other mechanisms.

HIV has the basic organisation of a typical retrovirus [54] with three major genes called gag (group antigen), pol (polymerase) and env (envelope) and has six additional genes that encode various regulatory proteins (Figure 6.5(a)). The cycle of replication begins when an HIV particle binds to the host cell (such as a T4 lymphocycte or a cell of the nervous system) and fuses with the cell membrane (Figure 6.5(b)). After injection of the HIV core containing two

Page 166: Principles of Molecular Recognition

DRUG DISCOVERY

vif

o o ---·rev ---- 0

O----tat ----0 pol o env nef

, ___ -===========I~O~AI========~II~I r ~ I c:J ~,

EVENT

1. Binding

2. Uncoating

3. Reverse Transcription

4. Integration

, , , vpr

I vpo

DNA polymerase

RNAase-H

DNA polymerase

LTR

DETAILS

Viral enveloppe pIOlein, gp120. binding to CD4 receptors on helper T-<:ells

Release of viral 55-RNA and RT

Viral 55-RNA -- Heteroduplex

Viral 55-DNA

Viral ds-DNA

Virll ds-DNA stitched into host cell DNA

TIME LAPSE ....... HOST CELL AcnVATION

5. Transcription

6. Translation

7. Postranslational Processing

8. Assembly

9. Budding

Virll mRNA produced under lIle conlrol of regulatory gene products TAT. REV. NEF etc

Production of viral pIOleins

Transpon to cell membrane SpecifIC cleavage of polypeptide precurson to generale functional and structural viral proteins Glycosylation and Irimming

New virus panicle prepared

Release of new virion to complete !he infection process

Figure 6.5 (a) The genome of HIV-I. (b) Replicative cycle of HIV.

153

copies of the viral single-stranded RNA, the genetic information is integrated into host DNA through the successive action of virally encoded RT and integrase. The resulting provirus lies dormant until the host cell is activated; this may also trigger the translation and transcription of viral DNA into genetic material, structural proteins and enzymes for a new generation of virus.

Page 167: Principles of Molecular Recognition

154 PRINCIPLES OF MOLECULAR RECOGNITION

The new virions are assembled from multiple copies of gag and gag-pol gene products synthesised in a ratio of about 20: 1 as well as the env encoded envelope proteins. It is a single frame shift that results in the expression of the gag-pol polyprotein. All three polypeptides and two strands of viral RNA are transported to the surface of the cell where they become attached to the cell membrane and to each other to form the new virion after the budding process. Before this package can be completed, however, the polyproteins must under­go processing through limited and specific proteolysis to generate the active forms of the viral structural and replicative proteins.

The proteinase responsible is contained within the gag-pol product and cuts itself free of the peptide chain prior to action; this enzyme is called HIV protease [55, 56]. HIV protease, in common with other retroviral proteinases, has been shown to belong to a family of enzymes known as aspartate proteinases which also includes renin, the rate-limiting enzyme involved in processing of angiotensinogen (see above). Activation of this proteinase appears to coincide with maturation of virions as witnessed by the change in morphology during the budding process. Transformation of cells infected with the HIV genome containing either a single point mutation [57] or deletion mutations [58] within the protease coding region produced non-infectious virions highlighting this enzyme as an efficacious target for drug intervention.

Based on this understanding of the pivotal role of HIV protease in viral maturation, an orally active and specific inhibitor of this enzyme has been sought for several years now. It is interesting to consider why medicinal chemists entered this particular battle at all given that most had been so badly mauled, and so recently, by renin! Two points may be made in defence. First, perhaps the substrate binding site of the HIV enzyme may be less extended than that in renin and hence a smaller molecular weight inhibitor may be found. Second, and probably more important, is the nature of the disease and the availability of alternative treatments. For hypertension, a range of drugs, albeit not perfect, acting via different mechanisms (diuretics, beta adrenergic receptor blockers, ACE inhibitors, calcium channel blockers, for example) are already in routine clinical practice. A new entity targetted for this disease would have to represent a real advance, say in side effect profile, and certainly be readily bio-available orally. A completely different picture exists for AIDS so that a compound with a better side effect profile than AZT but possibly only poorly bio-available would be most acceptable at this stage. It must always be remembered that the real importance of any drug discovery can only be measured in terms of 'what is out there already' not simply in terms of its scientific elegance. This is aptly demonstrated in the current status of HIV protease inhibitors.

Almost simultaneously, protein and medicmal chemists began to character­ise HIV protease by structural studies and inhibitor synthesis, respectively [59,60]. In the aspartate protease family, gene duplication gives rise to two structurally similar domains each having a conserved sequence, Asp-

Page 168: Principles of Molecular Recognition

DRUG DISCOVERY 155

H 0

tr Figure 6.6 Catalytic mechanism proposed for HIV protease.

Thr-Gly, with both Asp side-chains contributing to the active site apparatus (Figure 6.6.). Since this sequence occurs only once in HIV protease, it was predicted [61] that the two domains may be simulated through dimerisa­tion of the two identical subunits. Using both chemical synthesis and over­expression of the protein in E. coli, relatively large quantities of the pure enzyme have become available in many laboratories for both kinetic and crystallisation studies and for inhibitor assays. Several structures have been solved by X-ray crystallography at high resolution both of the native enzyme [62] and with inhibitor bound [63] and have confirmed that the enzyme is a homodimer as predicted. In principle, all the structural information that a medicinal chemist could ever want has become available during the early stages of inhibitor design. In the meantime, highly potent inhibitors of the enzyme have been prepared, in most cases by simply 'dusting off dipeptide isosteres [59] that were successful in renin programmes and remodelling to correspond to HIV protease cleavage sites in gag-pol (Table 6.4).

A more intellectual approach was also considered [64], the key feature being the hypothesis that the native protein would have exact two-fold rotational symmetry, a prediction that was subsequently confirmed by crystal­lographic examination. In addition, it was recognised that more drastic changes in the peptide-like structure of HIV protease inhibitors described above would have to be made in order to address such issues as oral absorption and metabolic stability. From this analysis, an entirely symmetri­cal inhibitor was designed to match the C2-symmetrical active site having indistinguishable sub sites and a symmetrical arrangement of the catalytic

Page 169: Principles of Molecular Recognition

156 PRINCIPLES OF MOLECULAR RECOGNITION

Table 6.4 Inhibitors of HIV protease"

Compound

AC-T~N6° ~ Q-R-NH2

H

Q S-S-:'~~V-V-OCH'

H OH 0

o

I-Amp

Q AC-S-L-~,~y0yI-V-OCH,

H OH 0

F F

Boc-S-A-A'N V-V-OCH, . H 0 0

S\I '" OH 0

Boc-V-V'~ f.~-V-NH2 H 0 :1

Inhibition of HIV protease Ki (nM) (structural class)

780 (reduced amide)

810 (statine-like)

70 (hydroxyethylene)

0.24 (hydroxyethylamino)

160 (difluorostatone)

0.4 (phosphinic acid)

• Data are taken from [59,60]. Single letter codes for amino acids are used as Table 4 and as follows: T, threonine; Q, glutamine; S, serine; V, valine; L, leucine; N, asparagine. Tba and Ac are abbreviations for t-butylacetic acid and acetyl, respectively.

apparatus. Additionally, such symmetric inhibitors would be expected to display high specificity for the retroviral over mammalian aspartic proteases since the latter have a less symmetric arrangement of subsites. Having said that, it must be remembered that the HIV enzyme-substrate complex itself is asymmetric so that whilst the side-chains of any symmetrical inhibitor may readily be accommodated in the enzyme subsites, the backbone H-bonds to the enzyme would require some consideration. Design of the C2-symmetric inhibitor [64, 65] was based on disecting the tetrahedral intermediate for the

Page 170: Principles of Molecular Recognition

DRUG DISCOVERY 157

cleavage of Phe-Pro (20) with the symmetric axis placed either on the carbonyl group or through the middle of the scissile bond to generate four core units (21)-(24). Further analysis using the homologous RSV enzyme whose crystal structure was known at the start of the programme led to the synthesis of the diamine (21) which, not surprisingly, proved to be only a weak inhibitor of the HIV protease (Table 6.5) and was without anti-HIV activity in vitro. Extending the compound to incorporate interactions at the P 2 and P / subsites gave A-74704 (25) which was a potent competitive inhibitor with a K j = 4.5 nM. More importantly, (25) also had antiviral activity in vitro (IC50 = 0.4 fLM for inhibition of viral core antigen in H9 cells) with a thera­peutic index of 500 (LD 50 for cytotoxic effect in H9 cells = 200 fLM). The design postulate was substantiated by the subsequent determination of the structure of recombinant HIV -1 protease co-crystallised with (25) which exhibited nearly exact crystallographic two-fold symmetry (Plate 4).

All three stereoisomers ofthe analogous derivative of the diol core (23) were generally about 10 times more potent than those derived from (21). In contrast to the traditional monohydroxyethyl transition state isosteres for both HIV protease and renin, all three isomers (26)-(28) (Table 6.5), were equi-active suggesting some degree of conformational lability at the linking region (-CHOH-CHOH-). None of the symmetric compounds prepared had sig­nificant activity against renin at 10 fLM. The improvement in activity against the isolated enzyme was transposed into whole cell activity with the three diols inhibiting HIV replication in H9 cells with an IC 50 between 0.02 and 0.06 fLM and a therapeutic ratio ranging from 500 to 5000.

OH 0 P>; ""'~ ~ ON ...... ,

H OH

(20)

Ph?0cH NH2

NH2

OH Ph

(23)

:)0:'" Qyl? OH

(21)

H OH H

V 1:_: : 1 U

(24)

(22)

)J(~H :~wz : I II I

A H OH 0 H ./ " Ph

(26) 3R,4R; (27) 3S,4S; (28) 3R,4S

Page 171: Principles of Molecular Recognition

158 PRINCIPLES OF MOLECULAR RECOGNITION

Table 6.5 Antiviral activity of HIV protease inhibitors

Compound Inhibitor of HIV Antiviral activity Reference Protease, IC 50 (nM) IC50(/lM) TC50 (/lM)

(21) >200 [65] (25)(A-74704) 3.0 0.4 200

(26) 0.22 0.Q2 10 (27) 0.38 0.06 >100 (28) 0.22 0.02 60

(29) (Ro 31-8959) <0.4 (Ki = 0.12) 0.002 >10 [66] (30) 6500 (31) 140 0.30 >100

"Inhibition of HIV in virto (IC50 ) are cytoxicity (TC50) measured in H9 cells (compound (21) and (25-28)) or C8166 cells (compounds (29}-{31)) infected with HIV-13B and HIV-IRF ,

respectively.

To date nothing has been reported about the pharmacokinetic profile of these compounds although one suspects that progress towards clinical evalu­ation has not been rapid. Scientific elegance does not always imply success at the drug level. On the other hand, a more empirically derived inhibitor, Ro 31-8959 (29) has advanced to Phase-II trials [66]. Again, the unusual feature of the ability of HIV protease to cleave the Phe(Tyr)-Pro bond in the gag-pol gene product was used as the basis on which to achieve selectivity over other aspartate proteases. The hydroxyethylamine moiety was chosen as the scissile amide bond replacement to be incorporated into the dipeptide as the mini­mum sequence. Again the resulting compound (30) was only poorly active (Table 6.5) but N-terminal extension provided a 40-fold increase in activity (31) which was not improved by further N- or C-terminal extension. The most marked jump in potency was achieved by varying the iminoacid at p I1 resulting in (29) with a K j = 0.12 nM. This compound showed exquisite antiviral activity

H Ph .

CCnnZ1N~\5yN'f o \. H OH

CONH2 H

(29)

(30)

Page 172: Principles of Molecular Recognition

DRUG DISCOVERY 159

against HIV -1 inhibiting replication in C8166 cells infected with the RF strain of HIV with an ICso = 2 nM and with a therapeutic ratio greater than 10000. In rats and cynamologous monkeys, plasma levels of the drug in excess of 70 ng/ml were achieved following an oral dose of 10 mg/kg so that the concentration remained above the IC90 (for HIV protease) for over 6 h. The compound shows modest (but probably sufficient) oral availability (about 11%) in monkeys [67] and has now advanced into Phase-II trials.

6.3.2 Emphysema and elastase

Pulmonary emphysema is one of several diseases in which human neutrophil (leukocyte) elastase (HNE) has been implicated as the pathogenic agent [68]. It is characterised by progressive enlargement of airspaces distal to the terminal bronchioles accompanied by destruction of their walls [69]. Physio­logically, lung elasticity and elastic driving pressure are decreased, expiratory airways collapse and this results in chronic airflow obstruction. Other disease states in which elastases may have a role include acute respiratory distress syndrome, glomerulonephritis, pancreatitis and rheumatoid arthritis. HNE, in common with related elastases are the most destructive enzymes in the body having the ability to cleave the important connective tissue protein, elastin. Elastin [70] has the unique property of elastic recoil and is particularly abundant in the lungs which owe their elastic properties to the presence ofthis protein. This flexible protein is highly cross-linked with unusual amino acids such as desmosine and isodesmosine and is rich in the residues bearing small side-chains such as alanine, valine and serine; HNE has a P 1 specificity pocket with this requirement.

When neutrophils invade inflamed areas such as the lung to remove dead or foreign material, they release a variety of hydrolytic proteases which are normally localised to the phagocytic vacuole and are therefore regulated by compartmentalisation. Any extracellular HNE that is released from polymor­phonuclear leukocytes is normally carefully modulated by natural circulating protease inhibitors such (Xl-protease inhibitor and the complexes subsequently cleared from the plasma. Either through a genetic defect leading to a decrease in (Xcprotease inhibitor or an abnormally high level of HNE, the enzyme inhibitor balance could be disturbed leading to a destruction of elastin­containing tissues such as lung and hence emphysema [71]. The biochemical efficacy of genetically engineered (Xl-protease inhibitor as replacement therapy in deficient humans has been demonstrated [72] and other naturally occurring high molecular weight protease inhibitors (e.g. eglin C) have been evaluated in experimental animals. However, treatment has required weekly intravenous infusions of the inhibitor and this must be continued throughout the lifetime of the patient.

A more attractive approach to the treatment of emphysema has been the design of small molecular weight inhibitors of HNE based on an understand-

Page 173: Principles of Molecular Recognition

160 PRINCIPLES OF MOLECULAR RECOGNITION

Figure 6.7 Catalytic triad involved in elastase action.

ing of its mechanism of action [73J and, more recently, its three-dimensional structure [74]. The active site machinery of HNE is typical of the serine protease family with the catalytic triad, Ser-195, His-57 and Asp-102, facili­tating the formation of the enzyme-substrate tetrahedral intermediate (Figure 6.7) in the usual way. Crystallography shows that the buried portion of this triad is linked to the surface of the protein by a channel of seven water molecules. As anticipated from homologous enzymology, the highly nuc­leophilic serine residue is sensitive to phosphofluoridates and sulphonyl fluorides but these reagents lack the specificity demanded of therapeutically useful agents.

Although the structures of over 20 elastase derivatives have now been determined, much of the early success was with the porcine pancreatic enzyme (PPE) and it is only recently that a structure of HNE has become available. One major reason for this difference is simply that native PPE readily forms crystals that are ammenable to complex formation through soaking experi­ments with a range of small inhibitors [73]. However, of all the serine proteases whose tertiary structures have been determined, PPE is topologi­cally most similar to HNE especially in their active site regions. The crystal structure of HNE inactivated by the chloromethylketone (32) is consistent with the accepted mechanism of action of such compounds [74J and reveals two covalent bonds, the first between Ser-195 and original ketone of (32) but which has become tetrahedral in the complex, and the second between His-57 and the terminal activated methylene group (33). Thus HNE is a conventional serine protease and inhibitors of related enzymes (e.g. chymotrypsin), correctly modified to take into account HNE specificity, should be applicable to this enzyme.

° CHt /".... ....... Ala-Ala-Pro, 1 -[3° .......... I( N/ Y ........ Cl

° H ° (32)

(33)

Page 174: Principles of Molecular Recognition

DRUG DISCOVERY 161

The crucial specificity pocket, Sl' is rather hydrophobic in character and is adapted to accommodate medium sized (e.g. leucine, valine) aliphatic side chains but not larger groups such a phenylalanine. The P 1 residue of a 1-

protease inhibitor is Met-358 which can fit into Sl readily in a typical bent conformation. When this Met is oxidised to sulphoxide (through cigarette smoke, for example), it is predicted by modelling experiments that a severe steric clash is created in this pocket thereby impairing tight binding. Such predictions have be supported [75] by kinetic experiments on small peptide substrates. It has also been established [76] that interactions at remote sites can also profoundly influence substrate specificity. For example, Phe can sometimes be tolerated as the P 1 residue in minimal substrates but not in extended peptides. This is of relevance in inhibitor designs that need to utilise the extended substrate binding site to achieve high potency and selectivity.

Several different classes of HNE inhibitors are known [77], both reversible and irreversible, the relative merits of which may be debated. Only in exceptional circumstances do irreversible inhibitors make therapeutically useful agents since they are inevitably reactive compounds which alkylate or acylate the target protein, and other proteins besides! The so-called mechan­ism-based enzyme inactivators ('suicide inhibitors') whose functionality is unleashed only by the specific action of the target enzyme are of far more value [78] and one such example is already a well-established drug, the fJ-Iactamase inhibitor, clavulanic acid (see Table 6.1). This example is not entirely fortu­itous since it turns out that some fJ-Iactams are also irreversible inhibitors of HNE [79]. Bacterial transpeptidases and carboxypeptidases are similar to classical serine proteases only in that they utilise serine as the nucleophilic entity. It was hypothesised [80], however, that acyl enzymes formed by nucleophilic attack on the fJ-Iactam may also function as HNE inhibitors. The seminal observation that benzyl clavulanate (34) but not the free acid (35) (Table 6.6.) inhibited HNE was consistent with the endopeptidase activity of HNE. SAR studies (Table 6.6.) showed that activity was influenced by the nature and stereochemistry of substituents in the 7-position; 7a-substituents generally gave potent HNE inhibitors with the corresponding 7 fJ-isomer being much weaker or inactive. This preference may be rationalised by remembering that HNE cleaves L, L-peptide bonds whereas the targets enzymes for fJ-Iactam antibiotics cleave D, D-peptides. Given the P 1 specificity of HNE, it is not

H

oftYOH C02R

(34) R=Bn (35) R=H (36)

Page 175: Principles of Molecular Recognition

162 PRINCIPLES OF MOLECULAR RECOGNITION

Table 6.6 Elastase inhibitors

R2 On

I

Rj=r~oy CO2-I-Bu °

Compound Rl R2 n Inhibition of NHE ICso, JlM

(34) 4 (35) >40

H OCH3 2 1 OCH3 H 2 >40 H OCH 3 0 10 H OCH3 1 (fJ) 4 H OCH3 1 (IX) >40 H OnPr 2 40

(37) H Cl 2 0.04

surprising that smaller 71X-substituents are preferred. Further modification and incorporation of a 3-substituent known in cephalosporin circles to confer long duration of action led to L-659286 (36) which has been selected for detailed preclinical evaluation. The compound was only modestly active against HNE in vitro (K j = 0.4 11M) but inhibited HNE-induced haemorrhage when given as an aerosol although not orally.

A crystal structure of PPE has been solved [81] with the irreversible cephalosporin inhibitor (37) bound at the active site (Figure 6.8(a)). The enzyme was relatively unperturbed by inhibitor binding even though two new covalent bonds were observed: the Ser-195 side chain had become attached to the C-8 of the cephalosporin through an ester linkage through nucleophilic attack at the IX-face of the f3-lactam ring and the 3-acetoxy substituent had been lost and replaced by a C-N bond to His-57 (Figure 6.8(b)). Loss of enzyme activity was time-dependent; initial inhibition was reversible by hydro­xylamine, presumably at the level of intermediate (38). Formation of the second bond (to give (39)), which could not have been determined other than by direct observation in the crystal structure, inhibited the enzyme in a hydroxylamine-insensitive manner.

A more clinically advanced inhibitor, ICI-200880 (40), now in Phase-I trials, has been designed [82] by consideration of the transition state developed during the production of the acyl-enzyme intermediate. Compound (40) is a slow binding competitive HNE inhibitor (K j = 0.5 nM) which protects in the hamster model of elastase-induced emphysema for up to 48 h following administration by aerosol [83]. The trifluoromethylketone (41) is the most

Page 176: Principles of Molecular Recognition

DRUG DISCOVERY 163

(b)

(38)

o 0 \' 1/

O~s N

I'-'::I~r o .N h Nh ( II His-57

Ser-195 COrt-Bu

(39)

Figure 6.8 (a) Crystal structure of PPE with cephalosporin inhibitor bound [81]. (b) Mechanism of irreversible inactivation of PPE by (38).

Cl pO ~ '01 l! r I Val-Pro, F ::::-... N ..... N F

s~ ~ I

0"0 0 H 0 (40)

~F Z-Lys(Z)-Val-Pro, (41) ~ F

H 0

~IlF Z-Lys(Z)-Val-Pro, N F

(42) II 0

'1 ~F Z-Val-Pro,

N F I

(43) H 0

H Ser-195 F F

Z-Val-Pro,~ F

(44) H OH

Page 177: Principles of Molecular Recognition

164 PRINCIPLES OF MOLECULAR RECOGNITION

potent synthetic inhibitor of HNE reported to date (K j = 0.1 nM) and is presumed to facilitate the enzyme-catalysed addition of Ser-195 to form a stable hemi-ketal adduct (42). There is at least a 10 OOO-fold decrease in binding when the ketone (43) in compared to its alcohol derivative (44). Hemi-ketal formation has been shown to be a rapid process which is followed by a rate-limiting conformational isomerisation to give the final complex.

6.4 Drug discovery by screening: concluding remarks

As a salutary post script to the discussion that has been presented above, a final example of innovation is presented, an example which has caused a profound change in direction in the search of anti-AIDS drugs. Until a little over a year ago, all research into inhibitors of reverse transcriptase (RT, see above) relied upon mimicry of the nucleoside substrates in the polymerase reaction. Here is an enzyme for which the reaction pathway is known only in bald outline and for which there is no structural information. Janssen's approach was masterly [84]. A library of 600 molecules, each prototypes of different chemical series and without activity in standard pharmacological assays, were screened for anti-HIV activity in vitro. It was discovered that (45) had modest but specific anti-HIV activity and lead optimisation eventually uncovered (46) and (47) as representatives of the 'TIBO' series. It was subse­quently shown that these compounds act as non-competitive inhibitors ofRT, an indication that even if structural data were available, they would be useless in developing the series towards a drug candidate. As ifby coincidence, several

(48)

Cl

~ ,,~( EnN~o a

Me N 0 I

H (50)

Page 178: Principles of Molecular Recognition

DRUG DISCOVERY 165

other groups [85-87] have now reported structurally diverse agents (for example (48), (49) and (50)), all apparently interacting with RT in a similar manner to TIBO, possibly at an identical, or at least at an overlapping allosteric site. In almost record time, two of these drugs, (46) and (50), have proceeded into clinical trials; the outcome is eagerly awaited.

In conclusion, we are still in a position where the payoff from modern drug discovery techniques it still to be fully realised. However, it is my belief that when such techniques are combined with serendipity and the intuition of medicinal chemists, this represents as formidable a team as is possible. Given this as the current state of the art, the limiting obstacles to the design of new drug entities will continue to be ph armco kinetic affairs and it is in this area that a predictive 'black box' would be most beneficial.

Acknowledgements

I thank Dr Peter McMeekin for his modelling expertise and for producing Figures 6.7, 6.9(b) and 6.1O( a). Co-ordinates were abstracted from Brookhaven file data [44] and visualised using Insight and Discover (Biosym.).

References

1. Scrip Yearbook (1993) PJB Publications, Surrey. 2. E. Uhlmann and A. Reyman (1990) Chem. Rev. 90, 544-584. 3. For a recent compilation see S. Watson and A. Abbott (1991) Trends Pharmacol. Sci.,

Receptor Nomenclature Suppl. 4. N.G. Morgan (1989) Cell Signalling, Open University Press, Milton Keynes. 5. S.R. Nahorski ed. (1990) Transmembrane Signalling, Wiley, Chichester, UK. 6. T.R. Jackson, L.A.C. Blair, 1. Marshall, M. Goedert and M.R. Hanley (1988) Nature 335,

437-440. 7. Y. Masu, K. Nakaama, H. Tamaki, Y. Harada, M. Kuno and S. Nakanishi (1987) Nature 329,

836-838. 8. 1.B.C. Findlay and E.E. Eliopoulos (1990) Trends Pharmacol. Sci. 11,492-499. 9. L. Marsh and I. Herskowitz (1988) Proc. Natl. Acad. Sci. USA 85, 3855-3859.

10. G. Carpender (1987) Annu. Rev. Biochem. 56, 881-914. 11. P.G. Strange (1988) Biochem. J. 249, 309-318. 12. C.F. Stevens (1987) Nature 328,198-199. 13. E.K. Perry (1986) Br. Med. Bull. 42, 63-69. 14. W.H. Moos, R.E. Davis, R.D. Schwarz and E.R. Gamzu (1988) Med. Res. Rev. 8, 353-391. 15. J.H. Growden (1983) Med. Res. Rev. 3, 237-257. 16. E. Hollander, R.C. Mohs and K.L. Davis (1986) Br. Med. Bull. 42, 97-100. 17. See Scrip (1991), no. 1636, pp. 24-25. 18. 1.M. Schulman, M.L. Sabia and R.L. Disch (1983) J. Med. Chem. 26, 817-823. 19. 1. Saunders and S.B. Freedman (1989) Trends Parmacol. Sci. (Suppl.) 10, 70-75. 20. T.L. Bonner, A.C. Young, M.R. Brann and N.J. Buckley (1988) Neuron 1,403-410. 21. S.F. Freedman, E.A. Harley and L.L. Iverson (1988) Br. J. Pharmacol. 93, 437-445. 22. P. Krogsgaard-Larsen, E. Fa1ch, P. Sauerberg, S.B. Freedman, H.L. Lembol and E. Meier

(1988) Trends Pharmacol. Sci. (Suppl.) 9, 69-74. 23. R. Baker and 1. Saunders (1989) Annu. Rep. Med. Chem. 24, 31-40. 24. 1. Saunders, M. Cassidy, S.B. Freedman, E.A. Harley, L.L. Iversen, C. Kneen, A.M. MacLeod,

K.J. Merchant, R.J. Snow and R. Baker (1990) J. Med. Chem. 33,1128-1138.

Page 179: Principles of Molecular Recognition

166 PRINCIPLES OF MOLECULAR RECOGNITION

25. LJ. Street, R. Baker, T. Book, CO. Kneen, A.M. MacLeod, KJ. Merchant, G.A. Showell, J. Saunders, RH. Herbert, S.B. Freedman and E.A. Harley (1990) J. Med. Chem. 33, 2690-2697.

26. A.M. MacLeod, R Baker, S.B. Freedman, S. Patel, KJ. Merchant, M. Roe and J. Saunders (1990) J. Med. Chem. 33, 2052-2059.

27. R. Baker, LJ. Street, AJ. Reeve and J. Saunders (1991) J. Chem. Soc., Chem. Commun. 760-762.

28. J. Saunders and J.B.C Findlay (1990) Biochem. Soc. Symp. 57, 81-90. 29. R.M. Snider, J.W. Constantine, J.A. Lowe, K.P. Longo, W.S. Lebel, H.A. Woody, S.E. Drozda,

M.C. Desai, FJ. Vinick, R.W. Spencer and H-J. Hess (1991) Science 251, 435-439. 30. A.T. Chiu, D.E. McCall, T.T. Nguyen, DJ. Carini, J.V. Duncia, W.F. Herblin, R.T. Uyeda,

P.C Wong, R.R Wexler, A.L. Johnson and P.B.M.W.M. Timmermans (1989) Eur. J. Pharmacol. 170, 117-118.

31. V.I. Dzau and R.E. Pratt (1986) in Handbook of Experimental Cardiology, eds. E. Haber, H. Morgan, A. Katz and H. Fozard, Raven Press, New York, pp. D1631-D1661.

32. M.A. Ondetti and D.W. Cushman (1981) J. Med. Chem. 24, 355-361. 33. WJ. Greenlee (1990) Med. Res. Rev. 10, 173-236. 34. D.M. Coulter and I.R Edward (1987) Br. Med. J. 294,1521-1523. 35. H-L. Chin. and D.A. Buchan (1990) Anna. Intern. Med. 112, 312-313. 36. E.W. Petrilo, N.C. Trippodo and J.M. DeForrest (1989) Anna. Rep. Med. Chem. 25, 51-60. 37. P. Corvol (1989) Clin. Exp. Hypertens.- Theory Practice (Suppl. 2) All, 463-470. 38. Y. Furukawa, personal communication. 39. K. Nishikawa and Y. Furakawa (1990) Exp. Med. 8, 214-220. 40. Y. Furukawa, S. Kishimoto and K. Nishikawa (1982) US Patent 4335040 (to Takeda). 41. J.V. Duncia, A.T. Chiu, D.I. Carini, G.R. Gregory, A.L. Johnson, W.A. Price, G.I. Wells, P.C

Wong, J.C Calabrese and P.B.M.W.M. Timmermans (1990) J. Med. Chem. 33,1312-1329. 42. R Smeby and S. Fermandjian (1978) in Chemistry and Biochemistry of Amino Acids, Peptides

and Proteins 5, ed. B. Weinstein, Marcel Dekker, New York, pp. 117-162. 43. J. Weinstock, R.M. Keenan, J. Samanen, J. Hempel, J.A. Finkelstein, R.G. Franz, D.E

Gaitanopoulos, G.R. Girard, J.G. Gleason, D.T. Hill, T.M. Morgan, CE. Peishoff, N. Aiyar, D.P. Brooks, T.A. Fredrickson, EH. Ohlstein, R.R. Ruffolo, EJ. Stack, A.C Sulpizio, E.F. Weidley and R.M. Edwards (1991) J. Med. Chem. 34, 1514-1517.

44. See Protein Data Bank Newsletter, Protein Data Bank, Chemistry Department, Brookhaven National Laboratory, Upton, NY 11973, USA.

45. M.1. Page and A. Williams eds. (1987) Enzyme Mechanisms, Royal Soc. Chern., London. 46. A. Fersht (1984) Enzyme Structure and Mechanism, Freeman, New york. 47. J.F. Morrison and CT. Walsh (1988) Adv. Enzymol. 61, 201-301. 48. RC Gallo and L. Montagnier (1988) Sci. Am. 259, 25-112. 49. D.W. Norbeck (1989) Annu. Rep. Med. Chem. 25, 149-158. 50. R Yarchoan, H. Mitsuya and S. Broder (1989) in The Science of AIDS, Freeman, New York,

pp.85-99. 51. M.A. Fischl, D.D. Richman, M.H. Grieco, M.S. Gottlieb, P.A. Volberding, O.L. Laskin, J.M.

Leedom, J.E Groopman, D. Mildvan, R.T. Schooley, G.G. Jackson, D.T. Durack, D. King and the AZT Collaborative Working Group (1987) N. Engl. J. Med. 317,185-191.

52. B.A. Larder, G. Darby and D.D. Richmann (1989) Science 243, 1731-1734. 53. J. Saunders and R. Storer (1992) Drug News Perspectives, 5,153-169. 54. W.A. Haseltine and F. Wong-Staal (1989) The Science of AIDS, Freeman, New York,

pp.13-25. 55. B.M. Dunn and J. Kay (1990) Antiviral Chem. Chemother. 1,3-8. 56. L.H. Pearl (1990) in Retroviral Proteases: Maturation and Control of Morphogenesis,

Macmillan, London. 57. N.E Kohl. E.A. Emini, W.A. Schleif, LJ. Davis, J.C Heimbach, R.A.F. Dixon, E.M. Scolnick

and I.S. Sigal (1988) Proc. Natl. Acad. Sci. USA 85, 4686-4690. 58. C Peng, B.K. Ho, T.W. Chang and N.T. Chang, (1989) J. Virol63, 2550-2556. 59. T.L. Blundell, R. Lapatto, A.F. Wilderspin, A.M. Hemmings, P.M. Hobart, D.E Danley and

P.I. Whittle (1990) Trends Biochem. Sci. 15,4225-4230. 60. S.R. Petteway, D.M. Lambert and B.W. Metcalf(1991) Trends Pharmacol. Sci. 12,28-34. 61. L.H. Pearl and W.R Taylor (1987) Nature 329, 351-354.

Page 180: Principles of Molecular Recognition

DRUG DISCOVERY 167

62. For example see: A. Wlodawer, M. Miller, M. Jaskolski, B.K. Sathyanarayana, E. Baldwin, LT. Weber, L.M. Se\k, L. Clawson, 1. Schneider and S.B.H. Kent. (1989) Science 245, 616-621.

63. A.I.. Swain, M.M. Miller, 1. Green, D.H. Rich, 1. Schneider, S.B.H. Kent and A. Wlodawer (1990) Proc. Natl. Acad. Sci. USA 87, 8805-8809.

64. 1. Erikson, D.1. Neidhart, 1. Vandrie, D.1. Kempf, X.e. Wang, D.W. Norbeck, 1.1. Plattner, J.W. Rittenhouse, M. Turon, N. Wideburg, W.E. Kohlbrenner, R. Simmer, R. Helfrich, D.A. Paul and M. Knigge (1990) Science 249, 527-533.

65. D.1. Kempf, D.W. Norbeck, I..M. Codacovi, X.e. Wang, W.E. Kohlbrenner, N.E. Wideburg, D.A. Paul, M.F. Knigge, S. Vasavononda, A. Craig-Kennard, A. Saldivar, W. Rosenbrook, 1.1. Clement, J.1. Plattner and 1. Erikson (1990) J. Med. Chem. 33, 2687-2689.

66. N.A. Roberts, 1.A. martin, D. Kinchington, A.V. Broadburst, J.e. Craig, I.B. Duncan, S.A. Galpin, B.K. Handa, J. Kay, A. Krohn, R.W. Lambert, 1.H. Merrett, J.S. Mills, K.E.B. Parkes, S. Redshaw, A.1. Ritchie, D.L. Taylor, G.1. Thomas and P.1. Machin (1990) Science 248, 358-361.

67. J.A. Martin, presented at the IVth International Conference on Antiviral Research, New Orleans, 1991.

68. A. Janoff (1985) Am. Rev. Respir. Dis. 132417-433. 69. G.I.. Snider, 1. Kleinerman, W.M. Thurlbeck and Z.H. Bengali (1985) Am. Rev. Resp. Dis. 132,

182-185. 70. J. Rosenbloom (1987) Methods Enzymol. 144, 172-196. 71. G.I.. Snider, E.e. Lucey and P.1. Stone (1986) Am. Rev. Resp. Dis. 133, 149-169. 72. M.D. Wewers, M.A. Casolaro, S.E. Sellars, S.e. Swayze, K.M. McPhaul, J.T. Wittes and R.G.

Crystal. (1987) N. Engl. J. Med. 316,1055-1062. 73. W. Bode, E. Meyer and J.e. Powers, (1989) Biochemistry 28, 1951-1963. 74. M.A. Navia, B.M. McKeever, J.P. Springer, T-Y. Lin, H.R. Williams, E.M. Fluder, e.P. Dorn,

K. Hoogsteen (1989) Proc. Natl. Acad. Sci. USA. 86, 7-11. 75. B. McRae, K. Nakajima, J. Travis and 1.e. Powers (1980) Biochemistry 19, 3973-3978. 76. R.I.. Stein, A.M. Strimpler, H. Hori and 1.e. Powers (1987) Biochemistry 26,1301-1305. 77. D.A. Trainor (1987) Trends Pharmacol. Sci. 8,303-307. 78. C.T. Walsh (1984) Annu. Rev. Biochem. 53, 493-535. 79. J.B. Doherty, B.M. Ashe, L.W. Argenbright, G.O. Chandler, M.E. Dahlgreen, e.P. Dorn, P.E.

Finke, R.A. Firestone, D. Fletcher, W.K. Hagmann, R. Mumford, L. O'Grady, A.L. Maycock, J.M. Pisano, S.K. Shah, K.R. Thompson and M. Zimmerman (1987) Nature 322,192-194.

80. 1.B. Doherty, B.M. Ashe, P.L. Barker, T.1. Blacklock, 1.W. Butcher, G.O. Chandler, M.E. Dahlgreen, P. Davies, e.P. Dorn, P.E. Finke, R.A. Firestone, W.K. Hagmann, T. Halgren, W.B. Knight, A.I.. Maycock, M.A. Navia, L. O'Grady, 1.M. Pisano, S.K. Shah, K.R. Thompson, H. Weston and M. Zimmerman (1990) J. Med. Chem. 33,2513-2521.

81. M.A. Navia, J.P. Springer, T-Y. Lin, H.R. Williams, R.A. Firestone, lM. Pisano, J.B. Doherty, P.E. Finke and K. Hoogsteen (1987) Nature 327, 79-82.

82. R.I.. Stein, A.M. Strimpler, P.D. Edwards, J.1. Lewis, R.e. Mauger, 1.A. Schwartz, M.M. Stein, D.A. Trainor, R.A. Wildonger and M.A. Zottola (1987) Biochemistry 26, 2682-2689.

83. 1.e. Williams, R.L. Stein, e. Knee, 1. Egan, R. Falcone, D. Trainor, P. Edwards, D. Wolanin, R. Wildonger, 1. Schwarts, B. Hesp, R.E. Giles and R.D. Krell (1988) F ASEB J. 2, A346.

84. R. Pauwels, K. Andries, 1. Desmyter, D. Schols, M.1. Kukla, H.1. Breslin, A. Raeymaeckers, 1.V. Gelder, R. Woestenburghs, 1. Heykants, K. Schellekens, M.A.e. Janssen, E. DeClercq and P.A.H. Janssen (1990) Nature 343, 470-474.

85. V.I. Merluzzi, K.D. Hargrave, M. Labadia, K. Grozinger, M. Skoog, J.e. Wu, C-K. Shih, K. Eckner, S. Hattox, J. Adams, A.S. Rosenthal, R. Faanes, R.I. Eckner, R.A. Koup, 1.L. Sullivan (1990) Science 250,1411-1413.

86. M. Baba, E. De Clercq, H. Tanaka, M. Ubasawa, H. Takashima, K. Sekiya, I. Nitta, K. Umezu, H. Nakashima, S. Mori, S. Shigeta, R.T. Walker and T. Miyasaka (1991) Proc. Natl. Acad. Sci. USA 88, 2356-2360.

87. M. Goldman, presented at 32nd Annual Med. Chern. Symposium, Buffalo, 1991.

Page 181: Principles of Molecular Recognition

7 Time scales and fluctuations of protein dynamics: metmyoglobin in aqueous solution

L.A. FINDS EN, S. SUBRAMANIAN, V. LOUNNAS and B.M. PETTITT

7.1 Introduction

Dynamic motions in biological systems are thought to playa very important role in determining their functionality and material properties [1]. Both theoretical and experimental methods of studying kinetics and molecular dynamics have become valuable tools for understanding biomolecular function [2]. The fundamental fluctuations in both main-chain and side­chain positions provide a wealth of detail into structures and mechanisms responsible for the various functional roles that proteins play.

While current computer simulations give a level of detail that is not pre­sently available from experiments, discrepancies both among various simula­tions [3,4J and with observable properties exist. The cause of the observed disagreement may be due in some cases to treatment of environmental influences around the protein [S]. Other problems between theoretical studies may include parameter differences [6J and sample preparations.

Since a number of relatively complete reviews of protein molecular dynamics exist [1, 2J, it seems appropriate to illustrate a single example in some detail to highlight the successes and weaknesses of this sort of theoretical investigation. To this end, in this contribution, we present a representative lS0ps molecular dynamics simulation of metmyoglobin in water. Myoglobin has been the object of extensive experimental [7J and theoretical [2J studies and hence serves as a useful case for comparing molecular simulation methodologies and experiments.

Myoglobin is a globular protein (Figure 7.1) which is involved in the respiratory cycle in vertebrates. This protein stores oxygen in muscles for use in times of physical effort. The active site in myoglobin involves the prosthetic heme group which is linked via the iron to Histidine 93 of the protein. Functionally, oxygen bonds to iron as a sixth ligand in myoglobin, while a water molecule serves this role in metmyoglobin. Myoglobin consists of eight a-helices packed together. Additional stability is imparted to the tertiary structure through salt links. Myoglobin was the earliest protein structure to be solved crystallographically [8J and serves as a model for understanding

Page 182: Principles of Molecular Recognition

TIME SCALES AND FLUCTUATIONS OF PROTEIN DYNAMICS 169

Figure 7.1 Stereo plot of X-ray versus calculated average structure. The heavy line is the calculated average structure and the light line represents the X-ray data.

aspects of the structure and function of hemoglobin, which is made up of four myoglobin-like units.

A 300 ps molecular dynamics simulation has been carried out on metmyo­globin in a vacuum by Levy et al. [3] using the CHARMM program and parameters, and shorter simulations were done by Tilton et al. [4] using the AMBER program and parameters, as was done for this work, and Kuczera et al. [9] again using CHARMM. These simulations depict a number of interesting facets of protein dynamics but lack the environmental influence of the water that abounds in both crystal and solvent settings. As well as usually being of functional importance, aqueous solvent is also crucial in enabling the protein to preserve its compact native structure by solvating the polar residues on the surface. Several studies in water have appeared [1,2] including Levitt and Sharon's recent simulation of bovine pancreatic trypsin inhibitor (BPTI) in aqueous solution [10]. In those studies, water was found to be important structurally as well as in the dynamics. The current study also demonstrates that water plays a key role in protein dynamics. In addition, the work by Levitt and Sharon provides a comparison of BPTI in a vacuum and in solution, including a number of benchmarks for technical comparison of protein dynamics [10]. Experimental observations [11] show that there is a different mobility of the protein in a solution than in the solid state environment. Crystallographic experiments show mobility differences between different crystal forms, presumably due to differing protein-protein contacts [5]. In view of these facts, it is desirable, when feasible, to carry out molecular dynamics simulations of proteins in a condensed phase environment, such as in solution, to consider structural fluctuations and their time scales. The time scales accessible by simulations range from femtoseconds to around a nanosecond with the upper limit determined by current computer technology and the availability of such resources. This wide range provides a great deal of information; however, it still severely limits the biologically related pheno-

Page 183: Principles of Molecular Recognition

170 PRINCIPLES OF MOLECULAR RECOGNITION

mena that can be observed. The reviews in [1,2] provide detailed catalogs of the various time scales of motions relevant to proteins.

In section 7.2, we describe the general methods used in this dynamic simulation of metmyoglobin in water. Since the principle of solving Newton's equations of motion for a given force field is commonly appreciated, the reader interested in more details is referred to [2]. Following this, we present comparisons of our results with experimental data and with previous com­puter simulations which do not include explicit water molecules. The final section presents a discussion about the present study.

7.2 Methods

The initial coordinates for the simulation were chosen to be the 2.0 A reso­lution metmyoglobin structure obtained by Takano [12] and available from the Protein Data Bank [13]. These coordinates are not the highest resolution coordinates known for metmyoglobin but were used so that comparison with the previous simulations [3,4] might be facilitated. The molecular dynamics simulation was carried out with the AMBER suite of programs [14]. The protein originally contained 1261 heavy atoms and 83 oxygens from waters found in the crystal structure. The hydrogens bonded to polar atoms and water oxygens were added using the standard geometry and stereochemistry re­quirements. The protein was then solvated using the standard algorithm in AMBER [6, 14] with a pre-equilibrated box of water to yield a simulation box of dimensions 56.32 A by 56.32 A by 44.45 A. The total number of solvent molecules in a box included the 3045 water molecules generated and the 83 crystal internal waters found in the crystal structure, making a total of 3128 solvent molecules. The system so obtained is about 12 mM in the protein.

The SPC water model was used for the solvent molecules [15]. The simulation process was preceded by 500 steps of steepest descent minimization to relieve bad Van der Waals contacts and form solvent-protein hydrogen bonds. The Leap Frog algorithm was used to propagate the equations of motion in the canonical ensemble (constant number, N, volume, V and temperature, T) [16]. A time step size of 2 fs was used in the integration as previous simulations found that the results were largely unaffected with this and SHAKEn hydrogenic bonds [1,2,17]. A sharp cut-off at 7 A was used for the calculation of non-bonded interactions. Since the algorithm used simulated constant temperature during production, any energy drift incurred by this technique would not be measurable. The non-bonded neighbor list was updated every 25 steps. The coordinates and velocities were stored every 20 steps. The system was thermalized by gradual heating for 5 ps at each of 100 K, 200 K and 300 K with velocity rescaling to the appropriate temperature every 200fs. Subsequent to this heating period, an NVTequilibration was carried out for 14 ps to let the system approach equilibrium at 300 K. A 150 ps

Page 184: Principles of Molecular Recognition

TIME SCALES AND FLUCTUATIONS OF PROTEIN DYNAMICS 171

N V T simulation was then carried out on the system and, as we will describe in the later sections, the system actually achieved equilibrium-like properties considerably later in the simulation. While temperature and energy fluctu­ations occur with this algorithm, the average of the temperature ofthe system holds constant without drift.

7.3 Spatial and temporal fluctuations

We now consider the motions found and their time scales in some detail. Atomic, group motions and structure are considered. The groups for this demonstration are taken to be either amino acid residues, entire helices or the loops that connect the helices. Clearly, other choices exist and these were chosen for convenience in this system.

7.3.1 The approach to equilibrium

A key prerequisite for obtaining statistically meaningful averages and fluctu­ations from molecular dynamics simulations is a proper equilibration of the system. Several measures have been suggested to test if the system is near equilibrium [1]. First, the overall dynamically averaged structure may be compared with the X-ray structure as shown in Figure 7.1. A system far removed from equilibrium is likely to display considerable structural devi­ation (e.g. denatured), and as is evident from the figure, this is not the case. The overall three-dimensional structure and tertiary motif is essentially preserved in the simulations.

A quantitative estimate of the structural deviation is the mean square (MS) unidirectional deviation defined through the expression

where r i is the current position vector, rj is taken to be the reference configur­ation position vector, < > indicates time average and [] structure average over the index i. Two independent MS deviations can be computed by using the X-ray structure and average structure over the first picosecond of dynamics, respectively, as reference structures. The MS deviation is plotted as a function oftime in Figure 7.2, where the plot in (a) uses X-ray structure for the reference coordinates while that in (b) uses the first picosecond dynamics average. The deviation of the protein backbone atoms (the dashed lines in Figure 7.2) is considerably less than that obtained when all atoms (solid lines in Figure 7.2) are considered. The MS deviations plotted were obtained as averages over lOps intervals. Smaller and larger time intervals for obtaining averages were tested and yielded similar behavior. X-Ray structure coordinates, as opposed to energy minimized coordinates, were used as a reference, since the backbone

Page 185: Principles of Molecular Recognition

(a) 3.0

2.5

.. -: 2.0 '-'

~ 0 .... .... 1.5 tIS .~ Q)

'"d

fIl 1.0 a

0.5

0.0 O. 25. 50. 75. 100. 125. 150.

time (ps)

(b) 3.0

2.5

.. eo< 2.0 '-'

~ 0 .... .... 1.5 tIS .~ Q)

'"d

rn 1.0 a

0.5

0.0 O. 25. 50. 75. 100. 125. 150.

time (ps)

Figure 7.2 Mean square deviation (MS = 1/3 {[ < r 1 > - < r 2> ]2} between initial and suc­cessive 10 ps average structures. The calculations, including all of the atoms except explicit hydro­gens, are shown in the solid line in each plot and the dashed line represents the backbone atoms (C., C and N). (a) The reference structure is the X-ray configuration and (b) the reference structure

is the 0-1 ps average configuration.

Page 186: Principles of Molecular Recognition

TIME SCALES AND FLUCTUATIONS OF PROTEIN DYNAMICS 173

changed only slightly from the X-ray structure in the 500-step minimization. The all-atom deviation of that energy-minimized structure as compared to the X-ray coordinates was 0.70 A.

An interesting feature that emerges from our simulation is the apparent lack of equilibration for the first 50 ps of dynamics simulation. In fact, significant drift in the MSD is apparent even up to 100 ps (see Figure 7.2). The drift seen in the vacuum simulation of Levy et al. [3] lasted even longer. The zero time depicted in the graph actually represents the end-point of the heating and a 14ps 'equilibration' period. Clearly, some averages obtained with shorter time scale simulations could be less reliable. However, MS fluctuations and their drift in time are known to be dependent on the details of the simulation; in particular, the use of short-ranged interaction cut-off switches has been shown to change the natural fluctuations of the atoms in vacuum simulations of myoglobin [18]. We do not deal explicity with this technical aspect here but [18] provides a framework for cautious interpretation of any protein simula­tion results.

It is interesting and useful to compare our simulation results with the previous results of Tilton et al. [4] using a similar potential surface, and the 300 ps vacuum simulation results of Levy et al. [3] using another potential energy function. Table 7.1 presents selected aspects of this comparison and we first note that the overall RMS deviation between structures is comparable in our simulations of myoglobin in solution to those of Levy et al. [3J in vacuo for similar time scales. Second, the movement of the protein in solution is only about 15% less in a shorter time window average as opposed to the vacuum simulation results. It appears that the aqueous solvent helps keep the protein compact and preserves its three-dimensional structure as given by crystallog­raphy, while intrinsic surface tension aids the protein in a vacuum. While there are also some protein-protein contacts in the crystal forms, the agreement with the aqueous simulation is reasonable considering the proportion of water in the crystals of globular proteins [8,12,19].

Figure 7.3(a, b) depicts the mean square fluctuations as a function of residue number. The dotted line curve in Figure 7.3(a) is the average of the 0-50 ps simulation period, the dashed line curve for the 50-lOOps period and the solid line curve for the 100-150 ps period. An inspection of the curves reveals a quantitative difference between the 0-50 ps simulation period and the 50-100, and 100-150 periods, indicating that the system had not achieved equilibrium by this measure after the first 50 ps of simulation time. Figure 7.3(b), where the dotted line curve is the average over the entire 150 ps time period and the solid line curve is the 50-150 ps average, highlights this point. An additional experi­mental comparison is given in Figure 7.3(c) (see section 7.3.2). This is also apparent from Figures 7.4(a), (b) where the mean square fluctuation is plotted as a function of distance from the center of mass of the protein. Figure 7.4(a) displays the average structure obtained over the entire 150ps simulation period, while Figure 7.4(b) displays the average over the 50-150 ps simulation

Page 187: Principles of Molecular Recognition

(a) 1.25

v CLl r:: o

..c ~ o aj

..c -o fIl

Ei

1.00

0.75

0.50

0.25

0.00

(b) 0.8

. 'S:' 0.6 <I '-"

v CLl s:: ..8 0.4 ~ o aj

..c -o fIl 0.2 Ei

0.0

o.

o.

A BCD E F G H

25. 50. 75. 100. 125. 150.

residue number

A BCD E F G H

25. 50. 75. 100. 125. 150.

residue number

Figure7.3 Mean square fluctuation (in A2) where «~r2»=1/3<~x2+~y2+~z2> as a function of residue number. (a) 0-50 ps average is dotted line, 50-100 ps average is dashed line, and 100-150ps average is solid line. (b) Average for 50-150ps is solid line, X-ray Phillips [19] is

dotted line.

Page 188: Principles of Molecular Recognition

(c) 0.8

A B C D E F G H

1\ .. ........ 0.6 ....

<l '-"

V Q)

I=l 0

0.4 .0 ~ CJ 1\1 .0 -0

11.1 0.2 a

0.0 O. 25. 50. 75. 100. 125. 150.

residue number

(d) O.B

A B C D E F G H

1\ .. ........ 0.6 ....

<l '-"

V Q)

I=l 0

0.4 .0 ~ CJ 1\1 .0 - ... 0 - .

11.1 0.2 a

0.0 O. 25. 50. 75. 100. 125. 150.

residue number

Figure 7.3 (contd.) (c) 0-150 ps average is solid line, Phillips X-ray data is dotted line. (d) 75-150 ps average is solid line, Phillips X-ray data is dotted line.

Page 189: Principles of Molecular Recognition

176 PRINCIPLES OF MOLECULAR RECOGNITION

Table 7.1 Root mean square difference between structures'

Compared structures AU atoms Backboneb Backbone (oc-helices)b

Vacuumc Solutiond Vacuumc Solutiond Vacuumc Solutiond

X vs. 000-050· 2.04 1.49 1.28 000-100 2.12 2.27 1.81 1.69 1.45 1.43 050-100 2.63 2.01 1.71 100-200 2.58 2.23 1.18 100-150 2.79 2.08 1.78 200-300 2.82 2.45 1.95

000-200 2.29 1.98 1.58 000-150 2.41 1.80 1.52 050-150 2.68 2.02 1.873 000-300 2.41 2.09 1.66

X vs. 000-000 2.03 1.80 1.76 1.42 1.44 1.63 000-001 vs. 000-050 1.51 1.67 0.91 000-100 1.19 1.89 0.94 1.38 0.85 1.21 050-100 2.39 1.76 1.55 100-200 1.84 1.51 1.38 100-150 2.51 1.85 1.70 200-300 2.25 1.91 1.74

000-200 1.44 1.16 1.07 000-150 2.04 1.52 1.35 050-150 2.40 1.78 1.61 000-300 1.65 1.37 1.26

000-050 vs. 050-100 1.33 0.97 0.88 000-100 vs. 100-200 1.01 0.95 0.82 050-1oovs.l00-150 0.98 0.55 0.52 000-100 vs. 200-300 1.67 1.43 1.23 000-025 vs. 025-050 0.83 0.56 0.53 250-275 vs. 275-300 0.69 0.54 0.40 000-010 vs. 010-020 1.12 0.88 0.74 130-140 vs. 140-150 0.62 0.43 0.38 000-030 vs. 030-060 1.23 0.86 0.79 050-080 vs. 120-150 1.21 0.67 0.58 090-120 vs. 120-150 0.81 0.55 0.52

• The root mean square difference. [<6x2 + 6y2 + 6z2) r/2. b Backbone atoms are C •• C and N. C Data from [3]. d Present study . • X indicates X-ray coordinates. Calculated coordinates are averaged over the indicated intervals.

period. The data in Figure 7.4(a), (b) reflect the typical packing features of globular proteins. Given the dimensions of myoglobin, the core of the protein can be considered to be of dimension less than 15 A across and this is evident in the dynamical evolution and fluctuations of the protein. As seen in Figure 7.4(a), (b), there are only small fluctuations and deviations from the average structure in the packed core of the protein.

Radius of gyration (Rg) of the protein can also be used to assess the state of

Page 190: Principles of Molecular Recognition

(a) 1.50 f- ' ,

'0

1.25 f-

v 1.00 f­

III ~ o .0 ~ o III

.0 'H o fIl

S

0.75

0.50

o o

o 1 0

X X ~ X

x 0 0 o 0 2

o ~ X

-

00 0 ~:~o~o X ~~o~ ~i -o 0 O~o 0 0 ~OX ~~~~><X _

o 0,.£)010 0 0 o ca;:;~ ~ 00 §&1b9~~ oft

0.25

0.00

(b) 1.25

v III I=l o ~ o Id .0 -o fIl

S

1.00

0.75

0.50

0.25

0.00

I I I I

5. 10. 15. 20.

distance from center of mass ( A )

1

5. 10. 15. 20.

distance from center of mass ( A ) Figure7.4 Mean square fluctuation (in A2) where «dr)2)= Ij3<dX2 + dy2+ dZ2) as a function of distance from the center of mass (A). The circles are backbone (C., N) atoms of helices, the crosses are the backbone atoms of the loop regions. (a) The 0-150 ps average structure and (b) the 50-150 ps average structures are shown. Region 1 is composed of atoms in helices A, Band C.

Region 2 is composed of atoms in D, E and H.

Page 191: Principles of Molecular Recognition

178 PRINCIPLES OF MOLECULAR RECOGNITION

Table 7.2 Radius of gyration"

Structureb VacuumC Solutiond

X-Ray 15.04 15.17( 15.38) 000-001 14.15 14. 77( 14.71) 000-050 14.88(14.84) 000-100 14.07 15.04(15.09) 050-100 15.14(15.14) 100-200 13.94 100-150 15.37(15.39) 200-300 13.88 000-150 15.15(15.14) 050-150 15.28(15.28) 000-300 14.00

" The radius of gyration (see text). b Calculated coordinates are averaged over the indicated intervals. Experimental data for the pre­vious study [3] is different from that used in present study. C Data from [3] Atoms included are C., C, Nand o. d Present study. Atoms included are C., C, Nand o and all atoms included in the protein that are included in the simulation in parentheses.

equilibration of the system. It is a measure of the effective size of the protein and is given by the expression

1: j (mlrJ• - rCM]2) R =

9 1:j mj

where rj is the position and rCM is the pre-averaged center of mass vector for thejth atom and mj is its mass. The value obtained from the crystal structure data [10] is 15.17 A. The values obtained during the simulation are presented in Table 7.2. Figure 7.5 shows the variation of the radius of gyration with respect to time during the simulation. The changes in the radius of gyration can be viewed as an indication of the approach of the system toward equilibrium. Analogous to the results of the 300 ps in vacuo simulation of Levy et al. [3], our result shows an initial slight contraction of the protein as evidenced by the decreased radius of gyration. However, the radius subse­quently recovers to nearly its experimental value. This recovery, which took nearly 100 ps, indicates two important points. One is that the preparation steps of minimization and slow heating from a lower temperature may have been good for preserving local structure but contracted the overall global structure. The other point is that the force field was sufficiently robust that it could recover from this method of preparation. While this is a standard method of preparing the system [2], this measurement seems to indicate that other methods should be explored. Analyses of individual residue motions in the exterior of the protein show that this decrease is due, in part, to the folding

Page 192: Principles of Molecular Recognition

(8) 16.0

........

-< ........ /:l 0 .... .... (Ij

~ ~ -0

In ~ ....

"d (Ij

Il::

(b)

I>. .... .... rn /:l CI) .... /:l .....

15.5

15.0

14.5

14.0 O.

1.0

0.8

0.6

0.4

0.2

0.0 0.00

25. 50. 75. 100. 125. 150.

time (ps)

0.05 0.10 0.15 0.20

Figure 7.5 (a) Radius of gyration, in (A), as a function of time. The experimental value is 15.2 A. (b) The corresponding power spectrum over two different time intervals (0-150ps, dotted line;

75 -150 ps, solid line).

Page 193: Principles of Molecular Recognition

180 PRINCIPLES OF MOLECULAR RECOGNITION

up of the side-chains which would have formed inter-protein contacts in the crystal structure and partly due to rearranging the interior free volume. However, unlike the vacuum simulation with scaled electrostatics [3,4], our simulation averages level off at a somewhat higher Rg value, very close to the experimentally determined number, reflecting the balance between the ther­mal motions and solvent-protein interactions. This is an important aspect of the simulation since the dynamical motions on the exterior can propagate to the interior of the protein, as shown in more detail below.

The analysis thus far presented reveals that this model of myoglobin in a solution of water approaches equilibrium, as measured by the fluctuations using the chosen algorithm, only after at least 75-100 ps of uninterrupted simulation after thermalization and a brief initial attempt to equilibrate. Hence, our most reliable time averages will be calculated during the last 75-100 ps of simulation.

7.3.2 Structure and dynamics

All averages ofthis analysis, unless stated otherwise, are obtained from the last lOOps of the trajectory. The comparison of the X-ray structure and the averaged structure of backbone IX carbons in Figure 7.1 showed that the integrity of the three-dimensional structure of the protein is preserved in the simulation. This substantiates the notion that protein structures in solution are similar to crystalline structures overall [20.21] but tend to explore a wider range of structures. Analysis reveals that the EF and GH loops which are more accessible to the solvent are somewhat more extended in our solution struc­ture. The helices A, Band C are also slightly displaced, on average, in the simulated solution structure after equilibration, while the helix that en­compasses the active site porphyrin retains a structure remarkably close to the crystal configuration. The proximal histidine (His 93), however, shows signifi­cant torsional motion and has rotated on average by approximately 45°.

Mean square fluctuations (MS) of atoms have been used as a useful mea­sure of the internal motions in a protein [13]. The total RMS deviation for various significant structures is given in Table 7.3. A plot of the MS deviation as a function ofthe residue number is shown in Figure 7.3. From a comparison of the starting X-ray structure and the simulated solution data (Figure 7.2(a) dotted and solid lines), one can draw several conclusions. There is a greater apparent degree of movement in the model solution than in the experimental [12] crystalline system. While all simulations show displacement from the X-ray data, some of this movement has also been inferred from solution experiments [2,21]. This can be attributed to several factors.

The nature offorces experienced by exposed residues depends on the nature of the environment, e.g. crystalline, aqueous, or membrane-bound. The effec­tive potential surface or work surface is known to be significantly altered for atoms with solvent exposure [1,2]. The environment in a crystal consists of

Page 194: Principles of Molecular Recognition

TIME SCALES AND FLUCTUATIONS OF PROTEIN DYNAMICS

Table 7.3 Root mean square atomic displacements, «L\R)2)1/2a

Time intervalb «L\R)2) 1/2 backbone·

Vacuum" Solutionf

000-050 0.64 000-100 0.829 0.71 050-100 0.80 100--200 0.829 100-150 0.48 200-300 0.707 0-100 0.828 0.71 0-150 0.70 50-150 0.63 0-150' 0.40 50-150h 0.39 0-200 0.960 0-300 1.071 0--300; 0.590 Experiment j 0.650 0.63

• The root mean square atomic displacement. b Picoseconds . • Backbone atoms are C., C and N.

«L\R)2)1/2 all atomsd

Solutionf

1.09 1.19 1.16

0.83

1.19 1.22 1.06 0.67 0.65

d All atoms included in the protein that are included in the simulation. " Data from [3]. f Present study. I Value obtained by averaging 15 100ps average of «L\R)2)1/2. h Value obtained by averaging 10 lOops average of «L\R)2) 1/2. ; Value obtained by averaging 12 25-ps average of «L\R)2) 1/2. j Experimental data for the previous study [3] is different from that used in present study [19] Phillips (Temp-factor)'/2 x ~3 = 0.63.

181

the mother liquor in a compact regular lattice of myoglobin protein molecules in contact with each other. In our solution studies, there is one myoglobin in a box of water which is large enough such that there are no constraining forces due to packing contacts in a crystal. The particular geometry of our box and the protein ensures at least three layers of water between the protein and the edge of the box and, therefore, at least six layers until the next periodic image of the protein. The significance of the environment is best seen by comparison of three separate parts of the protein. The first part is the C-D region of the protein. These two helices and the interconnecting loop are on the exterior of the protein (Figure 7.1). The second part is in the N-terminal region. Here the simulated solution atoms move somewhat less than what experimental B factors for our starting coordinates [12J indicated in the crystal. The other part of the protein which is different are the E-H helices. This constitutes a portion of the active site of the protein because the metalloporphyrin is connected to His 92 in the F helix and has non-bonded interactions with the E and G helices. This region is seen to have little large scale motion in the simulation, supporting the view that active sites in proteins are less susceptible to large

Page 195: Principles of Molecular Recognition

182

(a) m '0 ~ 0 .0 I

::r:: ~ ..... Q.) ~ 0 f.< 0.. I ~ ..... Q.) ~ 0 f.< 0..

f.< Q.)

.0 S ::s Z

(b)

>. ~ ..... rn ~ Q.) ~

~ ......

PRINCIPLES OF MOLECULAR RECOGNITION

220.

200.

180.

160.

140.

O.

1.0

0.8

0.6

0.4

0.2

0.0 0.00

25. 50. 75. 100.

time (ps)

0.05 0.10 0.15

Jl (pS-l)

125. 150.

0.20

Page 196: Principles of Molecular Recognition

TIME SCALES AND FLUCTUATIONS OF PROTEIN DYNAMICS 183

deformations [2]. The only region that shows large deviations is the GH loop which is on the exterior of the protein. Previous vacuum simulations on myoglobin [3,4] show qualitatively similar results to those shown in Fig­ure 7.3, although explicit water simulations seem to permit somewhat greater mobility of the outer regions of the protein [1,2].

The problem of crystal contacts may be analyzed in a straightforward manner. It is instructive to use both the recent crystallographic data of Phillips et al. [19] in which myoglobin was crystallized in a P6 space group, and the ansatz by Phillips [5] that the B factors in regions unaffected by crystal contacts are more representative of the 'natural' motion in the proteins, to make comparisons with our simulation data. After Phillips, by taking the maximum of the B-factors for a given residue from either the previous P2 1

structure and the P6 structure and comparing with the fluctuations from our last 50 ps of simulation, the majority of the discrepancies noted in the single crystal comparison above are alleviated. In particular, comparing our results with those of Phillips, we find considerable improvement in the agreement in the Band C helix region and between the G and H helices (see Figure 7.3(c)). While this is qualitatively similar to what was previously found [5] for a comparison with the larger vacuum trajectory of Levy and et al. [3], it is much better quantitatively. Such a comparison is not exact due to experimental differences in proteins, refinements and in mother liquors [11, 19] and, on the simulation side, parameters and environment, the major bias due to contacts in a particular crystal form, are significantly reduced in this comparison method.

Another way to quantify the atomic mobility is to look at the MS deviation as a function of distance from the center of mass of the protein (Fig­ure 7.4(a), (b». Here the deviation of all the backbone atoms is plotted. The statistics obtained from Figure 7.4(b) are believed to be more accurate (see section 7.3.1). As can be seen, in general, atoms farther away from the center of the protein have a larger degree of motion. This reflects the behavior charac­teristic of all compact globular proteins. In fact, significant solvent penetration is found in the exterior regions. Also, we note that even though all of the loop sections (indicated by crosses) are on the outside of the protein (15-22 A), they are not uniformly flexible. There are two labeled regions indicated by '1' and '2' in Figure 7.4(b). These are two backbone regions which have fluctuations more than ~ 0.50 A 2. Region 1 contains atoms from the helices Band C. It has been stated previously that these regions are far from the active site of the protein and are more accessible to the solvent. The other region, 2, contains

Figure 7.6 (a) Number of protein-protein hydrogen bonds (based on geometry) as a function of time. This is not restricted to backbone atoms but includes all of the polarizable hydrogens in the protein. Notice the oscillation period of about 50 ps. The 'noise' of the system is actually statistical

thermal fluctuations. (b) The corresponding power spectrum.

Page 197: Principles of Molecular Recognition

184

(a)

fIJ "'d s:l o

..0 I

ttl s:l ..... Q) ~ o r.. Il. I ~ .e o fIJ

s... Q)

..0 S ;:i z

(b)

~ ~ ..... fIJ s:l Q) ~

I=l ......

PRINCIPLES OF MOLECULAR RECOGNITION

220.

200.

180.

160.

140.

o.

1.0

0.8

0.6

0.4

0.2

0.0 0.00

25. 50. 75. 100. 125. 150.

time (ps)

0.05 0.15 0.20

Page 198: Principles of Molecular Recognition

TIME SCALES AND FLUCTUATIONS OF PROTEIN DYNAMICS 185

2650.

""' l\l

0< ......... td 2600. Q) I-< td Q) t.) td -I-< 2550. ::s rn rn

;:.... td td ~ I-< 2500. Q)

'd

d td t>

2450. o. 25. 50. 75. 100. 125. 150.

time (ps)

Figure 7.8 Surface area (scaled by n) of the protein vs. time.

some backbone atoms, from helices D, E and H, that are not close to the active site region.

The above analysis shows the movement of individual atoms or residues. These movements do not occur in isolation but form a part of the movement of the protein as a whole. There are several ways in which this correlated motion can be analyzed [2]. In an effort to uncover the mechanical driving force, we have chosen to focus on the strong stabilizing force that holds most aqueous phase proteins in their native configuration, the solution hydrogen bonds. We consider the number of protein-protein hydrogen bonds, using a conservative geometrical definition, as a function of time over the simulation period (Figure 7.6(a)). This plot reveals the presence of a periodicity of approximately 50-60 ps. This periodicity is also revealed by the radius of gyration power spectrum computed over the last half of the simulation (Figure 7.5(b), solid

Figure 7.7 (a) Number of solvent-protein hydrogen bonds (based on geometry) as a function of time. The beginning of the curve has a different slope.from the rest of the curve which indicates that the solvent conformation is not in equilibrium at the beginning of the simulation. Also notice, oscillation period of about 100 ps. The 'noise' of the system is actually statistical thermal

fluctuations. (b) The corresponding power spectrum.

Page 199: Principles of Molecular Recognition

186 PRINCIPLES OF MOLECULAR RECOGNITION

(a) 100. A B C D E F G H ,-..

N

--< 80 . ---<0 Q) 1-4 <0 Q) 60. 0 <0 -1-4 ::1 (J)

(J) 40. ::.... <0 <0 ~ 1-4 20. Q)

"0

C <0 > O.

'--O. 25. 50. 75. 100. 125. 150.

residue number

(b) 100. A B C D E F G H

C\I

--< 80. ---<0 Q) 1-4 oj

Q) 60. 0 oj -1-4 ::1 (J)

(J) 40. -oj «I ~ 1-4 20. Q)

"0

C «I > O.

'--O. 25. 50. 75. 100. 125. 150.

residue number

Page 200: Principles of Molecular Recognition

TIME SCALES AND FLUCTUATIONS OF PROTEIN DYNAMICS 187

line). While times beyond the time for a sound wave to traverse the box and re-enter via the periodic boundary conditions (3-4 ps) are generally affected by the system's artificial periodicity, these motions are far beyond that time scale and not likely to be a simple artifact of the boundary conditions.

To obtain the frequencies related to the hydrogen bonding, we computed the power spectrum of the number of protein-protein hydrogen bonds versus time (Figure 7.6(b)). The dominant period is around 60 ps, as the cursory inspection above indicated, with contributions at both lower and higher frequencies evident. As the solvent not only competes for the hydrogen bonds of the protein but serves as the dynamic and thermodynamic bath for the system, we consider also the hydrogen bonds that the protein makes with the proximal solvent as a function of time. A plot is shown (Figure 7.7(a)) for the solvent-protein hydrogen bonds. After the initial equilibration, this plot shows a periodicity in excess of 100 ps. Once again, the power spectrum was computed and the dominant periods are near 200 ps and 50 ps. The lower frequency is near the limit of detection for our method. There is a visible overlap in the peaks shown in Figure 7.6(b) with those of Figures 7.5(b) and 7.7(b), especially near 50-60ps (O.lO-O.l2pS-l). Thus, energy between those modes of motion would transfer in a facile manner and so the protein is seen to be effectively mechanically driven by the solvent. This motion is reflected in the radius of gyration (Figure 7.5(a)) and, therefore, in the solvent accessible surface area of the protein (Figure 7.8).

There is a complex relationship between the accessible volume of the protein and the movement and correlations of the solvent surrounding the protein. Solvent accessible surface areas of the protein, calculated from the Lee and Richards [22] algorithm, are plotted as a function of time in Figure 7.8 and this qualitatively supports the notion of low frequency collective motions, possibly driven by solvent fluctuations. The time-averaged surface area, as a function of residue number, is also presented in Figure 7.9(a) for the first 50 ps and in Figure 7.9{b) during the last 100 ps for comparison. While residues in Figure 7.9(a) with a large solvent exposure generally also have a large expo­sure in Figure 7.9(b), the relative intensities are seen to change, in some cases significantly.

The final method of analysis of overall motion of the protein presented here probes how the relative angles and positions of the various helices in myo­globin move with respect to themselves and with respect to other helices in the protein. The defined position of each helix is obtained by averaging the positions of all the C~ atoms in that helix. The angle between pairs of helices is calculated as the angle between the helix axes. The axis for each helix is obtained by joining the points that are the averaged positions of the first three

Figure 7.9 Surface area vs. residue number (a) An average over the first 50 ps (b) An average over the last lOOps.

Page 201: Principles of Molecular Recognition

188

(a)

18.

16.

10.

o.

(b)

28.

26.

22.

20.

o.

PRINCIPLES OF MOLECULAR RECOGNITION

25. 50. 75. 100. 125.

time (ps)

25. 50. 75. 100. 125.

time (ps)

150.

150.

c

H2

D

A

G2

Gl

B

Page 202: Principles of Molecular Recognition

TIME SCALES AND FLUCTUATIONS OF PROTEIN DYNAMICS 189

and last three Ca atoms, respectively. A similar analysis of a vacuum simula­tion of myohemerythrin helices has been reported [23].

Figure 7.10 shows how the distance between each of the helices in myo­globin and helix F change as a function of time. As can be seen, there are fluctuations of up to 2 A in the distances between the helices. The relative motions ofthe helices can be divided into three types, based on their extent and time scales. Most of the relative distances between helix F and the other helices (A, B, D, G and H 2) display a similar pattern which indicates two time periods of approximately 70 ps and 25 ps, respectively, each having the same ampli­tude. The other three helices show different patterns. Helix E has approximate­ly the same pattern, but shorter period fluctuations occur at an earlier time; H 1

has a larger amplitude in the initial part and then has approximately the same pattern as A, B, D, G and H 2; finally, C shows greater movement in the beginning of the equilibration run and then shows relatively no change after '" 85 ps. Thus, these plots again reveal the difficulty in assessing when the trajectory can be called representative of equilibrium.

In addition to considering the change of position as a function of time, one can also analyze the changes in the direction of the axes. Figure 7.11 explores the change in relative angles between helix F and the other helices. The periodicity is even more striking than in the case of helix positions, suggesting motion of the helices as intact structural units [2]. The internal motions must be preserved to within the overall RMS deviations but the angles may be correlated with each other, allowing for compensating internal counter­rotations which preserve the total angular momentum. Once again, three types of curves are seen. The first type of movement is relatively flat, i.e., the two helices stay at the same orientation with respect to each other during the simulation. This occurs in the relative F - B angle. The second type of movement displays a period in excess of lOOps. This occurs for the F-G 1

angle. The third, which is the most common, has a period of approximately 100 ps. The trajectory in this simulation shows two different motions having this periodicity, each having opposite phases. The first contains the F - E, F - D, F -H 1 and F - H 2 angles, and the second contains the F - A, F - C and F -G2 angles. The relative amplitudes of the movements are similar. This regular movement appears to be related to the collective modes already discussed (see also Figures 7.5-7.7). However, motions at or greater than lOOps may be affected by the approach of the simulated average to equilib­rium and should properly only be calculated from a considerably longer trajectory or a genuine ensemble of trajectories.

Figure 7.10 Distance between the center points (see text) of two helices as a function of time. Only the data for the last 100 ps are shown. All distances are to helix F (the active site of the protein). The helices closer to F are shown in (a) from top to bottom C, E, G" H 2 and the helices further away from F are shown in (b) from top to bottom D, A, G2 , HI' B. Changes of distances up to ~ 2 A occur in this system. Note that helices G and H have been split into two parts because of

bends in those a-helices.

Page 203: Principles of Molecular Recognition

190

(a) 175.

150.

125.

2: 100. Cb

75.

50.

25. o.

(b) 175.

150.

125.

,-... ~ 100. Cb

75.

50.

25. o.

PRINCIPLES OF MOLECULAR RECOGNITION

E

D

B

c

25. 50. 75. 100. 125. 150.

time (ps)

25. 50. 75. 100. 125. 150.

time (ps)

G1

G2

A H1

H2

Page 204: Principles of Molecular Recognition

TIME SCALES AND FLUCTUATIONS OF PROTEIN DYNAMICS 191

Some general statements concerning the global movement of myoglobin can be made by analyzing the plots of the radius of gyration (Figure 7.5), the number of solvent-protein hydrogen bonds (Figure 7.7), the number of protein-protein hydrogen bonds (Figure 7.6), and the distances (Figure 7.10) and relative angles (Figure 7.11) between helix F and the other seven helices in the protein. From the information on the relative inter helix distance (Fig­ure 7.10) and the radius of gyration, it seems that there is a large scale breathing motion consisting of both contraction and expansion of the protein with periods longer than 50 ps. Since helix C is solvent-exposed, it is not surprising that its motion is damped relative to the rest of the protein. Helices E and HI are similar but show differences, presumably due to the presence of the heme group near E. It is interesting to note that the number of solvent­protein hydrogen bonds (Figure 7.7) displays a periodicity akin to that seen in correlation of relative helix positions (Figure 7.10). This demonstrates a possible correlation between intra-protein and protein-solvent motions.

The relative angles that the helices make with helix F are more complicated to analyze. Here, a period of at least 100 ps is seen. However, the motion of the helices changes relative to the phase of the angle, essentially in opposing fashion. The two exceptions to this are helices Band G1 which show a slower change than the rest of the protein. This suggests the presence of a complicated twisting or multihinge movement in the protein where the hinges are at the solvent-exposed CD loop, around the active site, and the exposed GH loop. This internal motion shows a relationship with the number of protein -protein hydrogen bonds (Figure 7.6). Although this similarity is not as striking as the previous comparison, it is tempting to compare the above to the overall collective modes seen in the protein in aqueous solvent.

7.4 Conclusions

The example study we have presented emphasizes a number of interesting correlations between the fluctuations ofthe protein and various modes within the solvent. In particular, the fluctuations of hydrogen bonding in the solvent and to the protein may be responsible for maintaining a number of motions with periods below 50 ps.

In general, the gross features of the present solution study are in general agreement with previous vacuum simulations [3,4J and crystallographic structure data [10,15]. However, the present study does indicate that the relative movements of the protein can, in specific regions, be different in

Figure 7.11 Relative angle (see text) between helix F and the other seven helices as a function of timefor the equilibrated data. The helices from top to bottom are (a) E, D, Band C and (b) G l' G2 ,

A, HI and H 2' Changes of relative angles up to - 25° occur in this system. Note that helices G and H have been split into two parts because of bends in those a-helices.

Page 205: Principles of Molecular Recognition

192 PRINCIPLES OF MOLECULAR RECOGNITION

solution than in other environments. More specifically, as compared to previous vacuum simulations, our calculations show a greater degree of overall mobility in the solvent-exposed regions and a smaller degree of motion in the active site regions. Also, as expected, non-helical exposed loop regions display greater dynamical mobility than helical regions.

A technical point that emerges from the simulations is the long time that is necessary to equilibrate proteins in solution simulation. This was anticipated from previous vacuum simulations where there were some legitimate ques­tions about whether equilibrium was ever truly achieved [3]. Statistics ob­tained from shorter time simulations are likely to suffer from the shortcoming of yielding quantitatively incorrect results. The key role played by the solvent in modulating the motions was also examined and our results, in conjunction with other recent observations [5], advocate the need to consider the import­ance of the environment in theoretical studies. Simulations of the length studied here, and longer, provide a unique means of obtaining detailed atomic model information on long wavelength motions that are accessible to modern experiments.

Acknowledgements

The authors would like to thank the Robert A. Welch Foundation and NIH for support. Professor G. Phillips, Drs. P.E. Smith, and R. Loncharich are thanked for a careful reading of the manuscript and many helpful conversa­tions. S. Subramanian was partially supported by funds from the Institute for Molecular Design ofthe Chemistry Department at the University of Houston. Calculational support was provided, in part, by an instrumentation grant from NSF and by the NCSA. B.M. Pettitt was an Alfred P. Sloan fellow.

References

1. M. Karplus and I.A. McCammon (1983) Annu. Rev. Biochem. 52, 263-300. 2. c.L. Brooks, M. Karplus and B.M. Pettitt (1988) in Advances in Chemical PhYSiCS, Vol. 71, eds.

S. Rice and I. Prigogine Wiley, New York, J.A. McCammon and S. Harvey (1987) Dynamics of Proteins and Nucleic Acids, Cambridge University Press, Cambridge.

3. R.M. Levy, R.P. Sheridan, J.W. Keepers, G.S. Dubey, S. Swaminathan and M. Karplus (1985) Biophys J. 48, 509-518.

4. R.F. Tilton, U.c. Singh, I.D. Kuntz and P.A. Kollman (1988) J. Mol. Bioi. 199, 195-211. 5. G.N. Phillips (1990) Biophys J. 57, 318-383. 6. See for instance: SJ. Weiner, P.A. Kollman, D.A. Case, U.C. Singh, C. Ghio, G. Alagona, S.

Profeta and P. Weiner (1984) J. Am. Chem. Soc. 106,825-833. 7. LJ. Kagen (1973) Myoglobin: Biochemical, Physiological and Clinical Aspects, Columbia

Press, New York. 8. J.c. Kendrew, R.E. Dickerson, B.E. Strandberg, R.G. Hart, D.R. Davies, D.C. Phillips and

V.c. Shore (1960) Nature 185,422-427. 9. K. Kuczera, I. Kuriyan and K. Karplus (1990) J. Mol. Bioi. 213, 351.

Page 206: Principles of Molecular Recognition

TIME SCALES AND FLUCTUATIONS OF PROTEIN DYNAMICS 193

10. M. Levitt and R. Sharon (1988) Proc. Natl. Acad. Sci. USA 85, 7557. 11. W.A. Gilbert, 1. Kuriyan, G.A. Petsko and D.R. Ponzi in Structure and Dynamics: Nucleic

Acids and Proteins, eds. E. Clementi and R.H. Sarma, Adenine Press, New York, 405-420; F. Parak, E.N. Frolou, R.L. Mossbauer and V.I. Goldanskii (1981) J. Mol. BioI. 145,825-833.

12. T. Takano (1977) J. Mol. BioI. 110,537-568. 13. F.C. Bernstein, T.F. Koetzle, G.T.B. Williams, E.F. Meyer, M.D. Brice, 1.R. Rodgers,

O. Kennard, T. Shimanouchi and M. Tasumi (1977) J. Mol. BioI. 112, 535. 14. P.K. Weiner and P.A. Kollman (1981) J. Compo Chern. 2, 287; S.J. Weiner, P.A. Kollman, D.T.

Nguyen and D.A. Case (1986) J. Compo Chern. 7, 230. 15. H.J.C. Berendsen, 1.P.M. Postma, W.F. van Gunsteren and 1. Hermans in Intermolecular

Forces, ed. B. Pullman, D. Reidel, Dordrecht, p. 331. 16. L. Verlet (1967) Phys. Rev. 159, 98. 17. D.A. Pearlman and P.A. Kollman in Computer Simulations ofBiomolecular Systems, eds. W.V.

Gunsteren and P. Weiner, Escom, Leiden, p. 101. 18. R.J. Loncharich and B.R. Brooks (1989) Proteins 6, 32. 19. G.N. Phillips, R.M. Arduini, B.A. Springer and S.G. Sligar (1990) Proteins: Structure,

Functional Genetics (in press). 20. M.W. Makinen and A.L. Fink (1977) Annu. Rev. Biophys. Bioeng. 6, 301-343. 21. K. Wuthrich (1989) Science 243, 45-50. 22. B.K. Lee and F.M. Richards (1971) J. Mol. BioI. 55, 379. 23. D. Rojewska and R. Elber (1990) Proteins 7, 265.

Page 207: Principles of Molecular Recognition

Index

acceptor 14 molecules 24 numbers 61

ace inhibitor 145 acetylcholine esterase (ACHE) 141 achiral rhodium catalyst 84 active electrons 112 active site

myoglobin 168 protein 181-182 structure 121

affinity 142 agonists, partial 142-147 alcohols 47 alkaline earth metals 124 allylic acetate 103 allylic alkylation 79, 103 allyl ligand 104 Alzheimer's disease 141 amides 47 amines 47, 50 amino acids

histidine 93, 168 ligands 97 modifications 109 mutations 108

amino alkyl ferrocenyl phosphine complexes 97

ammonia 50 angiotensin II 145 angular

dependence 35 distortion 21 geometry

rules 17, 22-24 solid state 41

antagonists 142-144 anti-cooperativity 45, 56, 63, 71 anti-drugs

AIDS 152 aprotic solvents 48 aromatic nitroanions 54 aryl sulphonyl methyl perchlorates 70 aspartate proteases 154 asymmetric hydroformylation 79 atomic mobility 183 atropisomerically chiralligand, BINAP 87 attractive force

long/short range 6 stretched rubber band 4

attractive interaction 14 axial chirality 97

bacterial transpeptidases 161 basic probe 64 bending modes 21 binding energy 3 binding ratio 142 binding strength, relative 38 bio-solvation 75 blood pressure 145-146 Boltzmann average 116 Born-Oppenheimer approximation 2-3 bovine pancreatic trypsin inhibitor (BPTI)

169 BPTI 169 Buckingham-Fowler model 25-26,40

cage pairing 46 cage sharing 46 calcium ion

concentration 121 electrostatic field 121 enlarged ligand 123 SNase enzyme 116 stabilisation 121

captopril 145 carbon-carbon bond formation 96, 103 carboxylic acids 93

unsaturated 89 carboxypeptidases 161 catalysis

general base 133 orientation 109 palladium 103 rhodium 84 ruthenium 96

catalytic ion 108 catalytic kinetic resolution 84 catalytic power 108 cation hyperfine splitting 53 cation solvator 48 charge transfer to solvent (CTTS) 49 chelating phosphinamine ligand 100 chiral biphosphines 96-97 chiral reactants 84 CHIRAPHOS, palladium catalyst 83, 103 cholinergic hypothesis 141 cholinergic transmission 141 clathrate cages 46, 65, 76

Page 208: Principles of Molecular Recognition

196

clavulanic acid 138, 161 clinical trials 147 'closed' chelate transition state 86 CNDO 111 component molecules 39 concentration shifts 57 conformational ability 157 constraint potential 116 cooperativity 45 cough syndrome 145 counterpoise technique 9 cross-coupling 96 crucial specificity pocket Sl 161 crystal contacts 183 crystal field stabilisation energies 130 CTTS transition 49 cyanomethane 64 cytosolic receptors 141

denature 171 desolvation 109 destructive enzymes 159 diagonal energies 115 diastereomeric transition states 86 diffusion 70 dimers

formic acid 40 hydrogen-bonded 17-41

dimethyl sulphoxide 48 dinitrobenzene, m- 54 dinitrobenzene anions 52 DIOP 80 dioxygen 56 DIPAMP 80, 86 dipole induction energy 8 dipole moment 3 directed hydrogenation 84 dispersion energy 8, 12-15, 17, 21 dispersion force 14 dissociation constants 148 dissociation energy 17, 21 distributed multi pole analysis 25, 32-33 diuretics 145-148 DMA 25, 32-33 donor 14 down-field shifts 44 drug design 137 dynamic kinetic resolution 96 dynainic proteins 180-192

effective concentration 109 effectofmedium 12-14 efficacy 141-144 elastases 159 elastin 159 electron spin resonance spectroscopy (ESR) 43, 51, 72, 75

electron transfer 54

INDEX

electron pairs non-bonding 22-36 n-type 22 pseudo-n type 22

electrophilicity 37-40 electrostatic

complementarity 108 energy 6, 8, 25 model 26-31 potential 27-31 potential mapping 144

emphysema, pulmonary 159 empirical valence bond method (EVB)

109-114 enamide complexes 80-81 enantiomerically pure catalyst 86 enantioselectivity 81

rhodium reactions 82-83 entropic contribution 4 enzyme

acetyl choline esterase 141 carboxypeptidases 161 catalysis 109 destructive 159 elastases 159 reverse transcriptase 152

enzyme inhibitor complex 121 design 152

enzymes as targets 151-152 error range 121 epidermal growth factors 141 epimerisation 95 epoxidation, Sharpless 88 ESR 43, 51, 72, 75 esters 47 eq ui -active 157 equilibrium 171 ethers 47 EVB 109-114 exchange processes 54

FEP mapping vector 119 ferrocene-based diphosphinamine 103 ferrocenyl phosphine-palladium catalyst

precursor 99-100 ferrocenyl side chain 105 first order nucleophilic substitution 70 fluctuations 191 force 3-4,8 forces between macroscopic bodies 12 formaldehyde 29 free energy

metal ion catalysts 125 perturbation 114-115 profile 119 proton 114 surface in solution 115

Page 209: Principles of Molecular Recognition

free group postulate 68 free hydroxyl groups 66 frequency, hydrogen bonding 187 functionality 46

gag 152-158 gag-pol gene 158 gas-phase

estimates ll4 potential 113

Gaussian band 65 gene duplication 154, 155 general base

catalysts 133 strength 131

genetic engineering 109 Gillespie-Nyholm VSERR mode 33 global

minimum 25 movement 191

globular protein 168 Grignard reagent

chiral 98 5-coordinated 10 1 cross coupling 96 racemic 79 silyl 102

ground-state potential 116 ground-state potential surface ll5 growth factor receptors 141 G-protein linked receptors 140-141

Hamiltonian 111-113 helix ex

dynamics 180-192 myoglobin 168 structure 14

Hellmann-Feynman theorem 6 hemiketal formation 164 hemoglobin 169 heterolytic activation 93 Histidine 93 168 homodimer 155 homogeneous hydrogenation 79 homologous enzymology 160 homolytic bond cleavage 112 human immunodeficiency virus (HIV) protease 154-158

human immunodeficiency virus (HIV) protease inhibitors 152

hydration, ionic 72 hydroboration 85 hydrogenation

directed 84 homogeneous 79 unsaturated carboxylic acids 89

hydrogen bonding 14-15, 17-41,44,57, 123

INDEX

fluctuations 191 frequencies 187 intermolecular 14 intramolecular 14 solvation 45 Van der Waals 15

hydroformylation, asymmetric 79 hydrolysis 116 hydrolytic proteases 159 hydrophobic bonding 46 hydrophobic effect 4-5 hydroxyethylamine 158 hydroxyl groups, free 66 hypertension 145

induction energy 7 infrared (IR) 66 inhibitors

ACE 145

197

bovine pancreatic trypsin 169 irreversible 161 suicide 161 symmetrical 155-156

in-plane double minimum potential function 35

incipient molecular recognition 36-39 interaction 17 interaction energy 9-12 intermolecular binding strength 18-21 intermolecular energy 1 intermolecular forces

force theory long range 4-6 intermolecular hydrogen bonds 14 potentials 6 stretching force constant 17-21, 37

internal diastereoseIectivity 84 internal motions 189 intramolecular hydrogen bonds 14 iodoacetoxylation 85 ionic association 51 ionic hydration 72 ion multiplet 51 ion-pair fluctuations 52 ion pairing 51 ion solvation 71 ions, NMR 57 irreversible inhibitors 161 isomerism 36-39 isopharmacophoric replacement 150 isotropic hyperfine splitting 55

keto-esters, p, reduction 96 ketones 95 kidney-compromised 146

lactamase, p, inhibitor 138, 161 Lennard-Jones terms 124

Page 210: Principles of Molecular Recognition

198

ligand allyl 104 amino acid 97 BINAP 87 chelating phosphinamine 100

ligand field theory 130 ligand PN, reaction pathway 102 linewidth effects 59 localised model 50 London forces 8, 12 long-range intermolecular force theory 4 Lorentzian band width 65 low-frequency shift 11, 14, 66 lysozyme 134

magnesium salts 59 magnetic interactions 9 magnetic moment, rotational 3 magnetic resonance 72 magnitudes of contributions 11-12 manganese 130 mapping potentials 115 mean square (MS) 171-192 mean square fluctuation 173 mechanical driving force 185 mechanistic divergence 88 medium effect 12-14 metal ion exchange 129 metalloenzyme 110 metalloporphyrin 181 metal substitutions 123-124, 108 metmyoglobin 169-192 MOllO mobility of protein 169 molecular dynamics

simulation 169-170 timescales 169

molecular orbital (MO) 110 molecule 3-4

attraction 1 repulsion 1

Morse potential 113 MR 141-144 MS 171-192 muscarine receptor (MR) 141-144 myoglobin 168-192

nature of environments 180-181 near infrared (NIR) 66 neutral nitroxides 55 neutral solutes 48, 57 neutron diffraction 63 neutrophils 156 nickel biphosphine complexes 96 nickel haloalkyl complex 98 nickel phosphinoether complex 97 NIR 66 nitroxide

NMR 57, 59, 72

INDEX

non-bonding electron pairs 26-31 radicals 48, 75

non-rigid molecules 4 n-pair model 41 nuclear magnetic resonance (NMR) 57,59,

72 nucleophilicity 37-40 nucleophilic substitution (SN) 70

off-diagonal matrix elements 114 olefins 79 open transition states 85 oral activity, drugs 147 orientation 109 oscillator strength 66 overtone infrared 71 oxadiazoles 143 oxygen 76

palladium complexes in cross-coupling 96-98 halloallyl complexes 98 Pd (0) complexes 98, 103 Pd (II) dihalide complex 98

PAMP-P chiral phosphine 80 paramagnetic 51 partial agonist 143, 147 partial positive charge 28 pathogenic agent 159 PDTP inhibitor 116-120 penta-2, 4-diene 95 pharmacokinetics 147 phosphinamines 97 phospholipids 75 photoionisation 47 planar arrangement 18-23 point-charge models 31-36 pol 152 potential energy 2 potential energy function

defined 2 H2CO"'HF 30 H20'''HF 20, 28, 32 H2S"BF 29 S02"'HF 31

primary interactions 39 primary separation 6 primary solvation numbers 63 prosthetic heme group 168 protein

dynamics 169-192 globular 168 mobility 169 structure 168-192

proton NMR 59 proton transfer 112-120 pseudo-lt-bond 24 pulmonary emphysema 159 pyramidal arrangement 18-23

Page 211: Principles of Molecular Recognition

quantitative angular geometries 25-26 quantum charge distribution 111 quarternary ammonium group 141

radioligand binding assay 148 radius of gyration 176-192 reaction free energy barrier 108 receptors as targets 139 relative binding strengths 38 relaxation effects 59 relaxation time studies 59 renin 145-151, 154 resonance energy 8-9, 11 respiratory cycle 168 reverse-transcriptase (RT) 152 rhodium

achiral catalyst 84 asymmetric hydrogenation mechanism

82 cis delivery of hydrogen 81 complexes homogeneous hydrogenation

80 rotational

constants 18 magnetic moment 3 symmetry 155

ruthenium BINAP complexes 95 catalytic hydrogenation mechanism 96 complexes asymmetric hydrogenation

96 dideuteride 93 hydride 93 hydrogenation 87

saralasin 147 secondary interactions 39 secondary solvent sheIl 72 self-consistent field theory 10, 111 self-ionisation 69 semiquinones 52 shape complementarity 108 short-range interactions 9-10 silver perchlorate 56 'slow, tight binding' inhibitors 152 spin exchange 54 spin labels 51-55 soft nucleophiles 103 solute shifts 57 solvated silver ions 56 solvation

changes 76 energy changes 126 hydrogen bonding 45 in biological systems 75 numbers

solvent barrier 49 effects, NMR 43

INDEX

flip 55 systems 47

solvent-separated ion pairs 52 spectroscopy

ESR 43, 51, 72, 75 IR 66 NMR 76 rotational 18 UV 49 vibrational 72

stability force 185 stable radical ions 54

199

staphylococcal nuclease (SNase) 116, 108 SNase

activation barrier for Ca 2+ 133 active site 116 catalytic factors 135 free energy profile 118 mutants 117 rate constant 117 reaction mechanism 116 substrate binding 117 transition state recognition 123

stereochemical strain 109 strength of intermolecular binding 18-21 structural deviation 171 structural fluctuations 169 structure of proteins 180-192 style 137 substrate binding 117 substrate-induced diastereoselectivity 84 suicide inhibitors 161 supermolecule 9-10 superoxide ion 76 symmetrical inhibitor 155-156

temperature-dependent forces 8 tiglic acid 89 time-averaged shifts 59 timescales 169 trans-acrylic system 151 transition metals

empirical formula 129 replacements for Ca2+ 124

transition states 85, 109 triethylphosphine oxide 61 triple ions 53 triplet state 54 two-fold rotational symmetry 155

unsaturated carboxylic acid hydrogenation 89

UV spectra of halides 43, 49 UV spectroscopy 49

valence bond 110 valence bond wave functions 112 valence shell electron pair repulsion model (VSEPR) 34

Page 212: Principles of Molecular Recognition

200

Van der Waals dimer 3-4 hydrogen bonding 15 interactions 113 molecule 3-4, 15

vasoconstriction 145 vibrational chromophoric probes 60-66 vibrational contributions 10-11

INDEX

vibrational spectroscopy 72 virions 154 viscosity 70

zero shifts 63 zero time 173 zidovudine (AZT) 152 Z-value scale 48