[G. C. Barrett, D. T. Elmore] Amino Acids and Pept(BookFi.org)

242

description

Biokimia, sintesis Asam Amino

Transcript of [G. C. Barrett, D. T. Elmore] Amino Acids and Pept(BookFi.org)

  • This page intentionally left blank

  • The authors objective has been to concentrate on amino acids and pep-tides without detailed discussions of proteins, although the book gives allthe essential background chemistry, including sequence determination,synthesis and spectroscopic methods, to allow the reader to appreciateprotein behaviour at the molecular level. The approach is intended toencourage the reader to cross classical boundaries, such as in the laterchapter on the biological roles of amino acids and the design of peptide-based drugs. For example, there is a section on enzyme-catalysed synthesisof peptides, an area often neglected in texts describing peptide synthesis.

    This modern text will be of value to advanced undergraduates, graduatestudents and research workers in the amino acid, peptide and protein eld.

  • Amino Acids and Peptides

  • Amino Acidsand Peptides

    G. C.IBARRETT

    D. T. ELMORE

  • The Pitt Building, Trumpington Street, Cambridge, United Kingdom

    The Edinburgh Building, Cambridge CB2 2RU, UK40 West 20th Street, New York, NY 10011-4211, USA477 Williamstown Road, Port Melbourne, VIC 3207, AustraliaRuiz de Alarcn 13, 28014 Madrid, SpainDock House, The Waterfront, Cape Town 8001, South Africa

    http://www.cambridge.org

    First published in printed format

    ISBN 0-521-46292-4 hardbackISBN 0-521-46827-2 paperback

    ISBN 0-511-03770-8 eBook

    Cambridge University Press 2004

    1998

    (Adobe Reader)

  • vii

    Contents

    Foreword page xiii

    1 Introduction 11.1 Sources and roles of amino acids and peptides 11.2 Denitions 11.3 Protein amino acids, alias the coded amino acids 31.4 Nomenclature for the protein amino acids, alias the coded amino

    acids 71.5 Abbreviations for names of amino acids and the use of these

    abbreviations to give names to polypeptides 71.6 Post-translational processing: modication of amino-acid residues

    within polypeptides 111.7 Post-translational processing: in vivo cleavages of the amide

    backbone of polypeptides 111.8 Non-protein amino acids, alias non-proteinogenic amino acids

    or non-coded amino acids 111.9 Coded amino acids, non-natural amino acids and peptides in

    nutrition and food science and in human physiology 131.10 The geological and extra-terrestrial distribution of amino acids 151.11 Amino acids in archaeology and in forensic science 151.12 Roles for amino acids in chemistry and in the life sciences 16

    1.12.1 Amino acids in chemistry 161.12.2 Amino acids in the life sciences 16

    1.13 - and higher amino acids 171.14 References 19

    2 Conformations of amino acids and peptides 202.1 Introduction: the main conformational features of amino acids

    and peptides 20

  • 2.2 Congurational isomerism within the peptide bond 202.3 Dipeptides 262.4 Cyclic oligopeptides 262.5 Acyclic oligopeptides 272.6 Longer oligopeptides: primary, secondary and tertiary structure 272.7 Polypeptides and proteins: quaternary structure and aggregation 282.8 Examples of conformational behaviour; ordered and disordered

    states and transitions between them 292.8.1 The main categories of polypeptide conformation 29

    2.8.1.1 One extreme situation 292.8.1.2 The other extreme situation 292.8.1.3 The general case 29

    2.9 Conformational transitions for amino acids and peptides 302.10 References 31

    3 Physicochemical properties of amino acids and peptides 323.1 Acidbase properties 323.2 Metal-binding properties of amino acids and peptides 343.3 An introduction to the routine aspects and the specialised

    aspects of the spectra of amino acids and peptides 353.4 Infrared (IR) spectrometry 363.5 General aspects of ultraviolet (UV) spectrometry, circular dichroism

    (CD) and UV uorescence spectrometry 373.6 Circular dichroism 383.7 Nuclear magnetic resonance (NMR) spectroscopy 413.8 Examples of assignments of structures to peptides from NMR

    spectra and other data 433.9 References 46

    4 Reactions and analytical methods for amino acids and peptides 48

    Part 1 Reactions of amino acids and peptides 48

    4.1 Introduction 484.2 General survey 48

    4.2.1 Pyrolysis of amino acids and peptides 494.2.2 Reactions of the amino group 494.2.3 Reactions of the carboxy group 494.2.4 Reactions involving both amino and carboxy groups 51

    4.3 A more detailed survey of reactions of the amino group 514.3.1 N-Acylation 514.3.2 Reactions with aldehydes 524.3.3 N-Alkylation 53

    Contents

    viii

  • 4.4 A survey of reactions of the carboxy group 534.4.1 Esterication 544.4.2 Oxidative decarboxylation 544.4.3 Reduction 544.4.4 Halogenation 554.4.5 Reactions involving amino and carboxy groups of

    -amino acids and their N-acyl derivatives 554.4.6 Reactions at the -carbon atom and racemisation of

    -amino acids 554.4.7 Reactions of the amide group in acylamino acids and

    peptides 574.5 Derivatisation of amino acids for analysis 58

    4.5.1 Preparation of N-acylamino acid esters and similar derivatives for analysis 58

    4.6 References 60

    Part 2 Mass spectrometry in amino-acid and peptide analysis and in peptide-sequence determination 61

    4.7 General considerations 614.7.1 Mass spectra of free amino acids 614.7.2 Mass spectra of free peptides 624.7.3 Negative-ion mass spectrometry 65

    4.8 Examples of mass spectra of peptides 654.8.1 Electron-impact mass spectra (EIMS) of peptide

    derivatives 654.8.2 Finer details of mass spectra of peptides 684.8.3 Difculties and ambiguities 69

    4.9 The general status of mass spectrometry in peptide analysis 694.9.1 Specic advantages of mass spectrometry in peptide

    sequencing 704.10 Early methodology: peptide derivatisation 71

    4.10.1 N-Terminal acylation and C-terminal esterication 714.10.2 N-Acylation and N-alkylation of the peptide bond 724.10.3 Reduction of peptides to polyamino-polyalcohols 72

    4.11 Current methodology: sequencing by partial acid hydrolysis, followed by direct MS analysis of peptide hydrolysates 724.11.1 Current methodology: instrumental variations 74

    4.12 Conclusions 774.13 References 77

    Contents

    ix

  • Part 3 Chromatographic and related methods for the separation of mixtures of amino acids, mixtures of peptides and mixtures of amino acids and peptides 78

    4.14 Separation of amino-acid and peptide mixtures 784.14.1 Separation principles 78

    4.15 Partition chromatography; HPLC and GLC 804.16 Molecular exclusion chromatography (gel chromatography) 804.17 Electrophoretic separation and ion-exchange chromatography 82

    4.17.1 Capillary zone electrophoresis (CZE) 834.18 Detection of separated amino acids and peptides 83

    4.18.1 Detection of amino acids and peptides separated by HPLC and by other liquid-based techniques 84

    4.18.2 Detection of amino acids and peptides separated by GLC 854.19 Thin-layer chromatography (planar chromatography; HPTLC) 864.20 Quantitative amino-acid analysis 864.21 References 87

    Part 4 Immunoassays for peptides 87

    4.22 Radioimmunoassays 874.23 Enzyme-linked immunosorbent assays (ELISAs) 884.24 References 90

    Part 5 Enzyme-based methods for amino acids 90

    4.25 Biosensors 904.26 References 90

    5 Determination of the primary structure of peptides and proteins 915.1 Introduction 915.2 Strategy 925.3 Cleavage of disulphide bonds 965.4 Identication of the N-terminus and stepwise degradation 975.5 Enzymic methods for determining N-terminal sequences 1055.6 Identication of C-terminal sequences 1065.7 Enzymic determination of C-terminal sequences 1075.8 Selective chemical methods for cleaving peptide bonds 1075.9 Selective enzymic methods for cleaving peptide bonds 1095.10 Determination of the positions of disulphide bonds 1125.11 Location of post-translational modications and prosthetic

    groups 1145.12 Determination of the sequence of DNA 1175.13 References 118

    Contents

    x

  • 6 Synthesis of amino acids 1206.1 General 1206.2 Commercial and research uses for amino acids 1206.3 Biosynthesis: isolation of amino acids from natural sources 121

    6.3.1 Isolation of amino acids from proteins 1216.3.2 Biotechnological and industrial synthesis of coded amino

    acids 1216.4 Synthesis of amino acids starting from coded amino acids other

    than glycine 1226.5 General methods of synthesis of amino acids starting with a

    glycine derivative 1236.6 Other general methods of amino acid synthesis 1236.7 Resolution of -amino acids 1256.8 Asymmetric synthesis of amino acids 1276.9 References 129

    7 Methods for the synthesis of peptides 1307.1 Basic principles of peptide synthesis and strategy 1307.2 Chemical synthesis and genetic engineering 1327.3 Protection of -amino groups 1347.4 Protection of carboxy groups 1357.5 Protection of functional side-chains 138

    7.5.1 Protection of -amino groups 1387.5.2 Protection of thiol groups 1397.5.3 Protection of hydroxy groups 1407.5.4 Protection of the guanidino group of arginine 1417.5.5 Protection of the imidazole ring of histidine 1427.5.6 Protection of amide groups 1457.5.7 Protection of the thioether side-chain of methionine 1457.5.8 Protection of the indole ring of tryptophan 146

    7.6 Deprotection procedures 1467.7 Enantiomerisation during peptide synthesis 1467.8 Methods for forming peptide bonds 149

    7.8.1 The acyl azide method 1507.8.2 The use of acid chlorides and acid uorides 1517.8.3 The use of acid anhydrides 1517.8.4 The use of carbodiimides 1537.8.5 The use of reactive esters 1537.8.6 The use of phosphonium and isouronium derivatives 155

    7.9 Solid-phase peptide synthesis (SPPS) 1567.10 Soluble-handle techniques 1637.11 Enzyme-catalysed peptide synthesis and partial synthesis 164

    Contents

    xi

  • 7.12 Cyclic peptides 1687.12.1 Homodetic cyclic peptides 1687.12.2 Heterodetic cyclic peptides 170

    7.13 The formation of disulphide bonds 1707.14 References 172

    7.14.1 References cited in the text 1727.14.2 References for background reading 173

    8 Biological roles of amino acids and peptides 1748.1 Introduction 1748.2 The role of amino acids in protein biosynthesis 1758.3 Post-translational modication of protein structures 1788.4 Conjugation of amino acids with other compounds 1828.5 Other examples of synthetic uses of amino acids 1838.6 Important products of amino-acid metabolism 1878.7 Glutathione 1908.8 The biosynthesis of penicillins and cephalosporins 1928.9 References 198

    8.9.1 References cited in the text 1988.9.2 References for background reading 199

    9 Some aspects of amino-acid and peptide drug design 2009.1 Amino-acid antimetabolites 2009.2 Fundamental aspects of peptide drug design 2019.3 The need for peptide-based drugs 2029.4 The mechanism of action of proteinases and design of inhibitors 2049.5 Some biologically active analogues of peptide hormones 2109.6 The production of antibodies and vaccines 2139.7 The combinatorial synthesis of peptides 2159.8 The design of pro-drugs based on peptides 2169.9 Peptide antibiotics 2179.10 References 218

    9.10.1 References cited in the text 2189.10.2 References for background reading 218

    Subject index 220

    Contents

    xii

  • Foreword

    This is an undergraduate and introductory postgraduate textbook that givesinformation on amino acids and peptides, and is intended to be self-sufcient in allthe organic and analytical chemistry fundamentals. It is aimed at students of chem-istry, and allied areas. Suggestions for supplementary reading are provided, so thattopic areas that are not covered in depth in this book may be followed up by readerswith particular study interests.

    A particular objective has been to concentrate on amino acids and peptides, asthe title of the book implies; the exclusion of detailed discussion of proteins isdeliberate, but the book gives all the essential background chemistry so that proteinbehaviour at the molecular level can be appreciated.

    There is an emphasis on the uses of amino acids and peptides, and on their bio-logical roles and, while Chapter 8 concentrates on this, a scattering of items ofinformation of this type will be found throughout the book. Important pharma-ceutical developments in recent years underline the continuing importance andpotency of amino acids and peptides in medicine and the avour of current researchthemes in this area can be gained from Chapter 9.

    Supplementary reading(see also lists at the end of each Chapter)

    Standard Student Texts

    Standard undergraduate Biochemistry textbooks relate the general eld to thecoverage of this book. Several such topic areas are covered in

    Zubay, G. (1993) Biochemistry, Third Edition, Wm. C. Brown CommunicationsInc, Dubuque, IA

    andVoet, D. and Voet, J. G. (1995) Biochemistry, Second edition, Wiley, New York

    xiii

  • Typically, these topic areas as covered by Zubay areChapter 3: The building blocks of proteins: amino acids, peptides and proteinsChapter 4: The three-dimensional structure of proteinsChapter 5: Functional diversity of proteins

    Removed more towards biochemical themes, areChapter 18: Biosynthesis of amino acidsChapter 19: The metabolic fate of amino acidsChapter 29: Protein synthesis, targeting, and turnover

    Voet and Voet give similar coverage inChapter 24: Amino acid metabolismChapter 30: Translation (i.e. protein biosynthesis)Chapter 34: Molecular physiology (of particular relevance to coverage in this book of

    blood clotting, peptide hormones and neurotransmitters)

    Supplementary reading:suggestions for further reading

    (a) Protein structure

    Branden, C., and Tooze, J. (1991) Introduction to Protein Structure, Garland PublishingInc., New York

    (b) Protein chemistry

    Hugli, T. E. (1989) Techniques of Protein Chemistry, Academic Press, San Diego, CaliforniaCherry, J. P. and Barford, R. A. (1988) Methods for Protein Analysis, American Oil

    Chemists Society, Champaign, Illinois

    (c) Amino acids

    Barrett, G. C., Ed. (1985) Chemistry and Biochemistry of the Amino Acids, Chapman andHall, London

    Barrett, G. C. (1993) in Second Supplements to the 2nd Edition of Rodds Chemistry ofCarbon Compounds, Volume 1, Part D: Dihydric alcohols, their oxidation products andderivatives, Ed. Sainsbury, M., Elsevier, Amsterdam, pp. 11766

    Barrett, G. C. (1995) in Amino Acids, Peptides, and Proteins, A Specialist Periodical Reportof The Royal Society of Chemistry, Vol. 26, Ed. Davies, J. S., Royal Society of Chemistry,London (preceding volumes cover the literature on amino acids, back to 1969 (Volume1))

    Coppola, G. M. and Schuster, H. F. (1987) Asymmetric Synthesis: Construction of ChiralMolecules using Amino Acids, Wiley, New York

    Dawson, R. M. C., Elliott, D. C., Elliott, W. H., and Jones, K. M. (1986) Data forBiochemical Research, Oxford University Press, Oxford

    Foreword

    xiv

  • Greenstein, J. P., and Winitz, M. (1961) Chemistry of the Amino Acids, Wiley, New York (afacsimile version (1986) of this three-volume set has been made available by Robert E.Krieger Publishing Inc., Malabar, Florida)

    Williams, R. M. (1989) Synthesis of Optically Active -Amino Acids, Pergamon Press,Oxford

    (d) Peptides

    Bailey, P. D. (1990) An Introduction to Peptide Chemistry, Wiley, ChichesterBodanszky, M. (1988) Peptide Chemistry: A Practical Handbook. Springer-Verlag, BerlinBodanszky, M. (1993) Principles of Peptide Synthesis, Second Edition, Springer-Verlag,

    HeidelbergElmore, D. T. (1993) in Second Supplements to the 2nd Edition of Rodds Chemistry of

    Carbon Compounds, Volume 1, Part D: Dihydric alcohols, their oxidation products andderivatives, Ed. Sainsbury, M., Elsevier, Amsterdam, pp. 167211

    Elmore, D. T. (1995) in Amino Acids, Peptides, and Proteins, A Specialist Periodical Report ofThe Royal Society of Chemistry, Vol. 26, Ed. Davies, J. S., Royal Society of Chemistry,London (preceding volumes cover the literature of peptide chemistry back to 1969(Volume 1))

    Jones, J. H. (1991) The Chemical Synthesis of Peptides, Clarendon Press, Oxford

    Foreword

    xv

  • 1Introduction

    1.1 Sources and roles of amino acids and peptides

    More than 700 amino acids have been discovered in Nature and most of them are-amino acids. Bacteria, fungi and algae and other plants provide nearly all these,which exist either in the free form or bound up into larger molecules (as constitu-ents of peptides and proteins and other types of amide, and of alkylated and ester-ied structures).

    The twenty amino acids (actually, nineteen -amino acids and one -imino acid)that are utilised in living cells for protein synthesis under the control of genes are ina special category since they are fundamental to all life forms as building blocks forpeptides and proteins. However, the reasons why all the other natural amino acidsare located where they are, are rarely known, although this is an area of muchspeculation. For example, some unusual amino acids are present in many seeds andare not needed by the mature plant. They deter predators through their toxic or oth-erwise unpleasant characteristics and in this way are thought to provide a defencestrategy to improve the chances of survival for the seed and therefore help to ensurethe survival of the plant species.

    Peptides and proteins play a wide variety of roles in living organisms and displaya range of properties (from the potent hormonal activity of some small peptides tothe structural support and protection for the organism shown by insoluble proteins).Some of these roles are illustrated in this book.

    1.2 Denitions

    The term amino acids is generally understood to refer to the aminoalkanoic acids,H3N(CR1R2)nCO2 with n1 for the series of -amino acids, n2 for -aminoacids, etc. The term dehydro-amino acids specically describes 2,3-unsaturated (or-unsaturated)-2-aminoalkanoic acids, H3N(CCR1R2)CO2.

    However, the term amino acids would include all structures carrying amine andacid functional groups, including simple aromatic compounds, e.g. anthranilic acid,

    1

  • o-H3NC6H4CO2, and would also cover other types of acidic functionalgroups (such as phosphorus and sulphur oxy-acids, H3N(R1R2C)nHPO3 andR3N(R1R2C)nSO3, etc). The family of boron analogues R3NBHR1CO2R2

    ( denotes a dative bond) has recently been opened up through the synthesis ofsome examples (Sutton et al., 1993); it would take only the substitution of thecarboxy group in these organoboron amino acids (RR1R2H) by phospho-rus or sulphur equivalents to obtain an amino acid that contains no carbon!However, unlike the amino acids containing sulphonic and phosphonic acid group-ings, naturally occurring examples of organoboron-based amino acids are notknown.

    The term peptides has a more restricted meaning and is therefore a less ambigu-ous term, since it covers polymers formed by the condensation of the respectiveamino and carboxy groups of , , . . . -amino acids. For the structure with m2in Figure 1.1 (i.e., for a dipeptide) up to values of m20 (an eicosapeptide), the termoligopeptide is used and a prex di-, tri-, tetra-, penta- (see Leu-enkephalin, a linearpentapeptide, in Figure 1.1), . . . undeca- (see cyclosporin A, a cyclic undecapeptide,in Figure 1.4 later), dodeca-, . . . etc. is used to indicate the number of amino-acidresidues contained in the compound. Homodetic and heterodetic peptides are illus-trated in Chapter 7.

    Isopeptides are isomers in which amide bonds are present that involve the side-chain amino group of an -di-amino acid (e.g. lysine) or of a poly-amino acidand/or the side-chain carboxy-group of an -amino-di- or -poly-acid (e.g. asparticacid or glutamic acid). Glutathione (Chapter 8) is a simple example. Longer poly-mers are termed polypeptides or proteins and the term polypeptides is becomingthe most commonly used general family name (though proteins remains the pre-ferred term for particular examples of large polypeptides located in precise biolog-ical contexts). Nonetheless, the relationship between these terms is a little morecontentious, since the change-over from polypeptide to protein needs denition.The gure roughly fty amino acid residues is widely accepted for this. Insulin (apolymer of fty-one -amino acids but consisting of two crosslinked oligopeptide

    2

    Figure 1.1. Peptides as condensation polymers of -amino acids.

  • chains; see Figure 1.4 later) is on the borderline and has been referred to both as asmall protein and as a large polypeptide.

    Poly(-amino acid)s is a better term for peptides formed by the self-condensationof one amino acid; natural examples exist, such as poly(-glutamic acid), the proteincoat of the anthrax spore (Hanby and Rydon, 1946). In early research in the textileindustry, poly(-amino acid)s showed promise as synthetic bres, but the synthesismethodology required for the polymerisation of amino acids was complex anduneconomic.

    Polymers of controlled structures made from N-alkyl--amino acids (Figure 1.1;NRn instead of NH, R1R2H; n1), i.e. H2NRnCH2CO[NRnCH2CO]mNRnCH2CO2, which are poly(N-alkylglycine)s of denedsequence (various Rn at chosen points along the chain), have been synthesised aspeptide mimetics (see Chapter 9) and have been given the name peptoids. These canbe viewed as peptides with side-chains shifted from carbon to nitrogen; they willtherefore have a very dierent conformational exibility (see Chapter 2) from thatof peptides and will also be incapable of hydrogen bonding. This is a simple enoughway of providing all the correct side-chains on a exible chain of atoms, in order tomimic a biologically active peptide, but the mimic can avoid enzymic breakdownbefore it reaches the site in the body where it is needed.

    Using the language of polymer chemistry, polypeptides made from two or moredierent -amino acids are copolymers or irregular poly(amide)s, whereas poly(-amino acid)s, H[NHCR1R2CO]mOH, are homopolymers that could bedescribed as members of the nylon[2] family.

    Depsipeptides are near-relatives of peptides, with one or more amide bondsreplaced by ester bonds; in other words, they are formed by condensing -aminoacids with -hydroxy-acids in various proportions. There are several importantnatural examples of these, of dened sequence; for example the antibiotic valino-mycin and the family of enniatin antibiotics. Structures of other examples ofdepsipeptides are given in Section 4.8.

    Nomenclature for conformational features of peptide structure is covered inChapter 2.

    1.3 Protein amino acids, alias the coded amino acids

    The twenty -amino acids (actually, nineteen -amino acids and one -imino acid(Table 1.1)) which, in preparation for their role in protein synthesis, are joined in vivothrough their carboxy group to tRNA to form -aminoacyl-tRNAs, are organisedby ribosomal action into specic sequences in accordance with the genetic code(Chapter 8).

    Coded amino acids is a better name for these twenty amino acids, rather thanprotein amino acids or primary protein amino acids (the term coded aminoacids is increasingly used), because changes can occur to amino-acid residues afterthey have been laid in place in a polypeptide by ribosomal synthesis. Greenstein and

    1.3 Protein amino acids

    3

  • Stru

    ctur

    esH

    ydro

    phob

    icit

    yH

    ydro

    phili

    city

    Nam

    e of

    Thr

    ee-l

    ette

    rSi

    ngle

    -let

    ter

    amin

    o ac

    idab

    brev

    iati

    onab

    brev

    iati

    on

    *Am

    ino

    acid

    sid

    e-ch

    ain,

    R

    Hig

    hH

    igh

    One

    wit

    h no

    Gly

    cine

    Gly

    GH

    *si

    de-c

    hain

    * (i

    .e. w

    ith

    a hy

    drog

    en a

    tom

    )

    Fou

    r w

    ith

    satu

    rate

    dA

    lani

    neA

    laA

    CH

    3*

    alip

    hati

    c si

    de-

    Leu

    cine

    Leu

    LC

    H2C

    H(C

    H3)

    2*

    chai

    ns*

    (hyd

    roph

    obic

    Val

    ine

    Val

    VC

    H(C

    H3)

    2*

    side

    -cha

    ins)

    Isol

    euci

    neIl

    eI

    (S)-

    CH

    (CH

    3)C

    2H5

    *

    Tab

    le 1

    .1. T

    he t

    wen

    ty c

    oded

    am

    ino

    acid

    s (n

    inet

    een

    cod

    ed

    --a

    min

    o ac

    ids,

    and

    one

    cod

    ed

    --i

    min

    o ac

    id):

    stru

    ctur

    es a

    ndde

    nit

    ions

    a

    Stru

    ctur

    e co

    nven

    tion

    s fo

    r th

    e -

    -am

    ino

    acid

    s ar

    e

    Fis

    cher

    pro

    ject

    ion

    of a

    n -

    -am

    ino

    acid

    , req

    uiri

    ng t

    he c

    arbo

    n ch

    ain

    to b

    e ar

    rang

    ed v

    erti

    cally

    , wit

    h th

    e ca

    rbox

    ygr

    oup

    at t

    he t

    op

    One

    of

    the

    com

    mon

    ly-u

    sed

    thre

    e-di

    men

    sion

    al r

    epre

    sent

    atio

    ns o

    f an

    -

    -am

    ino

    acid

    Bar

    rett

    rep

    rese

    ntat

    ion

    of a

    n -

    -am

    ino

    acid

    iseq

    uival

    ent

    toCO

    2H

    3N

    HR

    +

    whi

    ch iseq

    uival

    ent

    toCO

    2H

    3N

    R

    +

    CO2

    H3N

    CH

    R

    +

    CO2

    H3N

    R

    +

  • Ten

    wit

    h fu

    ncti

    onal

    ised

    Arg

    inin

    eA

    rgR

    CH

    2CH

    2CH

    2NH

    C(

    NH

    )NH

    2*

    alip

    hati

    c si

    de-c

    hain

    s*A

    spar

    tic

    acid

    Asp

    DC

    H2C

    O2H

    *(m

    ostl

    y hy

    drop

    hilic

    Asp

    arag

    ine

    Asn

    NC

    H2C

    ON

    H2

    *si

    de-c

    hain

    s)G

    luta

    mic

    aci

    dG

    luE

    CH

    2CH

    2CO

    2H*

    Glu

    tam

    ine

    Gln

    QC

    H2C

    H2C

    ON

    H2

    *L

    ysin

    eL

    ysK

    CH

    2CH

    2CH

    2CH

    2NH

    2*

    Met

    hion

    ine

    Met

    MC

    H2C

    H2S

    CH

    3*

    Cys

    tein

    eC

    ysC

    CH

    2SH

    *Se

    rine

    Ser

    SC

    H2O

    H*

    Thr

    eoni

    neT

    hrT

    (R)-

    CH

    (CH

    3)O

    H*

    Fou

    r w

    ith

    arom

    atic

    Phe

    nyla

    lani

    neP

    heF

    CH

    2C6H

    5*

    or h

    eter

    oaro

    mat

    icT

    yros

    ine

    Tyr

    YC

    H2-

    (p-O

    H-C

    6H4)

    *si

    de-c

    hain

    s*H

    isti

    dine

    His

    HC

    H2-

    (im

    idaz

    ol-4

    -yl)

    *(m

    ost

    of t

    hese

    sid

    e-ch

    ains

    Try

    ptop

    han

    Trp

    WC

    H2-

    (ind

    ol-3

    -yl)

    *ar

    e hy

    drop

    hobi

    c)

    The

    cod

    ed

    -im

    ino

    Pro

    line

    Pro

    P*

    acid

    Not

    es:

    1.T

    he s

    truc

    ture

    of

    each

    sid

    e-ch

    ain,

    R, i

    s gi

    ven

    for

    the

    19 c

    oded

    -a

    min

    o ac

    ids

    , aft

    er e

    ach

    nam

    e. T

    he f

    ull s

    truc

    ture

    of

    the

    cod

    ed

    -im

    ino

    acid

    pro

    line

    is g

    iven

    . T

    hree

    -let

    ter

    and

    one

    -let

    ter

    abb

    revi

    atio

    ns a

    re g

    iven

    for

    the

    20. T

    he t

    hree

    -let

    ter

    abbr

    evia

    tion

    is t

    he

    rst

    thre

    e le

    tter

    s of

    the

    nam

    e fo

    r al

    l tw

    enty

    , exc

    ept

    for

    aspa

    ragi

    ne (

    Asn

    ), g

    luta

    min

    e (G

    ln),

    isol

    euci

    ne (

    Ile)

    and

    try

    ptop

    han

    (Trp

    ). T

    he s

    ingl

    e-le

    tter

    abb

    revi

    ated

    nam

    eis

    the

    rs

    t le

    tter

    of

    thei

    r fu

    ll na

    me

    for

    elev

    en o

    f th

    em. D

    iffer

    ent

    lett

    ers

    are

    need

    ed fo

    r th

    e ot

    her

    nine

    , to

    avoi

    d am

    bigu

    ity:

    arg

    inin

    e (R

    ),as

    para

    gine

    (N

    ), a

    spar

    tic

    acid

    (D

    ), g

    luta

    mic

    aci

    d (E

    ), g

    luta

    min

    e (Q

    ), ly

    sine

    (K

    ), p

    heny

    lala

    nine

    (F

    ), t

    rypt

    opha

    n (W

    ) an

    d ty

    rosi

    ne (

    Y).

    2.A

    ll fu

    ll na

    mes

    end

    in i

    ne e

    xcep

    t as

    part

    ic a

    cid,

    glu

    tam

    ic a

    cid

    and

    tryp

    toph

    an. A

    djec

    tive

    s ar

    e de

    rive

    d fr

    om t

    he n

    ames

    by

    drop

    ping

    the

    ine

    or

    its

    equi

    vale

    nt e

    ndin

    g an

    d ad

    ding

    yl;

    thu

    s, a

    lany

    l, gl

    utam

    yl, p

    roly

    l, tr

    ypto

    phyl

    , etc

    .3.

    Con

    gur

    atio

    ns.T

    he R

    /S c

    onve

    ntio

    n ca

    n ea

    sily

    be

    tran

    sfer

    red

    to r

    epla

    ce t

    he F

    isch

    er

    / s

    yste

    m, w

    hile

    ret

    aini

    ng t

    he t

    rivi

    al n

    ames

    : -

    enan

    tiom

    ers

    of a

    ll th

    e co

    ded

    amin

    o ac

    ids

    are

    mem

    bers

    of

    the

    S se

    ries

    exc

    ept

    -c

    yste

    ine,

    whi

    ch b

    ecom

    es R

    -cys

    tein

    e th

    roug

    h pr

    oper

    app

    licat

    ion

    of t

    he R

    /S r

    ules

    . Dia

    ster

    eois

    omer

    s (t

    he is

    oleu

    cine

    /allo

    -iso

    leuc

    ine

    and

    thre

    onin

    e/al

    loth

    reon

    ine

    pair

    s, a

    llo i

    ndic

    atin

    g in

    vers

    ion

    of t

    he s

    ide-

    chai

    nco

    ngu

    rati

    on o

    f th

    e co

    ded

    amin

    o ac

    id)

    are

    less

    am

    bigu

    ousl

    y na

    med

    thr

    ough

    the

    R/S

    sys

    tem

    , alt

    houg

    h th

    e si

    de-c

    hain

    con

    gur

    atio

    n ca

    n be

    indi

    cate

    d; fo

    r ex

    ampl

    e, n

    atur

    al

    -iso

    leuc

    ine

    is (

    2S,3

    S)-i

    sole

    ucin

    e:

    CO2

    H

    + NH

    2

  • Tab

    le 1

    .1. (

    cont

    .)

    whe

    reas

    -a

    llois

    oleu

    cine

    is (

    2S,3

    R)-

    isol

    euci

    ne:

    For

    the

    str

    uctu

    res

    of n

    atur

    al

    -thr

    eoni

    ne (

    (2S,

    3R)-

    thre

    onin

    e) a

    nd

    -allo

    thre

    onin

    e ((

    2S,3

    S)-t

    hreo

    nine

    ), r

    epla

    ce t

    he s

    ide-

    chai

    n et

    hyl g

    roup

    (C

    2H5)

    in is

    oleu

    cine

    and

    allo

    isol

    euci

    ne b

    y O

    H.

    4.IU

    PAC

    IU

    B n

    omen

    clat

    ure

    reco

    mm

    enda

    tion

    s (1

    983)

    , rep

    rodu

    ced

    in f

    ull i

    n A

    min

    o A

    cids

    ,Pep

    tide

    s,an

    d P

    rote

    ins,

    198

    5, V

    ol. 1

    6, T

    he R

    oyal

    Soci

    ety

    of C

    hem

    istr

    y, p

    . 387

    ; and

    in E

    ur.J

    .Bio

    chem

    ., 19

    84, 1

    38, 9

    , enc

    oura

    ge t

    he r

    eten

    tion

    of

    triv

    ial n

    ames

    for

    the

    com

    mon

    -a

    min

    o ac

    ids,

    but

    syst

    emat

    ic n

    ames

    are

    rel

    ativ

    ely

    stra

    ight

    forw

    ard;

    thu

    s,

    -ala

    nine

    is 2

    S-am

    inop

    ropa

    noic

    aci

    d an

    d -h

    isti

    dine

    is 2

    S-am

    ino-

    3-(i

    mid

    azol

    -4-y

    l)-

    prop

    anoi

    c ac

    id (

    the

    nam

    e fo

    r th

    e pr

    edom

    inan

    t ta

    utom

    er).

    5.H

    ydro

    phili

    c a

    nd h

    ydro

    phob

    ic a

    re t

    erm

    s us

    ed t

    o de

    note

    the

    rel

    ativ

    e w

    ater

    -att

    ract

    ing

    and

    wat

    er-r

    epel

    ling

    prop

    erty

    , res

    pect

    ivel

    y, o

    f th

    esi

    de-c

    hain

    whe

    n th

    e am

    ino

    acid

    is c

    onde

    nsed

    into

    a p

    olyp

    epti

    de (

    see

    Cha

    pter

    5).

    The

    ter

    m h

    ydro

    path

    y in

    dex

    may

    be

    used

    to

    plac

    e th

    e am

    ino

    acid

    s in

    ord

    er o

    f th

    eir

    hyd

    roph

    ilici

    ty (

    Kyt

    e an

    d D

    oolit

    tle,

    198

    5), a

    nd t

    heir

    rel

    ativ

    e po

    siti

    ons

    are

    show

    n he

    re o

    n an

    arb

    itra

    ry s

    cale

    .a

    Sele

    nocy

    stei

    ne (

    i.e. c

    yste

    ine

    wit

    h th

    e su

    lphu

    r at

    om r

    epla

    ced

    by a

    sel

    eniu

    m a

    tom

    ) ha

    s be

    en fo

    und

    in c

    erta

    in p

    rote

    ins,

    e.g

    . for

    mat

    ede

    hydr

    ogen

    ase,

    an

    enzy

    me

    from

    Esc

    heri

    chia

    col

    i, an

    d it

    has

    ver

    y re

    cent

    ly b

    een

    show

    n to

    be

    plac

    ed t

    here

    thr

    ough

    nor

    mal

    rib

    osom

    al s

    ynth

    esis

    (Sta

    dtm

    an, 1

    996)

    . Thu

    s se

    leno

    cyst

    eine

    can

    now

    be

    acce

    pted

    as

    the

    tw

    enty

    -rs

    t co

    ded

    amin

    o ac

    id.

    CO2

    H3N

    CH

    iseq

    uival

    ent

    to

    whi

    ch iseq

    uival

    ent

    to

    CO2

    H3N

    HC

    CO2

    H3N

    +

    ++

    CH

    3C

    H

    C 2H

    5

    C 2H

    5

    H

    CH

    3

    CO2

    H3N

    Cis

    equi

    val

    ent

    to

    whi

    ch iseq

    uival

    ent

    to

    CO2

    H3N

    HC

    CO2

    H3N

    +

    ++

    HC C 2

    H5

    C 2H

    5

    HC

    H3

    H

    CH

    3

  • Winitz, in their 1961 book, listed the 26 protein amino acids, six of which were laterfound to be formed from among the other twenty protein amino acids in the list ofGreenstein and Winitz, after the protein had left the gene (post-translational (some-times called post-ribosomal) modication or post-translational processing).Because of these changes made to the polypeptide after ribosomal synthesis, aminoacids that are not capable of being incorporated into proteins by genes (secondaryprotein amino acids, Table 1.2) can, nevertheless, be found in proteins.

    1.4 Nomenclature for the protein amino acids, alias the coded amino acids

    The common amino acids are referred to through trivial names (for example, glycinewould not be named either 2-aminoethanoic acid or amino-acetic acid in the aminoacid and peptide literature). Table 1.1 summarises conventions and gives structures.The rarer natural amino acids are usually named as derivatives of the commonamino acids, if they do not have their own trivial names related to their naturalsource (Table 1.2), but apart from these, there are occasional examples of the use ofsystematic names for natural amino acids.

    1.5 Abbreviations for names of amino acids and the use ofthese abbreviations to give names to polypeptides

    To keep names of amino acids and peptides to manageable proportions, there areagreed conventions for nomenclature (see the footnotes to Table 1.1). The simplest-amino acid, glycine, would be depicted HGlyOH in the standard three-letter system, the H and OH representing the H2O that is expelled when thisamino acid undergoes condensation to form a peptide (Figure 1.2). The three-letterabbreviations therefore represent the amino-acid residues that make up peptidesand proteins.

    So this three-letter system was introduced, more with the purpose of space-saving nomenclature for peptides than to simplify the names of the amino acids. Aone-letter system (thus, glycine is G) is more widely used now for peptides (but isnever used to refer to individual amino acids in other contexts) and is restricted tonaming peptides synthesised from the coded amino acids (Figure 1.3).

    1.5 Abbreviations

    7

    Figure 1.2. Polymerisation of glycine.

  • 8Table 1.2. Post-translational changes to proteins: the modied coded amino acidspresent in proteins, including crosslinking amino acids (secondary amino acids)

    Modications to side-chain functional groups of coded amino acids

    1. The aliphatic and aromatic coded amino acids may exist in -dehydrogenated formsand the -hydroxy--amino acids may undergo post-translational dehydration, so as tointroduce -dehydroamino acid residues, NH(CCR1R2)CO, into polypeptides.

    2. Side-chain OH, NH or NH2 proton(s) may be substituted by glycosyl, phosphate orsulphate. These substituent groups are lost during hydrolysis preceding analysis and duringlaboratory treatment of proteins by hydrolysis prior to chemical sequencing, which creates aproblem that is usually solved through spectroscopic and other analytical techniques.

    3. Side-chain NH2 of lysine may be methylated or acylated: (N-methylalanyl, N-di-aminopimelyl).

    4. Side-chain NH2 of glutamine may be methylated; giving N5-methylglutamine, and theside-chain NH2 of asparagine may be glycosylated.

    5. Side-chain CH2 may be hydroxylated, e.g. hydroxylysine, hydroxyprolines (trans-4-hydroxyproline in particular), or carboxylated, e.g. to give -aminomalonic acid, -carboxyaspartic acid, -carboxyglutamic acid, -hydroxyaspartic acid, etc.

    6. Side-chain aromatic or heteroaromatic moieties may be hydroxylated, halogenated or N-methylated.

    7. The side-chain of arginine may be modied (e.g. to give ornithine (Orn),RCH2CH2CH2NH2, or citrulline (Cit), RCH2CH2CH2NHCONH2).

    8. The side-chain of cysteine may be modied, as in 1 above, also selenocysteine (CH2SeHinstead of CH2SH; see footnote a to Table 1.1), lanthionine (see 10 below).

    9. The side-chain of methionine may be S-alkylated (see Table 1.3) or oxidised at S to givemethionine sulphoxide.

    10. Crosslinks in proteins may be formed by condensation between nearby side-chains.(a) From lysine: e.g. lysinoalanine as if from [lysineserineH2O]

    H-Lys-OH dehydroalanine |

    H-Ala-OH(b) From tyrosine: 3,3-dityrosine, 3,3,5,3-tertyrosine, etc.(c) From cysteine: oxidation of the thiol grouping (HSSHSS) to give the

    disulphide or to give cysteic acid (Cya): SHSO3H and alkylation leading to sulphide formation (e.g. alkylation as if by dehydroalanine to give lanthionine):

    (Further examples of crosslinking amino acids in peptides and proteins are given in Section 5.11.)

    Nomenclature of post-translationally modied amino acids

    Abbreviated names for close relatives of the coded amino acids can be based on the three-letter names when appropriate; thus, -Pro after post-translational hydroxylation gives -Hypro (trans-4-hydroxyproline, or (2S,4R)-hydroxyproline).

    Current nomenclature recommendations (see footnote to Table 1.1) allow a number ofabbreviations to be used for some non-coded amino acids possessing trivial names (some ofwhich are used above and elsewhere in this book): Dopa, -Ala, Glp, Sar, Cya, Hcy (homocysteine) and Hse (homoserine) are among the more common.

    2H Cys OH H Ala OH H Ala OH

    S

  • The three-letter system has some advantages and has gradually been extended(Figure 1.4) to encompass several amino acids other than the coded amino acids.It is usually used to display schemes of laboratory peptide synthesis (Chapter 7)since it allows protecting groups and other structural details to be added, some-thing that is very dicult and often confusing if attempted with the one-lettersystem.

    The one-letter abbreviation (like its three-letter equivalent) represents anamino-acid residue and the system allows the structure of a peptide or protein tobe conveniently stated as a string of letters, written as a line of text, incorporatingthe long-used convention that the amino terminus (the N-terminus) is to theLEFT and the carboxy terminus (the C-terminus) is to the RIGHT. This conven-tion originates in the Fischer projection formula for an --amino acid or apeptide made up of --amino acids; the -conguration places the amino groupto the left and the carboxy group to the right in a structural formula, as in Figure1.3.

    There are increasing numbers of violations of these rules; N-acetyl alanine, forexample, being likely to be abbreviated AcAla in the research literature or itscorrect abbreviation AcAlaOH (but never AcA). This does not usually leadto ambiguity on the basis of the rule that peptide structures are written with the N-terminus to the left and the C-terminus to the right. Thus, AcAla should still becorrectly interpreted by a reader to mean CH3CONHCH(CH3)CO2Hwhen this rule is kept in mind, since AlaOAc (more correctly, HAlaOAc)would represent the mixed anhydride NH2CH(CH3)COOCOCH3(there is a footnote about the term mixed anhydride on p. 152).

    1.5 Abbreviations

    9

    Figure 1.3. (a) The dipeptide -phenylalanyl--serine in the Fischer depiction. (b) Theschematic structure of a hexapeptide in the Fischer depiction, resulting in inecient use of

    space. (c) The three-letter and one-letter conventions for a representative peptide, GGA---FP.

    (a) (b)

    (c)

  • Links through functional groups in side-chains of the amino-acid residues can beindicated in abbreviated structures of peptides (Figure 1.4). Cyclisation between theC- and N-termini to give a cyclic oligopeptide can also be shown in abbreviatedstructural formulae. Insulin (Figure 1.4) provides an example of the relativelycommon disulphide bridge (there are three of these in the molecule), whereascyclosporin A (a cyclic undecapeptide from Trichoderma inatum, which is valuablefor its immunosuppressant property that is exploited in organ-transplant surgery) isa product of post-translational cyclisation (Figure 1.4).

    10

    Figure 1.4. Post-translationally modied peptides: (a) Human proinsulin. (b) CyclosporinA (Me is CH3). As well as the post-translationally modied threonine derivative (residue 1,

    called MeBmt), cyclosporin A contains one -amino acid, four N-methyl--leucineresidues, one non-natural amino acid, Abu (butyrine, side-chain C2H5), Sar (sarcosine, N-

    methylglycine) and N-methyl--valine, but only two of the eleven residues are coded -amino acids, valine and alanine.

    (a)

    (b)

  • 1.6 Post-translational processing: modications ofamino-acid residues within polypeptides

    The major classes of structurally altered amino-acid side-chains within ribosomallysynthesised polypeptides, which are achieved by intracellular reactions, are listed inTable 1.2 (see also Chapter 8).

    1.7 Post-translational processing: in vivo cleavages of the amide backbone ofpolypeptides

    Changes to the amide backbone of the polypeptide through enzymatic cleavagestransform the inactive polypeptide into its fully active shortened form. The polypep-tide may be transported to the site of action after ribosomal synthesis and then pro-cessed there. Standard terminology has emerged for the extended polypeptides, pre-,pro- and prepro-peptides for the inactive N-terminal-extended, C-terminal-extendedand N- and C-terminal extended forms, respectively, of the active compound. Figure1.5 shows schematically the post-translational stages from human proinsulin(Figure 1.4) to insulin.

    1.8 Non-protein amino acids, alias non-proteinogenic amino acids or non-coded amino acids

    This further term is needed since there are several examples of higher organisms thatutilise non-protein -amino acids that are available in cells in the free form (-amino acids that are normally incapable of being used in ribosomal synthesis). Someof these free amino acids (Table 1.3) play important roles, one example being S-adenosyl--methionine, which is a supplier of cellular methyl groups; for example,for the biosynthesis of neuroactive amines (and also for the biosynthesis of many

    1.8 Non-protein amino acids

    11

    Figure 1.5. Generation of the active hormone, insulin, from the translated peptide,proinsulin (Chan and Steiner, 1977).

  • other methylated species). Another physiologically important -amino acid in thiscategory is -Dopa, the precursor of dopamine in the brain, which is used for treat-ment of aictions such as Parkinsons disease and to bring about the return fromcertain comatose states (described in the book Awakenings by Oliver Sacks) thatmay be induced by -Dopa.

    12

    Table 1.3. Some non-protein amino acids with biological roles, that are excludedfrom ribosomal protein synthesis

    Notes:1. -Carboxyglutamic acid, a constituent of calcium-modulating proteins (introducedthrough post-translational processing of glutamic acid residues).2. Treatment for Parkinsons disease.3. Potent excitatory effects, parent of a family of toxic natural kainoids present in fungi.Domoic acid, which has trans, trans-CHMeCHCHCHCHCHMeCO2H in placeof the isopropenyl side-chain of kainic acid, is extraordinarily toxic, with fatalities ensuingthrough eating contaminated shellsh (Baldwin et al, 1990).4. Exhibits potent NMDA receptor activity.5. One of a number of protein crosslinks.6. Widely distributed in cells.

    CO 2HHO 2C

    H3N+

    CO2

    OHOH

    CH3C

    H2CCH2CO2H

    CO2H2N+

    HNO O

    NO

    HOO

    SMe

    O

    HON

    NN

    N

    NH2

    L-Dopa2

    -Carboxy-L-glutamic acid 1

    Quisqualic acid 4

    Di-L-tyrosine5 S-Adenosyl-L-methionine6

    Kainic acid3(3S)-Carboxymethyl-(4S)-

    isopropenyl-(S)-proline

    H3N+

    CO2

    H3N+

    CO2

    H3N+

    CO2 O2C H3N+

    CO2

    HO

    NH3+

    CH2

    +

    CH2

    CH2

    SMe+

  • Of course, most of the 700 or so natural amino acids mentioned at the start ofthis chapter will be non-protein amino acids. All these were, until recently, thoughtto be rigorously excluded from protein synthesis and other cellular events that arecrucial to life processes, but a very few of these that are structurally related to thecoded amino acids may be incorporated into proteins under laboratory conditions.This has been achieved by biosynthesising proteins in media that lack the requiredcoded amino acid, but which contain a close analogue. For example, incorporationof the four-membered-ring -imino acid azetidine-2-carboxylic acid instead of theve-membered-ring proline and incorporation of norleucine (side-chainCH2CH2CH2CH3) instead of methionine (side-chain CH2CH2SCH3) have beendemonstrated (Richmond, 1972) and -(3-thienyl)-alanine has been assimilatedinto protein synthesis by E. coli (Kothakota et al., 1995). Unusual amino acids thatare not such close structural relatives of the coded amino acids have been coupledin the laboratory to tRNAs, then shown to be utilised for ribosomal peptide syn-thesis in vivo (Noren et al., 1989).

    Ways have been found, in the laboratory, of broadening the specicity of someenzymes (particularly the proteinases, but also certain lipases that can be used in lab-oratory peptide synthesis; see Chapter 7), for example by employing organic sol-vents, so that these enzymes catalyse some of the reactions of non-protein aminoacid derivatives and some of the reactions of peptides that incorporate unusualamino acids. It has proved possible to involve -enantiomers of the coded aminoacids and - and -isomers of non-protein amino acids in peptide synthesis, togenerate non-natural peptides.

    1.9 Coded amino acids, non-natural amino acids and peptides in nutrition and food science and in human physiology

    The nutritional labels for some of the protein amino acids, such as essential aminoacids, are an indication of their roles in this context. The meaning of the termessential diers from species to species and reects the dependence of the organ-ism on certain ingested amino acids that it cannot synthesise for itself, but which itneeds in order to be able to generate its life-sustaining proteins. For the humanspecies, the essential amino acids are the -enantiomers of leucine, valine, isoleucine,lysine, methionine, threonine, phenylalanine, histidine and tryptophan. This impliesthat the other coded amino acids can be obtained from these essential amino acids,if not through other routes. There are some surprising pathways. For example, cys-teine can be generated from methionine, the loss of the side-chain carbon atomsbeing achieved through passage via cystathionine (Finkelstein, 1990); but homo-cysteine, the presence of which has been implicated as a causal factor in vasculardisease, is also formed rst in this route by demethylation of methionine. The - and-enantiomers of coded amino acids generally have dierent tastes and it hasrecently been appreciated that many fermented foods, such as yoghourt and shell-

    1.9 Nutrition and physiology

    13

  • sh (amongst many other food sources), contain substantial amounts of the enantiomers of the coded amino acids.

    The contribution of the enantiomers to the characteristic taste of foods is cur-rently being evaluated, but it is clear that the enantiomers generally taste sweeter,or at least less bitter, than do their isomers. Of course, kitchen preparation caninvolve many subtle chemical changes that enhance the attractiveness of naturalfoodstus, including racemisation (Man and Bada, 1987); therefore enantiomersmay be introduced in this way. Peptides are taste contributors, for example thebitter-tasting dipeptides TrpPhe and TrpPro and the tripeptide LeuProTrpthat are formed in beer yeast residues (Matsusita and Ozaki, 1993).

    Some coded amino acids are acceptable as food additives and some are widelyused in this way (e.g. -glutamic acid and its monosodium salt). Addition of aminoacids to the diet is unnecessary for people already eating an adequate and balancedfood supply and the toxicity of even the essential amino acids (methionine is the mosttoxic of all the coded amino acids (Food and Drugs Administration, WashingtonUSA, 1992)) should be better publicised, because some coded amino acids are easilyavailable (for use in specialised diets by body-builders, for example) and are some-times used unwisely. The use of -tryptophan for its putative anti-depressant andother health properties was responsible for the outbreak of eosinophilia myalgiasyndrome that aected more than 1500 persons (with more than 30 fatalities) in theUSA during 198990, though the problem was ascribed not to the amino acid itselfbut rather to an impurity introduced into the amino acid during manufacture (Smithet al., 1991). At the other end of the scale, some amino acids have more trivial uses,e.g. -tyrosine in sun-tan lotion for cosmetic browning of the skin.

    Methionine is included in some proprietary paracetamol products (Pameton;Smith Kline Beecham), since it counteracts some serious side-eects that areencountered with paracetamol overdosing through helping to restore glutathionelevels that are the bodys natural defence against products of oxidised paracetamol.However, the recommended antidote (bearing in mind the toxicity of methionine) isintravenous N-acetyl--cysteine, which, in any case, reaches the liver of the over-dosed patient faster.

    Derivatives of aspartic acid have special importance in neurological research; theN-acetyl derivative is a putative marker of neurones and N-methyl--aspartic acid(NMDA) is creating interest for its possible links with Alzheimers disease. NMDAis a potent excitant of spinal neurones; there are receptors in the brain for this -imino acid, for which agonists/antagonists are being sought. A particular interac-tion being studied is that between ethanol and NMDA receptors (Collingridge andWatkins, 1994; see also Meldrum, 1991).

    The industrial production base that has been developed to meet these demands(see Chapter 6) makes many amino acids cheaply available for other purposes suchas laboratory use and has contributed in no small measure to the development ofthe biotechnological sector of the chemical industry.

    14

  • 1.10 The geological and extra-terrestrial distribution of amino acids

    The development of sensitive analytical methods for amino acids became an essen-tial support for the study of geological specimens (terrestrial ones and Lunar andMartian samples) from the 1970s. Some of the primary and secondary proteinamino acids (and some non-protein amino acids) were established to exist in mete-orites (certainly in one of the largest known, the Murchison meteorite from WesternAustralia) though they have not been found in lunar samples. The scepticism thatgreeted an inference from this discovery the inference that life as we know it exists,or once existed, on other planetary bodies has also boosted interest in the chem-istry of the amino acids to try to support alternative explanations for their presencein meteorites. The possibility that such relatively sensitive compounds could havesurvived the trauma experienced by meteorites penetrating the Earths atmospherewas soon rejected. They must have been synthesised in the meteorites during thenal traumatic stage of their journey. This conclusion was obtained bearing in mindthe relevant amino-acid chemistry (Chapter 4); even the common, relatively muchmore gentle, laboratory practice of ultrasonic treatment of geological and biologicalsamples prior to amino-acid analysis was hastily discouraged when it was found thatthis causes chemical structural changes to certain common amino acids (e.g. conver-sion of glutamic acid into glycine); and the injection of energy into mixtures ofcertain simple compounds also causes the formation of amino acids (Chapter 6).

    The use of telescopic spectroscopy has revealed the existence of glycine in inter-stellar dust clouds. Since these clouds amount to huge masses of matter (greaterthan the total mass of condensed objects such as stars and planets), there must beuniversal availability of amino acids, even though they are dispersed thinly in thevast volume of space.

    1.11 Amino acids in archaeology and in forensic science

    Amino-acid analysis of relatively young fossils and of other archaeological sampleshas provided information on their age and on the average temperature proles thatcharacterised the Earth at the time of life of these samples. Samples from livingorganisms containing protein that has ceased turnover, i.e. proteins in metabolicculs-de-sac such as tooth and eye materials, can be analysed for their degree ofracemisation of particular amino acids (Asp and Ser particularly; Leu for olderspecimens) in order to provide this sort of information. The : ratio for the aspar-tic acid present in these sources can be interpreted to assign an age to the organism,since racemisation of this amino acid is relatively rapid on the geological time scaleand even in terms of life-span of a human being. The : ratio is easily measuredthrough standard amino-acid analysis techniques (see Chapter 4; Bada, 1984).

    -Aspartic acid is introduced through racemisation into eye-lens protein in the

    1.11 Archaeology and forensic science

    15

  • living organism at the rate of 0.14% per year, so that a 30-year-old person hasaccumulated 4.2% -aspartic acid in this particular protein. It is in age determina-tion of recently deceased corpses (and other scene-of-the-crime artefacts, for which14C-dating is inaccurate), that forensic science interest in reliable amino-acid datingis centred. For older specimens, the method is wildly unreliable: thus, Otztal Ice Man the corpse found at Hauslabjoch, high in the Austrian Tyrol, in 1991 was datedto 455027 BC by radiocarbon dating, but would have a grossly inaccurate assign-ment of birthday on the basis of amino-acid racemisation data (Bonani et al., 1994).In unrelated areas, amino-acid racemisation has given useful information on the ageof art specimens (e.g. dating of oil paintings through study of the egg-proteincontent).

    Such inferences derive from data on the kinetics of racemisation, measured in thelaboratory (described in Section 4.18.2) and there is a good deal of controversy sur-rounding the dating method since no account is taken of the catalytic inuence onracemisation rates of molecular structures that surrounded the amino-acid residuefor some or all the years. It is, for example, now known that the rate of racemisationof an amino acid, when it is a residue in a protein, is strongly dependent on thenature of the adjacent amino acids in the sequence; the particular amino acid onwhich measurement is made might have been located in a racemisation-promotingenvironment for many years after the death of the organism.

    1.12 Roles for amino acids in chemistry and in the life sciences

    1.12.1 Amino acids in chemistry

    The physiological importance of -amino acids ensures a sustained interest in theirchemistry particularly in pharmaceutical exploration for new drugs, and for theirsynthesis, reactions and physical properties. As is often the case when the chemistryof a biologically important class of compounds is being vigorously developed, anincreasing range of uses has been identied for -amino acids in the wider contextof stereoselective laboratory synthesis (including studies of biomimetic syntheticroutes).

    1.12.2 Amino acids in the life sciences

    Apart from their main roles, particularly their use as building blocks for condensa-tion into peptides and proteins, -amino acids are used by plants, fungi and bacte-ria as biosynthetic building blocks. Many alkaloids are derived from phenylalanineand tyrosine (e.g. Figure 1.6; and penicillins and cephalosporins are biosynthesizedfrom tripeptides, Chapter 8).

    16

  • 1.13 - and higher amino acids

    There are relatively few examples; but there are increasing numbers of amino acidswith greater separation of the amino and carboxy functions that have been found toplay important biological roles (Drey, 1985; Smith, 1995). The coded amino acid,aspartic acid, could be classied either as an - or as a -amino acid. Glutamic acid(which can be classied either as an -amino acid or as a -amino acid) is the bio-logical source, through decarboxylation, of -aminobutyric acid (known as GABA;see Table 1.4), which functions as a neurotransmitter (as do glycine and -glutamicacid and, probably, three other coded --amino acids). The simple tripeptide glu-tathione (actually, an isopeptide; see Section 1.2 and Chapter 8) is constructed usingthe side-chain carboxy group rather than the -carboxy group of glutamic acid andtherefore could be said to be a peptide formed by the condensation of a -aminoacid and two -amino acids.

    Numerous natural peptides with antibiotic activity and other intensely potentphysiological actions incorporate - and higher amino acids, as well as highly pro-cessed coded amino acids. The microcystins, which act as hepatotoxins, provide one

    1.13 - and higher amino acids

    17

    Figure 1.6. Routes from -tyrosine to alkaloids. Alkaloid biosynthesis is often grouped intocategories based on the initiating amino acid; i.e. the ornithine/cysteine route

    (e.g. nicotine); the phenylalanine/tyrosine/tryptophan route (e.g. the isoquinoline alkaloids,such as pellotine); etc.

  • example. They are represented by the family structure cyclo[-AlaX-MeAspZAdda-GluMdha), where X and Z are various coded -aminoacids and -MeAsp is -erythro--methylaspartic acid, found in the water bloom-forming cyanobacterium Oscillatoria agardhii. The structure of one of these, [-Asp3,DHb7]microcystin-RR (Sano and Kaya, 1995), is displayed in Chapter 3(Figure 3.6).

    Moving away from the simpler -amino acids as constituents of peptides, the -amino acid (R)-carnitine Me3NCH2CH(OH)CH2CO2, is a rare example of a freeamino-acid derivative with an important physiological role. This amino acid betaineis sometimes called vitamin BT and plays a part in the conversion of stored bodyfat into energy, through transport of fat molecules of high relative molecular massto the sites of their conversion.

    The (2R,3S)-phenylisoserine side-chain at position 13 of the taxane skeleton inthe anti-cancer drug taxol (from the yew tree) is essential to its action.

    18

    Table 1.4. Some -amino acids and higher amino acids found in biological sources

    Mentioned elsewhere in this chapter, as examples that are -amino acids and also -, -and -amino acids, respectively, are

    Glutamic acidb H3NCH(CO2)CH2CH2CO2H,Aspartic acidb H3NCH(CO2)CH2CO2H and

    -Amino-adipic acidc H3NCH(CO2)CH2CH2CH2CO2H

    -Alanineb (-Ala) H3NCH2CH2CO2

    -Aminobutyric acida (GABA) H3NCH2CH2CH2CO2

    Statinec (3S,4S)-3-hydroxy-4-amino-6-methylheptanoic acid

    -Phenylisoserinec [(2R,3S)-3-amino-2-hydroxy-3-phenylpropanoic acud; AHPA],C6H5CH(NH3)CH(OH)CO2, present in taxol (a potent anti-cancer agent) and present in bestatin, NH3CH(CH2C6H5)CH(OH)CONHCH[CH2CH(CH3)2]CO2

    (an immunological response-modifying agent)

    -Aminolaevulinic acida H3NCH2COCH2CH2CO2 (an analogue with a CC grouping is the active constituent of light-activated ointments for the treatment of skin cancer)

    Notes:Some of these naturally occurring amino acids are:a found only in the free state and not found in peptides;b found in the free state and also found in peptides; andc found only in peptides and other derivatised forms.

    CO2

    NH3

    HO

    +

  • 1.14 References

    Reviews providing information on all aspects of amino-acid science (Barrett, 1985;Greenstein and Winitz, 1961; Williams, 1989) and peptide chemistry (Jones, 1991) are listedat the end of the Foreword. References cited in the text of this chapter are the following.

    Bada, J. L. (1984) Methods Enzymol., 106, 98.Baldwin, J. E., Maloney, M. G. and Parsons, A. F. (1990) Tetrahedron, 46, 7263.Bonani, G., Ivy, S. D., Hajdas, I., Niklaus, T. R. and Suter, M. (1994) Radiocarbon, 36, 247.Chan, S. J. and Steiner, D. F. (1977) Trends Biochem. Sci., 2, 254.Collingridge, G. L. and Watkins, J. C. (1994) The NMDA Receptor, Second Edition, Oxford

    University Press, Oxford.Drey, C. N. C., in Barrett, G. C., Ed. (1985) Chemistry and Biochemistry of the Amino

    Acids, Chapman & Hall, London, p. 25.Finkelstein, J. D. (1990) J. Nutr. Biochem., 1, 228.Food and Drugs Administration, Washington, USA (1992) Safety of Amino Acids used as

    Dietary Supplements.Hanby, W. S. and Rydon, H. N. (1946) Biochem. J., 40, 297.Kothakota, S., Mason, T. L., Tirrell, D. A. and Fournier, M. J. (1995) J. Am. Chem. Soc.,

    117, 536.Kyte, J. and Doolittle, R. F. (1985) J. Mol. Biol., 157, 105.Man, E. H. and Bada, J. L. (1987) Ann. Rev. Nutr., 7, 209.Matsusita, I., and Ozaki, S. (1993) Peptide Chemistry; Proceedings of the 31st International

    Conference, pp. 1658 (Chem. Abs. 121, 77 934).Meldrum, B. S. (1991) Excitatory Amino Acid Antagonists, Blackwell, Oxford.Noren, C. J., Anthony-Cahill, S. J., Griths, M. C. and Schultz, P. G. (1989) Science, 244,

    182.Richmond, M. H. (1972) Bacteriol. Rev., 26, 398.Sano, T. and Kaya, K. (1995) Tetrahedron Lett., 36, 8603.Smith, B. (1995) Methods of Non--Amino Acid Synthesis, Dekker, New York.Smith, M. J., Mazzola, E. P., Farrell, T. J., Sphon, J. A., Page, S. W., Ashley, D., Sirimanne,

    S. R., Hill, R. H. and Needham, L. L. (1991) Tetrahedron Lett., 32, 991.Stadtman, T. C. (1996) Ann. Rev. Biochem., 65, 83.Sutton, C. H., Baize, M. W. and Todd, L. J. (1993) Inorg. Chem., 33, 4221.

    1.14 References

    19

  • 2Conformations of aminoacids and peptides

    2.1 Introduction: the main conformational features of amino acids and peptides

    This topic has been thoroughly developed insofar as the conformational behaviourof amino acids and peptides in aqueous solutions is concerned. The main driving forcefor conformational studies has been the pharmaceutical interest in the interactionsof biologically active amino acids and peptides with tissue, particularly with cellreceptors. The solid-state behaviour of amino acids and peptides, though less rele-vant in the pharmaceutical context, has not escaped investigation. This is becauseof the wider distribution and greater ease of use of X-ray crystallography equipmentnowadays.

    The conformational behaviour of N- and C-terminal-derivatised amino acids andpeptides in organic solvents has also been studied, particularly by nuclear magneticresonance and circular dichroism spectrometric techniques (in which advances ininstrumentation have been very considerable; see Chapter 3).

    2.2 Congurational isomerism within the peptide bond

    The amide group shows restricted exibility because its central NHCO bondhas some double-bond character due to resonance stabilisation [NHCONHC(O)]. The energy barrier that this creates is insucient to preventrotation, but sucient to ensure that geometrical isomers exist under normalphysiological conditions of temperature and solvent, so ensuring that a particularpeptide can exist in a variety of conformations, often an equilibrium mixture ofseveral conformations, in solutions.

    Planar cis and trans isomers (Figure 2.1(a)) are the most stable congurations,because the planar structure involves maximum orbital overlap. For the majority ofpeptides built up from -amino acids, the amide bond adopts the trans geometry. -Imino acids (notably proline but also N-methylamino acids), as well as -methyl--

    20

  • amino acids, assist the adoption of more exible structures by peptides when theyare built into peptides, because mixtures of cis and trans congurations are morelikely. Cis-amide bonds are rare in natural polypeptides that contain no prolineresidues (there are three cis-amide bonds in the enzyme carboxypeptidase A and onein the smaller polypeptide concanavalin A), though current re-investigations byNMR methods (Chapter 3) are revealing more distorted trans-amide bonds in struc-tures established without such details in the early days of X-ray crystallography.

    Regions of a peptide can exist either in a random conformation (i.e., the denaturedstate) or in one of a number of stereoregular conformations: the extended conforma-tion (Figure 2.2), the -helix (either right-handed or left-handed) and the -sheet(Figure 2.3). Two of the stereoregular conformations are stabilised by intra-molecular hydrogen bonding and intermolecular hydrogen bonding accounts for

    2.2 Congurational isomerism

    21

    Figure 2.1. (a) Amide bonds in the trans and cis congurations. (b) Torsion angles for anamino-acid residue in a peptide.

    (a)

    (b)

    Figure 2.2. A representative dipeptide made up from --amino acids, in the extendedconformation with the amide bond in the trans conguration.

  • Tab

    le 2

    .1. T

    he t

    ende

    ncy

    ofco

    ded

    amin

    o ac

    ids

    for

    -h

    elix

    -pro

    mot

    ing/

    brea

    king

    and

    -s

    truc

    ture

    pro

    mot

    ing/

    brea

    king

    ,whe

    n pa

    rt o

    fa

    pept

    ide

    (neu

    tral

    0.8

    1.00

    no

    ten

    denc

    y ei

    ther

    way

    )a

    -H

    elix

    -pro

    mot

    ing

    (hel

    icog

    enic

    )

    ----

    ----

    ----

    ----

    ----

    Neu

    tral

    ----

    ----

    ----

    ----

    ----

    -Hel

    ix-b

    reak

    ing

    Glu

    A

    laL

    euH

    is

    Met

    Gln

    Trp

    Val

    Phe

    Lys

    L

    leA

    sp

    Thr

    Ser

    Arg

    C

    ysA

    snT

    yrP

    roG

    ly1.

    531.

    451.

    341.

    241.

    201.

    171.

    141.

    141.

    121.

    071.

    000.

    980.

    820.

    790.

    790.

    770.

    730.

    610.

    590.

    53

    -Str

    uctu

    re p

    rom

    otin

    g

    ----

    ----

    Neu

    tral

    ----

    ----

    -Str

    uctu

    re b

    reak

    ing

    M

    etV

    alIl

    eC

    ysT

    yrP

    heG

    lnL

    euT

    hrT

    rpA

    laA

    rg

    Gly

    Asp

    L

    ys

    Ser

    His

    A

    snP

    roG

    lu

    1.67

    1.65

    1.60

    1.30

    1.29

    1.28

    1.23

    1.22

    1.20

    1.19

    0.97

    0.90

    0.81

    0.80

    0.75

    0.72

    0.71

    0.65

    0.62

    0.26

    Not

    es:

    aO

    n th

    e ba

    sis

    of a

    scr

    utin

    y of

    cry

    stal

    str

    uctu

    res

    of 1

    5 po

    lype

    ptid

    es/p

    rote

    ins

    (Cho

    u an

    d F

    asm

    an, 1

    974)

    ; see

    Fas

    man

    (19

    89, 1

    996)

    .F

    urth

    er d

    etai

    ls:

    1. A

    n ea

    rly

    term

    , s-

    valu

    es (

    Zim

    m a

    nd B

    ragg

    , 195

    9) fo

    r th

    ese

    tend

    enci

    es, i

    s ra

    rely

    use

    d no

    w.

    2. G

    lu

    , His

    , L

    ys

    , Asp

    , A

    rg

    indi

    cate

    the

    pre

    senc

    e of

    a c

    harg

    e on

    the

    sid

    e-ch

    ain

    as a

    res

    ult

    of p

    roto

    nati

    on o

    r of

    dep

    roto

    nati

    on.

    3. T

    he C

    hou

    Fas

    man

    rul

    es in

    bri

    ef (

    Fas

    man

    , 198

    5), a

    re A

    clu

    ster

    of

    FO

    UR

    -h

    elix

    -pro

    mot

    ing

    amin

    o-ac

    id r

    esid

    ues

    (Glu

    , A

    la, L

    eu, H

    is

    ,M

    et, G

    ln, T

    rp, V

    al, o

    r P

    he)

    in a

    run

    of

    SIX

    am

    ino

    acid

    s w

    ill in

    itia

    te a

    n

    -hel

    ix t

    o fo

    rm, u

    ntil

    sets

    of

    -h

    elix

    -bre

    akin

    g am

    ino

    acid

    s (t

    hose

    wit

    hs

    -val

    ues

    less

    tha

    n 1.

    00 in

    thi

    s ta

    ble)

    are

    enc

    ount

    ered

    . Pro

    line

    cann

    ot o

    ccur

    in a

    n

    -hel

    ix in

    the

    inne

    r pa

    rt o

    f a

    poly

    pept

    ide

    chai

    n, o

    r in

    a h

    elix

    in t

    he C

    -ter

    min

    al e

    nd o

    f a

    poly

    pept

    ide

    chai

    n, b

    ut c

    an o

    ccur

    wit

    hin

    the

    last

    thr

    ee r

    esid

    ues

    in t

    he N

    -ter

    min

    al e

    nd o

    f an

    -h

    elix

    . A c

    lust

    er o

    fT

    HR

    EE

    -s

    truc

    ture

    -for

    min

    g am

    ino-

    acid

    res

    idue

    s ou

    t of

    a r

    un o

    f F

    IVE

    am

    ino

    acid

    s w

    ill in

    itia

    te t

    he fo

    rmat

    ion

    of a

    -s

    heet

    , whi

    ch w

    ill e

    ndw

    hen

    a se

    t of

    four

    -s

    heet

    -bre

    akin

    g am

    ino

    acid

    s is

    rea

    ched

    .

  • numerous physical phenomena (gel formation as gelatin solutions are cooled, whichis mimicked by the behaviour of some synthetic oligopeptide solutions) and moresubtle aspects of protein behaviour of physiological importance.

    The -helix is one of the best-known regular conformational features as a sub-heading within the secondary structure of polypeptides and is frequently adoptedin chains of six or more helicogenic amino acids (see Table 2.1 for a denition ofterms and examples). The -sheet is another classic conformational structure thathas been detected from the earliest days of X-ray crystallography of proteins. Local

    2.2 Congurational isomerism

    23

    To N-Terminus

    C

    N

    C

    R

    H

    C

    O

    H

    C

    N

    OCN

    H

    H

    C

    R

    N

    H

    C

    O

    R

    C

    HN

    HO

    C

    N

    H

    H

    C

    R

    H

    C

    OH

    C

    HN

    H

    (a) (b)

    (c)

    H

    N

    C

    R

    H

    C

    O

    H

    N

    CR

    C

    H

    O

    H

    N

    C

    R

    H

    C

    NH

    C

    O

    -helix

    -sheet

    R H

    NR

    O

    NH

    O

    To N-Terminus

    (b)

    H

    N

    C

    R

    H

    C

    O

    H

    N

    CR

    C

    H

    O

    H

    N

    C

    R

    H

    C

    NH

    C

    O

    O

    Figure 2.3. Representative peptides made up from --amino acids. (a) The structuralformula of a parallel -sheet showing H-bonds. (b) The structural formula of a right-

    handed -helix showing H-bonds. (c) The standard tape depictions of the right-handed-helix and antiparallel -sheet. All amide bonds are in the trans conguration.

  • regions within peptides can show the onset of -sheet structuring and the term -bend or -turn is used (Figure 2.4). A tighter turn is fairly common, the -turn(Figure 2.4).

    The conformational details within an amino acid residue of a polypeptide are mostclearly dened as torsion angles (Figure 2.1(b)) for the backbone single bonds ( forthe C(O)N bond, for the CN bond, for the CC bond) and 1 for theCside-chain-Cbond, 2 for the CC bond, etc. Torsion angles are positivefor right-handed helicity, looking along the axis of a peptide from the N-terminusto the C-terminus (i.e., viewing the Newman projection of the backbone from theN-terminus to the C-terminus) and backbone torsion angles are all 180 for the fullyextended conformation (Figure 2.2). Trans- and cis peptide bonds have 180 andthis is the torsion angle seen in all exible peptides constructed from -amino acids.For poly(-alanine), in its usual right-handed -helix, is 47 and is 57.

    There are several types of -helix and of -bends, all diering slightly in torsionangles within the residues of the actual turn and, in the case of the -helix, alsodetermined by the pattern of intramolecular and intermolecular hydrogen bonding.Thus the type I 310 -bend includes a hydrogen bond linking three amino acidresidues into a ten-membered ring; the type II variant mentioned in older literatureis the same except for the helicity within the turn. There are actually thirteen distincttypes of -bend, diering only in dihedral angles and (Hermkens et al., 1994)and research into creating synthetic peptide mimetic drugs (Chapter 9) has focussedon copying the general disposition of functional groups around turns in biologicallyactive peptides, to create analogues but often without the peptide backbone andtherefore capable of reaching the target site in the living organism by virtue of sur-viving enzymatic destruction.

    The currently interesting motif RGDS (i.e. ArgGlyAspSer) present insome cell-adhesion proteins that are of crucial importance in growth and other

    24

    Figure 2.4. (a) A type I 310--bend. (b) A -turn.

    (a) (b)

  • aggregation phenomena has stimulated synthetic eorts with a view to obtainingsmaller molecules with similar properties. The glycine and aspartic acid residuesform part of a -turn, when the RGDS sequence is part of a small peptide (see alsoSection 2.4).

    The -helix can enclose more than the average 3.4 amino-acid residues per turnthat is its standard feature and can exist in left-handed helicity as well as in its mostcommon right-handed form (the right-handed form is normally induced by the -congured chiral centres).

    Imino-acid residues modify these torsion angles considerably; replacement of theamide proton, for example by methylation, induces conformational changes, sincecistrans isomerisation is more easily brought about (because there is a lower energybarrier). However, hydrogen-bonding ability and, of course, other types of close-packing interactions, are substantially modied at -imino-acid residues.

    The conformations of natural polypeptides have arisen as a consequence of thesequences of amino acids that they contain. In some cases, their biological functiondoes not depend directly on the presence of particular amino acids (e.g., collagenand other connective tissue) whereas, in other cases, certain functional side-chainsare mandatory at particular sites in the molecule in order for physiological activityto be shown (enzymes). The correct three-dimensional disposition of these side-chains is obtained through conformational features in their vicinity within the poly-peptide and often through features some distance away (Chapter 8).

    The conservatism implied in protein structure is revealed with the eect of N-methylation at one amino-acid residue in insulin (conversion of an -amino-acidresidue into an -imino-acid residue). Residue 2 in the A-chain is -isoleucine (thestructure is given in Figure 1.4) and replacement of this by its N-methyl analogue,or similar replacement of residue 3 (-valine), leads to distorted helical sequencesfrom residues 28, as shown by circular dichroism (CD) measurements, which leadsin turn to the change-over from the natural quaternary structure (dimeric for bovinezinc insulin) to the monomeric form. It is not surprising, bearing in mind the sub-stantial conformational changes accompanying apparently insignicant structuralchanges, that these analogues show only about 12% of the potency of the naturalhormone, according to radioimmunoassay evidence (Chapter 4).

    The eect of a transcis change at just one position in a biologically activeoligopeptide can drastically change the properties of that peptide. The antibioticGramicidin S, with its alanyl residue replaced by -amino-isobutyric acid(AlaAib, causing a transcis congurational change at the amide bond at thatlocation according to CD measurements) is inactive against Gram-positive bacte-ria. Insertion of Aib into a peptide generally restricts the local conformation to aleft-handed or right-handed 310 -bend or -helix, the outcome depending on pre-cisely what other residues are nearby.

    Structural changes of this nature need, however, not have an eect if the changesare remote from the part of the peptide that responds to receptors; thus bradykinin

    2.2 Congurational isomerism

    25

  • (Section 4.11) with the ProAib substitution at position 7 retains high biologicalactivity, probably because bradykinin itself has a cis-amide grouping at the N-ter-minal side of this proline residue (Cann et al., 1987). The conformational nature ofnatural bradykinin is now well understood and, under normal physiological condi-tions, an individual molecule spends its time interconverting between two types ofconformation, one disordered and the other partially ordered, with a right-handed310 -bend in the N-terminal region and a -turn towards the C-terminus (Cann etal., 1987).

    2.3 Dipeptides

    The exibility of acyclic structures such as peptides has been established convinc-ingly in the broad picture that has emerged from conformational studies. The solu-tion conformation of a dipeptide is sensitive towards (a) the nature of the solvent,(b) the concentration of the solution, (c) the temperature of the solution and (d)the presence and nature of other solutes. The sensitivity is much greater for dipep-tides than it is for other acyclic solute molecules, owing to the proximity of the ter-minal amino and carboxy groups. A further controlling factor is conguration;there are four possible stereoisomers for a representative dipeptide, namely twodiastereoisomers and their enantiomers. An ,-dipeptide (Figure 2.2) adopts aquite dierent solution conformation from that of its ,-diastereoisomer (Figure2.5).

    2.4 Cyclic oligopeptides

    Beyond the relatively rigid cyclic dipeptides (dioxopiperazines; see Chapter 6),macrocyclic peptides show considerable exibility though relatively less exibilitythan that of their acyclic analogues. The biological importance of cyclic peptides has

    26

    Figure 2.5. The conformation of a ,-dipeptide in water.

  • been attested by numerous examples (antibiotics such as Gramicidin S andimmunosuppressant agents such as cyclosporin; Figure 1.4).

    Many attempts to restrict conformational freedom of amino-acid and peptideanalogues synthesised for pharmaceutical studies have been described in reports ofcurrent drug research, one such approach being the synthesis of cyclic analogues ofbiologically active short peptides. For example, the importance of the cell-adhesionsequence RGDS (i.e., ArgGlyAspSer; see also Section 2.2) has stimu-lated research in which cyclic peptides enclosing it in such a manner as to constrainits conformation have been synthesised for biological testing (the fact that there arenaturally occurring cyclic peptides possessing biological activity provided theinspiration for this research).

    2.5 Acyclic oligopeptides

    As the distance of separation of the terminal functional groups increases, thesensitivity of the conformation towards solution parameters decreases. Attractionsbetween side-chains become more dominant in determining conformations oflonger peptides; these interacting side-chains do not necessarily have to be closetogether if they are polar, but may be bridged by water molecules.

    Oligopeptides are generally water-soluble, the hydrophilic amide backbonegroupings contributing substantially to this property, and intramolecular attrac-tions within short oligopeptides are practically nonexistent in water. N- and C-derivatised dipeptides are insoluble in water, but derivatised peptides longer thanthree or four amino-acid residues are usually appreciably soluble in water, owing totheir greater number of backbone amide groups.

    2.6 Longer oligopeptides: primary, secondary and tertiary structure

    As the chain length increases further, other factors come into play to determine theoverall conformation of the peptide and the electrically charged groups at the endsof the chain are less important in this respect. Interactions, repulsive and attractive,between side-chains are dominant and the primary structure (the sequence of thepeptide and the stereochemistry at each chiral centre) determines the run of thepeptide chain through the molecule (the secondary structure) and the overall shapeof a single polypeptide chain (globular, extended, etc.; the tertiary structure) of themolecule.

    The term domain is applied to describe regions within a protein molecule.Particular associations can arise between distinct secondary structures within thetertiary structure (an -helix in one part of the polypeptide molecule can pack witha particular region elsewhere in the sequence; a -sheet can be stabilised by a nearbystructural feature from a distinct part of the polypeptide sequence). These domains

    2.6 Longer oligopeptides

    27

  • are usually dominant stabilising features, characteristically leading to the functionalconformation of a protein (Figure 2.6).

    2.7 Polypeptides and proteins: quaternary structure and aggregation

    The association of two or more identical polypeptide molecules to form an overallglobular protein is often found to occur for enzymes and other proteins, the result-ing overall entity representing their quaternary structure. This association behav-iour can be demonstrated using synthetic peptides. Spectacular examples have beenworked out by reasoning out the structural requirements needed, on the basis of thebehaviour of individual amino acids thus, derivatised amphiphilic oligopeptides(peptides with alternating hydrophilic and hydrophobic residues, e.g. the N-acetylhexadecapeptide amide AcEAALEAALELAAELAANH2) form aggregatesof -helices in aqueous solution at pH 7 (0.5 mg ml1) which change to aggregated-sheets when the pH of the solution is lowered to 4 (Mutter et al., 1991). The evi-dence for this association behaviour is easily obtained through circular dichroism

    28

    Figure 2.6. Egg lysozyme (a globular protein), illustrating the method of representing theconformation of a polypeptide, in which the run of the polypeptide through the molecule is

    made clear by depicting it as a ribbon. There are three disulphide bridges in this enzyme.Parts of the molecule are shown in greater detail, to reveal that a globular protein istypically made up of a collection of the various conformational features (-helix and

    -sheet with -bends determining the number and length of each stretch of extendedconformation and random regions).

  • spectroscopy (Chapter 3). The side-chains of the glutamic acid residues (E) and ofthe lysine residues (L) are on opposite sides of the peptide backbone in its extendedconformation (Figure 2.2) and the helicogenic nature of the constituent amino acidsat pH 7 causes the formation of -helices that present a hydrophilic face on one sideof the helix and their hydrophobic face on the other (Figure 2.7).

    2.8 Examples of conformational behaviour; ordered and disordered states and transitions between them

    There are direct consequences of the amino-acid make-up and sequence of amino-acid residues within a peptide (its primary structure) on its overall conformationalbehaviour. In spite of the regularity of the backbone of the molecule, a wide varietyof conformations is adopted, indicating the role of the side-chains in determiningthe overall conformation of the molecule.

    2.8.1 The main categories of polypeptide conformation

    Sequence-determined features are summarised in the following sub-sections.

    2.8.1.1 One extreme situation

    Simple aliphatic hydrocarbon side-chains tend to lead to adoption of a regularconformation. The side-chains are hydrophobic and repel each other but repel waterto a much greater extent the molecule organises itself so that amide groups under-take stabilising hydrogen bonding with each other, leading to the -helix, or