Protein.pdf

15
Protein This article is about a class of molecules. For protein as a nutrient, see Protein (nutrient). For other uses, see Protein (disambiguation). A representation of the 3D structure of the protein myoglobin showing turquoise alpha helices. This protein was the first to have its structure solved by X-ray crystallography. Towards the right-center among the coils, a prosthetic group called a heme group (shown in gray) with a bound oxygen molecule (red). Proteins (/ˈproʊˌtiːnz/ or /ˈproʊti.ɨnz/) are large biological molecules, or macromolecules, consisting of one or more long chains of amino acid residues. Proteins perform a vast array of functions within living organisms, including catalyzing metabolic reactions, replicating DNA, responding to stimuli, and transporting molecules from one location to another. Proteins differ from one another primarily in their sequence of amino acids, which is dictated by the nucleotide sequence of their genes, and which usually results in folding of the protein into a specific three-dimensional structure that determines its activity. A linear chain of amino acid residues is called a polypeptide. A protein contains at least one long polypep- tide. Short polypeptides, containing less than about 20- 30 residues, are rarely considered to be proteins and are commonly called peptides, or sometimes oligopeptides. The individual amino acid residues are bonded together by peptide bonds and adjacent amino acid residues. The sequence of amino acid residues in a protein is defined by the sequence of a gene, which is encoded in the genetic code. In general, the genetic code specifies 20 standard amino acids; however, in certain organisms the genetic code can include selenocysteine and—in certain archaeapyrrolysine. Shortly after or even during syn- thesis, the residues in a protein are often chemically mod- ified by posttranslational modification, which alters the physical and chemical properties, folding, stability, ac- tivity, and ultimately, the function of the proteins. Some- times proteins have non-peptide groups attached, which can be called prosthetic groups or cofactors. Proteins can also work together to achieve a particular function, and they often associate to form stable protein complexes. Once formed, proteins only exist for a certain period of time and are then degraded and recycled by the cell’s ma- chinery through the process of protein turnover. A pro- tein’s lifespan is measured in terms of its half-life and covers a wide range. They can exist for minutes or years with an average lifespan of 1–2 days in mammalian cells. Abnormal and or misfolded proteins are degraded more rapidly either due to being targeted for destruction or due to being unstable. Like other biological macromolecules such as polysaccharides and nucleic acids, proteins are es- sential parts of organisms and participate in virtually every process within cells. Many proteins are enzymes that catalyze biochemical reactions and are vital to metabolism. Proteins also have structural or mechanical functions, such as actin and myosin in muscle and the proteins in the cytoskeleton, which form a system of scaffolding that maintains cell shape. Other proteins are important in cell signaling, immune responses, cell adhesion, and the cell cycle. Proteins are also necessary in animals’ diets, since animals cannot synthesize all the amino acids they need and must obtain essential amino acids from food. Through the process of digestion, animals break down ingested protein into free amino acids that are then used in metabolism. Proteins may be purified from other cellular components using a variety of techniques such as ultracentrifugation, precipitation, electrophoresis, and chromatography; the advent of genetic engineering has made possible a num- ber of methods to facilitate purification. Methods com- monly used to study protein structure and function in- clude immunohistochemistry, site-directed mutagenesis, X-ray crystallography, nuclear magnetic resonance and mass spectrometry. 1

Transcript of Protein.pdf

  • Protein

    This article is about a class of molecules. For proteinas a nutrient, see Protein (nutrient). For other uses, seeProtein (disambiguation).

    A representation of the 3D structure of the protein myoglobinshowing turquoise alpha helices. This protein was the rst tohave its structure solved by X-ray crystallography. Towards theright-center among the coils, a prosthetic group called a hemegroup (shown in gray) with a bound oxygen molecule (red).

    Proteins (/protinz/ or /proti.nz/) are largebiological molecules, or macromolecules, consistingof one or more long chains of amino acid residues.Proteins perform a vast array of functions within livingorganisms, including catalyzing metabolic reactions,replicating DNA, responding to stimuli, and transportingmolecules from one location to another. Proteins dierfrom one another primarily in their sequence of aminoacids, which is dictated by the nucleotide sequence oftheir genes, and which usually results in folding of theprotein into a specic three-dimensional structure thatdetermines its activity.A linear chain of amino acid residues is called apolypeptide. A protein contains at least one long polypep-tide. Short polypeptides, containing less than about 20-30 residues, are rarely considered to be proteins and arecommonly called peptides, or sometimes oligopeptides.The individual amino acid residues are bonded togetherby peptide bonds and adjacent amino acid residues. Thesequence of amino acid residues in a protein is denedby the sequence of a gene, which is encoded in the

    genetic code. In general, the genetic code species 20standard amino acids; however, in certain organisms thegenetic code can include selenocysteine andin certainarchaeapyrrolysine. Shortly after or even during syn-thesis, the residues in a protein are often chemically mod-ied by posttranslational modication, which alters thephysical and chemical properties, folding, stability, ac-tivity, and ultimately, the function of the proteins. Some-times proteins have non-peptide groups attached, whichcan be called prosthetic groups or cofactors. Proteins canalso work together to achieve a particular function, andthey often associate to form stable protein complexes.Once formed, proteins only exist for a certain period oftime and are then degraded and recycled by the cells ma-chinery through the process of protein turnover. A pro-teins lifespan is measured in terms of its half-life andcovers a wide range. They can exist for minutes or yearswith an average lifespan of 12 days in mammalian cells.Abnormal and or misfolded proteins are degraded morerapidly either due to being targeted for destruction or dueto being unstable.Like other biological macromolecules such aspolysaccharides and nucleic acids, proteins are es-sential parts of organisms and participate in virtuallyevery process within cells. Many proteins are enzymesthat catalyze biochemical reactions and are vital tometabolism. Proteins also have structural or mechanicalfunctions, such as actin and myosin in muscle and theproteins in the cytoskeleton, which form a system ofscaolding that maintains cell shape. Other proteinsare important in cell signaling, immune responses, celladhesion, and the cell cycle. Proteins are also necessaryin animals diets, since animals cannot synthesize all theamino acids they need and must obtain essential aminoacids from food. Through the process of digestion,animals break down ingested protein into free aminoacids that are then used in metabolism.Proteins may be puried from other cellular componentsusing a variety of techniques such as ultracentrifugation,precipitation, electrophoresis, and chromatography; theadvent of genetic engineering has made possible a num-ber of methods to facilitate purication. Methods com-monly used to study protein structure and function in-clude immunohistochemistry, site-directed mutagenesis,X-ray crystallography, nuclear magnetic resonance andmass spectrometry.

    1

  • 2 2 SYNTHESIS

    1 BiochemistryMain articles: Biochemistry, Amino acid and peptidebondMost proteins consist of linear polymers built from series

    Chemical structure of the peptide bond (bottom) and the three-dimensional structure of a peptide bond between an alanine andan adjacent amino acid (top/inset)

    Resonance structures of the peptide bond that links individualamino acids to form a protein polymer

    of up to 20 dierent L--amino acids. All proteinogenicamino acids possess common structural features, includ-ing an -carbon to which an amino group, a carboxylgroup, and a variable side chain are bonded. Only prolinediers from this basic structure as it contains an unusualring to the N-end amine group, which forces the CONH amide moiety into a xed conformation.[1] The sidechains of the standard amino acids, detailed in the list ofstandard amino acids, have a great variety of chemicalstructures and properties; it is the combined eect of allof the amino acid side chains in a protein that ultimatelydetermines its three-dimensional structure and its chem-ical reactivity.[2] The amino acids in a polypeptide chainare linked by peptide bonds. Once linked in the proteinchain, an individual amino acid is called a residue, andthe linked series of carbon, nitrogen, and oxygen atomsare known as the main chain or protein backbone.[3]

    The peptide bond has two resonance forms that contributesome double-bond character and inhibit rotation aroundits axis, so that the alpha carbons are roughly coplanar.The other two dihedral angles in the peptide bond deter-mine the local shape assumed by the protein backbone.[4]The end of the protein with a free carboxyl group isknown as the C-terminus or carboxy terminus, whereasthe end with a free amino group is known as the N-

    terminus or amino terminus. The words protein, polypep-tide, and peptide are a little ambiguous and can over-lap in meaning. Protein is generally used to refer to thecomplete biological molecule in a stable conformation,whereas peptide is generally reserved for a short aminoacid oligomers often lacking a stable three-dimensionalstructure. However, the boundary between the two isnot well dened and usually lies near 2030 residues.[5]Polypeptide can refer to any single linear chain of aminoacids, usually regardless of length, but often implies anabsence of a dened conformation.

    2 Synthesis

    2.1 Biosynthesis

    Main article: Protein biosynthesisProteins are assembled from amino acids using informa-

    A ribosome produces a protein using mRNA as template.

    GTGCATCTGACTCCTGAGGAGAAGCACGTAGACTGAGGACTCCTCTTC

    GUGCAUCUGACUCCUGAGGAGAAG

    V H L T P E E K

    DNA(transcription)

    RNA(translation)

    protein

    The DNA sequence of a gene encodes the amino acid sequenceof a protein.

    tion encoded in genes. Each protein has its own uniqueamino acid sequence that is specied by the nucleotidesequence of the gene encoding this protein. The geneticcode is a set of three-nucleotide sets called codons andeach three-nucleotide combination designates an aminoacid, for example AUG (adenine-uracil-guanine) is thecode for methionine. Because DNA contains four nu-cleotides, the total number of possible codons is 64;hence, there is some redundancy in the genetic code, withsome amino acids specied by more than one codon.[6]

  • 3Genes encoded in DNA are rst transcribed into pre-messenger RNA (mRNA) by proteins such as RNA poly-merase. Most organisms then process the pre-mRNA(also known as a primary transcript) using various formsof Post-transcriptional modication to form the maturemRNA, which is then used as a template for protein syn-thesis by the ribosome. In prokaryotes the mRNA mayeither be used as soon as it is produced, or be bound bya ribosome after having moved away from the nucleoid.In contrast, eukaryotes make mRNA in the cell nucleusand then translocate it across the nuclear membrane intothe cytoplasm, where protein synthesis then takes place.The rate of protein synthesis is higher in prokaryotesthan eukaryotes and can reach up to 20 amino acids persecond.[7]

    The process of synthesizing a protein from an mRNAtemplate is known as translation. The mRNA is loadedonto the ribosome and is read three nucleotides at a timeby matching each codon to its base pairing anticodonlocated on a transfer RNA molecule, which carries theamino acid corresponding to the codon it recognizes.The enzyme aminoacyl tRNA synthetase charges thetRNAmolecules with the correct amino acids. The grow-ing polypeptide is often termed the nascent chain. Pro-teins are always biosynthesized from N-terminus to C-terminus.[6]

    The size of a synthesized protein can be measured bythe number of amino acids it contains and by its to-tal molecular mass, which is normally reported in unitsof daltons (synonymous with atomic mass units), or thederivative unit kilodalton (kDa). Yeast proteins are onaverage 466 amino acids long and 53 kDa in mass.[5] Thelargest known proteins are the titins, a component of themuscle sarcomere, with amolecular mass of almost 3,000kDa and a total length of almost 27,000 amino acids.[8]

    2.2 Chemical synthesis

    Short proteins can also be synthesized chemically by afamily of methods known as peptide synthesis, which relyon organic synthesis techniques such as chemical ligationto produce peptides in high yield.[9] Chemical synthesisallows for the introduction of non-natural amino acidsinto polypeptide chains, such as attachment of uorescentprobes to amino acid side chains.[10] These methods areuseful in laboratory biochemistry and cell biology, thoughgenerally not for commercial applications. Chemical syn-thesis is inecient for polypeptides longer than about 300amino acids, and the synthesized proteins may not readilyassume their native tertiary structure. Most chemical syn-thesis methods proceed from C-terminus to N-terminus,opposite the biological reaction.[11]

    3 StructureMain article: Protein structureFurther information: Protein structure predictionMost proteins fold into unique 3-dimensional structures.

    The crystal structure of the chaperonin. Chaperonins assist pro-tein folding.

    Three possible representations of the three-dimensional structureof the protein triose phosphate isomerase. Left: all-atom rep-resentation colored by atom type. Middle: Simplied represen-tation illustrating the backbone conformation, colored by sec-ondary structure. Right: Solvent-accessible surface representa-tion colored by residue type (acidic residues red, basic residuesblue, polar residues green, nonpolar residues white)

    The shape into which a protein naturally folds is known asits native conformation.[12] Although many proteins canfold unassisted, simply through the chemical propertiesof their amino acids, others require the aid of molec-ular chaperones to fold into their native states.[13] Bio-chemists often refer to four distinct aspects of a proteinsstructure:[14]

    Primary structure: the amino acid sequence. A pro-tein is a polyamide.

    Secondary structure: regularly repeating local struc-tures stabilized by hydrogen bonds. The mostcommon examples are the alpha helix, beta sheetand turns. Because secondary structures are local,many regions of dierent secondary structure canbe present in the same protein molecule.

    Tertiary structure: the overall shape of a single pro-tein molecule; the spatial relationship of the sec-ondary structures to one another. Tertiary struc-ture is generally stabilized by nonlocal interactions,most commonly the formation of a hydrophobiccore, but also through salt bridges, hydrogen bonds,

  • 4 4 CELLULAR FUNCTIONS

    disulde bonds, and even posttranslational modica-tions. The term tertiary structure is often used assynonymous with the term fold. The tertiary struc-ture is what controls the basic function of the pro-tein.

    Quaternary structure: the structure formed by sev-eral protein molecules (polypeptide chains), usuallycalled protein subunits in this context, which func-tion as a single protein complex.

    Proteins are not entirely rigid molecules. In addition tothese levels of structure, proteins may shift between sev-eral related structures while they perform their functions.In the context of these functional rearrangements, thesetertiary or quaternary structures are usually referred to as"conformations", and transitions between them are calledconformational changes. Such changes are often inducedby the binding of a substrate molecule to an enzymesactive site, or the physical region of the protein that par-ticipates in chemical catalysis. In solution proteins alsoundergo variation in structure through thermal vibrationand the collision with other molecules.[15]

    Molecular surface of several proteins showing their compara-tive sizes. From left to right are: immunoglobulin G (IgG, anantibody), hemoglobin, insulin (a hormone), adenylate kinase(an enzyme), and glutamine synthetase (an enzyme).

    Proteins can be informally divided into three mainclasses, which correlate with typical tertiary structures:globular proteins, brous proteins, and membrane pro-teins. Almost all globular proteins are soluble and manyare enzymes. Fibrous proteins are often structural, suchas collagen, the major component of connective tissue, orkeratin, the protein component of hair and nails. Mem-brane proteins often serve as receptors or provide chan-nels for polar or charged molecules to pass through thecell membrane.[16]

    A special case of intramolecular hydrogen bondswithin proteins, poorly shielded from water attack andhence promoting their own dehydration, are calleddehydrons.[17]

    3.1 Structure determination

    Discovering the tertiary structure of a protein, or the qua-ternary structure of its complexes, can provide importantclues about how the protein performs its function. Com-mon experimental methods of structure determinationinclude X-ray crystallography and NMR spectroscopy,

    both of which can produce information at atomic resolu-tion. However, NMR experiments are able to provide in-formation from which a subset of distances between pairsof atoms can be estimated, and the nal possible confor-mations for a protein are determined by solving a distancegeometry problem. Dual polarisation interferometry is aquantitative analytical method for measuring the overallprotein conformation and conformational changes due tointeractions or other stimulus. Circular dichroism is an-other laboratory technique for determining internal betasheet/ helical composition of proteins. Cryoelectron mi-croscopy is used to produce lower-resolution structuralinformation about very large protein complexes, includ-ing assembled viruses;[18] a variant known as electroncrystallography can also produce high-resolution infor-mation in some cases, especially for two-dimensionalcrystals of membrane proteins.[19] Solved structures areusually deposited in the Protein Data Bank (PDB), afreely available resource from which structural data aboutthousands of proteins can be obtained in the form ofCartesian coordinates for each atom in the protein.[20]

    Manymore gene sequences are known than protein struc-tures. Further, the set of solved structures is biased to-ward proteins that can be easily subjected to the condi-tions required in X-ray crystallography, one of the majorstructure determination methods. In particular, globularproteins are comparatively easy to crystallize in prepa-ration for X-ray crystallography. Membrane proteins,by contrast, are dicult to crystallize and are underrep-resented in the PDB.[21] Structural genomics initiativeshave attempted to remedy these deciencies by system-atically solving representative structures of major foldclasses. Protein structure prediction methods attempt toprovide a means of generating a plausible structure forproteins whose structures have not been experimentallydetermined.[22]

    4 Cellular functionsProteins are the chief actors within the cell, said to becarrying out the duties specied by the information en-coded in genes.[5] With the exception of certain types ofRNA, most other biological molecules are relatively in-ert elements upon which proteins act. Proteins make uphalf the dry weight of an Escherichia coli cell, whereasother macromolecules such as DNA and RNA make uponly 3% and 20%, respectively.[23] The set of proteinsexpressed in a particular cell or cell type is known as itsproteome.The chief characteristic of proteins that also allows theirdiverse set of functions is their ability to bind othermolecules specically and tightly. The region of the pro-tein responsible for binding another molecule is knownas the binding site and is often a depression or pocketon the molecular surface. This binding ability is medi-ated by the tertiary structure of the protein, which de-

  • 4.2 Cell signaling and ligand binding 5

    The enzyme hexokinase is shown as a conventional ball-and-stickmolecular model. To scale in the top right-hand corner are twoof its substrates, ATP and glucose.

    nes the binding site pocket, and by the chemical prop-erties of the surrounding amino acids side chains. Pro-tein binding can be extraordinarily tight and specic; forexample, the ribonuclease inhibitor protein binds to hu-man angiogenin with a sub-femtomolar dissociation con-stant (1 M). Extremely minor chemicalchanges such as the addition of a single methyl group to abinding partner can sometimes suce to nearly eliminatebinding; for example, the aminoacyl tRNA synthetasespecic to the amino acid valine discriminates against thevery similar side chain of the amino acid isoleucine.[24]

    Proteins can bind to other proteins as well as to small-molecule substrates. When proteins bind specically toother copies of the same molecule, they can oligomerizeto form brils; this process occurs often in structural pro-teins that consist of globular monomers that self-associateto form rigid bers. Proteinprotein interactions also reg-ulate enzymatic activity, control progression through thecell cycle, and allow the assembly of large protein com-plexes that carry out many closely related reactions witha common biological function. Proteins can also bind to,or even be integrated into, cell membranes. The abil-ity of binding partners to induce conformational changesin proteins allows the construction of enormously com-plex signaling networks.[25] Importantly, as interactionsbetween proteins are reversible, and depend heavily onthe availability of dierent groups of partner proteins toform aggregates that are capable to carry out discrete setsof function, study of the interactions between specicproteins is a key to understand important aspects of cellu-lar function, and ultimately the properties that distinguishparticular cell types.[26][27]

    4.1 EnzymesMain article: Enzyme

    The best-known role of proteins in the cell is as enzymes,which catalyze chemical reactions. Enzymes are usually

    highly specic and accelerate only one or a few chem-ical reactions. Enzymes carry out most of the reac-tions involved in metabolism, as well as manipulatingDNA in processes such as DNA replication, DNA re-pair, and transcription. Some enzymes act on other pro-teins to add or remove chemical groups in a processknown as posttranslational modication. About 4,000 re-actions are known to be catalyzed by enzymes.[28] Therate acceleration conferred by enzymatic catalysis is oftenenormousasmuch as 1017-fold increase in rate over theuncatalyzed reaction in the case of orotate decarboxylase(78 million years without the enzyme, 18 millisecondswith the enzyme).[29]

    The molecules bound and acted upon by enzymes arecalled substrates. Although enzymes can consist of hun-dreds of amino acids, it is usually only a small fractionof the residues that come in contact with the substrate,and an even smaller fractionthree to four residues onaveragethat are directly involved in catalysis.[30] Theregion of the enzyme that binds the substrate and con-tains the catalytic residues is known as the active site.Dirigent proteins are members of a class of proteinswhich dictate the stereochemistry of a compound synthe-sized by other enzymes.

    4.2 Cell signaling and ligand binding

    Ribbon diagram of a mouse antibody against cholera that bindsa carbohydrate antigen

    Many proteins are involved in the process of cell sig-naling and signal transduction. Some proteins, such as

  • 6 5 METHODS OF STUDY

    insulin, are extracellular proteins that transmit a signalfrom the cell in which they were synthesized to other cellsin distant tissues. Others are membrane proteins that actas receptors whose main function is to bind a signalingmolecule and induce a biochemical response in the cell.Many receptors have a binding site exposed on the cellsurface and an eector domain within the cell, which mayhave enzymatic activity or may undergo a conformationalchange detected by other proteins within the cell.[31]

    Antibodies are protein components of an adaptive im-mune system whose main function is to bind antigens, orforeign substances in the body, and target them for de-struction. Antibodies can be secreted into the extracel-lular environment or anchored in the membranes of spe-cialized B cells known as plasma cells. Whereas enzymesare limited in their binding anity for their substrates bythe necessity of conducting their reaction, antibodies haveno such constraints. An antibodys binding anity to itstarget is extraordinarily high.[32]

    Many ligand transport proteins bind particular smallbiomolecules and transport them to other locations in thebody of a multicellular organism. These proteins musthave a high binding anity when their ligand is presentin high concentrations, but must also release the ligandwhen it is present at low concentrations in the target tis-sues. The canonical example of a ligand-binding proteinis haemoglobin, which transports oxygen from the lungsto other organs and tissues in all vertebrates and has closehomologs in every biological kingdom.[33] Lectins aresugar-binding proteins which are highly specic for theirsugar moieties. Lectins typically play a role in biologicalrecognition phenomena involving cells and proteins.[34]Receptors and hormones are highly specic binding pro-teins.Transmembrane proteins can also serve as ligand trans-port proteins that alter the permeability of the cell mem-brane to small molecules and ions. The membrane alonehas a hydrophobic core through which polar or chargedmolecules cannot diuse. Membrane proteins contain in-ternal channels that allow such molecules to enter and exitthe cell. Many ion channel proteins are specialized to se-lect for only a particular ion; for example, potassium andsodium channels often discriminate for only one of thetwo ions.[35]

    4.3 Structural proteins

    Structural proteins confer stiness and rigidity tootherwise-uid biological components. Most structuralproteins are brous proteins; for example, collagen andelastin are critical components of connective tissue suchas cartilage, and keratin is found in hard or lamen-tous structures such as hair, nails, feathers, hooves, andsome animal shells.[36] Some globular proteins can alsoplay structural functions, for example, actin and tubulinare globular and soluble as monomers, but polymerize

    to form long, sti bers that make up the cytoskeleton,which allows the cell to maintain its shape and size.Other proteins that serve structural functions are motorproteins such as myosin, kinesin, and dynein, which arecapable of generating mechanical forces. These proteinsare crucial for cellular motility of single celled organismsand the sperm of many multicellular organisms which re-produce sexually. They also generate the forces exertedby contracting muscles[37] and play essential roles in in-tracellular transport.

    5 Methods of studyMain article: Protein methods

    The activities and structures of proteins may be examinedin vitro, in vivo, and in silico. In vitro studies of puriedproteins in controlled environments are useful for learn-ing how a protein carries out its function: for example,enzyme kinetics studies explore the chemical mechanismof an enzymes catalytic activity and its relative anity forvarious possible substrate molecules. By contrast, in vivoexperiments can provide information about the physio-logical role of a protein in the context of a cell or evena whole organism. In silico studies use computationalmethods to study proteins.

    5.1 Protein purication

    Main article: Protein purication

    To perform in vitro analysis, a protein must be puriedaway from other cellular components. This process usu-ally begins with cell lysis, in which a cells membrane isdisrupted and its internal contents released into a solu-tion known as a crude lysate. The resulting mixture canbe puried using ultracentrifugation, which fractionatesthe various cellular components into fractions containingsoluble proteins; membrane lipids and proteins; cellularorganelles, and nucleic acids. Precipitation by a methodknown as salting out can concentrate the proteins fromthis lysate. Various types of chromatography are thenused to isolate the protein or proteins of interest basedon properties such as molecular weight, net charge andbinding anity.[38] The level of purication can be moni-tored using various types of gel electrophoresis if the de-sired proteins molecular weight and isoelectric point areknown, by spectroscopy if the protein has distinguishablespectroscopic features, or by enzyme assays if the pro-tein has enzymatic activity. Additionally, proteins can beisolated according their charge using electrofocusing.[39]

    For natural proteins, a series of purication steps may benecessary to obtain protein suciently pure for labora-tory applications. To simplify this process, genetic engi-

  • 5.3 Proteomics 7

    neering is often used to add chemical features to proteinsthat make them easier to purify without aecting theirstructure or activity. Here, a tag consisting of a specicamino acid sequence, often a series of histidine residues(a "His-tag"), is attached to one terminus of the protein.As a result, when the lysate is passed over a chromatogra-phy column containing nickel, the histidine residues lig-ate the nickel and attach to the column while the untaggedcomponents of the lysate pass unimpeded. A number ofdierent tags have been developed to help researchers pu-rify specic proteins from complex mixtures.[40]

    5.2 Cellular localization

    Proteins in dierent cellular compartments and structures taggedwith green uorescent protein (here, white)

    The study of proteins in vivo is often concerned with thesynthesis and localization of the protein within the cell.Although many intracellular proteins are synthesized inthe cytoplasm and membrane-bound or secreted proteinsin the endoplasmic reticulum, the specics of how pro-teins are targeted to specic organelles or cellular struc-tures is often unclear. A useful technique for assessingcellular localization uses genetic engineering to express ina cell a fusion protein or chimera consisting of the naturalprotein of interest linked to a "reporter" such as green u-orescent protein (GFP).[41] The fused proteins positionwithin the cell can be cleanly and eciently visualizedusing microscopy,[42] as shown in the gure opposite.Other methods for elucidating the cellular location of pro-teins requires the use of known compartmental mark-ers for regions such as the ER, the Golgi, lysosomes or

    vacuoles, mitochondria, chloroplasts, plasma membrane,etc. With the use of uorescently tagged versions of thesemarkers or of antibodies to known markers, it becomesmuch simpler to identify the localization of a protein ofinterest. For example, indirect immunouorescence willallow for uorescence colocalization and demonstrationof location. Fluorescent dyes are used to label cellularcompartments for a similar purpose.[43]

    Other possibilities exist, as well. For example,immunohistochemistry usually utilizes an antibody to oneor more proteins of interest that are conjugated to en-zymes yielding either luminescent or chromogenic signalsthat can be compared between samples, allowing for lo-calization information. Another applicable technique iscofractionation in sucrose (or other material) gradientsusing isopycnic centrifugation.[44] While this techniquedoes not prove colocalization of a compartment of knowndensity and the protein of interest, it does increase thelikelihood, and is more amenable to large-scale studies.Finally, the gold-standard method of cellular localizationis immunoelectron microscopy. This technique also usesan antibody to the protein of interest, along with classicalelectron microscopy techniques. The sample is preparedfor normal electron microscopic examination, and thentreated with an antibody to the protein of interest that isconjugated to an extremely electro-dense material, usu-ally gold. This allows for the localization of both ultra-structural details as well as the protein of interest.[45]

    Through another genetic engineering application knownas site-directed mutagenesis, researchers can alter theprotein sequence and hence its structure, cellular localiza-tion, and susceptibility to regulation. This technique evenallows the incorporation of unnatural amino acids intoproteins, using modied tRNAs,[46] and may allow therational design of new proteins with novel properties.[47]

    5.3 ProteomicsMain article: Proteomics

    The total complement of proteins present at a time in acell or cell type is known as its proteome, and the study ofsuch large-scale data sets denes the eld of proteomics,named by analogy to the related eld of genomics. Keyexperimental techniques in proteomics include 2D elec-trophoresis,[48] which allows the separation of a largenumber of proteins, mass spectrometry,[49] which allowsrapid high-throughput identication of proteins and se-quencing of peptides (most often after in-gel digestion),protein microarrays,[50] which allow the detection of therelative levels of a large number of proteins present in acell, and two-hybrid screening, which allows the system-atic exploration of proteinprotein interactions.[51] Thetotal complement of biologically possible such interac-tions is known as the interactome.[52] A systematic at-tempt to determine the structures of proteins representing

  • 8 6 NUTRITION

    every possible fold is known as structural genomics.[53]

    5.4 Bioinformatics

    Main article: Bioinformatics

    A vast array of computational methods have been devel-oped to analyze the structure, function, and evolution ofproteins.The development of such tools has been driven by thelarge amount of genomic and proteomic data availablefor a variety of organisms, including the human genome.It is simply impossible to study all proteins experimen-tally, hence only a few are subjected to laboratory ex-periments while computational tools are used to extrap-olate to similar proteins. Such homologous proteins canbe eciently identied in distantly related organisms bysequence alignment. Genome and gene sequences canbe searched by a variety of tools for certain proper-ties. Sequence proling tools can nd restriction enzymesites, open reading frames in nucleotide sequences, andpredict secondary structures. Phylogenetic trees can beconstructed and evolutionary hypotheses developed usingspecial software like ClustalW regarding the ancestry ofmodern organisms and the genes they express. The eldof bioinformatics is now indispensable for the analysis ofgenes and proteins.

    5.4.1 Structure prediction and simulation

    Constituent amino-acids can be analyzed to predict secondary,tertiary and quaternary protein structure, in this case hemoglobincontaining heme units.

    Main articles: Protein structure prediction and List ofprotein structure prediction software

    Complementary to the eld of structural genomics, pro-tein structure prediction seeks to develop ecient ways

    to provide plausible models for proteins whose struc-tures have not yet been determined experimentally.[54]The most successful type of structure prediction, knownas homology modeling, relies on the existence of a tem-plate structure with sequence similarity to the proteinbeing modeled; structural genomics goal is to providesucient representation in solved structures to modelmost of those that remain.[55] Although producing accu-rate models remains a challenge when only distantly re-lated template structures are available, it has been sug-gested that sequence alignment is the bottleneck in thisprocess, as quite accurate models can be produced if aperfect sequence alignment is known.[56] Many struc-ture prediction methods have served to inform the emerg-ing eld of protein engineering, in which novel pro-tein folds have already been designed.[57] A more com-plex computational problem is the prediction of inter-molecular interactions, such as in molecular docking andproteinprotein interaction prediction.[58]

    The processes of protein folding and binding can be sim-ulated using such technique as molecular mechanics, inparticular, molecular dynamics and Monte Carlo, whichincreasingly take advantage of parallel and distributedcomputing (Folding@home project;[59] molecular mod-eling on GPU). The folding of small alpha-helical pro-tein domains such as the villin headpiece[60] and theHIV accessory protein[61] have been successfully simu-lated in silico, and hybrid methods that combine stan-dard molecular dynamics with quantum mechanics cal-culations have allowed exploration of the electronic statesof rhodopsins.[62]

    6 Nutrition

    Further information: Protein (nutrient)

    Most microorganisms and plants can biosynthesize all20 standard amino acids, while animals (including hu-mans) must obtain some of the amino acids from thediet.[23] The amino acids that an organism cannot synthe-size on its own are referred to as essential amino acids.Key enzymes that synthesize certain amino acids are notpresent in animals such as aspartokinase, which cat-alyzes the rst step in the synthesis of lysine, methionine,and threonine from aspartate. If amino acids are presentin the environment, microorganisms can conserve energyby taking up the amino acids from their surroundings anddownregulating their biosynthetic pathways.In animals, amino acids are obtained through the con-sumption of foods containing protein. Ingested pro-teins are then broken down into amino acids throughdigestion, which typically involves denaturation of theprotein through exposure to acid and hydrolysis by en-zymes called proteases. Some ingested amino acids areused for protein biosynthesis, while others are converted

  • 9to glucose through gluconeogenesis, or fed into the citricacid cycle. This use of protein as a fuel is particularly im-portant under starvation conditions as it allows the bodysown proteins to be used to support life, particularly thosefound in muscle.[63] Amino acids are also an importantdietary source of nitrogen.

    7 History and etymologyFurther information: History of molecular biology

    Proteins were recognized as a distinct class of biologi-cal molecules in the eighteenth century by Antoine Four-croy and others, distinguished by the molecules abilityto coagulate or occulate under treatments with heat oracid.[64] Noted examples at the time included albuminfrom egg whites, blood serum albumin, brin, and wheatgluten.Proteins were rst described by the Dutch chemistGerardus Johannes Mulder and named by the Swedishchemist Jns Jacob Berzelius in 1838.[65][66] Mulder car-ried out elemental analysis of common proteins andfound that nearly all proteins had the same empirical for-mula, C400H620N100O120P1S1.[67] He came to the erro-neous conclusion that they might be composed of a sin-gle type of (very large) molecule. The term proteinto describe these molecules was proposed by Muldersassociate Berzelius; protein is derived from the Greekword (proteios), meaning primary,[68] in thelead, or standing in front.[69] Mulder went on to iden-tify the products of protein degradation such as the aminoacid leucine for which he found a (nearly correct) molec-ular weight of 131 Da.[67]

    Early nutritional scientists such as the German Carl vonVoit believed that protein was the most important nutri-ent for maintaining the structure of the body, because itwas generally believed that esh makes esh.[70] KarlHeinrich Ritthausen extended known protein forms withthe identication of glutamic acid. At the ConnecticutAgricultural Experiment Station a detailed review ofthe vegetable proteins was compiled by Thomas BurrOsborne. Working with Lafayette Mendel and apply-ing Liebigs law of the minimum in feeding laboratoryrats, the nutritionally essential amino acids were estab-lished. The work was continued and communicated byWilliam Cumming Rose. The understanding of pro-teins as polypeptides came through the work of FranzHofmeister and Hermann Emil Fischer. The central roleof proteins as enzymes in living organisms was not fullyappreciated until 1926, when James B. Sumner showedthat the enzyme urease was in fact a protein.[71]

    The diculty in purifying proteins in large quantitiesmade them very dicult for early protein biochemists tostudy. Hence, early studies focused on proteins that couldbe puried in large quantities, e.g., those of blood, egg

    white, various toxins, and digestive/metabolic enzymesobtained from slaughterhouses. In the 1950s, the ArmourHot Dog Co. puried 1 kg of pure bovine pancreaticribonuclease A and made it freely available to scientists;this gesture helped ribonuclease A become a major targetfor biochemical study for the following decades.[67]

    John Kendrew with model of myoglobin in progress.

    Linus Pauling is credited with the successful predic-tion of regular protein secondary structures based onhydrogen bonding, an idea rst put forth by William Ast-bury in 1933.[72] Later work by Walter Kauzmann ondenaturation,[73][74] based partly on previous studies byKaj Linderstrm-Lang,[75] contributed an understandingof protein folding and structure mediated by hydrophobicinteractions.The rst protein to be sequenced was insulin, byFrederick Sanger, in 1949. Sanger correctly determinedthe amino acid sequence of insulin, thus conclusivelydemonstrating that proteins consisted of linear polymersof amino acids rather than branched chains, colloids, orcyclols.[76] He won the Nobel Prize for this achievementin 1958.The rst protein structures to be solved were hemoglobinand myoglobin, by Max Perutz and Sir John CowderyKendrew, respectively, in 1958.[77][78] As of 2014, theProtein Data Bank has over 90,000 atomic-resolutionstructures of proteins.[79] In more recent times, cryo-electron microscopy of large macromolecular assem-blies[80] and computational protein structure prediction ofsmall protein domains[81] are two methods approachingatomic resolution.

    8 See also DNA-binding protein DNA, RNA and proteins: The three essential

    macromolecules of life

    Intein List of proteins

  • 10 9 REFERENCES

    Protein design Proteopathy Proteopedia Proteolysis Intrinsically disordered proteins Protein sequence space Protein superfamily Protein structure Protein fold

    9 References[1] Nelson DL, Cox MM (2005). Lehningers Principles of

    Biochemistry (4th ed.). New York, New York: W. H.Freeman and Company.

    [2] Gutteridge A, Thornton JM (2005). Understanding na-tures catalytic toolkit. Trends in Biochemical Sciences30 (11): 62229. doi:10.1016/j.tibs.2005.09.006. PMID16214343.

    [3] Murray et al., p. 19.

    [4] Murray et al., p. 31.

    [5] Lodish H, Berk A, Matsudaira P, Kaiser CA, Krieger M,Scott MP, Zipurksy SL, Darnell J (2004). Molecular CellBiology (5th ed.). New York, New York: WH Freemanand Company.

    [6] van Holde and Mathews, pp. 100242.

    [7] Dobson CM (2000). The nature and signicance of pro-tein folding. In Pain RH (ed.). Mechanisms of ProteinFolding. Oxford, Oxfordshire: Oxford University Press.pp. 128. ISBN 0-19-963789-X.

    [8] Fulton A, Isaacs W (1991). Titin, a huge, elastic sar-comeric protein with a probable role in morphogenesis.BioEssays 13 (4): 15761. doi:10.1002/bies.950130403.PMID 1859393.

    [9] Bruckdorfer T, Marder O, Albericio F (2004). Fromproduction of peptides in milligram amounts for re-search to multi-tons quantities for drugs of the future.Current Pharmaceutical Biotechnology 5 (1): 2943.doi:10.2174/1389201043489620. PMID 14965208.

    [10] Schwarzer D, Cole P (2005). Protein semisynthesisand expressed protein ligation: chasing a proteins tail.Current Opinion in Chemical Biology 9 (6): 56169.doi:10.1016/j.cbpa.2005.09.018. PMID 16226484.

    [11] Kent SB (2009). Total chemical synthesis of pro-teins. Chemical Society Reviews 38 (2): 33851.doi:10.1039/b700141j. PMID 19169452.

    [12] Murray et al., p. 36.

    [13] Murray et al., p. 37.

    [14] Murray et al., pp. 3034.

    [15] van Holde and Mathews, pp. 36875.

    [16] van Holde and Mathews, pp. 16585.

    [17] Fernndez A, Scott R (2003). Dehydron: a structurallyencoded signal for protein interaction. Biophysical Jour-nal 85 (3): 191428. Bibcode:2003BpJ....85.1914F.doi:10.1016/S0006-3495(03)74619-0. PMC 1303363.PMID 12944304.

    [18] Branden and Tooze, pp. 34041.

    [19] Gonen T, Cheng Y, Sliz P, Hiroaki Y, Fujiyoshi Y, Har-rison SC, Walz T (2005). Lipid-protein interactions indouble-layered two-dimensional AQP0 crystals. Nature438 (7068): 63338. Bibcode:2005Natur.438..633G.doi:10.1038/nature04321. PMC 1350984. PMID16319884.

    [20] Standley DM, Kinjo AR, Kinoshita K, NakamuraH (2008). Protein structure databases with newweb services for structural biology and biomedical re-search. Briengs in Bioinformatics 9 (4): 27685.doi:10.1093/bib/bbn015. PMID 18430752.

    [21] Walian P, Cross TA, Jap BK (2004). Structural genomicsof membrane proteins. Genome Biology 5 (4): 215.doi:10.1186/gb-2004-5-4-215. PMC 395774. PMID15059248.

    [22] Sleator RD. (2012). Prediction of protein functions.Methods in Molecular Biology. Methods in Molecular Bi-ology 815: 1524. doi:10.1007/978-1-61779-424-7_2.ISBN 978-1-61779-423-0. PMID 22130980.

    [23] Voet D, Voet JG. (2004). Biochemistry Vol 1 3rd ed. Wi-ley: Hoboken, NJ.

    [24] SankaranarayananR,Moras D (2001). The delity of thetranslation of the genetic code. Acta Biochimica Polonica48 (2): 32335. PMID 11732604.

    [25] van Holde and Mathews, pp. 83049.

    [26] Copland JA, Sheeld-Moore M, Koldzic-Zivanovic N,Gentry S, Lamprou G, Tzortzatou-Stathopoulou F,Zoumpourlis V, Urban RJ, Vlahopoulos SA (2009).Sex steroid receptors in skeletal dierentiation and ep-ithelial neoplasia: is tissue-specic intervention pos-sible?". BioEssays: news and reviews in molecular,cellular and developmental biology 31 (6): 62941.doi:10.1002/bies.200800138. PMID 19382224.

    [27] Samarin S, Nusrat A (2009). Regulation of epithelialapical junctional complex by Rho family GTPases. Fron-tiers in bioscience: a journal and virtual library 14 (14):112942. doi:10.2741/3298. PMID 19273120.

    [28] Bairoch A (2000). The ENZYME database in2000. Nucleic Acids Research 28 (1): 304305.doi:10.1093/nar/28.1.304. PMC 102465. PMID10592255.

  • 11

    [29] Radzicka A, Wolfenden R (1995). Aprocient enzyme. Science 267 (5194):9093. Bibcode:1995Sci...267...90R.doi:10.1126/science.7809611. PMID 7809611.

    [30] EBI External Services (2010-01-20). The CatalyticSite Atlas at The European Bioinformatics Institute.Ebi.ac.uk. Retrieved 2011-01-16.

    [31] Branden and Tooze, pp. 25181.[32] van Holde and Mathews, pp. 24750.[33] van Holde and Mathews, pp. 22029.[34] Rdiger H, Siebert HC, Sols D, Jimnez-Barbero J,

    Romero A, von der Lieth CW, Diaz-Mario T, Gabius HJ(2000). Medicinal chemistry based on the sugar code:fundamentals of lectinology and experimental strategieswith lectins as targets. Current Medicinal Chemistry 7(4): 389416. doi:10.2174/0929867003375164. PMID10702616.

    [35] Branden and Tooze, pp. 23234.[36] van Holde and Mathews, pp. 17881.[37] van Holde and Mathews, pp. 25864; 272.[38] Murray et al., pp. 2124.[39] Hey J, Posch A, Cohen A, Liu N, Harbers A (2008).

    Fractionation of complex protein mixtures by liquid-phase isoelectric focusing. Methods in Molecular Bi-ology. Methods in Molecular Biology 424: 22539. doi:10.1007/978-1-60327-064-9_19. ISBN 978-1-58829-722-8. PMID 18369866.

    [40] Terpe K (2003). Overview of tag protein fusions: frommolecular and biochemical fundamentals to commercialsystems. Applied Microbiology and Biotechnology 60(5): 52333. doi:10.1007/s00253-002-1158-6. PMID12536251.

    [41] Stepanenko OV, Verkhusha VV, Kuznetsova IM,Uversky VN, Turoverov KK (2008). Fluorescentproteins as biomarkers and biosensors: throwingcolor lights on molecular and cellular processes.Current Protein & Peptide Science 9 (4): 33869.doi:10.2174/138920308785132668. PMC 2904242.PMID 18691124.

    [42] Yuste R (2005). Fluorescence microscopy today. Na-ture Methods 2 (12): 902904. doi:10.1038/nmeth1205-902. PMID 16299474.

    [43] Margolin W (2000). Green uorescent protein asa reporter for macromolecular localization in bacterialcells. Methods (San Diego, Calif.) 20 (1): 6272.doi:10.1006/meth.1999.0906. PMID 10610805.

    [44] Walker JH, Wilson K (2000). Principles and Techniquesof Practical Biochemistry. Cambridge, UK: CambridgeUniversity Press. pp. 28789. ISBN 0-521-65873-X.

    [45] Mayhew TM, Lucocq JM (2008). Developments incell biology for quantitative immunoelectron microscopybased on thin sections: a review. Histochemistry andCell Biology 130 (2): 299313. doi:10.1007/s00418-008-0451-6. PMC 2491712. PMID 18553098.

    [46] Hohsaka T, Sisido M (2002). Incorporation of non-natural amino acids into proteins. Current Opinion inChemical Biology 6 (6): 80915. doi:10.1016/S1367-5931(02)00376-9. PMID 12470735.

    [47] Cedrone F, Mnez A, Qumneur E (2000). Tai-loring new enzyme functions by rational redesign.Current Opinion in Structural Biology 10 (4): 40510. doi:10.1016/S0959-440X(00)00106-8. PMID10981626.

    [48] Grg A, Weiss W, Dunn MJ (2004). Cur-rent two-dimensional electrophoresis technologyfor proteomics. Proteomics 4 (12): 366585.doi:10.1002/pmic.200401031. PMID 15543535.

    [49] Conrotto P, Souchelnytskyi S (2008). Proteomic ap-proaches in biological and medical sciences: principlesand applications. Experimental Oncology 30 (3): 17180. PMID 18806738.

    [50] Joos T, Bachmann J (2009). Protein microarrays: po-tentials and limitations. Frontiers in Bioscience 14 (14):437685. doi:10.2741/3534. PMID 19273356.

    [51] Koegl M, Uetz P (2007). Improving yeast two-hybridscreening systems. Briengs in Functional Genomics& Proteomics 6 (4): 30212. doi:10.1093/bfgp/elm035.PMID 18218650.

    [52] Plewczyski D, Ginalski K (2009). The interac-tome: predicting the proteinprotein interactions in cells.Cellular & Molecular Biology Letters 14 (1): 122.doi:10.2478/s11658-008-0024-7. PMID 18839074.

    [53] Zhang C, Kim SH (2003). Overview of structural ge-nomics: from structure to function. Current Opinionin Chemical Biology 7 (1): 2832. doi:10.1016/S1367-5931(02)00015-7. PMID 12547423.

    [54] Zhang Y (2008). Progress and challenges in proteinstructure prediction. Current Opinion in Structural Bi-ology 18 (3): 34248. doi:10.1016/j.sbi.2008.02.004.PMC 2680823. PMID 18436442.

    [55] Xiang Z (2006). Advances in homology protein struc-ture modeling. Current Protein and Peptide Science 7(3): 21727. doi:10.2174/138920306777452312. PMC1839925. PMID 16787261.

    [56] Zhang Y, Skolnick J (2005). The protein struc-ture prediction problem could be solved us-ing the current PDB library. Proceedings ofthe National Academy of Sciences U.S.A. 102(4): 102934. Bibcode:2005PNAS..102.1029Z.doi:10.1073/pnas.0407152101. PMC 545829. PMID15653774.

    [57] Kuhlman B, Dantas G, Ireton GC, Varani G, Stod-dard BL, Baker D (2003). Design of a novel glob-ular protein fold with atomic-level accuracy. Science302 (5649): 136468. Bibcode:2003Sci...302.1364K.doi:10.1126/science.1089427. PMID 14631033.

    [58] Ritchie DW (2008). Recent progress and fu-ture directions in proteinprotein docking. Cur-rent Protein and Peptide Science 9 (1): 115.doi:10.2174/138920308783565741. PMID 18336319.

  • 12 10 TEXTBOOKS

    [59] Scheraga HA, Khalili M, Liwo A (2007). Protein-folding dynamics: overview of molecular simulationtechniques. Annual Review of Physical Chem-istry 58: 5783. Bibcode:2007ARPC...58...57S.doi:10.1146/annurev.physchem.58.032806.104614.PMID 17034338.

    [60] Zagrovic B, Snow CD, Shirts MR, Pande VS (2002).Simulation of folding of a small alpha-helical pro-tein in atomistic detail using worldwide-distributedcomputing. Journal of Molecular Biology 323 (5):92737. doi:10.1016/S0022-2836(02)00997-X. PMID12417204.

    [61] Herges T, Wenzel W (2005). "In silico folding of a threehelix protein and characterization of its free-energy land-scape in an all-atom force eld. Physical Review Let-ters 94 (1): 018101. Bibcode:2005PhRvL..94a8101H.doi:10.1103/PhysRevLett.94.018101. PMID 15698135.

    [62] Homann M, Wanko M, Strodel P, Knig PH, Frauen-heim T, Schulten K, Thiel W, Tajkhorshid E, Elstner M(2006). Color tuning in rhodopsins: the mechanism forthe spectral shift between bacteriorhodopsin and sensoryrhodopsin II. Journal of the American Chemical Soci-ety 128 (33): 1080818. doi:10.1021/ja062082i. PMID16910676.

    [63] Brosnan J (June 2003). Interorgan amino acid transportand its regulation. Journal of Nutrition 133 (6 Suppl 1):2068S72S. PMID 12771367.

    [64] Thomas Burr Osborne (1909): The Vegetable Proteins,History pp 1 to 6, from archive.org

    [65] Bulletin des Sciences Physiques et Naturelles en Ner-lande (1838). pg 104. SUR LA COMPOSITION DEQUELQUES SUBSTANCES ANIMALES

    [66] Hartley, Harold. Origin of the Word Protein. Nature168, no. 4267 (August 11, 1951): 244244. doi:10.1038/168244a0.

    [67] Perrett D (2007). From 'protein' to the beginnings ofclinical proteomics. Proteomics: Clinical Applications1 (8): 72038. doi:10.1002/prca.200700525. PMID21136729.

    [68] New Oxford Dictionary of English

    [69] Reynolds JA, Tanford C (2003). Natures Robots: A His-tory of Proteins (Oxford Paperbacks). New York, NewYork: Oxford University Press. p. 15. ISBN 0-19-860694-X.

    [70] Bischo TLW, Voit, C (1860). Die Gesetze der Er-naehrung des Panzenfressers durch neue Untersuchungenfestgestellt (in German). Leipzig, Heidelberg.

    [71] Sumner JB (1926). The isolation and crystallization ofthe enzyme urease. Preliminary paper (PDF). Journal ofBiological Chemistry 69 (2): 43541.

    [72] Pauling L, Corey RB, Branson HR (1951). Thestructure of proteins: two hydrogen-bonded helicalcongurations of the polypeptide chain (PDF). Pro-ceedings of the National Academy of Sciences U.S.A.

    37 (5): 23540. Bibcode:1951PNAS...37..235P.doi:10.1073/pnas.37.5.235. PMC 1063348. PMID14834145.

    [73] KauzmannW (1956). Structural factors in protein denat-uration. Journal of Cellular Physiology. Supplement 47(Suppl 1): 11331. doi:10.1002/jcp.1030470410. PMID13332017.

    [74] Kauzmann W (1959). Some factors in the interpre-tation of protein denaturation. Advances in ProteinChemistry. Advances in Protein Chemistry 14: 163.doi:10.1016/S0065-3233(08)60608-7. ISBN 978-0-12-034214-3. PMID 14404936.

    [75] Kalman SM, Linderstrom-Lang K, Ottesen M, RichardsFM (1955). Degradation of ribonuclease by subtil-isin. Biochimica et Biophysica Acta 16 (2): 29799.doi:10.1016/0006-3002(55)90224-9. PMID 14363272.

    [76] Sanger F (1949). The terminal peptides of insulin. Bio-chemical Journal 45 (5): 56374. PMC 1275055. PMID15396627.

    [77] Muirhead H, Perutz M (1963). Structure of hemoglobin.A three-dimensional fourier synthesis of reduced hu-man hemoglobin at 5.5 resolution. Nature 199(4894): 63338. Bibcode:1963Natur.199..633M.doi:10.1038/199633a0. PMID 14074546.

    [78] Kendrew J, Bodo G, Dintzis H, Parrish R, Wycko H,Phillips D (1958). A three-dimensional model of themyoglobin molecule obtained by X-ray analysis. Nature181 (4610): 66266. Bibcode:1958Natur.181..662K.doi:10.1038/181662a0. PMID 13517261.

    [79] RCSB Protein Data Bank. Retrieved 2014-04-05.

    [80] Zhou ZH (2008). Towards atomic resolution struc-tural determination by single-particle cryo-electron mi-croscopy. Current Opinion in Structural Biology 18 (2):21828. doi:10.1016/j.sbi.2008.03.004. PMC 2714865.PMID 18403197.

    [81] Keskin O, Tuncbag N, Gursoy A (2008). Char-acterization and prediction of protein interfaces toinfer protein-protein interaction networks. Cur-rent Pharmaceutical Biotechnology 9 (2): 6776.doi:10.2174/138920108783955191. PMID 18393863.

    10 Textbooks Branden C, Tooze J (1999). Introduction to Protein

    Structure. New York: Garland Pub. ISBN 0-8153-2305-0.

    Murray RF, Harper HW, Granner DK, Mayes PA,Rodwell VW (2006). Harpers Illustrated Biochem-istry. New York: Lange Medical Books/McGraw-Hill. ISBN 0-07-146197-3.

    Van Holde KE, Mathews CK (1996). Biochemistry.Menlo Park, California: Benjamin/Cummings Pub.Co., Inc. ISBN 0-8053-3931-0.

  • 11.2 Tutorials and educational websites 13

    11 External links

    11.1 Databases and projects The Protein Naming Utility Human Protein Atlas NCBI Entrez Protein database NCBI Protein Structure database Human Protein Reference Database Human Proteinpedia Folding@Home (Stanford University) Comparative Toxicogenomics Database cu-

    rates proteinchemical interactions, as well asgene/proteindisease relationships and chemical-disease relationships.

    Bioinformatic Harvester A Meta search engine (29databases) for gene and protein information.

    Protein Databank in Europe (see also PDBeQuips,short articles and tutorials on interesting PDB struc-tures)

    Research Collaboratory for Structural Bioinformat-ics (see alsoMolecule of theMonth, presenting shortaccounts on selected proteins from the PDB)

    Proteopedia Life in 3D: rotatable, zoomable 3Dmodel with wiki annotations for every known pro-tein molecular structure.

    UniProt the Universal Protein Resource neXtProt Exploring the universe of human pro-

    teins: human-centric protein knowledge resource

    Multi-Omics Proling Expression Database:MOPED human and model organism protein/geneknowledge and expression data

    11.2 Tutorials and educational websites An Introduction to Proteins from HOPES (Hunt-

    ingtons Disease Outreach Project for Education atStanford)

    Proteins: Biogenesis to Degradation The VirtualLibrary of Biochemistry and Cell Biology

  • 14 12 TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

    12 Text and image sources, contributors, and licenses12.1 Text

    Protein Source: http://en.wikipedia.org/wiki/Protein?oldid=646143808 Contributors: AxelBoldt, Magnus Manske, Marj Tiefert, Mav,Bryan Derksen, Tarquin, Taw, Malcolm Farmer, Andre Engels, Toby Bartels, PierreAbbat, Karen Johnson, Ben-Zin, Robert Foley, Adam-Retchless, Netesq, Mbecker, DennisDaniels, Spi, Dwmyers, Lir, Michael Hardy, Erik Zachte, Lexor, Ixfd64, Fruge, Cyde, 168...,Ams80, Ahoerstemeier, Mac, Snoyes, Msablic, JWSchmidt, Darkwind, Glenn, Whkoh, Aragorn2, Mxn, Zarius, Lfh, Ike9898, StAkArKarnak, Tpbradbury, Taxman, Taoster, Samsara, Shizhao, Jecar, Bloodshedder, Carl Caputo, Jusjih, Donarreiskoer, Robbot, JoshCherry, Altenmann, Peak, Romanm, Arkuat, Stewartadcock, Merovingian, Sunray, Hadal, Isopropyl, Anthony, Holeung, Lupo, Dina,Mattwolf7, Centrx, Giftlite, Christopher Parham, Andries, Mikez, Wolfkeeper, Netoholic, Doctorcherokee, Peruvianllama, Everyking,Subsolar, Curps, Alison, Dmb000006, Bensaccount, Eequor, Alvestrand, Jackol, Delta G, Neilc, OldakQuill, Adenosine, ChicXulub, Doc-Sigma, Jonathan Grynspan, Knutux, Antandrus, Onco p53, G3pro, PDH, Karol Langner, H Padleckas, Bumm13, Tsemii, JohnArmagh,Jh51681, I b pip, Jake11, Ivo, Adashiel, Corti, Sysy, Indosauros, A-giau, Johan Elisson, Diagonalsh, Discospinster, Rich Farmbrough,Guanabot, Cacycle, Wk muriithi, Bishonen, Bibble, Bender235, ESkog, Cyclopia, Kbh3rd, Richard Taylor, Eric Forste, MBisanz, El C,Rgdboer, Gilgamesh he, Susvolans, RoyBoy, Perfecto, RTucker, Bobo192, Jasonzhuocn, Cmdrjameson, R. S. Shaw, Polluks, Password,Arcadian, Joe Jarvis, Jerryseinfeld, Jojit fb, NickSchweitzer, Banks, BW52, Haham hanuka, Hangjian, Benbread, Espoo, Siim, Alan-sohn, JYolkowski, Chino, Tpikonen, Interiot, Andrewpmk, SlimVirgin, Kocio, Mailer diablo, ClockworkSoul, Super-Magician, Knowl-edge Seeker, Cburnett, LFaraone, Gene Nygaard, Redvers, Bookandcoee, BadSeed, Kznf, RyanGerbil10, Ron Ritzman, Megan1967,Roland2, Angr, Richard Arthur Norton (1958- ), Pekinensis, Woohookitty, TigerShark, Jpers36, Benbest, JeremyA, Miss Madeline, SirLewk, Fenteany, Steinbach, M412k, Wayward, Essjay, Turnstep, Dysepsion, Matturn, Magister Mathematicae, V8rik, BorisTM, RxS,Sjakkalle, Rjwilmsi, Tizio, Tawker, Yamamoto Ichiro, FuelWagon, Titoxd, FlaBot, Ageo020, Yanggers, RexNL, Gurch, Alexjohnc3, Fresh-eneesz, Alphachimp, Chobot, Moocha, Bornhj, Korg, Bubbachuck, The Rambling Man, YurikBot, Wavelength, Sceptre, Reo On, Jtkiefer,Zaroblue05, Chuck Carroll, Splette, SpuriousQ, Rada, Derezo, CanadianCaesar, Stephenb, Gaius Cornelius, CambridgeBayWeather,Yyy, Ihope127, Cryptic, Cpuwhiz11, Wimt, The Hokkaido Crow, Annabel, Sentausa, Shanel, NawlinWiki, Wiki alf, UCaetano, BigCow,Exir Kamalabadi, Dureo, Mccready, Irishguy, Shinmawa, Banes, Matticus78, Rmky87, Raven4x4x, Khooly59, Neil.steiner, Misza13,Tony1, Bucketsofg, Dbrs, Aaron Schulz, BOT-Superzerocool, DeadEyeArrow, Bota47, Kkmurray, Brisvegas, Mr.Bip, Wknight94, Trig-ger hippie77, Astrojan, Tetracube, Holderca1, Phgao, Zzuuzz, Closedmouth, Dspradau, Pookythegreat, GraemeL, JoanneB, CWenger,Anclation, Staxringold, Banus, AssistantX, GrinBot, 8472, DVD R W, CIreland, That Guy, From That Show!, Eog1916, Wikimerce-nary, AndrewWTaylor, Sardanaphalus, Twilight Realm, Crystallina, Scolaire, SmackBot, FocalPoint, Zenchu, Paranthaman, Slashme,TestPilot, Pgk, Bomac, Davewild, Delldot, AnOddName, RobotJcb, Edgar181, Zephyris, Xaosux, Gilliam, Ppntori, Richfe, Malatesta,NickGarvey, ERcheck, JSpudeman, Tyciol, Bluebot, Dbarker348, Persian Poet Gal, Ben.c.roberts, Stubblyhead, Elagatis, MalafayaBot,Moshe Constantine Hassan Al-Silverburg, Deli nk, Ramas Arrow, Zsinj, Can't sleep, clown will eat me, Keith Lehwald, DRahier, Fjool,Onorem, Snowmanradio, Yidisheryid, EvelinaB, Rrburke, Mr.Z-man, SundarBot, NewtN, Nibuod, Nakon, Drdozer, MEJ119, Smokefoot,Drphilharmonic, Wisco, Wybot, DMacks, PandaDB, Jls043, Mikewall, Kukini, Clicketyclack, Derekwriter, EMan32x, Rockvee, Akubra,J. Finkelstein, Euchiasmus, Timdownie, Soumyasch, AstroChemist, Hemmingsen, Accurizer, Kyawtun, Mr. Lefty, IronGargoyle, TheMan in Question, MarkSutton, Bhulsepga, Special-T, Munita Prasad, Beetstra, Muadd, Rickert, Bendzh, Aarktica, Johnchiu, Shella, SijoRipa, Citicat, Jose77, Sasata, ShakingSpirit, BranStark, Iridescent, Electried mocha chinchilla, Lakers, JoeBot, Wjejskenewr, Tawker-bot2, Dlohcierekim, Bioinformin, MightyWarrior, Jman5, Fvasconcellos, SkyWalker, JForget, GeneralIroh, Porterjoh, Ale jrb, DreadSpecter, Insanephantom, Satyrium, Dycedarg, Scohoust, Makeemlighter, Picaroon, KyraVixen, DSachan, CWY2190, Nadyes, THF, Out-riggr, ONUnicorn, Icek, WillowW, MC10, Michaelas10, Gogo Dodo, Eric Martz, Ttiotsw, Chasingsol, Studerby, Tawkerbot4, Carstensen,Christian75, Narayanese, Btharper1221, Omicronpersei8, Nugneant, Gimmetrow, Thijs!bot, Epbr123, Barticus88, StuartF, Opabinia re-galis, Sid 3050, Mojo Hand, Headbomb, Marek69, John254, Folantin, James086, Tellyaddict, Miller17CU94, Dgies, Tim2027uk, Escar-bot, Ileresolu, AntiVandalBot, Galilee12, Luna Santin, Settersr, BenJWoodcroft, AaronY, Jj137, TimVickers, Priscus, Sprite89, MECU,JAnDbot, Deective, Husond, MER-C, Janejellyroll, Andonic, OllyG, Lawilkin, Acroterion, Magioladitis, Henning Blatt, Karlhahn, Bong-warrior, VoABot II, AuburnPilot, Hasek is the best, Think outside the box, Roadsoap, Srice13, Stelligent, Eldumpo, Mkdw, Emw, UserA1, Lafw, Glen, Rajpaj, DerHexer, JaGa, Megalodon99, Khalid Mahmood, WLU, Wayne Miller, Squidonius, Seba5618, Hdt83, Martin-Bot, Kenshealth, Meduban, Dan Gagnon, R'n'B, Player 03, LedgendGamer, Paranomia, J.delanoy, Pharaoh of the Wizards, CFCF, HansDunkelberg, Boghog, Uncle Dick, Public Menace, Pipe34, WarthogDemon, Hodja Nasreddin, G. Campbell, Lantonov, Tylerhammond2,LordAnubisBOT, Ignatzmice, Dthzip, Coppertwig, Pyrospirit, Belovedfreak, GBoran, SJP, AA, Touch Of Light, Shoessss, Juliancolton,Bogdan, Burzmali, Jamesontai, Natl1, WinterSpw, Pdcook, Kalyandchakravarthy, Mlsquirrel, CardinalDan, Idioma-bot, Montchav, Carljr., King Lopez, VolkovBot, IWhisky, Mstislavl, ABF, Ashdog137, Leebo, AlnoktaBOT, VasilievVV, Tiberti, Philip Trueman, TXiKi-BoT, Oshwah, Tameeria, JesseOjala, Sam1001, A4bot, Guillaume2303, Nitin77, Xavierschmit, Qxz, Clarince63, Melsaran, Martin451,Leafyplant, LeaveSleaves, Mkv22, Seb az86556, Fitnesseducation, Mkubica, Sirsanjuro, Hannes Rst, Naturedude858, Spiral5800, Hedge-hog33, Pierpunk, Amb sib, Synthebot, Wilbur2012, Falcon8765, Duckttape17, Enviroboy, Vector Potential, Kingjalis3, Brigand dog, DocJames, AlleborgoBot, Kehrbykid, Gangsta4lif, Frank 212121, Zoeiscow, EmxBot, ThinkerThoughts, Masterofsuspense3, Christoph.gille,SieBot, Graham Beards, Moonriddengirl, Work permit, Sharpvisuals, Winchelsea, Caltas, Iwearsox21, Jipan, Triwbe, Manojdhawade,Yintan, Agesworth, Mckes, Eganio, Keilana, Shura58, Radon210, Editore99, Oda Mari, Grimey109, Danizdeman, Paolo.dL, Thishumor-cake, Jlaudiow713, Oxymoron83, Nuttycoconut, Chenmengen, Poindexter Propellerhead, Lordfeepness, Fratrep, Maelgwnbot, N96, Stat-icGull, Mike2vil, Wuhwuzdat, Mygerardromance, Hamiltondaniel, Ascidian, Dabomb87, Superbeecat, Agilemolecule, Ayleuss, Twinned-Chimera, AutoFire, Asher196, Lascorz, Tattery, Naturespace, Amotz, WikipedianMarlith, Seaniemaster, De728631, ClueBot, The ThingThat Should Not Be, Rodhullandemu, Plankwalking, Arakunem, Mild Bill Hiccup, Coookeee crisp, CounterVandalismBot, Niceguyedc,Spritegenie, Peteruetz, ChandlerMapBot, NClement, Wardface, DragonBot, Douglasmtaylor, Alexbot, Jusdafax, Jordell 000, Mike713,OpinionPerson, CupOfRoses, NuclearWarfare, Cenarium, Lunchscale, Jotterbot, Achilles.g, Aitias, Versus22, NERIC-Security, Think-ing Stone, ClanCC, Tprentice, XLinkBot, Ivan Akira, Rror, Capitana, Mifter, Samwebstah, Michael.J.Goydich, Noctibus, Jbeans, ZooFari,Frictionary, Sgpsaros, Doctor Knoooow, Champ0815, Lemchesvej, HexaChord, Addbot, Some jerk on the Internet, DOI bot, Landon1980,SunDragon34, Ronhjones, TutterMouse, Fieldday-sunday, Cuaxdon, Nick Van Ni, Cst17, Mentisock, CarsracBot, EconoPhysicist, Mai-tai-guy, Glane23, AndersBot, Omnipedian, Debresser, LinkFA-Bot, RoadieRich, Tassedethe, Numbo3-bot, OsBlink, Tide rolls, James42.5,Gail, Ettrig, Swarm, Javanbakht, A K AnkushKumar, Luckas-bot, Yobot, II MusLiM HyBRiD II, Minorxer, Eric-Wester, Tempodi-valse, AnomieBOT, Itln boy, Xelixed, Galoubet, Piano non troppo, AdjustShift, VK3, Kingpin13, RandomAct, Sonic h, Materialscientist,Yurilinda, Citation bot, Knightdude99, Neurolysis, Aspenandgwen, LilHelpa, Xqbot, Timir2, Capricorn42, Bihco, Tchussle, Tad Lincoln,Flavonoid, P99am, Tennispro427, Almabot, 200itlove, GrouchoBot, Laxman12, Abce2, Aznnerd123, Omnipaedista, HernanQB, Ribot-BOT, Pravinhiwale, Altruistic Egotist, GhalyBot, Miyagawa, Captain-n00dle, FrescoBot, Tobby72, This doesnt help at all, Citation bot 1,

  • 12.2 Images 15

    Pinethicket, Bernarddb, HRoestBot, Katherine Folsom, Jstraining, Curehd, My very best wishes, Jauhienij, ActivExpression, Cnwilliams,Tim1357, Missellyah, FoxBot, TobeBot, Sweet xx, Lotje, Ndkartik, January, Max Janu, Duckyinc2, Specs112, Brian the Editor, Suu-sion of Yellow, Tbhotch, Jesse V., Minimac, DARTH SIDIOUS 2, Usmanmyname, Ripchip Bot, Agent Smith (The Matrix), DASHBot,EmausBot, Cpeditorial, WikitanvirBot, Immunize, Snow storm in Eastern Asia, 021-adilk, JonReader, Clarker1, ScottyBerg, Joeywal-lace9, GoingBatty, Minimacs Clone, RenamedUser01302013, Hous21, Wikipelli, Commanderigg, SRMN2, Meruem, JSquish, HiW-Bot,Traxs7, Heshamscience, Futureworldconqueror, Alpha Quadrant, John Mackenzie Burke, AvicAWB, Elektrik Shoos, Bamyers99, H3llBot,Zap Rowsdower, RODSHEL, Imnoteditinganything, Rajatgaur, Richardmnewton, Codahawk, Wstraub, Hylian Auree, Petrb, Teaktl17,ClueBot NG, Eastewart2010, Pisaadvocate, Helpful Pixie Bot, Bibcode Bot, BG19bot, Gautehuus, Edward Gordon Gey, Gupta.udatha,Protein Chemist, NotWith, Smettems, Prof. Squirrel, Smileguy91, Fma69, Dexbot, Wimblecf, Joeinwiki, David P Minde, Evolution andevolvability, Prokaryotes, Kirtimaansyal, Anrnusna, Monkbot and Anonymous: 1184

    12.2 Images File:225_Peptide_Bond-01.jpg Source: http://upload.wikimedia.org/wikipedia/commons/2/26/225_Peptide_Bond-01.jpg License: CC

    BY 3.0 Contributors: Anatomy & Physiology, Connexions Web site. http://cnx.org/content/col11496/1.6/, Jun 19, 2013. Original artist:OpenStax College

    File:3d_tRNA.png Source: http://upload.wikimedia.org/wikipedia/commons/f/f1/3d_tRNA.png License: CC BY-SA 3.0 Contributors:Own work Original artist: Vossman

    File:Chaperonin_1AON.png Source: http://upload.wikimedia.org/wikipedia/commons/6/6c/Chaperonin_1AON.png License: CC BY-SA 3.0 Contributors: Own work Original artist: Thomas Splettstoesser

    File:Commons-logo.svg Source: http://upload.wikimedia.org/wikipedia/en/4/4a/Commons-logo.svg License: ? Contributors: ? Originalartist: ?

    File:Genetic_code.svg Source: http://upload.wikimedia.org/wikipedia/commons/3/37/Genetic_code.svg License: CC-BY-SA-3.0 Con-tributors: Own work Original artist: Madprime

    File:Hexokinase_ball_and_stick_model,_with_substrates_to_scale_copy.png Source: http://upload.wikimedia.org/wikipedia/commons/e/e7/Hexokinase_ball_and_stick_model%2C_with_substrates_to_scale_copy.png License: Public domain Contributors:Transferred from en.wikipedia; transferred to Commons by User:Leptictidium using CommonsHelper. Original artist: Original uploaderwas TimVickers at en.wikipedia

    File:Issoria_lathonia.jpg Source: http://upload.wikimedia.org/wikipedia/commons/2/2d/Issoria_lathonia.jpg License: CC-BY-SA-3.0Contributors: ? Original artist: ?

    File:KendrewMyoglobin.jpg Source: http://upload.wikimedia.org/wikipedia/commons/e/ec/KendrewMyoglobin.jpg License: CC BY2.5 Contributors: MRC Laboratory of Molecular Biology (Transferred from en.wikipedia by User:Mormegil). Original artist: MRCLaboratory of Molecular Biology, requests for higher resolution images should be sent to [email protected] (original uploaderto the English Wikipedia was Dumarest)

    File:Localisations02eng.jpg Source: http://upload.wikimedia.org/wikipedia/commons/6/6e/Localisations02eng.jpg License: CCBY-SA2.5 Contributors: http://gfp-cdna.embl.de/ Original artist: Jeremy Simpson and Rainer Pepperkok

    File:Mouse_cholera_antibody.png Source: http://upload.wikimedia.org/wikipedia/commons/a/a2/Mouse_cholera_antibody.png Li-cense: CC BY-SA 3.0 Contributors: Own work Original artist: Thomas Splettstoesser (www.scistyle.com)

    File:Myoglobin.png Source: http://upload.wikimedia.org/wikipedia/commons/6/60/Myoglobin.png License: Public domain Contribu-tors: self made based on PDB entry Original artist: AT

    File:Peptide-Figure-Revised.png Source: http://upload.wikimedia.org/wikipedia/commons/c/c9/Peptide-Figure-Revised.png License:CC BY-SA 3.0 Contributors: Own work Original artist: Chemistry-grad-student

    File:Peptide_group_resonance.png Source: http://upload.wikimedia.org/wikipedia/commons/1/17/Peptide_group_resonance.png Li-cense: CC-BY-SA-3.0 Contributors: Transfered from en.wikipedia Original artist: Original uploader was WillowW at en.wikipedia

    File:Protein_composite.png Source: http://upload.wikimedia.org/wikipedia/commons/5/54/Protein_composite.png License: CCBY-SA3.0 Contributors: Own work (rendered with Cinema 4D) Original artist: Thomas Splettstoesser (www.scistyle.com)

    File:Proteinviews-1tim.png Source: http://upload.wikimedia.org/wikipedia/commons/6/6e/Proteinviews-1tim.png License: CC-BY-SA-3.0 Contributors: Self created from PDB entry 1TIM using the freely available visualization and analysis package VMD Original artist:Opabinia regalis

    File:Ribosome_mRNA_translation_en.svg Source: http://upload.wikimedia.org/wikipedia/commons/b/b1/Ribosome_mRNA_translation_en.svg License: Public domain Contributors: Own work based on:[1], [2], [3], [4], [5], [6], [7], and [8]. Original artist:LadyofHats

    File:TPI1_structure.png Source: http://upload.wikimedia.org/wikipedia/commons/1/1c/TPI1_structure.png License: Public do-main Contributors: based on 1wyi (http://www.pdb.org/pdb/explore/explore.do?structureId=1WYI), made in pymol Originalartist: AT

    File:Wiktionary-logo-en.svg Source: http://upload.wikimedia.org/wikipedia/commons/f/f8/Wiktionary-logo-en.svg License: Public do-main Contributors: Vector version of Image:Wiktionary-logo-en.png. Original artist: Vectorized by Fvasconcellos (talk contribs), basedon original logo tossed together by Brion Vibber

    12.3 Content license Creative Commons Attribution-Share Alike 3.0

    BiochemistrySynthesisBiosynthesisChemical synthesis

    StructureStructure determination

    Cellular functionsEnzymes Cell signaling and ligand bindingStructural proteins

    Methods of studyProtein purificationCellular localizationProteomicsBioinformatics Structure prediction and simulation

    NutritionHistory and etymologySee alsoReferencesTextbooksExternal linksDatabases and projectsTutorials and educational websites

    Text and image sources, contributors, and licensesTextImagesContent license