Centre for Integrative Bioinformatics VU (IBIVU)
description
Transcript of Centre for Integrative Bioinformatics VU (IBIVU)
Bioinformatics master courseDNA/Protein structure-function analysis and prediction
Lecture 13: Protein Function
Centre for Integrative Bioinformatics VU (IBIVU)
Faculty of Sciences / Faculty of Earth & Life Sciences
Sequence-Structure-Function
Sequence
Structure
Function
ThreadingAb initio
BLAST
Folding: impossible but for the smallest structures
Function prediction from structure – very difficult
Experimental
• Structural genomics
• Functional genomics
• Protein-protein interaction
• Metabolic pathways
• Expression data
Protein function categories• Catalysis (enzymes)
• Binding – transport (active/passive)
– Protein-DNA/RNA binding (e.g. histones, transcription factors)
– Protein-protein interactions (e.g. antibody-lysozyme) (experimentally determined by yeast two-hybrid (Y2H) or bacterial two-hybrid (B2H) screening )
– Protein-fatty acid binding (e.g. apolipoproteins)
– Protein – small molecules (drug interaction, structure decoding)
• Structural component (e.g. -crystallin)
• Regulation
• Signalling
• Transcription regulation
• Immune system
• Motor proteins (actin/myosin)
Catalytic properties of enzymes
Km kcat
• E + S ES E + PE = enzymeS = substrateES = enzyme-substrate complex (transition state)P = productKm = Michaelis constantkcat = catalytic rate constant (turnover number)Kcat/Km = specificity constant (useful for comparison)
[S]
Mol
es/s
Vmax
Vmax/2
Km
Vmax × [S]V = ------------------- Michaelis-Menten equation Km + [S]
Protein interaction domains
Protein Interaction Domains
http://pawsonlab.mshri.on.ca/html/domains.html
Energy difference upon binding
Examples of protein interactions (and functional importance) include: • Protein – protein (pathway analysis); • Protein – small molecules (drug interaction, structure decoding); • Protein – peptides, DNA/RNA (function analysis)
The change in Gibb’s Free Energy of the protein-ligand binding interaction can be monitored
and expressed by the following; G = H – T S (H=Enthalpy, S=Entropy and T=Temperature)
Protein function
• Many proteins combine functions
• Some immunoglobulin structures are thought to have more than 100 different functions (and active/binding sites)
• Alternative splicing can generate (partially) alternative structures
Protein function
Active site / binding cleft
Protein-protein interaction
Shape complementarity
Protein function evolution
Chymotrypsin
How to infer function
• Experiment• Deduction from sequence
– Multiple sequence alignment – conservation patterns– Homology searching
• Deduction from structure– Threading– Structure-structure comparison– Homology modelling
Mevalonate plays a role in epithelial cancers: it can inhibit EGFR
Cholesterol biosynthesis primarily occurs in eukaryotic cells. It is necessary for membrane synthesis, and is a precursor for steroid hormone production as well as for vitamin D. While the pathway had previously been assumed to be localized in the cytosol and ER, more recent evidence suggests that a good deal of the enzymes in the pathway exist largely, if not exclusively, in the peroxisome (the enzymes listed in blue in the pathway to the left are thought to be at least partly peroxisomal). Patients with peroxisome biogenesis disorders (PBDs) have a variable deficiency in cholesterol
biosynthesis
Epidermal Growth Factor as a Clinical Target in Cancer
Introduction:
A malignant tumour is the product of uncontrolled cell proliferation. Cell growth is controlled by a delicate balance between growth-promoting and growth-inhibiting factors. In normal tissue the production and activity of these factors results in differentiated cells growing in a controlled and regulated manner that maintains the normal integrity and functioning of the organ. The malignant cell has evaded this control; the natural balance is disturbed (via a variety of mechanisms) and unregulated, aberrant cell growth occurs. A key driver for growth is the epidermal growth factor (EGF) and the receptor for EGF (the EGFR) has been implicated in the development and progression of a number of human solid tumours including those of the lung, breast, prostate, colon, ovary, head and neck.
Energy housekeeping: Adenosine diphosphate (ADP) – Adenosine triphosphate (ATP)
Metabolic Metabolic networksnetworks
Glycolysis Glycolysis and and
GluconeogenesisGluconeogenesis
Kegg database (Japan)
Gene Ontology (GO)
• Not a genome sequence database
• Developing three structured, controlled vocabularies (ontologies) to describe gene products in terms of:– biological process– cellular component– molecular function
in a species-independent manner
The GO ontology
Gene Ontology Members
• FlyBase - database for the fruitfly Drosophila melanogaster • Berkeley Drosophila Genome Project (BDGP) - Drosophila informatics; GO database & software, Sequence Ontology development • Saccharomyces Genome Database (SGD) - database for the budding yeast Saccharomyces cerevisiae • Mouse Genome Database (MGD) & Gene Expression Database (GXD) - databases for the mouse Mus musculus • The Arabidopsis Information Resource (TAIR) - database for the brassica family plant Arabidopsis thaliana • WormBase - database for the nematode Caenorhabditis elegans • EBI GOA project : annotation of UniProt (Swiss-Prot/TrEMBL/PIR) and InterPro databases • Rat Genome Database (RGD) - database for the rat Rattus norvegicus • DictyBase - informatics resource for the slime mold Dictyostelium discoideum • GeneDB S. pombe - database for the fission yeast Schizosaccharomyces pombe (part of the Pathogen Sequencing Unit at the Wellcome Trust Sanger Institute) • GeneDB for protozoa - databases for Plasmodium falciparum, Leishmania major, Trypanosoma brucei, and several other protozoan parasites (part of the Pathogen Sequencing Unit at the Wellcome Trust Sanger Institute) • Genome Knowledge Base (GK) - a collaboration between Cold Spring Harbor Laboratory and EBI) • TIGR - The Institute for Genomic Research • Gramene - A Comparative Mapping Resource for Monocots • Compugen (with its Internet Research Engine) • The Zebrafish Information Network (ZFIN) - reference datasets and information on Danio rerio