1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.
-
date post
15-Jan-2016 -
Category
Documents
-
view
228 -
download
0
Transcript of 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.
![Page 1: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/1.jpg)
1
Bioinformatics Master Course Sequence Alignment
Lecture 9aPattern matching
part I
![Page 2: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/2.jpg)
2
Sequence Patterns vs. Protein Structure
I. Protein-Protein interaction1. enzyme (protein) substrate : serine protease trypsin2. receptor (protein) ligand : growth hormone receptor3. antibody (protein) antigen : immunoglobulin (Ig)
II. Protein-Ion and small molecule interaction1. protein ion (Ca2+, Mg2+, Na+, K+, Cl–, HCO3
–, SO42–) :
calmodulin2. pump ion, coupled to enzymatic function : ATPase3. channel water : aquaporin
III. Protein-DNA/RNA interaction1. enzyme DNA : Eco-RI ribozyme2. binder DNA groove : leucine zipper, zinc finger3. regulator RNA : KH domain
![Page 3: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/3.jpg)
3
Reactions and Interactions
• What is the difference between a reaction and an interaction? change in chemical bonding
• Which one of these is a chemical bond?1. H3C-CH2-O-H2. Na+ Cl–
3. H-O-H···OH2
4. H-O-CH2-CH3···H3C-CH2-O-H
![Page 4: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/4.jpg)
4
Bond Strength
• Bond strength and lifetime are a function of temperature vibration (bond stretching), thermal background
• Non-covalent interactions depend very much on the medium compare salt crystal with salt solution
• Interaction strength has a strong distance dependence ion-ion ~ r–2, dipole-dipole ~ r–4
quadrupole-quadrupole ~ r–6
![Page 5: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/5.jpg)
5
Binding: Complementary Interfaces
Binding requires complementary interfaces:
Interfaces have characteristic and conserved residues patterns or motifs
![Page 6: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/6.jpg)
6
Sequence Patterns and Profiles• Comparison between sequence pattern matching and
similarity scoring
PATTERN SCORE
exact word identity
regular expression weight matrix
Hidden Markov Model profile
generalized profilegeneral Hidden Markov Model
![Page 7: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/7.jpg)
7
Resources• PROSITE: biologically significant sites, patterns and profiles
– www.ebi.ac.uk/ppsearch/
• PFAM: large collection of multiple sequence alignments– www.sanger.ac.uk/Software/Pfam/
• DIP: interacting proteins– dip.doe-mbi.ucla.edu/
• Specialized Databases– Immunoglobins: imgt.cines.fr/– Ca2+-binding proteins structbio.vanderbilt.edu/cabp_database/
• Molecular visualisation packages– VMD: www.ks.uiuc.edu/Research/vmd/– MOLMOL: www.mol.biol.ethz.ch/wuthrich/software/molmol/– Rasmol: www.umass.edu/microbio/rasmol/
![Page 8: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/8.jpg)
8
Protein-Protein Interactions
![Page 9: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/9.jpg)
9
Protein Interaction NetworksMost proteins are functionally linked to other proteins
H Jeong, SP Mason, A-L Barabási & ZN Oltvai "Lethality and centrality in protein networks" Nature 2001;411(6833):41
![Page 10: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/10.jpg)
10
I.1 Enzyme: Serine Protease Trypsin
• Specific class of hydrolases– cleave peptide bonds at specific residue positions.
• aspartate proteases, cysteine proteases, serine proteases
• Trypsin is a serine protease– cleaves C-terminal of the basic residues Lys and Arg– one of the three principal digestive proteases
• other two are pepsin and chymotrypsin
– produced in an inactive form by the pancreas
• Pattern: His57, Asp102 and Ser195 (H-D-S)
NC
CN
'R'
OH
H
HN
CC
'R'
OHH
O
H2O
NC
C
'R'
OHH
OH
CH2
Trypsin
HOCH2
Trypsin
HOCH2
Trypsin
N
H
HN
H
H
![Page 11: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/11.jpg)
11
Serine Protease: Trypsin
• Pattern: His57, Asp102 and Ser195 (H-D-S)
![Page 12: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/12.jpg)
12
Principle of Catalysis
http://www.chemguide.co.uk/physical/basicrates/catalyst.html
![Page 13: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/13.jpg)
13
Trypsin Complex with Inhibitor
1btc.pdb
![Page 14: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/14.jpg)
14
I.2 Receptor: Growth Hormone Receptor
• Membrane-borne receptors:– extra-cellular domain
• ligand-binding site
– transmembrane domain• anchoring in the cell membrane
– intracellular domain• kinase or another signalling module (typically)
• Receptor for growth hormone – member of the cytokine receptor superfamily– dimerizes upon binding growth hormone as ligand– activates intracellular kinase, triggers cellular signalling cascade.
• Most structures only contain extra/intracellular domain– transmembrane domain is difficult to crystallize
• Patterns:– YGEFS (growth hormone receptor)
– WSxWS (cytokine receptor family)
![Page 15: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/15.jpg)
15
Growth Hormone Receptor Complex with Growth Hormone
1a22.pdb
![Page 16: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/16.jpg)
16
I.3 Immune System: Antibody• Antibodies (immunoglobulins, or Ig)
– immune system: bind ’foreign’ (non-self) characteristic structures
• e.g. protein surfaces
• Heavy Chain and Light Chain• Constant part (Fc) and Variable part (Fv).
– Fv specific recognition of target molecule (‘antigen’)
• structure called ‘Ig fold’:– Two -sheets face-to-face, with ‘Greek-key’ motif– binding site between two Ig folds– hypervariable loops participate in binding:
• H1, H2, H3 and L1, L2, L3• composition characteristic for antigen
![Page 17: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/17.jpg)
17
Pfam Ig Family Alignment
![Page 18: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/18.jpg)
18
Patterns of Hypervariable Loops
Loop Before After Length
CDR-L1 always Cys always Trp 10 to 17
CDR-L2 generally Ile-Tyr, also Val-Tyr, Ile-Lys, Ile-Phe
- always 7
CDR-L3 always Cys always Phe-Gly-xxx-Gly 7 to 11
CDR-H1 always Cys-xxx-xxx-xxx always Trp 10 to 12
CDR-H2 typically Leu-Glu-Trp-Ile-Gly Lys, Arg-Leu, Ile, Val, Phe, Thr, Ala-Thr, Ser, Ile, Ala
16 to 19
CDR-H3 always Cys-xxx-xxxx always Trp-Gly-xxx-Gly 3 to 25
![Page 19: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/19.jpg)
19
Antibody Structure
Kontou et al. Eur J Biochem 2000 267 23891F3R.pdb
![Page 20: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/20.jpg)
20
Antibody Diversity• Gene translocation
• heavy chain – multiple VH genes join with one DH and one JH
• light chain – multiple VL genes join with one JL gene
www.cat.cc.md.us/courses/bio141/lecguide/unit3/humoral/antibodies/abydiversity/abydiversity.html
![Page 21: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/21.jpg)
21
Protein-Ion and Protein-’small molecule’
Interactions
![Page 22: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/22.jpg)
22
II.1 Ion Binding: Calmodulin
• Two domains, each two ‘EF-hands’: – helix-loop-helix structure– loop contains Ca2+-binding motif.
• Ca2+-ion: 6-fold coordinated: – Oxygens from residues 1, 3, 5, 7, 9, and 12 in EF loop:
D-K-D-G-D-G-T-I-T-T-K-Q– one water molecule– three are negatively charged
• Ca2+-binding changes conformation of entire protein from closed to open– open conformation exposes hydrophobic surface area– binding site for calmodulin target proteins
![Page 23: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/23.jpg)
23
Calmodulin Complex with Calcium Ions
1exr.pdb
![Page 24: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/24.jpg)
24
II.2 Ion Pump: 2. Calcium ATPase (ATP synthase)• protein complex
– links electrical potential to ATP hydrolysis/synthesis– interconversion between mechanical and electrochemical energy in
molecular motors.
• F1F0 ATPase: reversible proton pump/motor• P-type ATPases: transport ions across membrane against a
concentration gradient.– Pattern: D-K-T-G-T-[LIVM]-[TIS]– Next to aspartate which is phosphorylated during reaction cycle
• Na+/K+-ATPase: ubiquitous membrane transport protein in mammalian cells– maintains high K+ and low Na+ in cytoplasm for normal membrane potentials
and cellular activities
• Ca-ATPases: Ca2+ from cytoplasm to organels (mammalian)– e.g. sarcoplasmic reticulum, endoplasmic reticulum
![Page 25: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/25.jpg)
25
ATPases
F1Fo-ATPase Ca2+-ATPasewww.rpi.edu/dept/bcbp/molbiochem/MBWeb/mb1/part2/f1fo.htm
www.utoronto.ca/maclennan/rint1.htm
![Page 26: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/26.jpg)
26
ATPase: Calcium Ions in Active Site
1eul.pdb
![Page 27: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/27.jpg)
27
II.3 Membrane Channel: Aquaporin
Conserved NPA motifs: Asn, Pro and Ala stabilise loops through multiple hydrogen bonds
Bert de Groot: www.mpibpc.mpg.de/groups/de_groot/bgroot.html
![Page 28: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/28.jpg)
28
Aquaporin: Motifs
•NPA: stabilizes loops B and E
• G(a)xxxG(a)xxG(a):– Crossing of
right-handhelicalbundles
Andreas Engel and Henning Stahlberg, in: Current Topics in Membranes (2001), Hohmann, Agre & Nielsen (Eds.) Academic Press
![Page 29: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/29.jpg)
29
Aqu
apor
in S
ubun
it
Ber
t de
Gro
ot: w
ww
.mpi
bpc.
mpg
.de/
grou
ps/d
e_gr
oot/b
groo
t.htm
l
1j4n.pdb
![Page 30: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/30.jpg)
30
Protein-DNA/RNA Interactions
![Page 31: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/31.jpg)
31
III.1 Enzyme: Eco-RI• Restriction enzyme:
– cut palindrome sequences – complex of one
DNA molecule with two Eco-RI molecules with inversion symmetry
www.accessexcellence.org/RC/VL/GG/restriction.html
![Page 32: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/32.jpg)
32
Eco-RI
1qrh.pdb
![Page 33: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/33.jpg)
33
III.2a DNA recognition: Leucine Zipper
• Dimer – Leu interactions– binds DNA by a fork-shaped structure
• ‘coiled-coil’ structure:– leucines on one side of helix– 7-residue repeat; one helix turn is 3.6 residues
a b c d e f g (position)
256 KV E E L L S KN Y H L E N EV A R L K K LV G 279
![Page 34: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/34.jpg)
34
Leucine Zipper: Complex with DNA
1an2.pdb
![Page 35: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/35.jpg)
35
Leucine Zipper: 7-Residue Repeat
![Page 36: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/36.jpg)
36
III.2b DNA Recognition: Zinc Finger Proteins
• zinc coordinates several side chains– pulls them together to form ‘finger’ loops
• Pattern: C-x2-4-C-x12-15-H-x3-5-H or C-x2-4-C-x12-15-C
– recognize nucleic acids (DNA or RNA) • modulate genes (also proteins can be targeted)
• modulate important functions:– gene expression– reverse transcription and virus assembly
• drug discovery targets: – pathogen-specific 3D structures – different from endogeneous (cellular) zinc finger proteins
![Page 37: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/37.jpg)
37
Zinc Finger Complex with DNA
1a1h.pdb
![Page 38: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/38.jpg)
38
III.3 RNA Regulation: KH Domain
• bind to specific DNA/RNA locations– regulation of RNA synthesis and metabolism– combination with other domains– Pattern: G-x-x-G
• ribonucleoprotein (RNP) domain• double stranded RNA binding domain (dsRBD)• K Homology (KH) domain
– recognize tetranucleotide motifs – high affinity/specificity:
• RNA secondary structure• repeated sequence elements
• alpha/beta fold similar to ribosomal proteins
![Page 39: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/39.jpg)
39
KH Domain Complex with RNA
1k1g.pdb
![Page 40: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/40.jpg)
40
Copyright ©2005 American Society of Plant BiologistsPrzybilski, R., et al. Plant Cell 2005;17:1877-1885
The HHRzHammerhead Motif of Ribozyme
![Page 41: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/41.jpg)
41
Hammerhead Motif of Ribozyme
• three base-paired helices (I-III) • core of 11 highly conserved, non-complementary
nucleotides – necessary for the catalysis.
• catalytic motif discovered by sequence comparison of plant viroids– site-specific,
self-catalyzed cleavage
(Birikh, 1997)academic.brooklyn.cuny.edu/chem/zhuang/QD/toppage1.htm
![Page 42: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/42.jpg)
42
Hammerhead Ribozyme Action
488d.pdb
![Page 43: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/43.jpg)
43
Copyright ©2005 American Society of Plant Biologists
Przybilski, R., et al. Plant Cell 2005;17:1877-1885
Modeling of the Arabidopsis HHRz Ara2
![Page 44: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d555503460f94a32522/html5/thumbnails/44.jpg)
44