Taking Geometry to its Edge: Fast Rigid (and Hinge-Bent) Docking Algorithms.
description
Transcript of Taking Geometry to its Edge: Fast Rigid (and Hinge-Bent) Docking Algorithms.
Taking Geometry to its Edge: Fast Rigid (and Hinge-Bent)
Docking Algorithms.Haim Wolfson1, Dina Duhovny1, Yuval
Inbar1, Vladimir Polak1 , Ruth Nussinov2,3
1School of Computer Science,
2School of MedicineTel Aviv University, Israel
3NCI-Frederick, USA
CAPRI: Critical Assessment of PRediction of
Interactions • First docking contest: 19 groups from all
over the world.• Round 1 – 3 targets.• Round 2 – 4 targets.• Only 5 predictions per target can be
submitted.
Targets
Round 1, Round 2 Any good prediction?
Our group
Revision
Hpr kinase / hpr
Rotavirus VP6 / Fab (antibody)
Hemagglutinin (virus capsid) / Fab HC63 (antibody)
Amylase / camelid antibody VH1
Amylase / camel antibody VH2
Amylase / camel antibody VH3
T-Cell Receptor/exotoxin
Molecular Surface Representation
Local Critical Feature Selection
Geometric Matching of Critical Features
Filtering and Scoring
Active site knowledge
Candidate Transformations
PDB files
Geometric Docking Algorithms
PPD – Norel et al. 1994• Surface representation – Connolly MS.• Critical features – local extrema of surface
curvature ‘knobs’ / ‘holes’.• Matching – pairs of critical points +
associated normals are matched using Geometric Hashing.
• Scoring – shape complementarity (allowing moderate penetration), electrostatics, aromatic residues.
BUDDA – V. Polak (M.Sc. Thesis 2002)
• Surface representation – Connolly patch centers (caps/pits/belts), distance transform grid.
• Critical features – ‘knobs’ / ‘holes’+ caps/pits/belts. Option to focus on backbone residue related points.
• Matching – knob/hole +a pair of neighboring caps/pits are matched using Geometric Hashing.
• Scoring – shape complementarity, allowing moderate penetration.
PatchDock – Duhovny et al. 2002
• Surface representation – distance transform grid, multi-resolution surface.
• Critical features – three types of surface patches: convex, concave and flat.
• Focus on active site : hot spot rich patches.• Matching – patch points are matched by
Geometric Hashing.• Scoring – shape complementarity, allowing
moderate penetration.
Our Docking AlgorithmsPPD BUDDA PatchDock
Surface representation
Connolly’s MS Caps/pits/belts, distance transform grid
distance transform grid, multi-resolution surface
Critical features
‘knobs’/‘holes’point+normal pairs
backbone ‘knob/hole’+pair of ‘caps/pits’
surface patches: convex, concave and flat (point+normal pairs in a patch)
Matching algorithm
Geometric Hashing
Geometric Hashing
Geometric Hashing
Filtering and scoring
shape complementarity, electrostatics, aromatic residues
shape complementarity
shape complementarity
Active Site Focusing
in the matching step
in matching and scoring steps
Automatic CDR detection
• The light and heavy chains of CDRs have conserved patterns that enable us to align a given sequence to a consensus sequence which was derived using statistical data.
• This alignment is used further to locate the CDRs area.
Round 1Docking Algorithms:
• PPD (Norel)
• BUDDA (Polak)
• HPr kinase / phosphatase is a key regulatory enzyme controlling carbon metabolism in bacteria.
• The protein is a hexamer.
• HprK/P contains the Walker motif - characteristic of nucleotide-binding proteins.
Target 1 – HPR Kinase
• It catalyses the ATP-dependent phosphorelation/dephosphorelation of Ser46 in HPr.
Target 1 – HPR Kinase
Target 1 – HPR Kinase
• What was done:– Distance constraint of
10.0 Å between the oxygen atom of Ser(Asp)-46 and the closest phosphate oxygen.
Target 1 – HPr Kinase/HPr
• What was done:– Distance constraint of 10.0
Å between the oxygen atom of Ser(Asp)-46 of the HPr and the closest phosphate oxygen.
• Results:– Best result within top 10
ranked 7; RMSD from native ~8.0 Å
– Explanation: A considerable part of the interface surface area is between the HPR and the enzyme flexible helix.
Target 1 – Lessons Learned• Flexible hinge-bent docking:
– Two rigid parts of enzyme: (i) Helix of chain C (ii) the body of chain A without the helix.
• New results:– 2nd best scoring result:
~ 3.0 Å, run-time: 2 min.
– In our solution the phosphocarrier protein is in red and the helix of the kinase is in orange.
– These results were achieved without using the distance constraint.
The position of the helix in the
uncomplexed structure (dark green color)
The position of the helix in the
solution obtained by flexible
docking (orange)
The position of the helix in the structure
of the complex (purple)
• VP6 protein of rotavirus that causes gastroenteritis in children.
Target 2 – Biological Background
• Trimmer (symmetry)• The surface of the B (helices)
domain is buried in rotavirus capsid.
• The H-domain interacts with the antibody.
• A ‘hint’ was given- to use the trimmer in the docking, meaning that active site is expanded to more than one chain.
Target 2 – Biological Background
Target 2 – what we did• The antibody potential binding site was restricted to
CDRs.• The antigen VP6 potential binding site was restricted to
the β domain.• We selected solutions with interfaces that include:
1. at least 4 CDRs of the antibody with high TYR,TRP concentration.
2. at least 2 chains of the antigen.• clustering of the solutions obtained for the different chains
of the trimer.
Target 2 – our best hit
Our solution in blue vs. original complex (RMSD 15A, rank 7)
(within 5 results that were not submitted due to technical problems)
Target 2 – Lessons Learned 1
• Search only loop regions of the antigen
• Restrict even further the antigen to the exposed part of the virus capsid
Loops of the “cap” region are in spacefill
Side view
Top view
Target 2 – Lessons Learned 2
• Filter out results that cause steric clash of the 3 (symmetric) antibodies binding to the antigen trimer .
Target 2 : a-posteriori best hits
• Our best hit: RMSD 3.08 rank 76
• First 10: RMSD 5.54 A, ranked 9
• Run-time: 7 min
VP6 molecule in spacefill, original complex Fab is in blue superimposed on our solution (rank 9) in yellow.
Target 2 – analysis of geometric shape complementarity
• The area of the interface of the original (blue) complex: ~400A2
• The area of the interface in our highest ranked solution (yellow) is ~600A2 .
• In this result the light chain of the antibody is shifted towards the center of virus capsid, enlarging shape complementarity. The heavy chain is very close to it’s original location.
Heavy chains
Light chains
Target 2 – shape complementarity
Heavy chains only
Light chains
• Hexamer: 3 dimmers (symmetry)
• One chain of the dimmer(s) is buried in the capsid.
• Other antibody-antigen complexes of this antigen also imply that the epitope is on the ‘external’ chains (A,C and E).
Target 3 – influenza hemagglutinin:
Target 3 – what we did• The antibody potential binding site was restricted
to CDRs.• We selected solutions with interface that
includes:1. at least 4 CDRs of the antibody with high
TYR,TRP concentration.2. only 1 chain of the antigen main
reason to failure!• Clustering of the solutions obtained for the
different chains of the trimer.
Target 3 – Lessons Learned
• Restrict antigen potential binding site to the exposed domain of the virus capsid
• Filter out results that cause steric clash of the 3 antibodies (symmetry constraint).
• Filter out results that include only one chain of the virus capsid in the interface.
• Detect structurally conserved regions of Influenza virus Hemagglutinin to reduce the effective protein surface.
• MultiProt – a tool for multiple alignment and detection of structurally conserved patterns. Applied to 25 structures of Hemagglutinin from the PDB .
Target 3 – MultiProt Results
138 structurally conserved residues out of 320 residues in domain HA1. Some of those residues exhibit significant sequence variability.
HA1 domain
HA2 domainStructurally conserved residues
Three sites of antibody binding:
• Capri Target 3
• 1QFU
• 2VIR
Hemagglutinin molecule
Structurally conserved residues
Target 3 – MultiProt Results
1qfucapri3
2vir
Target 3 – final results
RMSD 3.10 A, rank 6, run-time: 5 min
The original complex antibody is in red and our solution is in green.
Virus capsid protein
hemagglutininantibody from complexdocking solution
Round 2Docking Algorithms:
• BUDDA (Polak)
• PatchDock (Duhovny)
Targets 4,5,6 – alpha-amylase
3 Catalytic Residues in largest cavity:
Ca (stabilizer ion)
Cl (activator ion)Asp 197,300
Glu 233
Gly rich flexible loop (304-309)
Targets 4,5,6 – what we did
• Non conserved regions, based on multiple sequence alignment of mammalian amylase, were extracted.
• These regions were marked as the amylase potential binding site for the camel antibody.
• Favor results with wider interface area of CDR loop H3.
• Favor results with wider interface area of variable regions – reason to failure in all 3 targets.
Why non-conserved?
amylaseantibody
?The camel has it’s own amylase.
He can only produce antibodies for the residues that differ between the two amylases. The interface must include some of those different residues.
We don’t know the sequence of camel amylase, so we simply consider variable regions.
Targets 4,5,6 – non conserved regions in the
interfaces• Target 4: 15% of the
interface• Target 5: 13% of the
interface• Target 6: 20% of the
interface
ConSurf output:
Targets 4,5,6 – automatic CDR detection
• Target 4: 89% of the interface• Target 5: 88% of the interface• Target 6: 83% of the interface
Target 4 Target 5
amylase
antibody
CDRs
Targets 4,5,6 – new results
Only restriction – at least 70% of the interface in the candidate complexes belongs to CDRs.
Average running time: 25 min
Target Best RMSD
Rank Interface Area of the Correct solution
Interface Area of the Highest Ranked Solution
4 2.67 169 ~ 405A2 ~ 765A2
5 1.82 156 ~ 435A2 ~ 700A2
6 1.90 4 ~ 570A2 ~ 600A2
Target 7: T-cell receptor with streptococcal pyrogenic
exotoxin
TCR-SAG complex in blue,
Streptococcal toxin in yellow
• In the PDB search we found a complex of TCR with staphylococcal enterotoxin.
• The toxins have high structural similarity alignment is our first solution
Target 7: DockingActive site focusing:• TCR: only loops that are relevant for SAG binding
were selected.• Toxin: loops from the interface of the complex
computed by alignment were selected.
Target 7: Docking Results
Active site focusing for TCR and SAG:Best result : RMSD 3.37, Rank 3, Running Time: 1 minActive site focusing for TCR only:Best result : RMSD 3.37, Rank 36 , Running Time: 7 min
Conclusions 1• We have presented results of fast rigid docking
algorithms, which are based on geometric shape complementarity only.
• The algorithms can be easily extended to include main-chain flexibility (hinge bending).
• Successful approximate focusing on the binding sites of the proteins “almost” ensures ranking of a “correct” solution at the top.
Conclusions 2
• Despite the heuristic nature of the algorithms, which are based on local shape complementarity and not on exhaustive search of the transformation space, “correct solutions” are not lost.
• A “correct solution” always appears among the first few hundred, yet the “best solution” might exhibit significantly higher shape complementarity than the “native” one.
Conclusions 3
• Re-ranking by a GOOD energy function of the top few hundred geometric solutions, would result in obtaining a correct solution.
• Biological knowledge of “similar” interactions can assist in the focusing on the binding sites.
• A fully automatic prediction can help to evaluate the relative merit of the various algorithms employed.
Acknowledgements• The CAPRI organizers and evaluators.• Maxim Shatsky, Hadar Benyamini, Inbal
Halperin, Adi Barzilay, Snait Tamir. • Raquel Norel.