Predicting PDZ domain protein-protein interactions from the
genomeGary Bader
Donnelly Centre for Cellular and Biomolecular Research
University of TorontoVanBUG, Vancouver, Jan.8.2009
http://baderlab.org
Computational Cell Map
Cary MP et al. Pathway information… FEBS Lett. 2005Bader GD et al. Functional genomics and proteomicsTrends Cell Biol. 2003
Map the cell• Predict map from genome• Multiple perturbation mapping• Active cell map• Map visualization and analysis software
Read map to understand • Cell processes• Gene function• Disease effects• Map evolution
How are biological networks in the cell
encoded in the genome?Can we accurately predict biologically relevant interactions from
a genome?
How do genome sequence changes underlying disease affect the molecular network in the cell?
Can we predict how well model pathways or phenotypes will translate to human?
Can we design new networks de novo?
Predicting Protein Interaction Networks From the Genome
• Ideally:
• Reality:– Not currently possible– Signaling pathways too divergent to accurately
map by orthology– Protein interaction prediction likely as hard as
protein folding, in general e.g. induced fit
AccuratelyPredict
• Map via orthology relationships– Metabolic pathways
• E.g. KEGG, BioCyc, metaSHARK
– Protein-protein interactions• E.g. OPHID, HomoMINT
– Signaling pathways• E.g. Reactome
• Infer using functional associations– Phylogenetic profile, Rosetta Stone
• Infer from molecular profiles– Gene expression gene regulatory network– E.g. ARACNE, MEDUSA, MatrixREDUCE
Predicting Networks
Bader & Enright
Pinney et al.NAR 2005
Peptide Recognition Domains
• Simple binding sites• Well studied• Numerous• Biologically important
– Eukaryotic signaling systems often involve modular protein-protein interaction domains
http://pawsonlab.mshri.on.ca/
http://nashlab.uchicago.edu/domains/
Protein Domain Interaction Network Prediction
Genome
Gene and protein prediction
Domain prediction
Specificity prediction
Protein-protein interaction prediction
Protein Domain Interaction Network Prediction
Genome
Gene and protein prediction
Domain prediction
Specificity prediction
Protein-protein interaction prediction
Par-6 PDZ DomainVKESLV-COOH(1RZX, Fly)
PDZ Domains• 80-90 aa’s, 5-6 beta
strands, 2 alpha helices• Recognize
hydrophobic C-termini• Membrane localization of
signaling components• Neuronal development,
cell polarity, ion channel regulation
C
Tonikian et al. PLoS Biology Sep.2008
Dev Sidhu
~250 Human PDZ Domains
Multiple sequencealignment
~250 Human PDZ Domains
Multiple sequencealignment
C-Terminus
PDZBindingMotifs
polarbasicacidichydrophobic
Class 1: X[T/S]X
Class 2: XX
Sequence Logo
SWWPDSWVNAFEETWVNPFWDVWVNPFWDVWVSVDVDTWV-AYFDTWVSTFLETWVKGVFESWVESWHDSWV-GDQDTWVGRWMDTWVKFWRDTWL…
Profile
polar=green, basic=blue, acidic=red, hydrophobic=black
Logo
-3 -2 -1 0A 0 0 0 0C 0 0 0 0D 0.7 0.1 0 0E 0.3 0.05 0 0F 0 0 0.05 0G 0 0 0 0H 0 0 0 0I 0 0 0 0K 0 0 0 0L 0 0 0 0.1M 0 0 0 0N 0 0 0 0P 0 0 0 0Q 0 0 0 0R 0 0 0 0S 0 0.15 0 0T 0 0.7 0 0V 0 0 0 0.9W 0 0 0.95 0Y 0 0 0 0
Position
Am
ino
Aci
d
http://weblogo.berkeley.edu/Schneider TD, Stephens RM. 1990.Nucleic Acids Res. 18:6097-6100
82 worm and human PDZ specificities mapped by phage display
~3100 peptides
PDZ SpecificityMap
Class 1: X[T/S]X
Class 2: XX
PDZ SpecificityMap
Class 1: X[T/S]X
Class 4: XGX
Class 3: X[D/X]X
Class 2: XX
Class 1: X[T/S]X
Class 4: XGX
Class 3: X[D/X]X
Class 2: XXPDZ SpecificityMap
16 Classes
Specificity at Most Positions
Position Versatile
Many Distinct Specificities
Versatile and Robust91 Erbin mutants phaged, 3400 peptidesMutations cause specificity switch, not function loss
Conserved Specificity, Expanded UsePDZ domains are versatile, but only ~16 classes used from worm to humanOne billion years of evolutionModel: specificities arose early, domains expanded under evolutionary constraints
Raffi Tonikian
Protein Domain Interaction Network Prediction
Genome
Gene and protein prediction
Domain prediction
Specificity prediction
Protein-protein interaction prediction
Predicting PDZ Specificity
>ERBB2IP-1RVRVEKDPELGFSISGGVGGRGNPFRPDDDGIFVTRVQPEGPASKLLQPGDKIIQANGYSFINIEHGQAVSLLKTFQNTVELII
Tonikian et al. PDZ specificity map
Sequence Predicts Specificity
50 mapped PDZ domains>70% similar to 69unmapped PDZ
Double coverage to45% of worm/human
33 more PDZ groups110 singletons
Mapped
Unmapped Worm
Human
Are Residues Correlated?
~80
~3000Boris Reva, Chris Sander
Domain Position
Peptide Position
Joint Freq
Domain Freq
Peptide Freq
MutualInformation
(H@105) (T@7) 886 1367 913 0.166384111
(P@53) (T@7) 373 411 913 0.130328629
(Q@67) (W@8) 366 377 1037 0.117349366
(V@109) (T@7) 836 1430 913 0.115598151
(S@64) (E@6) 218 386 414 0.109298916
(V@9) (W@4) 150 202 340 0.109096478
(A@102) (E@6) 228 429 414 0.107661006
(L@30) (S@6) 207 383 384 0.106889284
(P@53) (E@6) 219 411 414 0.103683514
(L@26) (E@6) 391 1138 414 0.10274842
Top 10 1-1 Rules
p joint lnp joint
pdomain ppeptide
886
2083ln
886
20831367
2083
913
2083
0.17
Correlation Validation
Prediction Can Be Accurate
Experiment
Prediction
Challenge: But Not Always
Experiment
Prediction
Shirley Hui
Predicting PDZ Specificity
Machine Learning
Predictions
YESNO…
Negative:Positive:
…
YESYESYESYES
NONONONO
Test Examples(PDZ-Peptide Pairs)
??
…
Training Examples(Binding and Non binding PDZ-Peptide Pairs)
…Shirley Hui, Xiaojian Shao
Consider sequence and physicochemical propertieshigh accuracy at matching known domains to peptides
Protein Domain Interaction Network Prediction
Genome
Gene and protein prediction
Domain prediction
Specificity prediction
Protein-protein interaction prediction
Genome Search
SWWPDSWVNAFEETWVNPFWDVWVNPFWDVWVSVDVDTWV-AYFDTWVSTFLETWVKGVFESWVESWHDSWV-GDQDTWVGRWMDTWVKFWRDTWL…
Profile
Phage Results
polar=green, basic=blue, acidic=red, hydrophobic=black
ERBINPDZ
>Q86W91_HUMAN Plakophilin 4, isoform b...LKSTTNYVDFYSTKRPSYRAEQYPGSPDSWV
QYPGSPDSWV
Genome Search
DSWV
PDZ ERBIN
C-TerminalMatch Score
5.5
Predicted C-Terminal Motif
w
iip
110logAssumes: Position independence,
uniform input, good samplingPhysiological binder is similar to phage sequence
Known Interactor
High Score
Prediction Can be AccurateERBIN PDZInteractionPrediction
10E-5 (High)
Probabilityof PDZ binding
10E-7 (Low)
ERBB2IP-1
…but requires further experimental support
...
p-value
Network of prioritized human PDZ interactions
336 interactions between 54 PDZ domains, 247 proteins
Matches known biology, significantly enriched in known interactors
8% overlap, p=8.6x10-18
In vitro Biologically Relevant(In vivo)
Future: In vivo Protein Interaction Prediction
PDZ
PhageDisplay
Genome
ProteinExpression
ProteinFunction
ProteinLocation
ProteinStructure
EvolutionaryContext
NetworkContext
Bind
DLGs NMDAR
In silicoPredictions
Peptides
PDZ Human-Virus Interactions
89 viral proteins matched better than any human protein(vs. 30 domains)
Affinities (ELISA)
Yingnan Zhang
Crtam peptide inhibitorblocks SCRIB-3 bindingand polarization
Synthetic viral peptidepromotes T cellproliferation
T cell
Jung-Hua Yeh and Andrew Chan
Crtam Ig transmembraneprotein important inlate phase T cellactivation
Non SCRIB binding SCRIB Binding
Non SCRIB binding
SCRIB Binding
Conclusions• PDZ domains are highly specific, versatile and
robust to mutation• Many specificities possible, but only a few are
used• Specificity can be predicted from domain
sequence• Prioritize predictions for experimental follow up• Use by pathogens• PDZ specificity map useful for:
– Novel protein interaction discovery– Peptidomimetic therapeutic design– PDZ design (synthetic biology)
Cell map exploration and analysis
Databases
Literature
Expert knowledge
Experimental Data
Can we accurately predict protein interactions?
PathwayInformation
PathwayAnalysis
(Cytoscape)
http
://pa
thgu
ide.
org
Vuk Pavlovic
~280 PathwayDatabases!
Pathway Commons: A Public Library
•Books: Pathways•Lingua Franca: BioPAX OWL•Index: cPath pathway database software•Translators: translators to BioPAX
•Open access, free software•No competition: Author attribution•Aggregate ~ 20 databases in BioPAX format
http:pathwaycommons.org
Sander Lab(MSKCC)Bader Lab
Network visualization and analysis
UCSD, ISB, Agilent, MSKCC, Pasteur, UCSF, Unilever, U of Toronto, U of Michigan
http://cytoscape.org
Pathway comparisonLiterature miningGene Ontology analysisActive modulesComplex detectionNetwork motif search
Gene Function Prediction
•Guilt-by-association principle
•Biological networks are combined intelligently to optimize prediction accuracy
•Algorithm is more fast and accurate than its peers
http://www.genemania.org
Quaid Morris (CCBR)Rashad Badrawi, Ovi Comes, Sylva Donaldson, Christian Lopes,Jason Montojo, Khalid Zuberi
Canadian Bioinformatics Workshops 2009
Interpreting Gene Lists from -omics Studies
Date: July 9-10, 2009, Toronto
Faculty: Gary Bader, Quaid Morris & Wyeth Wasserman
Clinical Genomics and Biomarker Discovery
Date: July 16-17, 2009, Toronto
Faculty: Sohrab Shah
Informatics on High-Throughput Sequencing Data
Date: July 23-24, 2009, Toronto
Faculty: Michael Brudno, Asim Siddiqui & Francis Ouellette
Exploratory Data Analysis and Essential Statistics using R
October 2-3, 2009, Toronto
Faculty: Raphael Gottardo and Boris Steipe
Applications now being accepted at www.bioinformatics.ca
Limited registration
Registration Fee: $500
AcknowledgementsPDZ WorkGenentech
Dev SidhuYingnan ZhangHeike HeldStephen SazinskyYan Wu
University of TorontoCharlie BooneRaffi Tonikian, Xiaofeng
Xin
MSKCCChris SanderBoris Reva
CytoscapeTrey Ideker (UCSD)Kei Ono, Mike Smoot, Peng Liang Wang (Ryan Kelley, Nerius Landys, Chris Workman, Mark Anderson, Nada Amin, Owen Ozier, Jonathan Wang)
Lee Hood (ISB)Sarah Killcoyne, John Boyle, Ilya Shmulevich (Iliana Avila-Campillo, Rowan Christmas, Andrew Markiel, Larissa Kamenkovich, Paul Shannon)
Benno Schwikowski (Pasteur)Mathieu Michaud (Melissa Cline, Tero Aittokallio)
Chris Sander (MSKCC)Ethan Cerami, Ben Gross (Robert Sheridan)
Annette Adler (Agilent)Allan Kuchinsky, Mike Creech (Aditya Vailaya)
Bruce Conklin (UCSF)Alex Pico, Kristina Hanspers
Bader LabG2NChris TanDavid GfellerShirley HuiXioajian ShaoShobhit JainMPAnastasija BaryshnikovaIain WallaceLaetitia MorrisonRon AmmarACMDaniele MericoRuth IsserlinVuk PavlovicOliver Stueker
Pathway CommonsChris SanderEthan CeramiBen GrossEmek DemirRobert HoffmannIgor RodchenkovRashad Badrawi
FundingCIHR, NSERC, NIHGenome CanadaCanada Foundationfor Innovation/ORF http://baderlab.org
Top Related