NMR in Structural Genomics: Strategic and Technical ChallengesThe Scripps Research Institute, La...
Transcript of NMR in Structural Genomics: Strategic and Technical ChallengesThe Scripps Research Institute, La...
NMR in Structural Genomics: Strategic NMR in Structural Genomics: Strategic and Technical Challenges and Technical Challenges
Kurt WüthrichKurt Wüthrich
The Scripps Research Institute, La Jolla, CA, USA and The Scripps Research Institute, La Jolla, CA, USA and
ETH Zürich, Zürich, SwitzerlandETH Zürich, Zürich, Switzerland
Korea Advanced Institute of Science and Technology
Daejeon, Korea, March 20, 2009
JCSG Annual Meeting 2007
JCSG Annual Meeting 200808
Protein Sequences(> 6,000,000)
The Expanding Protein Universe
Protein Structures(~50,000)
StructuralLeverage(50-400 fold)
PSI Mission: To make the 3-dimensional atomic-level structures of most proteins easily obtainable from knowledge of their corresponding DNA sequences
Can HTP structural genomics provide broad structural coverage with experimental structures
leveraged by homology models?
Protein universe can be subdivided into protein families
• Protein sequences (domains) can be grouped into ~32,000 families, but a very large number (~21,000) have only one member
So most of sequence space (>70%) can be covered with 3000 well-chosen structures>87% with 6000
Modified from John Moult & Jeremy Berg, NIGMS
PSI-2 goal: To determine representative structures for large families with no or insufficient structural coverage
Pan-genomic analyses
Cloned: 22,676 Expressed: 22,357 Crystallized: 1,618 Solved: 810Targets in PDB: 722 Structures in PDB: 748
JCSG Cumulative Scoreboard (09/01/2000 – 12/07/2008)
* Y04: projections based 5 months
*STAGE Y01 Y02 Increase Y03 Increase Current Projected Increase
Selected 4816 6138 27% 7713 26% 2080 4576 -41%
Activated 4816 6138 27% 7713 26% 2080 4576 -41%
Cloning 4589 6093 33% 7615 25% 2080 4576 -40%
Expression 4222 5796 37% 7584 31% 2025 4455 -41%
Purification 484 851 76% 1102 29% 646 1421 29%
Crystal Setup 484 851 76% 1065 25% 646 1421 33%
Crystallization 463 772 67% 1004 30% 538 1184 18%
Mounted Crystals 214 400 87% 642 61% 372 818 27%
Screened 214 400 87% 642 61% 372 818 27%
Data Collected 111 185 67% 222 20% 144 288 30%
Solved 101 169 67% 207 22% 133 266 29%
Refined 99 166 68% 200 20% 133 293 46%Deposited 98 154 57% 196 27% 93 223 14%
Y04 *
Production rates still increasing substantially in PSI-2 Y03
JCSG Protein Crystallography during the period 09/21 to 12/15, 2008__________________________________
• 320 unique targets
• 7975 crystals sent for diffraction screening (CC→ SDC)
• 51 structures deposited in the PDB
Liquid sample Handler
Gilson
Crystallomics NMR
UCSD & Burnham
Bioinformatics Core
John Wooley
Adam Godzik
Lukasz Jaroszewski
Sri Krishna Subramanian
Andrew Morse
Tamara Astakhova
Lian Duan
Piotr Kozbial
Dana Weekes
Natasha Sefcovic
Konstantina Bakolitsa
Kyle Ellrott
Xiaohui Cai
Josie Alaoen
Cindy Cook
Scientific Advisory BoardSir Tom BlundellUniv. CambridgeHomme Hellinga
Duke University Medical CenterJames Naismith
The Scottish Structural Proteomics FacilityUniv. St. AndrewsJames Paulson
Consortium for Functional Glycomics,The Scripps Research Institute
Robert StroudCenter for Structure of Membrane Proteins,
Membrane Protein Expression Center, UCSF Soichi Wakatsuki
Photon Factory, KEK, JapanJames Wells
UC San FranciscoTodd Yeates
UCLA-DOE, Inst. for Genomics and Proteomics
TSRI, NMR Core
Kurt Wüthrich
Reto Horst
Margaret Johnson
Michael Geralt
Pedro Serrano
Amarnath ChatterjeeBill Pedrini (ETH Zürich)
Biswaranjan MohantyKristaps Jaudzems
TSRI
Administrative CoreIan Wilson
Marc ElsligerGye Won Han
David MarcianoHenry Tien
Lisa van Veen
Stanford /SSRL Structure Determination Core
Keith Hodgson
Ashley Deacon
Mitchell Miller
Hsiu-Ju (Jessica) Chiu
Kevin Jin
Qingping Xu
Silvya Oommachen
Henry van den Bedem
Scott Talafuse
Ronald Reyes
Abhinav Kumar
Christine Trame
Debanu Das
Winnie Lam
Herbert Axelrod
Andrew Yeh
The JCSG is supported by the NIGMS Protein Structure Initiative Grant U54 GM074898
Ex officio founding members JCSG-1Raymond Stevens , TSRI
Susan Taylor, UCSDPeter Kuhn, SSRL/TSRI
Duncan McRee, TSRI/SyrrxPeter Schultz, TSRI/GNF
GNF & TSRI
Crystallomics Core
Scott Lesley
Mark Knuth
Heath KlockMarc Deller
Dennis CarltonPolat Abdubek
Sanjay AgarwallaConnie ChenMichelle Chiu
Thomas ClaytonCarol Farr
Julie FeuerhelmAnna GrzechnikJoanna C. Hale
Thamara JanaratneSachin Kale
Edward NigoghossianAmanda Nopakun
Linda OkachChristina PuckettSebastian SudekTiffany Wooten
Jessica CansecoMimmi Brown
Xray and NMR highly complementary
Slabinski L, Jaroszewski L, Rodrigues AP, Rychlewski L, Wilson IA, Lesley SA, Godzik A. "Thechallenge of protein structure determination- lessons from structural genomics," ProteinScience, 16: 2472-2482 (2007).
Protein universe can be subdivided into protein families
• Protein sequences (domains) can be grouped into ~32,000 families, but a very large number (~21,000) have only one member
So most of sequence space (>70%) can be covered with 3000 well-chosen structures>87% with 6000
Modified from John Moult & Jeremy Berg, NIGMS
PSI-2 goal: To determine representative structures for large families with no or insufficient structural coverage
Pan-genomic analyses
Structure coverage of the non structural protein 3 from the SARS-CoV
UB1 ACAcidic rich,disordered
SUD−MSARS unique domain
RNA binding
NABNew fold,
Nucleic acid binding
ADRPADP-ribose-1”phosphatase
UB2Ubiquitin-like
PLPPapain-like Protease,
deubiquitinationactivity
SUD−N SUD−C G2MCoronavirus Group 2
marker
Ubiquitin-like RNA binding
111-183 184-351 366-512 528-648 651-722 723-1037 1070-117916-110 1203-1318
Structure determined by NMR Structure determined by X-rays
Unknown Structure Flexibly disordered by NMR Unstructured by X-rays
1 183 365 722 1054 1318
nsp3a nsp3b nsp3c nsp3d nsp3e
Automation in NMR Automation in NMR structure determination of structure determination of
proteins in solutionproteins in solution
Dr. Bill PedriniDr. Torsten Herrmann
Illustrative result
TM1367124 aa
RMSD to X-ray structure1.11 ± 0.10 Å (bb)
NMR data collection and automated NMR structure
determination
Interactive NMR structure refinement
7d
6d
RMSD to X-ray structure0.92 ± 0.06 Å (bb)1.54 ± 0.06 Å (ha)
Validation 2d
PDB
Illustrative result
TM111289 aa
RMSD to X-ray structure1.73 ± 0.12 Å (bb)
Interactive NMR structure refinement
7d
5d
RMSD to X-ray structure1.09 ± 0.07 Å (bb)1.65 ± 0.09 Å (ha)
Validation 2d
PDB
NMR data collection and automated NMR structure
determination
Illustrative result
TM111289 aa
RMSD to PDB NMR mean structure1.80 ± 0.08 Å (bb)
Interactive NMR structure refinement
7d
5d
RMSD to PDB NMR mean structure1.45 ± 0.06 Å (bb)2.13 ± 0.09 Å (ha)
Validation 2d
PDB
NMR data collection and automated NMR structure
determination
Fold Rating
A: folded globular protein
AVA481
D: non-globular protein
BC032552
C: aggregated or oligomerized protein
BH2160
B: folded globular protein with
broadened linesCHUT1712
12 10 8 6 4 2 0 -2 12 10 8 6 4 2 δ (1H) [ppm]
From [15N,1H]-HSQC to NMR Profiles
1. NMR sample
TM1290
Asl1650
STM3127ω1(
15N)[ppm]
ω2(1H) [ppm]
110
120
110
120
130
110
120
130
10 8
10 8 6
10 8
130
1. NMR sample
HSQCS:N
APSYS:N
600 MHzRT 5 mm probe
700 MHzRT 5 mm probe
1. NMR sample
HSQCS:N
APSYS:N
6
36
600 MHzRT 5 mm probe
700 MHzRT 5 mm probe
TM1290 NMR Profile
1. NMR sample
Peak number
S:Nrel
APSY quality
36
1. NMR sample
Asl1650 NMR Profile
Peak number
Limited APSY quality
S:Nrel
36
1. NMR sample
STM3127 NMR Profile
Peak number
APSY incompleteAssigned with conventional
triple resonance experiments
S:Nrel
36
1. NMR sample
HSQCS:N
APSYS:N
600 MHzRT 5 mm probe
700 MHzRT 1.7 mm probe
1. NMR sample
HSQCS:N
APSYS:N
6
16
600 MHzRT 5 mm probe
700 MHzRT 1.7 mm probe
1. NMR sample
HSQCS:N
APSYS:N
600 MHzCryogenic
5 mm probe
700 MHzRT 1.7 mm probe
1. NMR sample
HSQCS:N
APSYS:N
6
11
600 MHzCryogenic
5 mm probe
700 MHzRT 1.7 mm probe
1D-1H-NMR screening
1. NMR sample 2. NMR structure
2D-[15N,1H]-HSQC screening
Promising protein constructs and solvent
conditions
Structure-quality protein solution
Automated backbone assignment
Interactive validation of backbone assignmentsChem. shift adaptation to NOESY spectra
Interactive NMR structure refinement
NMR structure validation
PDB
NMR structure refined
NMR structure solved
accurate backbone fold
NMR profile
Yes
No
Protocol for automated NMR structure determination
Automated [1H,1H]-NOESY-based sidechain assignment, constraint
collection and structure calculation
NMR experiments
2. NMR structure
3 APSY-NMR experiments
20-35 2D-projections each
3D 15N-resolved [1H,1H]-NOESY3D 13C-resolved [1H,1H]-NOESY(ali)
3D 13C-resolved [1H,1H]-NOESY(aro)
Automated backbone assignment
Interactive validation of backbone assignmentsChem. shift adaptation to NOESY spectra
Interactive NMR structure refinement
NMR structure refined
NMR structure solved
accurate backbone fold
Automated [1H,1H]-NOESY-based sidechain assignment, constraint
collection and structure calculation
NMR assignments TM0212 (124 aa), 2.7 mM, 40 0C
M K M K K Y T K T H E W V S I E D K V A T V G I T N H A Q E Q L G D
V V Y V D L P E V G R E V K K G E V V A S I E S V K A A A D V Y A P
L S G K I V E V N E K L D T E P E L I N K D P E G E G W L F K M E I S D
E G E L E D L L D E Q A Y Q E F C A Q E
6D-HNCOCANH (25 Proj.)Measurement time: 14 hPROSA/GAPRO: 0.3 hMATCH: 0.2 h
Software
2. NMR structure
MATCH
GAPRO
ASCAN
ATNOSCANDID
Automated backbone assignment
Interactive validation of backbone assignmentsChem. shift adaptation to NOESY spectra
Interactive NMR structure refinement
NMR structure refined
NMR structure solved
accurate backbone fold
Automated [1H,1H]-NOESY-based sidechain assignment, constraint
collection and structure calculation
(CYANA)
UNIO
Structure of TM0212 (1–124) after refinement
2. NMR structure
0.53 ± 0.08 Å (bb)0.99 ± 0.08Å (ha)
RMSD with respect to NMR mean structure
(1 – 124)
Residual DYANA target function
2.63 ± 0.68 Å2
Structure of TM0212 (1–124) after refinement
2. NMR structure
1.21 ± 0.12 Å (bb)1.92 ± 0.10 Å (ha)
RMSD with respect to X-ray structure(residues 5–123)
Time used for structure determination of TM0212
2. NMR structure
Automated backbone assignment
Interactive validation of backbone assignmentsChem. shift adaptation to NOESY spectra
Interactive NMR structure refinement
NMR structure validation
PDB
NMR structure refined
NMR structure solved
accurate backbone fold
Yes
1h
Automated [1H,1H]-NOESY-based sidechain assignment, constraint
collection and structure calculation
20h1h
120h5h
48h
10h
8h
1h
GAPROMATCH
ASCANATNOS CANDID
DYANA
ASCANATNOS CANDID
DYANAOPALp
NMRtime
Computationtime
Interactivework