NMR in Structural Genomics: Strategic and Technical ChallengesThe Scripps Research Institute, La...

Post on 28-Aug-2020

0 views 0 download

Transcript of NMR in Structural Genomics: Strategic and Technical ChallengesThe Scripps Research Institute, La...

NMR in Structural Genomics: Strategic NMR in Structural Genomics: Strategic and Technical Challenges and Technical Challenges

Kurt WüthrichKurt Wüthrich

The Scripps Research Institute, La Jolla, CA, USA and The Scripps Research Institute, La Jolla, CA, USA and

ETH Zürich, Zürich, SwitzerlandETH Zürich, Zürich, Switzerland

Korea Advanced Institute of Science and Technology

Daejeon, Korea, March 20, 2009

JCSG Annual Meeting 2007

JCSG Annual Meeting 200808

Protein Sequences(> 6,000,000)

The Expanding Protein Universe

Protein Structures(~50,000)

StructuralLeverage(50-400 fold)

PSI Mission: To make the 3-dimensional atomic-level structures of most proteins easily obtainable from knowledge of their corresponding DNA sequences

Can HTP structural genomics provide broad structural coverage with experimental structures

leveraged by homology models?

Protein universe can be subdivided into protein families

• Protein sequences (domains) can be grouped into ~32,000 families, but a very large number (~21,000) have only one member

So most of sequence space (>70%) can be covered with 3000 well-chosen structures>87% with 6000

Modified from John Moult & Jeremy Berg, NIGMS

PSI-2 goal: To determine representative structures for large families with no or insufficient structural coverage

Pan-genomic analyses

Cloned: 22,676 Expressed: 22,357 Crystallized: 1,618 Solved: 810Targets in PDB: 722 Structures in PDB: 748

JCSG Cumulative Scoreboard (09/01/2000 – 12/07/2008)

* Y04: projections based 5 months

*STAGE Y01 Y02 Increase Y03 Increase Current Projected Increase

Selected 4816 6138 27% 7713 26% 2080 4576 -41%

Activated 4816 6138 27% 7713 26% 2080 4576 -41%

Cloning 4589 6093 33% 7615 25% 2080 4576 -40%

Expression 4222 5796 37% 7584 31% 2025 4455 -41%

Purification 484 851 76% 1102 29% 646 1421 29%

Crystal Setup 484 851 76% 1065 25% 646 1421 33%

Crystallization 463 772 67% 1004 30% 538 1184 18%

Mounted Crystals 214 400 87% 642 61% 372 818 27%

Screened 214 400 87% 642 61% 372 818 27%

Data Collected 111 185 67% 222 20% 144 288 30%

Solved 101 169 67% 207 22% 133 266 29%

Refined 99 166 68% 200 20% 133 293 46%Deposited 98 154 57% 196 27% 93 223 14%

Y04 *

Production rates still increasing substantially in PSI-2 Y03

JCSG Protein Crystallography during the period 09/21 to 12/15, 2008__________________________________

• 320 unique targets

• 7975 crystals sent for diffraction screening (CC→ SDC)

• 51 structures deposited in the PDB

Liquid sample Handler

Gilson

Crystallomics NMR

UCSD & Burnham

Bioinformatics Core

John Wooley

Adam Godzik

Lukasz Jaroszewski

Sri Krishna Subramanian

Andrew Morse

Tamara Astakhova

Lian Duan

Piotr Kozbial

Dana Weekes

Natasha Sefcovic

Konstantina Bakolitsa

Kyle Ellrott

Xiaohui Cai

Josie Alaoen

Cindy Cook

Scientific Advisory BoardSir Tom BlundellUniv. CambridgeHomme Hellinga

Duke University Medical CenterJames Naismith

The Scottish Structural Proteomics FacilityUniv. St. AndrewsJames Paulson

Consortium for Functional Glycomics,The Scripps Research Institute

Robert StroudCenter for Structure of Membrane Proteins,

Membrane Protein Expression Center, UCSF Soichi Wakatsuki

Photon Factory, KEK, JapanJames Wells

UC San FranciscoTodd Yeates

UCLA-DOE, Inst. for Genomics and Proteomics

TSRI, NMR Core

Kurt Wüthrich

Reto Horst

Margaret Johnson

Michael Geralt

Pedro Serrano

Amarnath ChatterjeeBill Pedrini (ETH Zürich)

Biswaranjan MohantyKristaps Jaudzems

TSRI

Administrative CoreIan Wilson

Marc ElsligerGye Won Han

David MarcianoHenry Tien

Lisa van Veen

Stanford /SSRL Structure Determination Core

Keith Hodgson

Ashley Deacon

Mitchell Miller

Hsiu-Ju (Jessica) Chiu

Kevin Jin

Qingping Xu

Silvya Oommachen

Henry van den Bedem

Scott Talafuse

Ronald Reyes

Abhinav Kumar

Christine Trame

Debanu Das

Winnie Lam

Herbert Axelrod

Andrew Yeh

The JCSG is supported by the NIGMS Protein Structure Initiative Grant U54 GM074898

Ex officio founding members JCSG-1Raymond Stevens , TSRI

Susan Taylor, UCSDPeter Kuhn, SSRL/TSRI

Duncan McRee, TSRI/SyrrxPeter Schultz, TSRI/GNF

GNF & TSRI

Crystallomics Core

Scott Lesley

Mark Knuth

Heath KlockMarc Deller

Dennis CarltonPolat Abdubek

Sanjay AgarwallaConnie ChenMichelle Chiu

Thomas ClaytonCarol Farr

Julie FeuerhelmAnna GrzechnikJoanna C. Hale

Thamara JanaratneSachin Kale

Edward NigoghossianAmanda Nopakun

Linda OkachChristina PuckettSebastian SudekTiffany Wooten

Jessica CansecoMimmi Brown

Xray and NMR highly complementary

Slabinski L, Jaroszewski L, Rodrigues AP, Rychlewski L, Wilson IA, Lesley SA, Godzik A. "Thechallenge of protein structure determination- lessons from structural genomics," ProteinScience, 16: 2472-2482 (2007).

Protein universe can be subdivided into protein families

• Protein sequences (domains) can be grouped into ~32,000 families, but a very large number (~21,000) have only one member

So most of sequence space (>70%) can be covered with 3000 well-chosen structures>87% with 6000

Modified from John Moult & Jeremy Berg, NIGMS

PSI-2 goal: To determine representative structures for large families with no or insufficient structural coverage

Pan-genomic analyses

Structure coverage of the non structural protein 3 from the SARS-CoV

UB1 ACAcidic rich,disordered

SUD−MSARS unique domain

RNA binding

NABNew fold,

Nucleic acid binding

ADRPADP-ribose-1”phosphatase

UB2Ubiquitin-like

PLPPapain-like Protease,

deubiquitinationactivity

SUD−N SUD−C G2MCoronavirus Group 2

marker

Ubiquitin-like RNA binding

111-183 184-351 366-512 528-648 651-722 723-1037 1070-117916-110 1203-1318

Structure determined by NMR Structure determined by X-rays

Unknown Structure Flexibly disordered by NMR Unstructured by X-rays

1 183 365 722 1054 1318

nsp3a nsp3b nsp3c nsp3d nsp3e

Automation in NMR Automation in NMR structure determination of structure determination of

proteins in solutionproteins in solution

Dr. Bill PedriniDr. Torsten Herrmann

Illustrative result

TM1367124 aa

RMSD to X-ray structure1.11 ± 0.10 Å (bb)

NMR data collection and automated NMR structure

determination

Interactive NMR structure refinement

7d

6d

RMSD to X-ray structure0.92 ± 0.06 Å (bb)1.54 ± 0.06 Å (ha)

Validation 2d

PDB

Illustrative result

TM111289 aa

RMSD to X-ray structure1.73 ± 0.12 Å (bb)

Interactive NMR structure refinement

7d

5d

RMSD to X-ray structure1.09 ± 0.07 Å (bb)1.65 ± 0.09 Å (ha)

Validation 2d

PDB

NMR data collection and automated NMR structure

determination

Illustrative result

TM111289 aa

RMSD to PDB NMR mean structure1.80 ± 0.08 Å (bb)

Interactive NMR structure refinement

7d

5d

RMSD to PDB NMR mean structure1.45 ± 0.06 Å (bb)2.13 ± 0.09 Å (ha)

Validation 2d

PDB

NMR data collection and automated NMR structure

determination

Fold Rating

A: folded globular protein

AVA481

D: non-globular protein

BC032552

C: aggregated or oligomerized protein

BH2160

B: folded globular protein with

broadened linesCHUT1712

12 10 8 6 4 2 0 -2 12 10 8 6 4 2 δ (1H) [ppm]

From [15N,1H]-HSQC to NMR Profiles

1. NMR sample

TM1290

Asl1650

STM3127ω1(

15N)[ppm]

ω2(1H) [ppm]

110

120

110

120

130

110

120

130

10 8

10 8 6

10 8

130

1. NMR sample

HSQCS:N

APSYS:N

600 MHzRT 5 mm probe

700 MHzRT 5 mm probe

1. NMR sample

HSQCS:N

APSYS:N

6

36

600 MHzRT 5 mm probe

700 MHzRT 5 mm probe

TM1290 NMR Profile

1. NMR sample

Peak number

S:Nrel

APSY quality

36

1. NMR sample

Asl1650 NMR Profile

Peak number

Limited APSY quality

S:Nrel

36

1. NMR sample

STM3127 NMR Profile

Peak number

APSY incompleteAssigned with conventional

triple resonance experiments

S:Nrel

36

1. NMR sample

HSQCS:N

APSYS:N

600 MHzRT 5 mm probe

700 MHzRT 1.7 mm probe

1. NMR sample

HSQCS:N

APSYS:N

6

16

600 MHzRT 5 mm probe

700 MHzRT 1.7 mm probe

1. NMR sample

HSQCS:N

APSYS:N

600 MHzCryogenic

5 mm probe

700 MHzRT 1.7 mm probe

1. NMR sample

HSQCS:N

APSYS:N

6

11

600 MHzCryogenic

5 mm probe

700 MHzRT 1.7 mm probe

1D-1H-NMR screening

1. NMR sample 2. NMR structure

2D-[15N,1H]-HSQC screening

Promising protein constructs and solvent

conditions

Structure-quality protein solution

Automated backbone assignment

Interactive validation of backbone assignmentsChem. shift adaptation to NOESY spectra

Interactive NMR structure refinement

NMR structure validation

PDB

NMR structure refined

NMR structure solved

accurate backbone fold

NMR profile

Yes

No

Protocol for automated NMR structure determination

Automated [1H,1H]-NOESY-based sidechain assignment, constraint

collection and structure calculation

NMR experiments

2. NMR structure

3 APSY-NMR experiments

20-35 2D-projections each

3D 15N-resolved [1H,1H]-NOESY3D 13C-resolved [1H,1H]-NOESY(ali)

3D 13C-resolved [1H,1H]-NOESY(aro)

Automated backbone assignment

Interactive validation of backbone assignmentsChem. shift adaptation to NOESY spectra

Interactive NMR structure refinement

NMR structure refined

NMR structure solved

accurate backbone fold

Automated [1H,1H]-NOESY-based sidechain assignment, constraint

collection and structure calculation

NMR assignments TM0212 (124 aa), 2.7 mM, 40 0C

M K M K K Y T K T H E W V S I E D K V A T V G I T N H A Q E Q L G D

V V Y V D L P E V G R E V K K G E V V A S I E S V K A A A D V Y A P

L S G K I V E V N E K L D T E P E L I N K D P E G E G W L F K M E I S D

E G E L E D L L D E Q A Y Q E F C A Q E

6D-HNCOCANH (25 Proj.)Measurement time: 14 hPROSA/GAPRO: 0.3 hMATCH: 0.2 h

Software

2. NMR structure

MATCH

GAPRO

ASCAN

ATNOSCANDID

Automated backbone assignment

Interactive validation of backbone assignmentsChem. shift adaptation to NOESY spectra

Interactive NMR structure refinement

NMR structure refined

NMR structure solved

accurate backbone fold

Automated [1H,1H]-NOESY-based sidechain assignment, constraint

collection and structure calculation

(CYANA)

UNIO

Structure of TM0212 (1–124) after refinement

2. NMR structure

0.53 ± 0.08 Å (bb)0.99 ± 0.08Å (ha)

RMSD with respect to NMR mean structure

(1 – 124)

Residual DYANA target function

2.63 ± 0.68 Å2

Structure of TM0212 (1–124) after refinement

2. NMR structure

1.21 ± 0.12 Å (bb)1.92 ± 0.10 Å (ha)

RMSD with respect to X-ray structure(residues 5–123)

Time used for structure determination of TM0212

2. NMR structure

Automated backbone assignment

Interactive validation of backbone assignmentsChem. shift adaptation to NOESY spectra

Interactive NMR structure refinement

NMR structure validation

PDB

NMR structure refined

NMR structure solved

accurate backbone fold

Yes

1h

Automated [1H,1H]-NOESY-based sidechain assignment, constraint

collection and structure calculation

20h1h

120h5h

48h

10h

8h

1h

GAPROMATCH

ASCANATNOS CANDID

DYANA

ASCANATNOS CANDID

DYANAOPALp

NMRtime

Computationtime

Interactivework