Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four...

19
Tutorial: Rosetta tools for structure determination in cryoEM density Brandon Frenz, Ray Y.-R. Wang, Frank DiMaio Last updated: March 2019 This tutorial is intended to introduce users to several different ways Rosetta may be used to solve various structure determination tasks given 3-5Å cryoEM density data. It is not intended to replace the user’s guide, available at https://www.rosettacommons.org/manuals/latest/main/. The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures and minimize structures guided by experimental density data 2. Our model rebuilding protocol (RosettaCM), where one wishes to recombine homologous structures, and rebuild small missing regions (<12 residues) 3. An advanced application of RosettaCM to determine the sequence threading of a model 4. Our model completion tools, where one wishes to complete a partial model built by the de novo tool or wishes to rebuild large missing regions (12 or more residues) In each scenario, we present the most basic usage of Rosetta for the task, and then describe additional options that may be useful. Command-line flags and input scripts are provided in shaded boxes, with boldfaced text indicating parameters of note. These parameters are described in the text following the command line. Note: in all sections, you will need to update the command scripts to point at your installation of Rosetta and the Rosetta database.

Transcript of Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four...

Page 1: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

Tutorial:RosettatoolsforstructuredeterminationincryoEMdensityBrandonFrenz,RayY.-R.Wang,FrankDiMaioLastupdated:March2019

ThistutorialisintendedtointroduceuserstoseveraldifferentwaysRosettamaybeusedtosolvevariousstructuredeterminationtasksgiven3-5ÅcryoEMdensitydata.Itisnotintendedtoreplacetheuser’sguide,availableathttps://www.rosettacommons.org/manuals/latest/main/.

Thetutorialissplitupintofourparts.

1. AnintroductiontoRosettaingeneral,showinghowonemayscorestructuresandminimizestructuresguidedbyexperimentaldensitydata

2. Ourmodelrebuildingprotocol(RosettaCM),whereonewishestorecombinehomologousstructures,andrebuildsmallmissingregions(<12residues)

3. AnadvancedapplicationofRosettaCMtodeterminethesequencethreadingofamodel4. Ourmodelcompletiontools,whereonewishestocompleteapartialmodelbuiltbythedenovo

toolorwishestorebuildlargemissingregions(12ormoreresidues)

Ineachscenario,wepresentthemostbasicusageofRosettaforthetask,andthendescribeadditionaloptionsthatmaybeuseful.Command-lineflagsandinputscriptsareprovidedinshadedboxes,withboldfacedtextindicatingparametersofnote.Theseparametersaredescribedinthetextfollowingthecommandline.

Note:inallsections,youwillneedtoupdatethecommandscriptstopointatyourinstallationofRosettaandtheRosettadatabase.

Page 2: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

1)Rosettaandelectrondensitybasics

ThissectionprovidesabriefintroductiontousingRosetta,andanoverviewofusingdensitydatawithinRosetta.

OverviewofRosetta

TheRosettadocumentationisagoodsourceofadditionalinformationonseveralofthetoolsdescribedinthisdocument.Thisisavailableathttps://www.rosettacommons.org/docs/latest/Home.

ThetoolsdescribedinthisdocumentusetheRosettaScriptsframework,describedathttps://www.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/RosettaScripts.Briefly,thisallowsprotocolstobedefinedasaseriesofatomic"Movers"whichmanipulateastructure.Theformatisasfollows:

<ROSETTASCRIPTS> <SCOREFXNS> </SCOREFXNS> <MOVERS> </MOVERS> <PROTOCOLS> </PROTOCOLS> </ROSETTASCRIPTS>

Eachblockcontainsinformationinrunningtheprotocol:<SCOREFXNS>and<MOVERS>areusedtodeclarescorefunctionsandmovers;while<PROTOCOLS>iswherethestepsoftheprotocolareenumerated.

Rosettatoolsarerunviathecommandline,withflagscontrollinggeneralprogrambehavior.Manyoftheflagsspecificallyfordensityrefinementareoutlinedinthesectionsfollowing.

DensityscoringinRosetta

AgreementtodensityisimplementedinRosettaasanadditionalenergyterm.Rosettaassessesagreementtodensitybycomputingthedensitythatonewouldexpecttosee,givenamodel,andmeasuringtheagreementoftheexpectedandexperimentaldensity.

elec_dens_fastThisscoretermisrecommendedfornearlyallusesofdensityrefinementinRosetta.Itusesinterpolationonaprecomputedgridofper-atomscorestoapproximatethedensitycorrelations.Thisversionissignificantlyfaster(~10x)thentheexactscoringtermbelow,andisveryhighlycorrelated.

TheseenergytermsmaybeprovidedtoRosettaintwoways.First,itmaybeprovidedinaRosettaScriptXMLfileasinput:

<Reweight scoretype="elec_dens_fast" weight="35.0"/>

Fornon-Rosettascriptapplications,thefollowingflagcontrolsthedensityscoringfunctionweight:

-edensity:fast_dens_wt 35.0 Therecommendedweightsforeachofthesetermsvarydependingonthedensitymapresolution,startingmodelquality,andprotocol.Section2describeshowtheweightsmaybetuned.However,thefollowingare

Page 3: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

goodrulesofthumbforsettingthedensityweightwithinRosetta:Atresolutionsbetterthan2.5Å:anelec_dens_fastweightof65.0isgenerallyreasonable.Atresolutionsbetween2.5Åand3.5Å:anelec_dens_fastweightof50.0isgenerallyreasonable.Atresolutionsworsethan3.5Å:anelec_dens_fastweightof35.0isgenerallyreasonable.Incentroidmode:anelec_dens_fastweightof10.0isgenerallyreasonable

Atverylowresolutions(worsethan6Å),theweightmayneedtobefurtherreduced.Ingeneral,iftheRosettaenergiesarepositive(orsignificantoutliersareflaggedbyMolprobityorothervalidationprograms)thentheweightsneedtobereduced.

Inadditiontothescoretermsabove,therearealsoseveralflagsthatcontrolmapscoringbehavior.MapsarereadintoRosettausingeithertheflag:

-edensity::mapfile mapfile.mrc

OrfromXML:

<LoadDensityMap name="loaddens" mapfile="mapfile.mrc"/>

MapsmaybeineitherCCP4orMRCformat(themaptypeisautomaticallydetectedfromtheheaderinfo).

Theresolutionofthemap,usedwhencomparingcalculatedtoexperimentaldensity,isspecifiedwiththeflag:

-edensity::mapreso 5.0

Mapsmayalsoberesampledtoreducememoryusageandruntime.Thisisdonethroughtheflag:

-edensity::grid_spacing 2.0

Noticethatthisflagshouldneverbemorethanhalfthegivenresolution,andifusingthefastscoringfunctionnevermorethanathirdoftheresolution.Forbothparameters,thedefaultisgenerallyfine(don’tresample,andassumetheresolutionis~3xthegridsampling).

Finally,onemaychoosetocalculatedensityusingeithercryoEMorX-rayscatteringfactors.Atlowresolution,thisprobablymakeslittledifference,butmightatresolutionsbetterthanabout3.5Å.ThedefaultistouseX-rayscatteringfactors;toturnoncryoEMscatteringfactorsinstead,usethefollowingflag:

-edensity::cryoem_scatterers

Example1A:ScoringaPDBinRosettawithdensity

Mostsimply,onemaywishtosimplyscoreamodelusingRosetta’senergyfunctionincludingthedensityterms.Thisiseasilyaccomplishedusingthescore_jd2application.Asamplecommandlinetorescorethestructureindensityisgivenin1_rosetta_basics/A_run_rescore.sh.ItillustratestheuseofvariousdensityflagstoprovideRosettawithexperimentaldensityinformation.

Page 4: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

$ROSETTA3/source/bin/score_jd2.macosclangrelease \ -database $ROSETTA3/database/ \ -in::file::s 1isrA.pdb 1issA.pdb \ -ignore_unrecognized_res \ -edensity::mapfile 1issA_6A.mrc \ -edensity::mapreso 5.0 \ -edensity::grid_spacing 2.0 \ -edensity::fastdens_wt 35.0 \ -edensity::cryoem_scatterers \ -crystal_refine

Someflagsofnoteareboldfacedabove.First,theinputstructureisprovidedwiththecommand-in::file::s.ThisiscommontomanyRosettaapplications,andmorethanoneinputmaybeprovided;eachwillbeprocessedindependently.Theflagsbeginningwith–edensity::tellRosettaaboutthedensitymapintowhichitisbeingfit.Thenameofthemapfile(inCCP4orMRCformat),theresolutionofthemap,thegridsamplingofthemap(whichshouldneverbemorethanhalftheresolution),andtheweightsonthevariousfit-to-densityscoringfunctions.Thesesameflagsarereusedformanydifferentprotocolsinadditiontorelax.Finally,theflag-crystal_refinetheflagturnsonseveraldensity-relatedoptionsrelatedtoPDBreadingandwriting,andshouldalwaysbeusedwhenrefiningagainstdensitydata.

Note:TheinputPDBmustbealignedtothedensitymapusingsomeexternaltool.Rosettawilloptionallyrigid-bodyminimizethestructureintodensitybeforerescoringbyprovidingtheflag–edensity::realignmintotheapplication.Ifthisisdone,theflag–out::pdbwillwritetheminimizedPDBfiletoaPDBfile.

Thiscommandlineoutputsascorefile,score.sc,thatgives,foreachstructurespecifiedwith-in::file::s,thescorewithrespecttoeachterminRosetta’senergyfunction.ThemeaningofindividualscoretermsaswellasanoverviewoftheRosettaenergyfunctioncanbefoundinthepaper:

AlfordRF,Leaver-FayA,JeliazkovJR,O'MearaMJ,DiMaioFP,ParkH,ShapovalovMV,RenfrewPD,MulliganVK,KappelK,LabonteJW,PacellaMS,BonneauR,BradleyP,DunbrackRLJr,DasR,BakerD,KuhlmanB,KortemmeT,GrayJJ.TheRosettaAll-AtomEnergyFunctionforMacromolecularModelingandDesign.JChemTheoryComput.2017Jun13;13(6):3031-3048.

Examples1Band1C:SimplerefinementintodensityusingRosettaScriptsandrelax

InthissectionweintroduceRosettaScriptsbywayofaverysimplerefinement-into-densityexample.RosettaScriptsprovidesanXMLscriptinginterfacetoRosettathatallowsfine-grainedcontrolofprotocols.ThesyntaxisfullydescribedintheRosettadocumentation;however,averybriefintroductionisprovidedhere.ThebasicsyntaxfortheXMLisillustratedhere(1_rosetta_basics/B_relax_density.xml)

<ROSETTASCRIPTS> <SCOREFXNS> <ScoreFunction name="dens" weights="beta_cart"> <Reweight scoretype="elec_dens_fast" weight="35.0"/> <Set scale_sc_dens_byres="R:0.76,K:0.76,E:0.76,D:0.76,M:0.76, C:0.81,Q:0.81,H:0.81,N:0.81,T:0.81,S:0.81,Y:0.88,W:0.88, A:0.88,F:0.88,P:0.88,I:0.88,L:0.88,V:0.88"/> </ScoreFunction> </SCOREFXNS> <MOVERS> <SetupForDensityScoring name="setupdens"/> <LoadDensityMap name="loaddens" mapfile="1issA_6A.mrc"/> <FastRelax name="relaxcart" scorefxn="dens" repeats="2" cartesian="1"/> </MOVERS>

Page 5: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

<PROTOCOLS> <Add mover="setupdens"/> <Add mover="loaddens"/> <Add mover="relaxcart"/> </PROTOCOLS> <OUTPUT scorefxn="dens"/> </ROSETTASCRIPTS>

Therearethree"blocks"ofdeclarationsinthisscript.Inthefirst,<SCOREFXNS>…</SCOREFXNS>,thescorefunctionstobeusedthroughouttheprotocolaredeclared;thesecond,<MOVERS>…</MOVERS>,movers–oratomicoperationsthatmodifyastructure–aredeclared;finally,thethird,<PROTOCOLS>…</PROTOCOLS>,afullprotocolisdeclaredasasequenceofmovers.

Inthisparticularexample,wedeclareasinglescorefunction,dens,whichusesthescorefunctionbeta_cart(adefaultscorefunction,don’tneedtoworryaboutit),andturnsonelec_dens_fast,thefit-to-densityscore,withaweightof35.Wethendeclarethreemovers,SetupForDensityScoring,LoadDensityMap,andFastRelax,whichsetsuptheloadedstructurefordensityscoring,loadsamapintomemory,andthenrefinesthestructureusingtheFastRelaxprotocol.Thedeclaredscorefunction,dens,isusedasaninputtotheFastRelaxmover.

Finally,notetheadditionalblock:

<Set scale_sc_dens_byres="R:0.76,K:0.76,E:0.76,D:0.76,M:0.76, C:0.81,Q:0.81,H:0.81,N:0.81,T:0.81,S:0.81,Y:0.88,W:0.88, A:0.88,F:0.88,P:0.88,I:0.88,L:0.88,V:0.88"/>

Thisadjuststheper-residuesidechaindensityweights.ItisrecommendedtoalwaysusetheseweightswhenrefiningagainstcryoEMdensity.

Torunthisscript,weusethefollowingcommandline(1_rosetta_basics/B_relax_density.sh):

$ROSETTA3/source/bin/rosetta_scripts.macosclangrelease \ -database $ROSETTA3/database/ \ -in::file::s 1isrA.pdb \ -parser::protocol ex_B1_run_RS_relax_density.xml \ -ignore_unrecognized_res \ -edensity::mapreso 5.0 \ -edensity::cryoem_scatterers \ -crystal_refine \ -out::suffix _relax \ -beta

Note:Wedonothavetospecifythedensityweightorthemapfileonthecommandline,sincetheyarehandledwithintheXMLfile.However,otherdensityoptionsmustbespecifiedonthecommandline.WhenusingRosettaScripts,thedensityweightsmustbespecifiedintheXML,theinputmapmaybespecifiedeitherway.

Finally,inthepreviousXMLfile,thetagcartesian=1appears,whichrefinesthestructureinCartesianspace.Rosettaalsoallowsrefinementintorsionalspace,whichmaybebetterforcapturingdomainmotion,andforfurtherreductioninmodelparametersagainstlow-resolutiondata.Toenabletorsionalrefinement(1_rosetta_basics/C_relax_tors_density.xml),wemakethreesmallchangestotheXML:

<ROSETTASCRIPTS> <SCOREFXNS> <ScoreFunction name="dens" weights="beta"> <Reweight scoretype="elec_dens_fast" weight="35.0"/> <Set scale_sc_dens_byres="R:0.76,K:0.76,E:0.76,D:0.76,M:0.76,

Page 6: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

C:0.81,Q:0.81,H:0.81,N:0.81,T:0.81,S:0.81,Y:0.88,W:0.88, A:0.88,F:0.88,P:0.88,I:0.88,L:0.88,V:0.88"/> </ScoreFunction> </SCOREFXNS> <MOVERS> <SetupForDensityScoring name="setupdens"/> <LoadDensityMap name="loaddens" mapfile="1issA_6A.mrc"/> <FastRelax name="relaxcart" scorefxn="dens" repeats="5" cartesian="0"/> </MOVERS> <PROTOCOLS> <Add mover="setupdens"/> <Add mover="loaddens"/> <Add mover="relaxcart"/> </PROTOCOLS> <OUTPUT scorefxn="dens"/> </ROSETTASCRIPTS>

Cartesianversustorsionalrefinement

OneofthestrengthsofRosettaisitsabilitytoperformtorsion-spacerefinement,whichcanbeincrediblyvaluableincapturingthingslikedomainmotionofproteinswhicharesimplemovesintorsionspacebutcanbecomplexincartesianspace.Theoptimaltypeofrefinementforaparticularproblemdependsonthesystemitself,themapresolution,andthequalityofthestartingmodel.Afewgeneraltips:

• Several(2-4)repeatsoftorsion-spacerefinementfollowedby1repeatofCartesian-spacerefinementisgenerallyagoodstrategy

• Forverylarge(1000residue+)systemsorverypoorqualityinputmodels(manyclashes)cartesianrefinementaloneisbetterbehaved.

Page 7: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

2)ModelrebuildingwithRosettaCM

Inthisscenario,weintroduceatool,RosettaCM,forbuildingmissingportionsofamodelguidedbydensitydata.Whileprimarilygearedtowardscomparativemodeling,itmayalsobeusefulforbuildingportionsofaproteinthataredisorderedwhencrystallizedordifficultregionsinhand-builtmodels.Inthisscenario,weintroducethebasicrebuildingprotocol,thenshowhowthetoolmayalsobeusedto:

• Combinepiecesfrommultipletemplatemodelsguidedbydensity• Rebuildwithuser-definedrestraints• Iterativelyrebuildmodelsindifficultcasesdifficultcases

Asarunningexample,weusehorsespleenapoferritinandthedepositedmap(EMD-2788).

Example2A:PreparingtemplatesforuseinRosettaCM

Inmanycases,muchofthesetupworkishandledbyascript,setup_RosettaCM.pyinRosettaTools(aseparaterepositoryavailablefromrosettacommons.org).Thisscripttakesaninputalignmentinavarietyofformats,andpreparestheinputsautomatically.Itisexecutedbyrunningthecommand:

setup_RosettaCM.py \ -–fasta t20s.fasta \ --alignment tmpl.fasta \ --alignment_format fasta \ --templates tmpl.pdb \ --rosetta_bin ~/Rosetta/main/source/bin \ --verbose

Inputsincludethefull-lengthfasta,analignmentfile–ineitherfasta,ClustalW,orHHSearchformat–andthecorrespondingtemplatePDBfiles.ThisscriptwillprepareallthenecessaryinputsinordertorunRosettaCM.

Alternately,thesetupmaybeperformedmanually.Inthiscase,sinceweareusingsomenonstandardfeatures(symmetryanddensity),andwehavetwochainsintheasymmetricunitwewilldothis;alternately,theinputsfromthepreviousstepmaybeusedasastartingpointandsubsequentlymodified.

Identifyingalignmentswithhhpred

Wefirstneedtoidentifyhomologoussequences.Todothis,weusethewebserverhhpred(https://toolkit.tuebingen.mpg.de/).Weenteroursequence(seg.fasta)intothewebformandclicksubmit.Wegetresults:

Page 8: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

Inthiscase,therearemanyhomologous

WeneedtoconvertthisalignmenttoaformatRosettacanunderstand.Ihaveincludedascript(scripts/prepare_hybridize_from_hhsearch.pl)thatautomatesthisalthoughitmaybeperformedmanuallywithatexteditoraswell.

Downloadthealignmentbyclicking“RawOutput”andthen“Download”(orseethetutorialfileseq.hhr).

Mostofthesehitsareveryhighsequenceidentity,makingthemodellingproblemtrivial.Wearegoingtofocusonmodellingstartingfromtwodistantstructuresofbacterioferritins:

… 60 3UOI_V Bacterioferritin (E.C.1 99.7 9E-18 1.9E-22 114.8 19.5 156 3-168 1-156 … 62 3GVY_C Bacterioferritin; bacte 99.7 3.6E-17 7.5E-22 112.0 19.3 154 6-169 2-155 …

Usingatexteditor,editthefileseq.hhr,removingallbutthesetwoalignments(orseethefileseq_edit.hhr).

Next,convertthesealignmentstoRosettaformatusingthegivenscript(A_convert_hhr_file.sh).Inadditiontoconvertingthealignmentfile,itwillalsodownloadthetemplatefilesnecessaryforthenextstep.Runthisscriptwithoutinputarguments,andanoutput,alignment.filt,isproduced:

## 1XXX_ 3uoiV_201 # hhsearch scores_from_program: 0 1.00 2 IRQNYSTEVEAAVNRLVNLYLRASYTYLSLGFYFDRDDVALEGVCHFFRELAEEKREGAERLLKMQNQRGGRALFQDLQKPSQDEWGTTLDAMKAAIVLEKSLNQALLDLHALGSAQADPHLCDFLESHFLDEEVKLIKKMGDHLTNIQRLVGSQAGLGEYLFERL 0 --MQGDPDVLRLLNEQLTSELTAINQYFLHSKMQDN--WGFTELAAHTRAESFDEMRHAEEITDRILLLDGLPNYQRIGSLRI--GQTLREQFEADLAIEYDVLNRLKPGIVMCREKQDTTSAVLLE-KIVADEEEHIDYLETQLELMDK-----LGEELYSAQCV -- ## 1XXX_ 3gvyC_202 # hhsearch scores_from_program: 0 1.00 5 NYSTEVEAAVNRLVNLYLRASYTYLSLGFYFDRDDVALEGVCHFFRELAEEKREGAERLLKMQNQRGGRALFQDLQKPSQDEWGTTLDAMKAAIVLEKSLNQALLDLHALGSAQADPHLCDFLESHFLDEEVKLIKKMGDHLTNIQRLVGSQAGLGEYLFERLT 0 QGDAKVIEYLNAALRSELTAVSQYWLHYRLQED--WGFGSIAHKSRKESIEEMHHADKLIQRIIFLGGHPNLQRLNPLRI--GQTLRETLDADLAAEHDARTLYIEARDHCEKVRDYPSKMLFE-ELIADEEGHIDYLETQIDLMGS-----IGEQNYGMLNAK --

Inthisformat,thefirstlineis'##'followedbyacodeforthetargetandoneforthetemplate.Thesecondlineidentifiesthesourceofthealignment;thethirdjustkeepasitis.Thefourthlineisthetargetsequenceandthefifthisthetemplate;thenumberisan'offset',identifyingwherethesequencestarts.However,thenumberdoesn'tusethePDBresidbutjustcountsresiduesstartingat0.Thesixthlineis'--'.Multiplealignmentsmaybeconcatenatedinasinglefile,withthetemplatecodeidentifyingthetemplatecorrespondingtoeachalignment.

Example2B:Runpartialthreadinganddockmodelsintodensity

RosettaCMtakesasinputspartiallythreadedmodels,thatismodelswherealignedpositionshavetheirresidueidentitiesremapped,andunalignedresiduesarenotpresent.Togeneratethesemodelsfromanalignmentfileandtemplate,wecanruntheRosettacommand(3_model_rebuilding/A_partialthread.sh):

Page 9: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

$ROSETTA3/source/bin/partial_thread.macosclangrelease \ -database ~/Rosetta/main/database/ \ -in::file::fasta seq.fasta \ -in::file::alignment alignments.filt \ -in::file::template_pdb 3uoiV.pdb 3gvyC.pdb pdb

Thiswilloutputatwopartiallythreadedmodels–3uoiV_201.pdband3gvyC_202.pdb–thatwillbeusedasinputforRosettaCM.

Thefinalstepofthemethodistoalignthepartiallythreadedmodelsintothedensitymap.ThiscanbedonemosteasilyusingChimera’s“fitintomap”tool.Itmaybeeasiesttoalignonepartialthreadintothedensityandthenaligntheothermodeltothat.Alignedversionsofthetemplatesareincludedas3uoiV_201_aln.pdband3gvyC_202_aln.pdb.

Example2C:RunningRosettaCMasamonomer.

Forourfirststep,wewillbemodellingthemonomerstructureusingRosettaCM.Whiletheassemblyissymmetric,andthenextpartwillbecarriedoutinthecontextoftheassembly,itmaybeusefulinsomecasestomodelindividualcomponentsoflargerassemblies.Suchmodellingismuchfaster,allowingformuchgreaterconformationalsampling,anditisoftenusefultomodelindividualsubunitsbeforemodellingtheentirecomplex.

LikethemethodsintroducedinScenario1,RosettaCMiscontrolledthroughanXMLscriptusingRosettaScripts.TheXMLisasfollows(2_model_rebuilding/C_rosettaCM_singletarget.xml):

<ROSETTASCRIPTS> <TASKOPERATIONS> </TASKOPERATIONS> <SCOREFXNS> <ScoreFunction name="stage1" weights="score3" symmetric="1"> <Reweight scoretype="atom_pair_constraint" weight="0.1"/> <Reweight scoretype="elec_dens_fast" weight="10"/> </ScoreFunction> <ScoreFunction name="stage2" weights="score4_smooth_cart" symmetric="1"> <Reweight scoretype="atom_pair_constraint" weight="0.1"/> <Reweight scoretype="elec_dens_fast" weight="10"/> </ScoreFunction> <ScoreFunction name="fullatom" weights="beta_cart" symmetric="1"> <Reweight scoretype="atom_pair_constraint" weight="0.1"/> <Reweight scoretype="elec_dens_fast" weight="35"/> <Set scale_sc_dens_byres="R:0.76,K:0.76,E:0.76,D:0.76,M:0.76, C:0.81,Q:0.81,H:0.81,N:0.81,T:0.81,S:0.81,Y:0.88,W:0.88, A:0.88,F:0.88,P:0.88,I:0.88,L:0.88,V:0.88"/> </ScoreFunction> </SCOREFXNS> <FILTERS> </FILTERS> <MOVERS> <Hybridize name="hybridize" stage1_scorefxn="stage1" stage2_scorefxn="stage2" fa_scorefxn="fullatom" batch="1"> <Template pdb="3uoiV_201_aln.pdb" weight="1.0" cst_file="AUTO"/> <Template pdb="3gvyC_202_aln.pdb" weight="1.0" cst_file="AUTO"/> </Hybridize> </MOVERS> <PROTOCOLS> <Add mover="hybridize"/> </PROTOCOLS> </ROSETTASCRIPTS>

Themainworkisdonethroughasinglemover,Hybridizewhichhandlesallstagesofmodel-building.Input

Page 10: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

structuresarespecifiedviaTemplatelines(inthiscasethereisonlyone).Foreachtemplateline,wespecifythepdbinput,aswellasacoupleofotherparameters:aweight(therelativefrequencywesampleeachtemplatewith);aconstraintfile(settingthisto"auto"setsupautomaticconstraintstothetemplate,whilesettingthisto"none"turnsoffallconstraints,user-definedconstraintsaredescribedlater).

Afewnotesaboutusingmultiplemodelswithhybridize:

• Withdensity,weneedtoensurethatallinputmodelsarealignedtothedensity.ThiscanbedoneusingChimera’salignmenttools.Itmaybeeasiertoalignasinglemodeltothedensityandthenalignallothermodelstothismodel.

• Ineachtrajectory,astartingmodelischosenatrandom;theconstraintsandsymmetryfromthisselectedmodelarechosenatthestartofeachrun.Ifwewishtouseaportionofamodel,butdonotwanttouseitssymmetryorconstraints,wecanassignitaweightof0:backboneconformationsfromthismodelwillbeusedinconformationalsampling,butthesymmetryandconstraintswillneverbeused.

• Similarly,gapsintheselectedstartingmodelarerebuiltbeforerecombinationoccurs.Ifoneofthetemplateshaspoorcoverage,butprovidesvaluablestructuralfeatures,itshouldbeused,butwithweight0.

GiventhisXML,RosettaCMisthenrunwiththefollowingcommandline(C_rosettaCM_singletarget.sh):

$ROSETTA3/source/bin/rosetta_scripts.macosclangrelease \ -database $ROSETTA3/database/ \ -in:file:fasta t20s.fasta \ -parser:protocol C_rosettaCM_singletarget.xml \ -nstruct 5 \ -relax:jump_move true \ -relax:dualspace \ -out::suffix _singletgt \ -edensity::mapfile t20S_41A_half1.mrc \ -edensity::mapreso 5.0 \ -edensity::cryoem_scatterers \ -beta \ -default_max_cycles 200

Theinputcommandissimilartothoseseenbefore,butwithafewkeydifferences.First,theinputtoRosettaisspecifiedwith-in:file:fastaratherthan-in:file:s.Alsonotethattheinputargument–nstruct 5isgiven,tellingRosettatogenerate50modelsforeachprocess.Generally,hundredstothousandsofmodelsarenecessarytosufficientlysampleconformationalspace;moreandlongerregionstorebuildrequiremoremodels.

JobdistributionItisgenerallyusefultosample~100modelsfromeachstartingpoint.Forthispurpose,itmaybeusefultorunmultiplejobsinparallel.Topreventoutputstructuresfromclobberingoneanother,theflag–out::suffixmaybeuseful,whereeachseparatejobisgivenadifferentsuffix.

Forexample,ona16-coremachine,wemayspecify-out::suffix_$1,then(usingGNUparallel)runthefollowing:

parallel –j16 ./C_rosettaCM_monomer.sh {} ::: {1..16}

Finally, GNU parallel allows launching of jobs remotely if SSH keys have been set up for passwordless login. To run:

Page 11: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

parallel –S 16/node1,16/node2,16/node3,16/node4 –-workdir . ./C_rosettaCM_monomer.sh {} ::: {1..48}

Thiswilllaunchinstead48jobsspreadacrossfourmachines.SeetheGNUparalleldocumentation(https://www.gnu.org/software/parallel/)formoreinformation.

AnalyzingresultsandmodelselectionWhilethisisanactivetopicofresearch,generally–onceadensityweighthasbeenchosen–toselectthebestmodelsfromamongthefullset,wewanttoselectmodelsoptimizingbothmodelgeometryandfit-to-densityvalues.ModelgeometrymaybeevaluatedusingRosettaenergiesaftersubtractingdensityenergies,whichmaybedonebyinspectingthescore*.scfilesproducedasoutput.DensityfitmaybeevaluatedusingthedensityenergyinRosettaaswellasFSCsusingtheReportFSCmover(notcoveredinthistutorial,seepart2ofthemaintutorial)

Nomattertheselectioncriteria,thetopmodels(5-10)shouldbeinspectedformodelconvergenceaswellasvisuallyinspectedfordensitymapagreement.

Example2D:RunningRosettaCMwithsymmetry.

Next,weneedtosetupsymmetricmodelingwithRosettaCM.Weuseascript,make_symmdef_file.plscriptinordertogenerateasymmetrydefinitionfileforuseinRosetta.AstraightforwardwaytodosoistouseChimeratodockthenecessarychainsintodensity.Thisscript’srequiredinputsdependontheunderlyingsymmetry:

• Forcyclic(C)anddihedral(D)symmetries,weonlyneedasingle"primarychain"andanadjacentchainineachpointgroup;

• Forhelicalsymmetries,weneedanadjacentchaininthelayer(ifthereisone)andanadjacentchainupthehelicalaxis

• Forothersymmetriesweneedallchainsadjacenttoasinglesubunit.

Sincethiscasefallsintothelattercase,(forexampleswithCandDsymmetryseethemaintutorial),weneedtocreateaPDBfilethatcontainsonechainplusalladjacentchainsdockedintodensity.Anexample,3gvyC_symm_r.pdb,isincluded.

Chimeracanbeusefulhereaswell.Fromwithinchimera,wecanrunthefollowingcommandtogeneratesymmetryforthiscase:

sym #1 group O center 88.9,88.9,88.9

Eitherway,savethechimerafilesinasingleoutputfileandthenrelabelchainsusingtheincludedscript:

scripts/relabel_chains.pl 3gvyC_symm.pdb

TogenerateourRosettasymmetryfilefromthisinput,wethensimplyhavetorunthecommand(D_make_symmdef.sh):

$ROSETTA3/source/src/apps/public/symmetry/make_symmdef_file.pl \ -m pseudo -a A \ -p 3gvyC_symm_r.pdb > ferritin.symm

Sincewehavealreadycreatedtheinputtemplatesusingthepartial_threadapplication,wesimplyneedto

Page 12: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

usetheoutputofthepartialthreadingtogetherwiththesymmetrydefinitionfile.

Wethenneedtomaketwosmallmodificationstoourinputs:

... <Hybridize name="hybridize" stage1_scorefxn="stage1" stage2_scorefxn="stage2" fa_scorefxn="fullatom" batch="1”> <Template pdb="3uoiV_201_aln.pdb" weight="1.0" cst_file="AUTO" symmdef="ferritin.symm"/> <Template pdb="3gvyC_202_aln.pdb" weight="1.0" cst_file="AUTO" symmdef="ferritin.symm"/> </Hybridize> ...

Page 13: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

3)Advancedmodelling:usingpartial_threadandrelaxtodeterminesequencethreading

Inthissection,wewillusethesametoolsintroducedintheprevioussectionstotackleamorechallengingproblem,determiningthealignmentofsequencetoabackbonemodel.ThisisbasedonEgelmanetal.,Structure,2015.

Forthisexamplewehaveamap(left)thatclearlyidentifieshelicesinthedensity.However,thethreadingofsequenceisambiguous:itisnotknownwhichistheN-andwhichistheC-terminus,andthereareonly24resolvedresidues,comparedto29aminoacidsinthesequence.

However,sincethehelixorientationsarestraightforward,wecanbrute-forcethisproblem.Wecreatethreemodels:

1. polyA_symm.pdb,inwhichtwohelicesaredocked,fromwhichwecangetthesymmetrydefinitionfile2. polyA_ctermin.pdb,amonomerinoneorientation3. polyA_ntermin.pdb,amonomerintheotherorientation

Example3A:Buildthesymmetrydefinitionfile.

Asinsectiontwo,westartbybuildingthesymmetrydefinitionfile:

$ROSETTA3/source/src/apps/public/symmetry/make_symmdef_file.pl \ -m HELIX -a J -b K \ -p polyA_symm.pdb -r 1000 -t 8 > h.symm

Sincewehavehelicalsymmetry,someoftheoptionsareabitdifferent.“-mHELIX”specifiesweruninhelicalmode,andthearguments“-aJ-bK”indicatethe“primarychain”(J)andthechainupthehelicalaxis(K).Finally,theargument“-t8”indicateshowmanysubunitstogenerateineachdirection.

Whenrunninginthismode,note:

• Rosettaoutputsafile,polyA_symm_model_JK.pdb,ofthesymmetryitidentifies.Youshouldensurethatthismakessensegiventhemap.

• Rosettaoutputsthehelicalparametersinferredfromthemodel,includingthehelicalriseandthesubunitsperturn.Thisshouldmatchwhatwasdeterminedexperimentally.

N-term in C-term in

0.30

0.35

0.40

0.45

0.50

-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3

test

map

FSC

(10-

4.4A

)helix placement and shift

Page 14: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

Runningthiscommandproducesasymmetrydefinitionfile,“h.symm”,tobeusedasinputinsubsequentsteps.

Finally,weneedtomakeasmalledittothisfilefordensityrefinement.Change:

set_dof JUMP_0_0_0 z(2.20334489451302) angle_z set_dof JUMP_0_0_0_to_com x(17.509223919948) set_dof JUMP_0_0_0_to_subunit angle_x angle_y angle_z

To:

set_dof JUMP_0_0_0_to_com x y z set_dof JUMP_0_0_0_to_subunit angle_x angle_y angle_z

Thatis,deletethefirstline(whichallowsrefinementofthesymmetryoperators)

Example3B:Generatethepartialthreads.

Inthiscase,weusethepartial_threadtoolintroducedlastsectiontogeneratealltheofdifferentsequencethreadingswearegoingtomodel.Theincludedscript,scripts/generate_threadings.plwillbeusedtogeneratetheinputalignmentfile,thoughitmayalsobedonemanuallyusingatexteditor.

Theresultingalignmentfile(alignment.filt):

## 1XXX ctermin_0 # scores_from_program: 0.0 0 QARILEADAEILRAYARILEAHAEILRAQ 0 AAAAAAAAAAAAAAAAAAAAAAAAA---- -- ## 1XXX ntermin_0 # scores_from_program: 0.0 0 QARILEADAEILRAYARILEAHAEILRAQ 0 AAAAAAAAAAAAAAAAAAAAAAAAA---- -- … ## 1XXX ntermin_1 # scores_from_program: 0.0 0 QARILEADAEILRAYARILEAHAEILRAQ 0 -AAAAAAAAAAAAAAAAAAAAAAAAA--- -- … ## 1XXX ntermin_2 # scores_from_program: 0.0 0 QARILEADAEILRAYARILEAHAEILRAQ 0 --AAAAAAAAAAAAAAAAAAAAAAAAA-- -- …

Thealignmentfilesimplyslidesthesequencealongtheinputpoly-alaninemodel.

Wethenrunthepartial_threadapplicationonthismodel,producingatotalof10inputmodels:

$ROSETTA3/source/bin/partial_thread.macosclangrelease \ -database ~/Rosetta/main/database/ \

Page 15: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

-in::file::fasta seq.fasta \ -in::file::alignment alignments.filt \

Example3C:Refineallthemodels.

Inthefinalstep,werefineeachofthemodelsagainstthedensitymap,usingthesamerelaxscriptthatwasusedinpartoneofthetutorial(withsomemodificationsforsymmetry).

Thecommandline(C_relax_density.sh):

$ROSETTA3/source/bin/rosetta_scripts.macosclangrelease \ -database ~/Rosetta/main/database/ \ -render_density \ -in::file::s ntermin_*.pdb ctermin_*.pdb \ -parser::protocol C_relax_density.xml \ -ignore_unrecognized_res \ -edensity::mapreso 3.8 \ -edensity::cryoem_scatterers \ -crystal_refine \ -beta \ -out::suffix _relax \ -default_max_cycles 200

AndtheXMLfile:

<ROSETTASCRIPTS> <SCOREFXNS> <ScoreFunction name="dens" weights="beta_cart"> <Reweight scoretype="elec_dens_fast" weight="35.0"/> <Set scale_sc_dens_byres="R:0.76,K:0.76,E:0.76,D:0.76,M:0.76,C:0.81,Q:0.81,H:0.81,N:0.81,T:0.81,S:0.81,Y:0.88,W:0.88,A:0.88,F:0.88,P:0.88,I:0.88,L:0.88,V:0.88"/> </ScoreFunction> </SCOREFXNS> <MOVERS> <SetupForSymmetry name="setupdens" definition="h_edit.symm"/> <LoadDensityMap name="loaddens" mapfile="emd_6123.map"/> <FastRelax name="relaxtors" scorefxn="dens" repeats="1" cartesian="0"/> <FastRelax name="relaxcart" scorefxn="dens" repeats="1" cartesian="1"/> </MOVERS> <PROTOCOLS> <Add mover="setupdens"/> <Add mover="loaddens"/> <Add mover="relaxtors"/> <Add mover="relaxcart"/> </PROTOCOLS> <OUTPUT scorefxn="dens"/> </ROSETTASCRIPTS>

Notethewaythatsymmetryinformationisloaded,withtheboldedmoverabove.Additionally,lookingattheprotocolshowsthatweperformonecycleoftorsionrefinement,followedbyonecycleofcartesianrefinement.

Finally,wecananalyzeresultsbylookingattheoutput*.scscorefiles.Foreachthreading,theyshowascorebreakdownofeachofthethreadedmodels.Wecanevaluatetheseresultsusingthecommand:

Page 16: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

grep SCORE: *.sc | grep -v desc| sort -nk 2

Thiscommandsortstheoutputsbytotalenergy.Whatdoesthisshow?Whatifwesortbydensityenergyinstead?

Page 17: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

4)BuildinglargesegmentsusingRosettaES

RosettaCM(section2)isapowerfultoolforrebuildingsmallsegmentsguidedbydensity.However,itpoorlydealswithmodelcompletionoflargesegmentsofprotein.Thesemayariseinseveralcases:

1. Homologymodels(particularlydistantones)mayhavelargeinsertions,orevenentiredomainsthatarelacking.

2. Themodelsproducedfromdenovo_densitymaybemissingsignificantfractionsofthebackbone3. Itmaybedifficulttomanuallytracelongstretchesoflowlocalresolutionintodensity

Toaddresstheseissues,wehavedevelopedatoolcalledRosettaEnumerativeSampling,whichusesaensemblesearchalgorithmtodeterminealargenumberofconformationsthatarebothconsistentwiththedensityandtheRosettaenergyfunction.Thistoolcanbeusedonapartialmodelsfromthedenovo_densityapplication,anincompletehomologymodel,oranyotherstartingstructure. RosettaESmodelbuildingconsistsofthreesteps.Initially,apreparationstepbuildsthefragmentsthataretobeusedinconformationalsampling.Thenarebuildingstepwillidentifyeachunassignedsegmentintheinitialmodelandbuildanensembleofpossiblesolutionsforeach.Finally,acombinationstepfindsalltheconsistentsubsetsofinteractions,andrefinesallsuchmodels(ifthereisonlyonesegment,thescriptsimplyrefinesallstructuresintheensemble).Inthiscombinationstep,ifassemblyfailstofindaconsistentsetofsolutions,anadditionalroundofsamplingwillbecarriedout,forcingdifferentsolutionsthanthepreviousmodel.

NotethatafulltutorialofRosettaESisgiveninsectionfivemaintutorial;inthis“minitutorial,”wewillonlybeusingthistooltorebuildasinglemissingsegmentfromamodel.

Comparedtotheothersections,theworkflowisabitmorecomplicatedwhenextendedtomultiplecomputecores.TohandlejobdistributionwehaveincludedapythonscriptRunRosettaES.pythatmanagesthisjobdistributionamongavailableCPUsonasinglemachine.(ThescriptisincludedaspartofRosetta,in/main/source/scripts/python/public/EnumerativeSampling,aswellasinthistutorial).Fordealingwithjobschedulersorclustersincompatiblewiththisscript,section5EgivesanoverviewofjobdistributionwithRosettaES.

Step4A.FragmentPicking

Thefirststepinvolvesselectionof"fragmentfiles,"whichpredictbackboneconformationfromlocalsequence.WehaveacustomalgorithmforfragmentpickinginRosettaES.ThesefragmentswillneedtobegeneratedbeforerunningRosettaES;thefollowingcommandwillgeneratethesefiles(A_PickFragments.sh): $ROSETTA3/source/bin/grower_prep.default.macosclangrelease \ -pdb input.pdb \ -in::file::fasta t20sA.fasta \ -fragsizes 3 9 \ -fragamounts 100 20

Thiswillgenerate1003residuefragmentsand209residuefragments,named100.3mersand20.9mers,thatarethenusedinsubsequentstepsoftherebuildingprocess.

Step4B.Generateconformationsforthemissingsegment Thegrowerconsidersassigningpositionsforeachunassignedsegmentofdensity(thatis,eachstretchofaminoacidspresentinthefastafilebutmissingfromtheinputstructure).Eachsegmentisreferredtousing

Page 18: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

asegmentid,inwhicheachsegmentisnumberedfromN-toC-terminus(withmultiplechainsgiveninorderintheinputfastafile).Thescriptisrunintwoparts:first,thescriptisrunonceforeachsegmenttorebuild;then,thescriptisrunin“assemblymode”giventheoutputsproducedbyrebuildingeachsegmentindividually.Thus,forrebuildingthetwosegmentsinthetestcase,thescriptiscalledthreetimes:oncetobuildeachsegment,andoncetoassembletheresults. Inthefirststep,weperformconformationalsamplingforadifficultsegmentinaopferritin,generatinganensembleofputativesolutions.Thiscanbedonecallingthecommand(B_SampleSegment.sh): python RunRosettaES.py \ -rs runES.sh \ -x RosettaES.xml \ -f seq.fasta \ -p difficult_loop.pdb \ -d ../2_rosettaCM_apoferritin/emd_2788.map \ -l 1 \ -c 16 \ -n loop_1

Theargumentstothisprogramareasfollows:

• -rsrunES.sh-thescriptthatislaunchedoneachcoreandcontainsRosettaflagsandinputs • -xRosettaES.xml-theXMLscriptdescribingparametersforconformationalsampling(seebelow) • -ft20sA.fasta-theinputfastafile(withchainbreaksspecifiedby‘/’) • -pinput.pdb-theinputpdbfile.Thisneedstomatchtheinputsequence,andallresiduespresentin

thefastabutabsentinthePDBwillgetbuilt. • -dT20S_48A_alpha_chainA.mrc-theinputdensitymap • -l1-thesegmentidofthesegmenttorebuild.Thiscommandshouldbecalledonceforeachsegment

torebuild,varyingthisargumentfrom1toN • -c16-thenumberofcomputecorestouse • -nloop_1-theoutputtagforthisjob(resultswillbeplacedinafolderwiththisname).Tagsshould

beuniqueforeachsegment. TheinputXMLfileexposeskeyparametersforconformationalsampling.Inthetutorial,thisfile,RosettaES.xml,containsablock: ... <FragmentExtension name="ext" fasta="full.fasta" scorefxn="dens" censcorefxn="cendens" beamwidth="32" dumpbeam="0" samplesheets="1" read_from_file="0" continuous_weight="0.3" looporder="1" comparatorrounds=”100” windowdensweight=”30” readbeams="%%readbeams%%" storedbeams="%%beams%%" steps="%%steps%%" pcount="%%pcount%%" filterprevious="%%filterprevious%%" filterbeams="%%filterbeams%%"> <Fragments fragfile="100.3mers"/> <Fragments fragfile="20.9mers"/> </FragmentExtension> ...

ThesamplingbehaviorofRosettaESiscontrolledbytheblockabove.Manyofthetagsinthisblock–fasta,dumpbeam,read_from_file,storedbeams,steps,pcount,filterprevious,comparitorrounds,andfilterbeams–areusedbythejobdistributionscripttopassresultsfromonesteptothenext,andtheyshouldbeleftas-is. Othersareuser-specified,andcanbemodifiedbasedonthesizeoftheloopandresolutionofthedata:

• beamwidth:controlsthemaximumnumberofsolutionstobeheldateachstep.Settingthevaluehigherwillincreaseruntimebutmayimproveaccuracy.

• windowdensweight:therelativecontributionofdensityinmodelselection

Page 19: Tutorial: Rosetta tools for structure determination in ... · The tutorial is split up into four parts. 1. An introduction to Rosetta in general, showing how one may score structures

Formanycases,thedefaultparametersaresufficient.However,ifthesegmenttogrowislong(50+residues),youmayneedtoincreasebeamwidth;ifthedensityislowresolution,youmightneedtodecreasewindowdensweightto15or20. Severaloptionsshouldrarelybemodified,butmayneedtobeinspecificcases:

• samplesheets:Controlswhetherornotbetasheetsamplingshouldbeperformed.Itisrecommendedtousethisexceptwhenworkingwithsymmetricsystems.

• continuous_weight:Controlsthepenaltyondiscontinuousdensity.Settingthisvalueto1willcompletelyremoveanypenaltyondiscontinuousdensity;settingitcloserto0willincreasethepenalty.Youmaywishtoraisethisvalueto0.7(ormore)ifyouanticipatethesegmentyouaretryingtomodeldoesnotfollowacontinuouspathofdensity.

Finally,theoptioncomparitorroundsisusedinmulti-segmentassembly(seesection5C)AfterrunningthescriptwiththisXML,therearetwoimportantintermediateoutputfiles,placedinthefolderloop_1(theargumentto-n):

• .lps(forlooppartialsolution)files,whicharethencombinedinstep5C,incaseswheretherearemultiplesegmentstomodel

• loop_1/beam_X.txtfiles,whereXcorrespondstothenumberofresiduesaddedtothesegment.Thesearegeneratedasthesearchaddsresidues,andareusedtopassinformationfromonesteptothenext(asadditionalresiduesareaddedinasinglesegment).

Finally,whileinmostcases,userswillwanttowantforaruntofinishtoinspectthebeam,ifthesamplingresultswanttobeinspectedasthecodeisrunning,thefinaloutputensemblecanbesavedasPDBfileswiththecommand(B2_InspectIntermediates.sh): python RunRosettaES.py \ -rs runES.sh \ -x RosettaES.xml \ -f seq.fasta \ -p difficult_loop.pdb \ -d ../2_rosettaCM_apoferritin/emd_2788.map \ -l 1 \ -db loop_1/beam_17.txt

Note,thenumberofthebeamfile(17)correspondstothetotalnumberofresiduesbuilt.Intermediateresults(aftergrowingNresidues)canbeinspectedbychangingthistoalowernumber(e.g.,beam_14.txtshowssolutionsafter14residueshavebeenrebuilt).