Post on 12-Mar-2020
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Supplementary Figure 1
Characterization of designed leucine-rich-repeat proteins.
(a) Water-mediate hydrogen-bond network is frequently visible in the convex region of LRR crystal structures. Examples are shown for the idealized L24 (DLRR_B) and L24→L28 fusion structure (DLRR_G3). Water molecules participating in the hydrogen bond (yellow dots) network are represented by spheres. (b) Super-helical shapes of the three idealized building block repeats. For clear visualization, dots tracing the global super-helix defined by the fitted parameters are overlaid with the LRR structures (rotation angle < 720°). The highly conserved leucine residues used for the parameter fitting are represented by spheres. See Supplementary Table 1 for the helical parameter estimation. (c) Structural alignments of the partial Ncap-L245 structure in DLRR_B (top) and L225 structure in DLRR_A (bottom) into the crystal structure of DLRR_E. Cα r.m.s. deviations for the alignments of DLRR_B and DLRR_A are 0.4 Å and 0.3 Å, respectively. (d) Structural defects in the initial fusion model of DLRR_G3. The crystal structure (magenta) of the junction module in DLRR_G3 is aligned with the initial model structure before design (gray) and the final model structure after design (green). The initial model contains large cavity and side chain clashes in the junction module, which are improved in the subsequent design procedure as shown in the final model structure (green). (e) SEC-MALS experiments for DLRR_D, DLRR_E, DLRR_I, DLRR_J, DLRR_K, and DLRR_L. Most of designs are monomeric even though some soluble aggregates/oligomers are observed in DLRR_I and DLRR_K.
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Supplementary Figure 2
Experimental characterization of six L22→L28 designs (DLRR_F).
In the top row, structure alignment (left) and sequence alignment (right) of the six junction module designs are represented. The building block sequences (L22 + L28) are shown in the first row of the sequence alignment for comparison. Far-UV CD spectra, thermal denaturation at 218 nm, and SEC-MALS are shown from left to right for each design.
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Supplementary Figure 3
Experimental characterization of six L24→L28 designs (DLRR_G).
In the top row, structure alignment (left) and sequence alignment (right) of the six junction module designs are represented. The building block sequences (L24 + L28) are shown in the first row of the sequence alignment for comparison. Far-UV CD spectra, thermal denaturation at 218 nm, and SEC-MALS are shown from left to right for each design. DLRR_G6 has one less {L28→L29} module than the others. The crystal structure of DLRR_G3 is shown in Figure 3d.
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Supplementary Figure 4
Experimental characterization of four L24→L32→L24 designs (DLRR_H).
In the top row, structural alignment of the four wedge module designs is represented with the structure. Sequence alignment of the four wedge module designs is shown with the building block and the native L32 module sequence (L24 + L32 + L24) in the first row of the alignment for comparison. Far-UV CD spectra, thermal denaturation at 218 nm, and SEC-MALS are shown from left to right for each design. Design DLRR_I has two identical L32 modules derived from DLRR_H1 (Supplementary Table 2). In SEC-MALS experiments, some soluble aggregates/oligomers are observed in addition to the monomeric status. The crystal structure of DLRR_H2 is shown in Figure 3e.
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Supplementary Figure 5
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Characterization of designed junction modules.
(a) Sequence alignments between the designed junction modules and the top 3 naturally occurring sequences (square block) found in BLAST1 search for the non-redundant (NR) database. There are numerous sequence differences between the designed modules and the closest sequence in NR. Indeed, BLAST fails to find full length alignments for most of the junction sequences. (b) Comparison of structures of designed and naturally occurring junctions between LRR modules. Left: designed junction modules, Middle: the closeststructural matches found in the PDB using TMalign2, Right: structural alignment. The TMalign searches were carried out with the two-unit junction module structures (green) and one or two module structures next to the junction module are shown for both designed and natural structures (yellow) to make the ideality (lack of ideality) of the different structures clearer. Most junctions between different length LRR modules in the native structures occur near the caps where the structure becomes much less regular. This irregularity, evident in the right side of the images from native structures, makes it not possible to generate novel LRR’s with controlled curvature by combining multiple different types of modules simply using junctions already existing in the PDB. (c) Structural comparison between crystal structures and model structures generated by the iterative module assembly protocol described in Method. All model structures show high consistency to the crystal structures (r.m.s. deviationg in Table 2). (d) Native LRR proteins, internalin A (InlA, PDB ID: 1O6S, top left) and ribonuclease inhibitor (RI, PDB ID: 1A4Y, bottom left), achieve high affinity and specificity by having shapes closely conforming to the surfaces of the target proteins (human E-cadherin and ribonuclease A, respectively). Each protein has a curvature optimized to its target, resulting in well-packed complementary protein-protein interfaces with hot-spot clusters (shown by red sticks) at both the N and C termini. In contract, swapping the respective target for each of the LRR proteins (i.e. RI:E-cadherin, orange-cyan complex in the top right and InlA:ribonuclease, green-yellow complex in the bottom right) makes the clashes and large gaps in the binding interface.
Supplementary Table 1 Super-helical parameters of building block modules
LRR type Rise (Å) Radius (Å) Rotation angle (radian)
Number of repeat units used for fitting
Fitted RMSD (Å)
L22 2.34 18.67 0.24 8 0.09
L24 1.41 24.62 0.20 9 0.13
{L28→L29} 0.82 16.52 0.31 10 0.17
The L22, L24 and {L28→L29} repeats form unique solenoid shapes which can be described
by three super-helical parameters (radius: distance to the helical axis, rise: projected
distance along the helical axis between adjacent units, and rotation angle: rotation angle
about the helical axis between units). The global helical shapes and parameters are
estimated by fitting the three parameters to the repeat protein structures. For the parameter
fitting, one of the highly conserved positions, the second Leu in LxxLxLxxN/C motif, is used
as a representative for each repeat module. The Cα coordinates of the representative
positions are obtained from the crystal structures of DLRR_A (L22) and DLRR_B (L24), and
from the model structure of DLRR_C ({L28→L29}). Eight to ten Cα coordinates are used to
fit the same number of coordinates arbitrary generated from the three helical parameters.
RMSD between the two coordinate sets is minimized by using non-linear optimization
algorithm (constrOptim.nl) in alabama R package3,4. Initial helical parameters, the input of
the optimization procedure, are inferred from the transformation matrix between the first two
modules of the building block structures. After performing the optimization procedure, the
parameter of the lowest RMSD is used to represent the global helical shape of the idealized
building block structures (Supplementary Fig. 1b).
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Supplementary Table 2 Module organization and module origins of the multiple fusion
designs in Figure 4c.
Design name Module organization Individual
modules Original design
DLRR_I Ncap–L242–JNL24→L32→L24–JNL24→L32→L24–L242
Ncap–L242
JNL24→L32→L24
JNL24→L32→L24
L242
DLRR_B
DLRR_H1
DLRR_H1
DLRR_B
DLRR_J Ncap–L224→L242–JNL24→L28→L29→[L28→L29]2
Ncap–L224
L242
JNL24→L28
L29
[L28→L29]2
DLRR_A
DLRR_B
DLRR_G3
DLRR_G3
DLRR_G3
DLRR_K Ncap–L242–JNL24→L32→L24–L243–
JNL24→L28→L29→[L28→L29]2
Ncap–L242
JNL24→L32→L24
L243
JNL24→L28
L29
[L28→L29]2
DLRR_B
DLRR_H2
DLRR_B
DLRR_G6
DLRR_G6
DLRR_G6
DLRR_L Ncap–L223→L243–JNL24→L32→L24–L243–
JNL24→L28→L29→[L28→L29]2
Ncap–L223
L243
JNL24→L32→L24
L243
JNL24→L28
L29
[L28→L29]2
DLRR_A
DLRR_B
DLRR_H2
DLRR_B
DLRR_G6
DLRR_G6
DLRR_G6
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Supplementary Table 3 Number of possible fusion LRR structures with respect to the
number of repeat units.
Number of repeat units Number of possible LRR structures
Fold change (i) → (i+1)
5 64
6 145 2.266
7 327 2.255
8 736 2.251
9 1,655 2.249
10 3,720 1.976
11 8,360 2.247
12 18,786 2.247
13 42,213 2.247
14 94,853 2.247
15 213,134 2.247
16 478,909 2.247
17 1,076,100 2.247
18 2,417,996 2.247
19 5,433,237 2.247
LRR structures are generated by recursively following the edges of the network in Figure 4a.
The general module assembly starts from Ncap-L22 or Ncap-L24 in the network except
{L28→L29}n and each assembly (transition in the network) adds one repeat unit to the
structure. The number of repeat units in the table only considers the internal repeat units
excluding N-terminal capping domain.
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Supplementary Table 4 Crystallization conditions
Design names Crystallization conditions
DLRR_A 22% PEG3350 w/v, 0.1 M MES pH 6.0, 0.2 M NaCl
DLRR_E 20% PEG 1000 v/v, 0.1 M Na/K phosphate pH 6.2
DLRR_G3 2 M ammonium sulfate, 0.1 M Bis-Tris Ph 5.5
DLRR_H2 22% PEG 3350 w/v, 300 mM Ammonium sulfate, unbuffered
DLRR_I 24% PEG 3350 w/v, 0.2 M ammonium sulfate, 0.1 M HEPES pH 7.5, 0.1 M proline
DLRR_K 20% PEG-3000, 0.1 M Tris pH 7.0, 0.2 M Ca(OAc)2
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Supplementary Table 5 Designed sequences
> DLRR_A
ETITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNLKTLKLSNNKITDISAL
KQLNNLGWLDLSNNGITDISALKNLASLHTLDLSNNGITDISALKNLDNLHTLDLSNNGITDISALKNLDNLHTLDLS
NNGITDISALKNLTSLHTLDLSNNGITDISALKNLDNLETLDLRNNGITDKSALKNLNNLKgslehhhhhh
>DLRR_B
TITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNVRYLALGGNKLHDISAL
KELTNLGWLNLSNNQLETLPQGVFEKLTNLTTLNLSNNQLTSLPQGVFERLASLTTLNLSNNQLTSLPQGVFERLT
NLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTSLTTLNLSNNQLTSLPQGVFERLTNLKTL
NLSNNQLQSLPTGVDEKLTQLTgshhhhhh
>DLRR_C
LDLSNQNKTKEDCREIARELKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARAL
KQAASLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEG
AAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDL
SNCNLTKEACREIARALKQATSLHELHLSNNNIGEEGKAWLEEARRHPGSTLETgshhhhhh
>DLRR_D
ETITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNLKTLKLSNNKITDISAL
KQLNNLGWLDLSNNGITDISALKNLASLHTLDLSNNGITDISALKNLDNLHTLDLSNNGITDISALKNLDNLHTLDLS
NNGITDISALKNLTSLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLP
QGVFERLTSLTTLNLSNNQLTSLPQGVFERLTNLKTLNLSNNQLQSLPTGVDEKLTQLTgshhhhhh
>DLRR_E
TITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNVRYLALGGNKLHDISAL
KELTNLGWLNLSNNQLETLPQGVFEKLTNLTTLNLSNNQLTSLPQGVFERLASLTTLNLSNNQLTSLPQGVFERLT
NLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTSLHTLDLSNNGITDISALKNLDNLHTLDL
SNNGITDISALKNLDNLHTLDLSNNGITDISALKNLTSLHTLDLSNNGITDISALKNLDNLETLDLRNNGITDKSALKN
LNNLKgslehhhhhh
>DLRR_F1
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
ETITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNLKTLKLSNNKITDISAL
KQLNNLGWLDLSNNGITDISALKNLASLHTLDLSNNGITDISALKNLDNLHTLDLSNNGITDISALKNLDNLHTLDLS
NNGITDISALKNLTSLHTLDLSNNGIENFSAMSNLENLKTLNLSNNRVTKEACKAIAKALKRATSLHELHLSNNNIGE
EGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETL
DLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALK
QATSLHELHLSNNNIGEEGKAWLEEARRHPGSTLETgshhhhhh
>DLRR_F2
ETITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNLKTLKLSNNKITDISAL
KQLNNLGWLDLSNNGITDISALKNLASLHTLDLSNNGITDISALKNLDNLHTLDLSNNGITDISALKNLDNLHTLDLS
NNGITDISALKNLTSLHTLDLSNNGIENFNALRNLENLKTLNLSNNRVTKDACEAIAEALKRATSLHELHLSNNNIGE
EGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETL
DLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALK
QATSLHELHLSNNNIGEEGKAWLEEARRHPGSTLETgshhhhhh
>DLRR_F3
ETITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNLKTLKLSNNKITDISAL
KQLNNLGWLDLSNNGITDISALKNLASLHTLDLSNNGITDISALKNLDNLHTLDLSNNGITDISALKNLDNLHTLDLS
NNGITDISALKNLTSLHTLDLSNNGIENFEAMRNLENLKTLNLSNNRLTKEACKAVAEALKRATSLHELHLSNNNIG
EEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLET
LDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARAL
KQATSLHELHLSNNNIGEEGKAWLEEARRHPGSTLETgshhhhhh
>DLRR_F4
ETITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNLKTLKLSNNKITDISAL
KQLNNLGWLDLSNNGITDISALKNLASLHTLDLSNNGITDISALKNLDNLHTLDLSNNGITDISALKNLDNLHTLDLS
NNGITDISALKNLTSLHTLDLSNNGITNVSALKNLENLKTLNLSNNNITKEACKAIAEALKRATSLHELHLSNNNIGEE
GAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLD
LSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQ
ATSLHELHLSNNNIGEEGKAWLEEARRHPGSTLETgshhhhhh
>DLRR_F5
ETITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNLKTLKLSNNKITDISAL
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
KQLNNLGWLDLSNNGITDISALKNLASLHTLDLSNNGITDISALKNLDNLHTLDLSNNGITDISALKNLDNLHTLDLS
NNGITDISALKNLTSLHTLDLSNNGIRNLEAMRNLENLKTLNLSNNNVTKEACSALAEALKRATSLHELHLSNNNIG
EEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLET
LDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARAL
KQATSLHELHLSNNNIGEEGKAWLEEARRHPGSTLETgshhhhhh
>DLRR_F6
ETITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNLKTLKLSNNKITDISAL
KQLNNLGWLDLSNNGITDISALKNLASLHTLDLSNNGITDISALKNLDNLHTLDLSNNGITDISALKNLDNLHTLDLS
NNGITDISALKNLTSLHTLDLSNNGIRNFEAMRNLENLKTLNLSNNNFTKEACSALAEALKRATSLHELHLSNNNIG
EEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLET
LDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARAL
KQATSLHELHLSNNNIGEEGKAWLEEARRHPGSTLETgshhhhhh
>DLRR_G1
TITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNVRYLALGGNKLHDISAL
KELTNLGWLNLSNNQLETLPQGVFEKLTNLTTLNLSNNQLTSLPQGVFERLASLTTLNLSNNQLTSLPQGVFERLT
NLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTSLTTLNLSNNQLTSLPDGVLERLTNLKTL
NLSNNQITKEVCRHVAKILKQAASLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALK
QATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGA
AELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATSLHELHLSNNNIGEEGKAWLEEARRHPGSTLETgshh
hhhh
>DLRR_G2
TITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNVRYLALGGNKLHDISAL
KELTNLGWLNLSNNQLETLPQGVFEKLTNLTTLNLSNNQLTSLPQGVFERLASLTTLNLSNNQLTSLPQGVFERLT
NLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTSLTTLNLSNNQLTSLPDGVFERLTNLKTL
NLSNNQLTKEACRIVAKMLKQLASLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALK
QATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGA
AELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATSLHELHLSNNNIGEEGKAWLEEARRHPGSTLETgshh
hhhh
>DLRR_G3
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
TITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNVRYLALGGNKLHDISAL
KELTNLGWLNLSNNQLETLPQGVFEKLTNLTTLNLSNNQLTSLPQGVFERLASLTTLNLSNNQLTSLPQGVFERLT
NLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTSLTTLNLSNNQLTSLPDGVFERLTNLKTL
NLSNNQLTKEACRAVANALKQAASLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARAL
KQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEG
AAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATSLHELHLSNNNIGEEGKAWLEEARRHPGSTLETgsh
hhhhh
>DLRR_G4
TITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNVRYLALGGNKLHDISAL
KELTNLGWLNLSNNQLETLPQGVFEKLTNLTTLNLSNNQLTSLPQGVFERLASLTTLNLSNNQLTSLPQGVFERLT
NLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTSLTTLNLSNNQLTSLPDGVLERLTNLKTL
NLSNNQITKEVCRLVAKFLKQLASLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALK
QATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEGA
AELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATSLHELHLSNNNIGEEGKAWLEEARRHPGSTLETgshh
hhhh
>DLRR_G5
TITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNVRYLALGGNKLHDISAL
KELTNLGWLNLSNNQLETLPQGVFEKLTNLTTLNLSNNQLTSLPQGVFERLASLTTLNLSNNQLTSLPQGVFERLT
NLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTSLTTLNLSNNQLTSLPDGVFERLTNLKTL
NLSNNQITKEVCRMVAKVLKQAASLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARAL
KQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATTLHELHLSNNNIGEEG
AAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATSLHELHLSNNNIGEEGKAWLEEARRHPGSTLETgsh
hhhhh
>DLRR_G6
TITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNVRYLALGGNKLHDISAL
KELTNLGWLNLSNNQLETLPQGVFEKLTNLTTLNLSNNQLTSLPQGVFERLASLTTLNLSNNQLTSLPQGVFERLT
NLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTSLTTLNLSNNQLTSLPKGVLERLTNLKTL
NLSNNQITKEVCRHVAELLKQAASLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALK
QATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATSLHELHLSNNNIGEEGK
AWLEEARRHPGSTLETgshhhhhh
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
>DLRR_H1
TITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNVRYLALGGNKLHDISAL
KELTNLGWLNLSNNQLETLPQGVFEKLTNLTTLNLSNNQLTSLPQGVFERLASLTTLNLSNNNIANINDQMLEGLT
NLTTLNLSHNNLARLWKHANPGGPIYFLKGLTNLTTLNLSSNGFDEIPREVFKDLTSLTTLNLSNNQLTSLPQGVFE
RLTNLKTLNLSNNQLQSLPTGVDEKLTQLTgshhhhhh
>DLRR_H2
TITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNVRYLALGGNKLHDISAL
KELTNLGWLNLSNNQLETLPQGVFEKLTNLTTLNLSNNQLTSLPQGVFERLASLTTLNLSNNNLANLNDKVFEGLT
NLTTLNLSNNNLARLWKHANPGGPIYFLKGLTNLTTLNLSNNGFDEFPKEVFKDLTSLTTLNLSNNQLTSLPQGVF
ERLTNLKTLNLSNNQLQSLPTGVDEKLTQLTgshhhhhh
>DLRR_H3
TITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNVRYLALGGNKLHDISAL
KELTNLGWLNLSNNQLETLPQGVFEKLTNLTTLNLSNNQLTSLPQGVFERLASLTTLNLSNNNLANLNDRLLEGLT
NLTTLNLSNNNLARLWKHANPGGPIYFLKGLTNLTTLNLSNNGFDEFPREVFKDLTSLTTLNLSNNQLTSLPQGVF
ERLTNLKTLNLSNNQLQSLPTGVDEKLTQLTgshhhhhh
>DLRR_H4
TITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNVRYLALGGNKLHDISAL
KELTNLGWLNLSNNQLETLPQGVFEKLTNLTTLNLSNNQLTSLPQGVFERLASLTTLNLSNNNLANLNDRVFEGLT
NLTTLNLSNNNLARLWKHANPGGPIYFLKGLTNLTTLNLSNNGFDELPKEVFKDLTSLTTLNLSNNQLTSLPQGVF
ERLTNLKTLNLSNNQLQSLPTGVDEKLTQLTgshhhhhh
>DLRR_I
TITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNVRYLALGGNKLHDISAL
KELTNLGWLNLSNNQLETLPQGVFEKLTNLTTLNLSNNQLTSLPQGVFERLASLTTLNLSNNNIANINDQMLEGLT
NLTTLNLSHNNLARLWKHANPGGPIYFLKGLTNLTTLNLSSNGFDEIPREVFKDLTSLTTLNLSNNNIANINDQMLE
GLTNLTTLNLSHNNLARLWKHANPGGPIYFLKGLTNLTTLNLSSNGFDEIPREVFKDLTSLTTLNLSNNQLTSLPQG
VFERLTNLKTLNLSNNQLQSLPTGVDEKLTQLTgshhhhhh
>DLRR_J
ETITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNLKTLKLSNNKITDISAL
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
KQLNNLGWLDLSNNGITDISALKNLASLHTLDLSNNGITDISALKNLDNLHTLDLSNNGITDISALKNLDNLHTLDLS
NNGITDISALKNLTSLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLP
QGVFERLTNLKTLNLSNNQLTKEACRAVANALKQAASLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCN
LTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATTLH
ELHLSNNNIGEEGKAWLEEARRHPGSTLETgshhhhhh
>DLRR_K
TITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNVRYLALGGNKLHDISAL
KELTNLGWLNLSNNQLETLPQGVFEKLTNLTTLNLSNNQLTSLPQGVFERLASLTTLNLSNNNLANLNDRVFEGLT
NLTTLNLSNNNLARLWKHANPGGPIYFLKGLTNLTTLNLSNNGFDELPKEVFKDLTSLTTLNLSNNQLTSLPQGVF
ERLTNLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTSLTTLNLSNNQLTSLPKGVLERLTN
LKTLNLSNNQITKEVCRHVAELLKQAASLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIA
RALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATSLHELHLSNNNIG
EEGKAWLEEARRHPGSTLETgshhhhhh
>DLRR_L
ETITVSTPIKQIFPDDAFAETIKANLKKKSVTDAVTQNELNSIDQIIANNSDIKSVQGIQYLPNLKTLKLSNNKITDISAL
KQLNNLGWLDLSNNGITDISALKNLASLHTLDLSNNGITDISALKNLDNLHTLDLSNNGITDISALKNLTSLTTLNLSN
NQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTSLTTLNLSNNNLA
NLNDRVFEGLTNLTTLNLSNNNLARLWKHANPGGPIYFLKGLTNLTTLNLSNNGFDELPKEVFKDLTSLTTLNLSN
NQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTNLTTLNLSNNQLTSLPQGVFERLTSLTTLNLSNNQLT
SLPKGVLERLTNLKTLNLSNNQITKEVCRHVAELLKQAASLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSN
CNLTKEACREIARALKQATTLHELHLSNNNIGEEGAAELVEALLHPGSTLETLDLSNCNLTKEACREIARALKQATS
LHELHLSNNNIGEEGKAWLEEARRHPGSTLETgshhhhhh
*C-terminal linkers and 6x His tags are shown in lower case.
*AS or TS in the regular repeat sequences are for inserting the restriction sites (NheI and SpeI).
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938
Supplementary References
1. Altschul, S.F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs. Nucleic Acids Res 25, 3389-402 (1997).
2. Y. Zhang, J. Skolnick, TM-align: A protein structure alignment algorithm based on TM-score,
Nucleic Acids Research 33, 2302-09 (2005)
3. R Core Team R: A language and environment for statistical computing R Foundation for Statistical
Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/ (2012)
4. Ravi Varadhan, alabama: Constrained nonlinear optimization. R package version 2011.9-1.
http://CRAN.R-project.org/package=alabama (2012)
Nature Structural and Molecular Biology: doi:10.1038/nsmb.2938