Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and...

21
Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment was used to construct the phylogenetic tree in Fig 2.

description

Bv1460 Bv14724 Bv14843 Bv Bv1520 Bv6614 Bv4315 Bv1621 Bv2426 Bv Mb chromosome 4 A. thaliana Bv1540 Bv4110 Bv15515 p10m p11m Bv15629 p40m p41m p41m p41m p15m p15m p15m CYP716A 0 Bv15918 Bv Bv1650 Bv16714 Bv Mb 14.2 chromosome 5 A. thaliana CYP72A8 Bv70Bv740 Bv5611 Bv3213 Bv7818 Bv8321 Bv2924 Bv Bv1733 Bv12442 Bv Bv1330 Bv2012 p35m p14m Bv11327 p14m p41m Bv10752 p41m Bv1100 p10m p40m p40m Mb 4.9 chromosome 3 A. thaliana AB CD dominance p41m Bv1239 p38m p35m p14m p16m Bv6533 Bv3640 p14m p14m Bv15 52 Bv11658 p41m p38m p40m p40m p41m Bv12114 p10m Bv3528 Bv Bv120 Bv Bv Bv12835 Bv Bv8650 Bv LUP5 0 Bv Bv9633 Bv Bv600 Bv63-15 p40m p11m LUP2 17 p40m p16m p38m Mb chromosome 1 A. thaliana p10m Bv p15m p14m Bv1370 Ra1229 Bv13936 UGT73C 45 Bv14550 Bv14354 Bv14161 Bv Mb 7.1 chromosome 2 A. thaliana Bv Bv Bv136 2S dominance A BCD 2R dominance QTL for P-type saponins QTL for glucosinolates QTL for G-type saponins A. Hederagenin cellobioside B. Oleanolic acid cellobioside C. Gypsogenin cellobioside D. 4-Epihederagenin cellobioside 1-4 QTL for flea beetle resistance Fig. S2 Figure S2. B. vulgaris QTL map showing 17 linkage groups by vertical bars, aligned to the five A. thaliana chromosomes. QTLs for flea beetle resistance, saponins of the G- and P-type, and glucosinolates are indicated by vertical colored bars, and positions of the six mapped genes for saponin biosynthesis by circles. Genetic distances (cM) are listed at the left of each linkage group and genetic markers (AFLP, SSR) at the right; corresponding marker positions on the A. thaliana chromosomes are indicated in millions of base pairs (Mb). QTLs for the four unknown saponins from the P-type are marked 1 to 4; see Kuzina et al. (2011) for more details; two of these were dominant, as were two QTLs for glucosinolates. Confidence intervals, explained variation and position of the maximal LOD score are in Table S2. AFLP markers are designated with the MseI (m) and PstI (p) primer combinations followed by molecular weight (bp).

Transcript of Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and...

Page 1: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

Fig. S1

Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment was used to construct the phylogenetic tree in Fig 2.

Page 2: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

Fig. S1, continued

Page 3: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

Bv1460

Bv14724

Bv14843

Bv1869

13

Bv1520

Bv6614Bv4315Bv1621Bv2426

Bv15334

14

1.4

0.1

15.0

15.8

12.8

17.016.9

16.5

5.8

Mb

chro

mos

ome

4A

. tha

liana

Bv1540

Bv4110Bv15515p10m50-43318p11m48-8124Bv15629

p40m49-22039

p41m49-30258

p41m60-49266

p41m60-13085

p15m60-7597p15m60-76101p15m60-292107

15

CYP716A0

Bv15918

Bv16126

16

Bv1650

Bv16714

Bv626

1723.4

26.4

3.5

6.4

3.54.0

16.316.5

Mb

14.2

chro

mos

ome

5A

. tha

liana

CYP72A8

Bv70 Bv740

Bv5611Bv3213Bv7818Bv8321Bv2924Bv3426

28

Bv1733Bv12442

Bv6453

10

Bv1330

Bv2012

p35m62-10919p14m59-24425Bv11327p14m59-37235p41m60-12639

Bv10752

p41m49-11566

11

Bv1100

p10m50-1078

p40m49-46617

p40m49-47529

12

9.8

15.5

23.4

1.31.30.81.41.83.52.84.74.45.3

Mb

4.9

chro

mos

ome

3A

. tha

liana

A BC D

12

3 4

dom

inan

ce

p41m60-2150

Bv1239

p38m49-13521p35m62-13926p14m48-22729p16m49-34132Bv6533Bv3640p14m59-38343p14m48-39450Bv1552Bv11658p41m60-19464

570

p38m49-27589

p40m49-178114

1

p40m49-1370p41m49-1382

Bv12114p10m50-11720

Bv3528

Bv12635

2

Bv120

Bv51-212

Bv87-221

Bv12835

Bv87-148Bv8650

Bv89-164

3

LUP50

Bv89-214

Bv9633

Bv3147

4

5Bv600Bv63-15p40m49-26611p11m59-25112

LUP217

p40m49-26527

p16m49-13834

p38m49-19143

9.07.54.84.1

0.6

18.920.720.122.922.823.923.323.922.825.1

25.0

25.1

25.4

29.7

Mb

28.3

25.7

chro

mos

ome

1A

. tha

liana

p10m50-4210

Bv13521

6

p15m60-1330

1220

p14m47-20140

7

8

Bv1370

Ra1229

Bv13936

UGT73C45Bv14550Bv14354

Bv14161Bv2365

9

7.8

16.115.3

19.618.2

15.4

12.0

13.5

9.5

9.8

Mb

7.1

chro

mos

ome

2A

. tha

liana

Bv270

20

31

Bv51-142

Bv136

2S

dom

inan

ce

A B C D2R

dom

inan

ce

QTL for P-type saponinsQTL for glucosinolates

QTL for G-type saponinsA. Hederagenin cellobiosideB. Oleanolic acid cellobiosideC. Gypsogenin cellobiosideD. 4-Epihederagenin cellobioside

1-4QTL for flea beetle resistance

Fig. S2

Figure S2. B. vulgaris QTL map showing 17 linkage groups by vertical bars, aligned to the five A. thaliana chromosomes. QTLs for flea beetle resistance, saponins of the G- and P-type, and glucosinolates are indicated by vertical colored bars, and positions of the six mapped genes for saponin biosynthesis by circles. Genetic distances (cM) are listed at the left of each linkage group and genetic markers (AFLP, SSR) at the right; corresponding marker positions on the A. thaliana chromosomes are indicated in millions of base pairs (Mb). QTLs for the four unknown saponins from the P-type are marked 1 to 4; see Kuzina et al. (2011) for more details; two of these were dominant, as were two QTLs for glucosinolates. Confidence intervals, explained variation and position of the maximal LOD score are in Table S2. AFLP markers are designated with the MseI (m) and PstI (p) primer combinations followed by molecular weight (bp).

Page 4: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

55.1

73.1

95.1129.1

147.1175.1

189.2

203.2

218.2

257.2279.2 408.4 498.50

20

40

60

80

100[%]

100 200 300 400 500 600 700m/z

Rel

ativ

e A

bund

ance

RT: 22.50 min

55.1

75.1

95.1129.1

147.1

175.1

189.2

218.2

231.2 279.2 361.2408.4 498.50

20

40

60

80

100

[%]

100 200 300 400 500 600 700m/z

Rel

ativ

e A

bund

ance

RT: 22.69 min

a bβ-amyrin α-amyrin

Fig. S3

Figure S3. EI-MS fragmentation patterns and retention times (RT) of trimethylsilyl (TMS) derivatives of 13 authentic triterpene standards and five tentatively identified triterpenoids (unknowns 1-5). Tentatively identified triterpenoids correspond to peak numbers as illustrated in Fig. 3. EI-MS patterns, RT and structures of (a) β-amyrin, (b) α-amyrin, (c) lupeol, (d) erithrodiol, (e) uvaol, (f) betulin, (g) oleanolic acid, (h) oleanolic aldehyde, (i) betulinic acid, (j) betulinic aldehyde, (k) ursolic acid, (l) ursolic aldehyde, (m) hederagenin were compared against authentic standards. Tentative identification of triterpenoids (unknowns 1-5 (n-u)) were based on common EI-MS fragmentation patterns observed for most trimethylsilylated triterpenes and mass spectral comparison with authentic standards. Characteristic fragmentation patterns represented by loss of methyl (-15 m/z), followed by loos of dimethylsilyl oxonium (-75 m/z) and loss of the second methyl (-15 m/z) group from the final trimethylsilylated products (v) and allowed estimation of molecular masses of unknowns 1-5 (Table S8).

Page 5: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

55.1

75.1

95.1109.1

121.1135.2

175.2

189.2

203.2

218.2

231.2

245.2257.2

279.2299.3325.3

369.4393.4

483.4

498.5

0

20

40

60

80

100

[%]

100 200 300 400 500 600 700m/z

Rel

ativ

e A

bund

ance

RT: 22.73 min

55.1

75.1

91.1

147.1

216.2

281.1 361.1

496.4

567.5

RT: 23.3 min

0

20

40

60

80

100

[%]

100 200 300 400 500 600 700

erythrodiol

55.1

75.1

91.1

147.1

203.2

281.1361.1

429.2

496.5

567.5

RT: 23.5min

0

20

40

60

80

100

[%]

100 200 300 400 500 600 700

uvaol

55.1

73.1

91.1

147.1

204.1

281.0361.2

393.3 483.4

567.5

RT: 23.6 min

0

20

40

60

80

100

[%]

100 200 300 400 500 600 700

betulin

c dlupeol

m/z

m/z m/z

e f

Rel

ativ

e A

bund

ance

Rel

ativ

e A

bund

ance

Rel

ativ

e A

bund

ance

Fig. S3, continued

Page 6: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

55.1

73.1

105.1129.1

189.2

203.2

219.2241.2279.2306.2

320.2

393.4

482.4

585.50

20

40

60

80

100

[%]

100 200 300 400 500 600 700

RT: 23.76 min

75.1

105.1129.1

203.2

279.2

RT: 23.8min

0

20

40

60

80

100

[%]

100 200 300 400 500 600 700

oleanolic aldehyde

H

73.1

95.1

129.1

189.2

257.2

292.2

353.3393.4 483.4 585.5

RT: 23.9 min

0

20

40

60

80

100

[%]

100 200 300 400 500 600 700

betulinic acid

55.1

75.1

91.1129.1

189.2

279.2 383.3

484.5512.5

RT: 23.9min

0

20

40

60

80

100

[%]

100 200 300 400 500 600 700

betulinic aldehyde

H

Rel

ativ

e A

bund

ance

oleanolic acid

m/zm/z

m/z m/z

Rel

ativ

e A

bund

ance

Rel

ativ

e A

bund

ance

Rel

ativ

e A

bund

ance

g h

i j

Fig. S3, continued

Page 7: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

55.1

73.1

133.1

203.2

256.2279.3

320.2

393.4482.5

585.5

RT: 24.0 min

0

20

40

60

80

100

[%]

100 200 300 400 500 600 700

55.1

75.1

91.1

133.1

203.2

232.2279.3

RT: 24.1 min

0

20

40

60

80

100

[%]

100 200 300 400 500 600 700

H

ursolic aldehyde

55.1

73.1

105.1

147.1

203.2

278.2

320.2

391.3570.5

24.5 min

0

20

40

60

80

100

[%]

100 200 300 400 500 600 700

hederagenin

ursolic acid

m/z m/z

m/z

Rel

ativ

e A

bund

ance

Rel

ativ

e A

bund

ance

Rel

ativ

e A

bund

ance

k l

m

Fig. S3, continued

Page 8: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

55.1

75.1

129.1

147.1

159.1

190.2

216.2

243.1

279.2

291.2

306.2

331.1

361.2

406.3496.4

586.60

20

40

60

80

100[%]

100 200 300 400 500 600 700m/z

Rel

ativ

e A

bund

ance

RT: 23.09 min

55.1

73.1

91.1103.1

129.1

147.1

201.2

217.1243.2255.2

271.1291.2331.2

361.2379.3

451.4

494.5505.4

541.5584.5

0

20

40

60

80

100[%]

100 200 300 400 500 600 700m/z

Rel

ativ

e A

bund

ance

RT: 23.67 min

55.1

73.1

103.1

129.1

147.1

189.2

201.2

214.2

253.2291.2 361.1389.4

481.4

494.4

584.5

0

20

40

60

80

100

[%]

100 200 300 400 500 600m/z

Rel

ativ

e A

bund

ance

RT: 23.15 min RT: 23.70 min73.0

103.0

129.0201.1

253.1291.2 389.3

494.4

584.5

0

20

40

60

80

100[%]

100 200 300 400 500m/z

Rel

ativ

e A

bund

ance

600

unknown-1 unknown-4

unknown-2unknown-2 identified in the GC-MS analysis of

saponified leaf extracts of the G-type B. vulgaris

n o

p q

Fig. S3, continued,

Page 9: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

55.1

73.1

103.1

129.1

147.1

189.1

201.2

214.2

253.2 304.2361.2389.3

481.4494.4

584.5

0

20

40

60

80

100[%]

100 200 300 400 500 600m/z

Rel

ativ

e A

bund

ance

RT: 23.29 min RT: 23.90 min73.0

103.0

129.0201.1

253.1 304.2389.3

494.4

584.5

0

20

40

60

80

100[%]

100 200 300 400 500m/z

Rel

ativ

e A

bund

ance

600

481.4

55.1

73.1

91.1

129.1

147.1

171.1

187.2

238.1255.1271.1

304.2

347.2361.2391.4413.4

481.4

508.4 583.4

598.5

0

20

40

60

80

100

[%]

100 200 300 400 500 600m/z

Rel

ativ

e A

bund

ance

RT: 24.20 min RT: 24.90 min73.0

95.0

129.0147.0

187.1

238.1 304.1 368.2413.2

481.4

508.4 557.1

598.4

0

20

40

60

80

100[%]

100 200 300 400 500m/z

Rel

ativ

e A

bund

ance

600

unknown-3

unknown-5

unknown-3 identified in the GC-MS analysis of saponified leaf extracts of the G-type B. vulgaris

unknown-5 identified in the GC-MS analysis of saponified leaf extracts of the G-type B. vulgaris

r s

t u

Fig. S3, continued

Page 10: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

CH3-

- 15 m/zOSi

CH3

H3C

CH3

499 m/zOSi

H3C

H3C

..+

484 m/z

Si

CH3

CH3

HO+

- 75 m/z

C .+409 m/z

CH3

- 15 m/zHC

+394 m/z

-

360 380 400 420 440 460 480 5000

1

2

3499.0

393.9

408.9

483.9365.6

377.3 455.5425.4 469.6441.3

m/z

[%]

55.1

75.1

95.1

147.1

218.2

281.1 361.2499.0

0

20

40

60

80

100

[%]

100 200 300 400 500 600 700m/z

Zoomed in

vCharacteristic EI-MS of β-amyrin-tms

Rel

ativ

e A

bund

ance

Fig. S3, continued

Page 11: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

731.5 824.2 1124.11428.0

1268.6

-MS, 14.3min #974

619.8 781.9944.0

1281.8

1106.1

-MS2(1268.6), 14.3min #975

457.5

619.5781.6

943.6

-MS3(1269.0->1106.7), 14.3min #976

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

739.1 824.4 966.4 1107.31282.2

1428.2

1429.1

-MS, 14.1min #978

455.6617.7

821.71145.7

1267.9

779.8

-MS2(1429.1), 14.1min #979

455.5

617.5

-MS3(1428.8->780.6), 14.2min #980

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

1282.7

-MS, 14.3min #990

795.9

-MS2(1282.7), 14.3min #991

407.3471.4

633.5

-MS3(1283.0->796.7), 14.3min #992

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

459.7 730.9 854.1 981.1 1122.4 1428.0

1268.1

-MS, 13.8min #938

619.7 781.8

943.9

1106.1

-MS2(1268.1), 13.8min #939

457.5

619.5 781.7

943.7

-MS3(1268.6->1106.6), 13.9min #940

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

Peak 1 Peak 2

Peak 3 Peak 4

Fig. S4 (a)

Figure S4. LC-MS/MS spectra of tentatively identified saponins produced by expressing (a) LUP5G + CYP716A80 + UGT73C11 and (b) LUP2P + CYP716A80 + UGT73C11 in N. benthamiana, as illustrated in Fig. 4 left (LUP5G) and right (LUP2P) panels, respectively. Mass spectra numbers correspond to peak numbers in Fig. 5. Mass spectra were recorded in negative mode, thus proposed aglycone masses correspond to [M – 1] -.

Page 12: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

756.1 843.1 946.8 1152.6

1430.0

1266.3

-MS, 14.8min #1005

781.6 869.6

983.6

1105.71220.0

617.6

-MS2(1266.3), 14.8min #1006

438.5

-MS3(1266.8->618.1), 14.8min #1007

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

814.3 983.8 1268.4 1513.9

1428.3

-MS, 14.9min #1018

617.7

1145.9

1265.7

779.9

-MS2(1428.3), 15.0min #1019

455.5

617.5

-MS3(1428.8->780.0), 15.0min #1020

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

867.2 964.2 1513.8

1267.9

-MS, 15.2min #1035

617.9

780.0

943.8 1105.9

1270.8

780.8

-MS2(1267.9), 15.2min #1036

455.4

617.5

-MS3(1268.3->779.1), 15.2min #1037

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

418.4 536.6 641.3 837.7 963.71419.9

1515.9

1267.3

-MS, 15.5min #1054

617.7

779.7

943.7 1145.7780.6

-MS2(1267.3), 15.5min #1055

455.4

617.5

-MS3(1267.1->778.7), 15.5min #1056

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

Peak 5

Peak 7

Peak 6

Peak 8

Fig. S4 (a)

Page 13: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

1420.1

1352.9

-MS, 16.0min #1086

660.9840.4

1309.7

1026.3

-MS2(1352.9), 16.0min #1087

659.6

965.6

-MS3(1353.2->1026.4), 16.0min #1088

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

420.0 554.2 637.5 751.3 836.6 944.6

1351.8

1108.7

-MS, 16.2min #1098

458.5621.5

783.6 1116.6

945.7

-MS2(1108.7), 16.2min #1099

622.1

782.5

-MS3(1108.6->945.9), 16.2min #1100

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

1090.0

1309.0

-MS, 16.5min #1113

779.7

1025.7

1267.8

617.6

-MS2(1309.0), 16.5min #1114

455.5

-MS3(1308.7->617.5), 16.5min #1115

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

1258.0

1191.3

-MS, 16.9min #1142

881.71145.9

659.8

-MS2(1191.3), 16.9min #1143

204.8

455.5

631.5

1028.0

-MS3(1191.2->659.9), 16.9min #1144

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

Peak 9 Peak 10

Peak 11 Peak 12

Fig. S4 (a)

Page 14: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

799.1996.2

1194.71578.2

919.4

-MS, 21.0min #1464

553.3 650.2 994.4 1361.7

781.7

-MS2(919.4), 21.1min #1465

620.5

715.5

-MS3(919.2->782.5), 21.1min #1466

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

661.5

935.6

1299.51559.8

825.5

-MS, 21.3min #1483

662.7

779.8

-MS2(825.5), 21.3min #1484

457.5

617.5

-MS3(825.4->781.3), 21.4min #1485

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

901.7

1565.3

-MS, 21.5min #1492

619.6

781.7

-MS2(1565.3), 21.5min #1493

457.5

619.5

-MS3(1565.1->780.9), 21.5min #1494

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

785.0901.9

1097.6

1565.5

-MS, 21.7min #1510

782.0

-MS2(1565.5), 21.7min #1511

457.5

619.5

-MS3(1565.2->781.5), 21.8min #1512

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

Peak 13 Peak 14

Peak 15 Peak 16

Fig. S4 (a)

Page 15: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

901.7

1564.9

-MS, 22.0min #1525

619.7

781.9

-MS2(1564.9), 22.0min #1526

457.5

619.5

-MS3(1565.3->781.0), 22.0min #1527

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

899.9

1561.0

-MS, 22.5min #1559

617.7

779.8

-MS2(1561.0), 22.5min #1560

425.5

617.5

-MS3(1561.3->778.9), 22.5min #1561

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

827.51374.2

1560.9

1172.1

-MS, 24.0min #1673

617.8

780.1

-MS2(1172.1), 24.0min #1674

455.6

617.7

-MS3(1171.7->778.7), 24.1min #1675

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

618.7806.2

1237.5

-MS, 27.1min #1875

617.8

-MS2(1237.5), 27.1min #1876

437.4

571.5

-MS3(1237.2->618.7), 27.1min #1877

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

Peak 17 Peak 18

Peak 19 Peak 20

Fig. S4 (a)

Page 16: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

Peak 21 Peak 22

660.0728.0793.8

1408.5600.1

-MS, 28.2min #1955

407.4

455.4

569.5

785.5 987.0

497.5

-MS2(600.1), 28.3min #1956

437.4

-MS3(600.3->497.8), 28.3min #1957

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

602.3 692.0761.2 840.2 986.7 1095.1

1408.4

-MS, 29.8min #2064

659.61282.11364.0

599.6

-MS2(1408.4), 29.8min #2065

391.3

455.5497.3

-MS3(1408.9->598.2), 29.9min #2066

0

50

100

Intens.[%]

0

50

100

[%]

0

50

100

[%]

200 400 600 800 1000 1200 1400 m/z

Fig. S4 (a)

Page 17: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

317.4 537.9658.0730.9803.8 999.0 1107.0

1268.3

1430.2

1269.1

-MS, 14.2min #774

782.1944.2

1106.3

1271.1

-MS2(1269.1), 14.2min #7750

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

311.3385.3

576.0728.6 835.8

971.8

1106.1

1220.0

1345.0 1572.4

1107.1

-MS, 14.5min #792

457.7

619.8781.9

944.0

1119.8

-MS2(1107.1), 14.5min #7930

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

379.8458.8 606.2

687.8

801.1

956.1

1061.8

1430.1

-MS, 13.2min #716

0

20

40

60

80

100

Intens.[%]

200 400 600 800 1000 1200 1400 m/z

249.6 375.9 461.9 687.4926.7 1106.1

1268.3

1337.1

1269.1

-MS, 13.7min #746

620.0 782.2

944.3

1106.4

1271.1

-MS2(1269.1), 13.7min #7470

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

Peak 1 Peak 2

Peak 3 Peak 4

Fig. S4 (b)

Page 18: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

249.3

419.9

487.8 597.8 806.1 902.5

1024.1

1106.1

1260.0

1107.1

-MS, 15.8min #865

619.8 781.9

944.0

1059.6

-MS2(1107.1), 15.9min #8660

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

444.0 553.5

1106.8

1220.4

1107.5

-MS, 16.0min #876

619.9 782.1

944.2

-MS2(1107.5), 16.1min #8770

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

317.8 506.0 686.6

836.3

1023.9

1268.2

1269.1

-MS, 15.3min #835

619.8 781.9

944.01106.0

1272.0

-MS2(1269.1), 15.3min #8360

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

243.3317.5 440.1 576.3 684.3756.2836.4912.2

1024.1

1106.2

1220.1

1354.1

1107.1

-MS, 15.6min #853

619.8 781.9

944.0

1113.6

-MS2(1107.1), 15.6min #8540

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

Peak 7 Peak 8

Peak 5 Peak 6

Fig. S4 (b)

Page 19: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

317.5 495.5 589.8665.5786.3

944.4

1058.1

945.3

-MS, 16.4min #896

457.7

619.8781.9

897.5

-MS2(945.3), 16.4min #8970

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

249.6 399.3 521.9 651.3

797.1

1000.01067.9

865.6

-MS, 19.9min #1096

796.2

-MS2(865.6), 19.9min #10970

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

243.5 398.5 521.5 628.2

796.5

932.0 1320.9

864.2

-MS, 20.9min #1153

796.1

-MS2(864.2), 20.9min #11540

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

590.4

830.4

902.5

1070.1 1221.1 1320.8

1565.4

-MS, 21.3min #1176

619.8

782.1

1568.4

-MS2(1565.4), 21.3min #11770

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

Peak 11 Peak 12

Peak 9 Peak 10

Fig. S4 (b)

Page 20: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

636.2 731.5

809.7

887.7

1290.5

1291.4

-MS, 22.0min #1214

633.8

1294.3

-MS2(1291.4), 22.0min #12150

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

317.7412.0 666.3 1299.6

823.9

-MS, 22.4min #1235

407.8

471.8615.9

762.0

-MS2(823.9), 22.4min #12360

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

666.1

-MS, 22.5min #1243

619.7

-MS2(666.1), 22.5min #12440

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

237.3 362.4441.6 553.5

618.1

710.8

779.8855.4

1083.2

1440.6

618.8

-MS, 23.1min #1276

241.0

409.6

467.6

625.6

675.9789.7

1067.7 1503.1

-MS2(618.8), 23.1min #12770

20

40

60

80

100

Intens.[%]

0

20

40

60

80

100

[%]

200 400 600 800 1000 1200 1400 m/z

Peak 15 Peak 16

Peak 13 Peak 14

Fig. S4 (b)

Page 21: Fig. S1 Figure S1. Multiple alignment of the selected OSC sequences from Barbarea vulgaris and Arabidopsis thaliana, using the software Muscle; this alignment.

Fig. S5

Saponin aglycone molecular mass of 456(––); 458(––); 472(––)

14 16 18 20 22 24 26 2812 30

14* 15*

7

2

64*5

8*

13

9*6*

12 13*11 16

10*

14 16 18 20 22 24 26 2812 30

2*4*

5

67*

8 9

10

1112

13

14

15

1 3

14 16 18 20 22 24 26 2812 30

2A (EIC)

1 (TIC)

2 (TIC)

2B (EIC)

2C (EIC)

2D (EIC)

1

3*

46*

7*9

10*11*

8

12

1415*16

17*18*

19

20*21*

52

RT (min) RT (min) RT (min)

aLUP2G + CYP716A80 + UGT73C11

bLUP2P + CYP716A81 + UGT73C13

cLUP2P + CYP716A80 + UGT73C11

Figure S5. Saponins produced by expression of different combinations of B. vulgaris genes coding for OSCs, P450s and UGTs in N. benthamiana plant leaves. Panels (a-c) show LC-MS profiles for (1) (a) LUP2G, (b) and (c) LUP2P; (2) LUPs in combination with (a) CYP716A80, (b) CYP716A81, and (c) BvCYP716A80; (3) LUPs and CYPs in combination with (a) UGT73C11, (b) UGT73C13, and (c) UGT73C11. (2A) – (3D) EICs of (2) representing saponins with (2A) five sugar moieties (m/z 1266-1268, 1308-1310, 1380-1382, 1418-1420); (2B) four moieties (m/z 1088-1091, 1105-1110, 1145-1147, 1190-1192) (2C) two moieties (m/z 829-831, 901-903, 1564-1566) and (2D) one moiety (m/z 617-619, 666-668, 1236-1238, 1413-1415, 1408-1411). The y axis (ion count) of each chromatogram is scaled to the highest peak. Peaks highlighted with asterisk (*) correspond to saponins that significantly decreased or disappeared after sodium hydroxide-based saponification, indicating presence of sugar moieties at the C28 position of the triterpene aglycones. See Tables S5-S7 for more detailed information on detected saponins from these enzyme combinations.