Avena strigosa – from metabolic gene clusters to genomics
Anne [email protected]
CH2OH
Rha-Gal-Glc-O
O O
O
OH
OHC
CO.O-Fuc-O
Xyl-GlcA-O
Gal
OH
O O-Ara
O OH O
Rha-Xyl-Api
Chromosaponin I
QS21
CH2OH
Rha-Gal-Glc-O
O O
O
OH
OHC
CO.O-Fuc-O
Xyl-GlcA-O
Gal
OH
O O-Ara
O OH O
Rha-Xyl-Api
Chromosaponin I
QS21
CO2H
H
HO
H
HO
R
H
O
NC
O
CO2HH
HO
H
HO
H
O-Glc
Glc-OHO
H
CO2H
H
CO2H
GlcA-GlcA-O
O
CH2OH
CHO
Glc-Ara-O
O
O
O HNMe
OH
CH2OH
Rha-Gal-GlcA-O
OH
CO.O-Glc-Rha-Ara
Xyl-Fuc-GlcNAc-O
OH
O
O
OHO
Xyl O
O
OHOH
Glc
Oleanolic acid Beta-Amyrin R = CO2H, CO2Me, CO.NHMe, CNSynthetic oleananes
Betulinic acid Ginsenoside Rg1Glycyrrhizin
Avenacin A-1 Soyasaponin I
Avicin D
Glc
CO2H
H
HO
H
HO
R
H
O
NC
O
CO2HH
HO
H
HO
H
O-Glc
Glc-OHO
H
CO2H
H
CO2H
GlcA-GlcA-O
O
CH2OH
CHO
Glc-Ara-O
O
O
O HNMe
OH
CH2OH
Rha-Gal-GlcA-O
OH
CO.O-Glc-Rha-Ara
Xyl-Fuc-GlcNAc-O
OH
O
O
OHO
Xyl O
O
OHOH
Glc
Oleanolic acid Beta-Amyrin R = CO2H, CO2Me, CO.NHMe, CNSynthetic oleananes
Betulinic acid Ginsenoside Rg1Glycyrrhizin
Avenacin A-1 Soyasaponin I
Avicin D
Glc
CO2H
H
HO
H
HO
R
H
O
NC
O
CO2HH
HO
H
HO
H
O-Glc
Glc-OHO
H
CO2H
H
CO2H
GlcA-GlcA-O
O
CH2OH
CHO
Glc-Ara-O
O
O
O HNMe
OH
CH2OH
Rha-Gal-GlcA-O
OH
CO.O-Glc-Rha-Ara
Xyl-Fuc-GlcNAc-O
OH
O
O
OHO
Xyl O
O
OHOH
Glc
Oleanolic acid Beta-Amyrin R = CO2H, CO2Me, CO.NHMe, CNSynthetic oleananes
Betulinic acid Ginsenoside Rg1Glycyrrhizin
Avenacin A-1 Soyasaponin I
Avicin D
Glc
CO2H
H
HO
H
HO
R
H
O
NC
O
CO2HH
HO
H
HO
H
O-Glc
Glc-OHO
H
CO2H
H
CO2H
GlcA-GlcA-O
O
CH2OH
CHO
Glc-Ara-O
O
O
O HNMe
OH
CH2OH
Rha-Gal-GlcA-O
OH
CO.O-Glc-Rha-Ara
Xyl-Fuc-GlcNAc-O
OH
O
O
OHO
Xyl O
O
OHOH
Glc
Oleanolic acid Beta-Amyrin R = CO2H, CO2Me, CO.NHMe, CNSynthetic oleananes
Betulinic acid Ginsenoside Rg1Glycyrrhizin
Avenacin A-1 Soyasaponin I
Avicin D
Glc
Metabolic diversity – a snapshot
CO2H
H
HO
H
HO
R
H
O
NC
O
CO2HH
HO
H
HO
H
O-Glc
Glc-OHO
H
CO2H
H
CO2H
GlcA-GlcA-O
O
CH2OH
CHO
Glc-Ara-O
O
O
O HNMe
OH
CH2OH
Rha-Gal-GlcA-O
OH
CO.O-Glc-Rha-Ara
Xyl-Fuc-GlcNAc-O
OH
O
O
OHO
Xyl O
O
OHOH
Glc
Oleanolic acid Beta-Amyrin R = CO2H, CO2Me, CO.NHMe, CNSynthetic oleananes
Betulinic acid Ginsenoside Rg1Glycyrrhizin
Avenacin A-1 Soyasaponin I
Avicin D
Glc
SweetenerBitterness/antifeedant
Antifungal
Adjuvant
Anti-cancer
Plant growth stimulant
“Adaptogen”
Avenacins – Antimicrobial defence compounds produced by oat (Avena spp.)
O
O
OOH
O
OH
CHO
NMe
Avenacin A-1
-D-glu (1 2)
-D-glu (1 4)
-L-ara (1 )3
O
3
H
-Amyrin
Avenacin A-1
Papadopoulou et al. (1999) PNAS 96:12923
The genes for avenacin synthesis are clustered
~3.5 cM:Sad3
Sad1 Sad2Sad7Sad9 Sad10 Sad6Sad8
400 kb
Roots Shoots Leaves Flowers
Sad2
Sad1
Sad7
Sad10
Sad9
GAPDH
Sad2
Sad1
Sad7
Sad9
OSC CYP51
Qi et al (2004) PNAS 101:8233; Qi et al (2006) PNAS 103:18848; Mylona et al (2008) Plant Cell 20:201; Mugford et al (2009)
Plant Cell 21:2473; Wegel et al (2009) Plant Cell 21:3926; Qin et al (2010) Phytochemistry 71:1245; Owatworakit et al
(2012) JBC, 288:3696; Mugford et al Plant Cell (2013) 25:1078; Geisler et al. (2013) PNAS 110:E3360-67
2,3 Oxidosqualene
Shikimate pathway
Sad1(AsβAS1)
Sad2(AsCYP51H10)
Sad7
(AsSCPL1)
Sad9(AsMT1)
Sad10(AsUGT74H5)
Avenacin A-1
NMA-Glc
NMA-Glc
NMA
Anth
β-Amryin
? ?
Cytosol
Vacuole
Plastid
ER
ER derivedvesicles
MVA pathway
Des-acyl avenacin A
Des-acylavenacin A
OH
NH
OH
O
Glc
NH
O
OCH3
Glc
NH
O
OCH3
OHGlc
Glc
CHO
OH
O
O
Ara
NH
O
OCH3
Glc
Glc
CHO
OH
OH
OH
O
O
Ara
Glc
Glc
CHO
OH
OH
OH
O
O
Ara
NH
HO
OCH3
Owatworakit et al (2012)
JBC, 288:3696; Mugford et
al Plant Cell (2013) 25:1078
Metabolic engineering for disease resistance
Take-all disease of wheat
-L-Ara--D-Glc(1-4)
-D-Glc(1-2)
HO
O
HO HO
CO2H
O
O
HO
HOCH2OH
OH
O
CHOO
O NHMe
OH
OH
O
2, 3-oxidosqualene
squalene
squaleneepoxidase
cycloartenol
plant steroids
-amyrin -amyrin lupeol
oxidation
glycosylation
oxidation
glycosylation
avenacin A1
eclabatin etc
lupane betulinicacid etc
soyasapogenol B
cycloartenol synthaseoxidosqualene
cyclases
over 100different scafoldsdiscovered so far
multiplehydroxylation
patterns
greatest diversityintroduced by
functionalisation ofhydroxyl groups
mevalonate farnesyl pyrophosphate
acylation
-D-GlcA
-D-GlcA(1-2)
30
21
16
1213
3
23 24
H
H21
30
16
13
12
24H
H21
30
16
13
12
24
-L-Ara--D-Glc(1-4)
-D-Glc(1-2)
HO
O
HO HO
CO2H
O
O
HO
HOCH2OH
OH
O
CHOO
O NHMe
OH
OH
O
2, 3-oxidosqualene
squalene
squaleneepoxidase
cycloartenol
plant steroids
-amyrin -amyrin lupeol
oxidation
glycosylation
oxidation
glycosylation
avenacin A1
eclabatin etc
lupane betulinicacid etc
soyasapogenol B
cycloartenol synthaseoxidosqualene
cyclases
over 100different scafoldsdiscovered so far
multiplehydroxylation
patterns
greatest diversityintroduced by
functionalisation ofhydroxyl groups
mevalonate farnesyl pyrophosphate
acylation
-D-GlcA
-D-GlcA(1-2)
30
21
16
1213
3
23 24
-L-Ara--D-Glc(1-4)
-D-Glc(1-2)
HO
O
HO HO
CO2H
O
O
HO
HOCH2OH
OH
O
CHOO
O NHMe
OH
OH
O
2, 3-oxidosqualene
squalene
squaleneepoxidase
cycloartenol
plant steroids
-amyrin -amyrin lupeol
oxidation
glycosylation
oxidation
glycosylation
avenacin A1
eclabatin etc
lupane betulinicacid etc
soyasapogenol B
cycloartenol synthaseoxidosqualene
cyclases
over 100different scafoldsdiscovered so far
multiplehydroxylation
patterns
greatest diversityintroduced by
functionalisation ofhydroxyl groups
mevalonate farnesyl pyrophosphate
acylation
-D-GlcA
-D-GlcA(1-2)
30
21
16
1213
3
23 24
A) Scaffolds (OSCs):
B) Scaffold oxygenation:(CYP450s)
AsCYPA
AsCYPB
GmCYP93E1
GgCYP72A154
AsbAS1 LjAMY2
21
30
16
13
12
24
BvC10,C12
GmUGT73P2
GmUGT91H4
AsGTs
Glycosylation
(GTs)
Acylation
AsMT1
AsGT2
AsAT1
C) Further modification:
-L-Ara--D-Glc(1-4)
-D-Glc(1-2)
HO
O
HO HO
CO2H
O
O
HO
HOCH2OH
OH
O
CHOO
O NHMe
OH
OH
O
2, 3-oxidosqualene
squalene
squaleneepoxidase
cycloartenol
plant steroids
-amyrin -amyrin lupeol
oxidation
glycosylation
oxidation
glycosylation
avenacin A1
eclabatin etc
lupane betulinicacid etc
soyasapogenol B
cycloartenol synthaseoxidosqualene
cyclases
over 100different scafoldsdiscovered so far
multiplehydroxylation
patterns
greatest diversityintroduced by
functionalisation ofhydroxyl groups
mevalonate farnesyl pyrophosphate
acylation
-D-GlcA
-D-GlcA(1-2)
30
21
16
1213
3
23 24
Arabidopsis RiceOats Arabidopsis RiceOats Arabidopsis RiceOats Arabidopsis RiceOats
Oat Arabidopsis Rice
Medicago truncatula
The oat promoters are active in other species
Kemen et al. (2014) PNAS 111:8679
AB 36.12AB 36.11AB 36.7
pEW286 (GG CONSTRUCT -pSad1-Gus with 2 introns)
The oat promoters are active in other species
Aymeric Leveau, Gemma Farres, with National
Institute of Agricultural Botany, Cambridge
The HyperTrans system – Rapid high level
transient expression of protein in plants
Sainsbury F and Lomonossoff G. (2008) Plant Phys. 148:1212
Stigmasterol
Stigmasterol
Beta-Amyrin
Stigmasterol
Control
Sad1
Sad1 +
Sad2
Control
Sad1
Sad1 +
Sad2
New product
OH
O
OH
12,13-epoxy-
16-hydroxy-
β-amyrin
1213
16
OH
O
OH
12,13-epoxy-
16-hydroxy-
β-amyrin
1213
16
Production of a functionalised triterpene
Geisler et al. (2013) PNAS 110:E3360-67
OSC
1GT1 MT6 AT5KANRLB
R
B
Level 2 Golden Gate binary vector
Standardized assembly of gene parts from a library, using Golden Gate technology
O
OH
CH3 CH3
CH3 CH3
CH3 CH3
CH3
CH3
O
OHOOH
OH
OH
O
OHOOH
OH
OH
O
CH3
OH
CH3CH3
CH3
CH3
O
ONH
CH3
CH3
O
OH
O
O
OH
Scaffolds
Scaffold oxygenation
Further modification
Precursor
Glycosylation (GTs)
Acylation (MTs, GTs, ATs )
Pathway steps
CYP1 CYP4 GT3
Promoters
CDS
Terminators
GT
1
GT
2
GT
3
GT
4
GT
5
GT
6
GT
7
OSCs
CYPs
GTs
MTs
ATsAT
1
AT
2
AT
3
AT
4
AT
5
AT
6
AT
7
P1 P2 P3 P4 P5 P6 P7
T1 T2 T3 T4 T5 T6 T7
CYP1
MT1 MT2 MT3 MT4 MT5 MT6
OSC1 OSC2 OSC3 OSC4 OSC5 OSC6 OSC7
MT7
CYP2 CYP3 CYP4 CYP5 CYP6 CYP7CYP1
OSCs
CYPs
MTs
ATs
GTsOH
CH3
OH
CH3CH3
CH3
CH3
OH
CH3
O
OH
O
Golden Gate: Weber et al. (2011) PLoS ONE 6(2): e16765. doi:10.1371/journal.pone.0016765; Thomas Leveau unpub.
TS
CYP450
CYP450
MT
GTs
AT Accumulation of acylated terpene in the vacuole
+
+
+
+
+
Functional assembly of a terpene pathway- Proof of concept
The thalianol pathway – The first example of an
“operon-like” gene cluster from Arabidopsis
Field & Osbourn (2008) Science 320 534-547
Thalianol
Thalian-diol
Desaturatedthalian-diol
TS
2,3-Oxidosqualene
CYP708A2
CYP705A5
O
H
OH
OH
H
OH
OH
H
OH
TS KO TS KO
CYP708A2 KO CYP708A2 KO
CYP705A5 KO CYP705A5 KO
wild type wild type
thalia
nol
thalia
n-d
iol
desat-
thalia
n-d
iol
SIC 229 amu SIC 227 amu
Thalianol
Thalian-diol
Desaturatedthalian-diol
TS
2,3-Oxidosqualene
CYP708A2
CYP705A5
O
H
OH
OH
H
OH
OH
H
OH
OO
H
OH
H
OH
OH
H
OH
OH
H
OH
H
OH
OH
H
OH
OH
H
OH
H
OH
TS KO TS KO
CYP708A2 KO CYP708A2 KO
CYP705A5 KO CYP705A5 KO
wild type wild type
thalia
nol
thalia
n-d
iol
desat-
thalia
n-d
iol
SIC 229 amu SIC 227 amu
TS KO TS KO
CYP708A2 KO CYP708A2 KO
CYP705A5 KO CYP705A5 KO
wild type wild type
thalia
nol
thalia
n-d
iol
desat-
thalia
n-d
iol
TS KO TS KO
CYP708A2 KO CYP708A2 KO
CYP705A5 KO CYP705A5 KO
wild type wild type
thalia
nol
thalia
n-d
iol
desat-
thalia
n-d
iol
SIC 229 amu SIC 227 amu
Examples of gene
clusters for specialised
metabolic pathways in
a range of plants
Nutzmann & Osbourn (2014) Curr Opin.
Biotechnol. 26:91
Arabidopsis thaliana
Thalianol
Marneral
Lotus japonicusLotaustralin Linamarin
Zea mays
Solanum lycopersicum
Sorghum bicolor
Papaver somniferum
DIMBOA-Glc
α-Tomatine
Noscapine
Dhurrin
Oryza sativaMomilactone A Phytocassanes A-E
Avena strigosa
Avenacin A-1
A
F
G
E
C
H
B
D
Nutzmann & Osbourn, Curr Opin. Biotechnol. 2014
Arabidopsis thalianaBrassica rapaCucumis sativusGlycine maxLotus japonicusMimulus guttatusMedicago truncatulaPopulus trichcarpaRicinis communisSolanum lycopersiconSolanum tuberosumVitis viniferaBrachypodiumdistachyonmOryza sativaSorghum bicolorSetaria indicaZea mays
Mining the terpenome
Boutanaev et al., (2015) PNAS
Functional analysis of candidate pathways
Boutanaev et al., (2015) PNAS
Avenacin cluster (A. strigosa)
Clustering may allow a ‘belts and braces’ approach to pathway regulation
Thalianol cluster (A. thaliana)
Chromatin modifications at plant metabolic gene clusters
Nutzmann and Osbourn (2015); Yu, Nutzmann et al., submitted.
Genome mining for new pathways and
chemistries
Nutzmann & Osbourn, Curr Opin. Biotechnol. 2014
Avena strigosa (S75) – A diploid model
Genome Size ~3.92 Gb (1C=4.00)
Diploid: AA
2n=2x=14
Chromosome number : 7
Photosynthetic pathway: C3
Bin Han
Qiang Zhao
Ying Lu
Yan Li
Hengyun Lu
Xuehui Huang
Qi Feng
Danling Fan
Wenjun Li
Qilin Tian
Aymeric Leveau
Thomas Louveau
Daisy Orme
Preparation of genomic DNA and samples for RNA-seq,
454 resources, BAC libraries, genome mining, functional
analysis, metabolic biology
Anne Osbourn
Project leader (JIC)
Bin Han
Genome assembly and annotation
Annotation and comparative genomics
Annotation
Genome assembly
Material and Genome sequencing
Genome and transcriptome sequencing
Genome sequencing
Transcriptome sequencing
Director & project leader
Annotation
Genome sequencing ofAvena strigosa S75April 2013-
mate pair(3k, 8k, 20k)
S75 raw reads
Contigs
Scaffolds
Super scaffolds
Fosmid ends(40k)
raw reads
Fosmid
Super scaffolds
scaffolds
Assembly strategy Bin Han
v6 – 90% coverage
RNAseq
Head emergenceSeedling and root Root tips
Boot stage
a
a
a
b
b
a Identical mutations (G1912 → A)b Identical mutations (G7249 → A)
Mutant Mutation
event
Predicted
amino acid
change
Predicted
Mutation type
A1 G1912A Tyr165
STOP
Premature
termination of
translation
B1 G1912A Tyr165
STOP
109 G3417A Tyr380
STOP
610 G1912A Tyr165
STOP
1146 G4169A Tyr471
STOP
1293 G39A Tyr13
STOP
110 G6689A -
Splicing error225 G3302
A -
589 G3914A -
1001 G4365A -
297 G3939A Glu419
Lys
Amino acid
substitution
358 G5234A Cys563
Tyr
384 C7249T Ser728
Phe
532 G549A Gly121
Glu
599 G2809A Gly277
Glu
1023 C7249T Ser728
Phe
1217 G2025A Gly203
Glu
a
a
a
b
b
a Identical mutations (G1912 → A)b Identical mutations (G7249 → A)
Mutant Mutation
event
Predicted
amino acid
change
Predicted
Mutation type
A1 G1912A Tyr165
STOP
Premature
termination of
translation
B1 G1912A Tyr165
STOP
109 G3417A Tyr380
STOP
610 G1912A Tyr165
STOP
1146 G4169A Tyr471
STOP
1293 G39A Tyr13
STOP
110 G6689A -
Splicing error225 G3302
A -
589 G3914A -
1001 G4365A -
297 G3939A Glu419
Lys
Amino acid
substitution
358 G5234A Cys563
Tyr
384 C7249T Ser728
Phe
532 G549A Gly121
Glu
599 G2809A Gly277
Glu
1023 C7249T Ser728
Phe
1217 G2025A Gly203
Glu
Characterization of sad1 mutants
S75 29
735
838
453
259
910
23
1217
S75 11
022
558
910
01A B C
GAPDH
Sad1
S75 A1 B1 10
961
011
46
1293
S75 29
735
838
453
259
910
23
1217
S75 29
735
838
453
259
910
23
1217
S75 11
022
558
910
01
S75 11
022
558
910
01A B C
GAPDH
Sad1
S75 A1 B1 10
961
011
46
1293
S75 A1 B1 10
961
011
46
1293
Stop codons Splicing errors Amino acid substitutions
Transcript
188
98
62
49
38
28
17
14
kDa
Protein
D
F259
F725
S728
I554
T260Y264
D
F259
F725
S728
I554
T260Y264
F259
F725
S728
I554
T260Y264D484
C563
C485
C
D484
C563
C485
D484
C563
C485
C
Characterization of sad1 mutants
Areas for future work
• Avena strigosa genome annotation• Comparative genomics with other species
– Why are oats different?
• Analysis of important metabolic pathways – Agriculture, health, industrial biotechnology
• Establish a reverse genetics platform for diploid oat- Exome capture (validate with allelic series of characterised mutants; then test with 50 uncharacterized avenacin-deficient mutants)
• Longer term - Resequencing of other Avena accessions, GWAS analysis…
Osbourn Group:Ramesh Thimmappa Hans Neutzmann Aymeric LeveauGemma FarresTessa MosesThomas Louveau James Reed Rachel Melton Daisy OrmeZheyong XueDiego Batista
OpenPlant Project Manager:Colette Matthewman
In collaboration with:
Lionel Hill, Paul Brett, Rob Field, Martin Rejzek, George Lomonossoff, Paul O’Maille, Martin Trick, Vinod Kumar (JIC)Andy Greenland, Emma Wallingford (NIAB) Bin Han and team (SIPPE)Tim Langdon (IBERS)The Syntegron team: Susan Rosser (Univ. Glasgow), Jay Keasling (UC Berkeley), Paul Freemont (Imperial), Declan Bates (Univ. Exeter), Josh Leonard (Northwestern) and co-workersUnilever, Croda, AlgenuityChristina Smolke, Beth Sattely, Sue Rhee
Jenni Rant (The SAW Trust)
NIH
Top Related