Caffeine Alkaline Hydrolysis and in Silico Anticipation ...

73
doi.org/10.26434/chemrxiv.12110862.v1 Caffeine Alkaline Hydrolysis and in Silico Anticipation Reveal the Origin of Camellimidazoles Pierre Le Pogam, Erwan POUPON, Grégory Genta-Jouve, Jean-François Gallard, Victor Turpin, Adam Skiredj, Karine Leblanc, Mehdi Beniddir Submitted date: 10/04/2020 Posted date: 13/04/2020 Licence: CC BY-NC-ND 4.0 Citation information: Le Pogam, Pierre; POUPON, Erwan; Genta-Jouve, Grégory; Gallard, Jean-François; Turpin, Victor; Skiredj, Adam; et al. (2020): Caffeine Alkaline Hydrolysis and in Silico Anticipation Reveal the Origin of Camellimidazoles. ChemRxiv. Preprint. https://doi.org/10.26434/chemrxiv.12110862.v1 - Context. Camellimidazoles A–C were recently reported as new natural substances arising from what was described as new caffeine degradation pathway in Keemun black tea. - Discoveries. Under alkaline hydrolysis conditions followed by spontaneous cascade reactions with formaldehyde or dichloromethane (as the key methylene group providers), we were able to achieve the synthesis of camellimidazoles B and C. A MetWork-based pipeline was also implemented highlighting a wealth of structurally diverse compounds formed in the course of the reaction and streamlining the isolation of the newly described camellimidazoles D-F, subsequently confirmed as anticipated in silico upon extensive spectroscopic analyses. Besides demonstrating the artefactual origin of camellimidazoles, the current investigation emphasizes the fitness of MetWork-tagging to illuminate the chemical diversity associated with seemingly simple reactive conditions. - Methods. Organic synthesis, phytochemical analysis, in silico anticipation, molecular networking dereplication File list (2) download file view on ChemRxiv Turpin et al. 10APR2020.pdf (1.51 MiB) download file view on ChemRxiv Turpin et al. SI 10APR2020.pdf (3.17 MiB)

Transcript of Caffeine Alkaline Hydrolysis and in Silico Anticipation ...

doi.org/10.26434/chemrxiv.12110862.v1

Caffeine Alkaline Hydrolysis and in Silico Anticipation Reveal the Originof CamellimidazolesPierre Le Pogam, Erwan POUPON, Grégory Genta-Jouve, Jean-François Gallard, Victor Turpin, AdamSkiredj, Karine Leblanc, Mehdi Beniddir

Submitted date: 10/04/2020 • Posted date: 13/04/2020Licence: CC BY-NC-ND 4.0Citation information: Le Pogam, Pierre; POUPON, Erwan; Genta-Jouve, Grégory; Gallard, Jean-François;Turpin, Victor; Skiredj, Adam; et al. (2020): Caffeine Alkaline Hydrolysis and in Silico Anticipation Reveal theOrigin of Camellimidazoles. ChemRxiv. Preprint. https://doi.org/10.26434/chemrxiv.12110862.v1

- Context. Camellimidazoles A–C were recently reported as new natural substances arising from what wasdescribed as new caffeine degradation pathway in Keemun black tea.- Discoveries. Under alkaline hydrolysis conditions followed by spontaneous cascade reactions withformaldehyde or dichloromethane (as the key methylene group providers), we were able to achieve thesynthesis of camellimidazoles B and C. A MetWork-based pipeline was also implemented highlighting awealth of structurally diverse compounds formed in the course of the reaction and streamlining the isolation ofthe newly described camellimidazoles D-F, subsequently confirmed as anticipated in silico upon extensivespectroscopic analyses. Besides demonstrating the artefactual origin of camellimidazoles, the currentinvestigation emphasizes the fitness of MetWork-tagging to illuminate the chemical diversity associated withseemingly simple reactive conditions.- Methods. Organic synthesis, phytochemical analysis, in silico anticipation, molecular networkingdereplication

File list (2)

download fileview on ChemRxivTurpin et al. 10APR2020.pdf (1.51 MiB)

download fileview on ChemRxivTurpin et al. SI 10APR2020.pdf (3.17 MiB)

1

Caffeine Alkaline Hydrolysis and in Silico Anticipation Reveal the Origin of

Camellimidazoles

Victor Turpin,† Mehdi A. Beniddir,† Grégory Genta-Jouve,‡§ Adam Skiredj,† Jean-François

Gallard,# Karine Leblanc,† Pierre Le Pogam,†* Erwan Poupon†*

† Université Paris-Saclay, CNRS, BioCIS, 92290, Châtenay-Malabry, France

‡ Laboratoire de Chimie-Toxicologie Analytique et Cellulaire (C-TAC), UMR CNRS 8038

CiTCoM, Université de Paris, 4, Avenue de l’Observatoire, 75006, Paris, France

§ Laboratoire Ecologie, Evolution, Interactions des Systèmes Amazoniens (LEEISA), USR

3456, Université De Guyane, CNRS Guyane, 275 Route de Montabo, 97334 Cayenne, French

Guiana

# Institut de Chimie des Substances Naturelles, CNRS, ICSN UPR 2301, Université Paris-

Saclay, 21 Avenue de la Terrasse, 91198 Gif-sur-Yvette, France

*Corresponding authors: [email protected]; pierre.le-pogam-

[email protected]

2

Abstract

Camellimidazoles A–C (2–4) were recently reported as new natural substances arising from

what was described as new caffeine (1) degradation pathway in Keemun black tea. Under

alkaline hydrolysis conditions followed by spontaneous cascade reactions with formaldehyde

(as the key methylene group provider in the “biosynthetic” hypothesis), we were able to achieve

the synthesis of 3 and 4. Nonetheless, to reach as wide coverage of the obtained products as

possible, a MetWork-based pipeline was implemented highlighting a wealth of structurally

diverse compounds formed in the course of the reaction and streamlining the isolation of the

newly described camellimidazoles D−F (7–9), subsequently confirmed as anticipated in silico

upon extensive spectroscopic analyses. A mere stirring of caffeine (1) in a biphasic solvent

system consisting of basic water and dichloromethane as an alternative methylene donor also

granted these structures probably accounting for their occurrence in Wang et al. phytochemical

pipeline. Besides demonstrating the artefactual origin of camellimidazoles, the current

investigation emphasizes the fitness of MetWork-tagging to illuminate the chemical diversity

associated with seemingly simple reactive conditions.

3

Introduction

Caffeine (1) content of tea leaves is notably stable all along fermentation process (i.e. “green”

to “black” tea)1,2 in opposition to polyphenols which are allowed to oxidize under controlled

humidity and temperatures (theaflavin and thearubigin formation for example).3,4 In that

context, camellimidazoles A−C (2–4) were recently reported as new methylene-bridged dimeric

imidazoles from Keemun black tea.5 These structures, along with some trimethylallantoine

derivatives later described by the same research group,6 were suggested to be generated through

an alternative caffeine catabolic pathway. Even though a detailed “biosynthetic” pathway was

proposed by the authors, the convoluted phytochemical processing having led to their isolation

questions their genuine natural origin, especially in view of the minute amounts that were

isolated.5 In the present work, we wanted to unveil the origin of camellimidazoles.

Results and discussion

“Biosynthetic” or other origin? At first, a retro(bio)synthetic analysis of camellimidazoles

reveals, indeed, the need for four building blocks, i.e.:

- imidazoles 5 (aka caffeidine) and 6 that may both arise from hydrolytic decarboxylation of

caffeine (1);

- methylamine that may be provided through hydrolysis of imidazoles 5 and 6;

- and a bridging methylene source such as formaldehyde, which origin will be discussed in

the following.

Isohypsic cascades of Mannich-type reactions and involving cross condensations of 5 and 6

may then easily explain the formation of camellimidazoles A-C (Figure 1).

4

Figure 1. The (bio)chemical logic of camellimidazoles formation.

Taking into account the basic conditions of some extraction steps of the seminal paper, a mere

chemical reactivity can rather be envisaged to afford the newly reported species, especially

when having in mind the huge quantity of starting material (200 kg). This chemical reactivity-

driven scenario would rely on repeatedly described purine alkaline hydrolysis reactions.7–9

Quite strikingly, the putative biosynthetic scheme proposed by the authors relies on a

surprisingly similar way as it proceeds via a caffeidine (5) intermediate which was formerly

chemically obtained from caffeine (1) upon alkaline hydrolysis.

The current investigation aims at assessing the chemical diversity generated under accelerated

caffeine hydrolytic degradation conditions that were deemed suitable for camellimidazoles

formation (i.e. in the presence of formaldehyde in protic medium). Reaction crude mixtures

can, in this particular situation, be seen as “artificial metabolomes” and may advantageously

enter a prioritization rational workflow. For this purpose, untargeted mass spectrometric data

5

of the crude reaction batches were generated by LC-HRMS/MS prior to being organized as a

molecular network (MN).10 The resulting MN was then further labelled by computer-generated

molecules obtained with MetWork.11 From an applied standpoint, these putative structures were

obtained based on foreseen chemical reactions that were first implemented into MetWork server

and their MS/MS spectra were further CFM-ID-predicted. As a last step, a similarity

comparison between theoretical and experimental MS/MS spectra is performed to annotate the

MN (Figure 2).

6

GNPS

A

B

C

7

Figure 2. Overview of the analytical workflow developed in this study along with the structures

of the obtained compounds.

Preparation of “artificial metabolomes”. Wang et al. biosynthetic hypothesis was

mimicked at the bench. Caffeine (1) was reacted under basic conditions (NaOH) and hydrolysis

was accelerated by heating (80°C). Mild acidic conditions were then imposed to favor the key

decarboxylating steps presumably leading to imidazoles 5 and 6 (Figure 1). Finally,

formaldehyde was added in the mixture which was allowed to slowly react at room temperature

for a dozen days. To get a wide insight into the generated species, the crude batch was profiled

by HPLC-HRMS/MS to be later organized as a molecular network. The later was further

labelled by in silico generated compounds obtained through MetWork pipeline as described in

the following section.

Analysis workflow of “artificial metabolomes”. As an implementation of the metabolome

consistency concept,12 MetWork was developed to anticipate new natural products from

untargeted tandem mass spectrometric analyses organized as molecular networks.10 For this

purpose, MetWork web server features (i) a collaborative library of (bio)chemical

transformations and (ii) a CFM-ID based MS/MS spectra prediction module. From a

dereplicated hit in the molecular network, in silico metabolization algorithms generate putative

structures prior to predicting the associated MS/MS spectra. At last, a similarity comparison

between the MS/MS spectra is performed to annotate the node. Based on aforementioned

literature reports on purines and imidazoles reactivity, the foreseen chemical reactions were

uploaded into MetWork web server: amide hydrolysis, urea hydrolysis, alcohol dehydration,

amine hydroxymethylation, enamine hydroxymethylation, carbamic acid decarboxylation,

imidazole decarboxylation, iminium amination, hemiaminal dehydration, aza-Michael addition

and Michael addition with enamine (Figure 3A) and are made available to users. This reaction

panel was applied in silico on caffeine (1), methylamine, and formaldehyde. This analytical

8

pipeline instantly tagged the seminal camellimidazoles B and C (3 and 4) along with most

intermediates proposed to intercede in camellimidazole formations and some putative

unprecedented structures such as methylene-bridged imidazole derivatives (Figure 3B).

9

Figure 3. A: Putative chemical scenario to camellimidazoles with detailed mechanisms. B:

MetWork-tagged cluster of the organic crude batch highlighting inferred intermediates to the

10

camellimidazole series and some anticipated new structures, highlighting prioritized nodes for

isolation and NMR structure elucidation.

Isolated compounds. Camellimidazoles B and C (3 and 4) were instantly MetWork-tagged

along with caffeidine (5), already proposed as a “biosynthetic” pillar en route to these former.

Mass spectrometric-streamlined isolation of these compounds validated their tentative identities

through the thorough analysis of the complete set of NMR spectra.5 Under our conditions,

camellimidazole A (2) seems not to have been obtained, the methylene bridge formation in this

latter may require a series of slower reactions (enamine versus nitrogen nucleophilic attack to

anchor the second imidazole unit).

Besides having pinpointed these expected structures, MetWork facilitated the annotation of a

further dozen nodes. Some of these putative structures were prioritized for isolation based on i)

the degree of scaffold originality ii) the possibility to be isolated for subsequent NMR structure

elucidation. Three methylene-containing compounds were indeed isolated and characterized,

they were subsequently named camellimidazoles D–F (7–9). The structure elucidation of these

new compounds is detailed in the Supporting Information and further validate the CANPA

approach proposed last year.13

Questions about the origin of camellimidazoles. As mentioned before, camellimidazoles

were isolated in minute amounts from tea following an unconventional and poorly rational

protocol.5 Efforts were made by the first authors to address this query, but the analytical

evidences to support the claimed genuineness of camellimidazoles A−C as natural products

(viz. UPLC-HR-ESI-MS of fresh leaves and processed materials) are poorly convincing (level

of confidence 4/5 according to Schymanski et al. i.e. “insufficient evidence to propose possible

structure”14). To benefit from a higher degree of analytical confidence, the extraction process

11

having led to the dubious characterization of camellimidazoles A−C in this seminal report were

repeated by us from two different Keemun black tea samples for subsequent HPLC-HRMS2

chemical profiling. The obtained tandem-mass spectrometric data were later organized as MNs

and dereplicated against (i) the experimental MS/MS spectra of all isolated camellimidazole

derivatives and (ii) the CFM-ID predicted fragmentation patterns of all intermediates formerly

shown to step in camellimidazole synthesis. No annotation in the obtained MNs confirmed the

lack of any camellimidazole formation and demonstrated a possible artefactual origin arising

from alkaline hydrolysis scheme as the (sole) plausible scenario to reach the camellimidazole

series (Figures S2-S5, Supporting Information).

Experimental clues for an artefactual origin. A last point to mitigate was the nature of the

mechanisms having led to camellimidazole frameworks in the specific phytochemical

workflow followed by Wang and coll., that did not use formaldehyde nor methylamine. The in

situ generation of methylamine can be related to the alkaline hydrolysis of caffeine,7 as

reasonably assumed by the authors. The claimed endogenous content of HCHO in Camelia

sinensis is, conversely questionable, especially when taking into account black tea

manufacturing process. This led us to infer that methylene chloride should have served as the

real methylene donor instead.15 With this in mind, liquid-liquid extraction conditions were

mimicked by stirring caffeine (1) in a biphasic system constituted of a basic aqueous phase

(NaOH, 2 eq.) and dichloromethane at room temperature for 3 days. Untargeted LC-MS²

analysis of the crude batch and subsequent MN revealed the occurrence of the camellimidazole

derivatives reported from our former reactive conditions (Figure S6, Supporting Information),

firmly establishing their artefactual origin.

12

Conclusion

Besides unveiling in a straightforward manner the likely occurrence of formerly reported

camellimidazoles and its postulated “biosynthetic” intermediates, the MetWork pipeline also

shed light on some unprecedented appendages. Mass spectrometric approaches offer interesting

perspectives owing to their high sensitivity. As such, tandem mass spectrometry-based

dereplicative strategies lie at the forefront of modern natural product discovery research and

proved their ability to provide a sharp glance into the chemical space encompassed by an

analyzed sample to further pinpoint and streamline new products from various natural sources.16

Conversely, as of 2020, the field of organic synthesis does not yet benefit from such advanced

mass spectrometric platforms and the degree of structural information conveyed by these

techniques remains most often limited to a poorly informative molecular formula. To overcome

this bottleneck, the CFM-ID predicted mass spectrometric fragmentation of in silico derivatized

reactants is herein demonstrated as a valuable strategy to straightforwardly pinpoint tentative

structures from the crude batch without the need for any standard nor even any former literature

report. The specific example of camellimidazoles might be ideally fitted to retrieve the full

extent of advantages offered by the MetWork pipeline since it involves (i) a set of limited and

simple organic molecules and (ii) a reduced number of possible chemical processes to afford a

diversity of more complex compounds through iterative metabolization steps. Accordingly,

MetWork workflow can be deemed of peculiar interest where chemical diversity stems from

such « minimalistic » conditions, e.g. bio-inspired cascade reactions17 or prebiotic chemistry-

related area.18

13

Supporting Information

Copies of 1D (1H and 13C) and 2D NMR spectra and structure determination for compounds 3,

4, 7−9 (PDF).

Corresponding authors

*E-mail: [email protected]; pierre.le-pogam-alluard@universite-

paris-saclay.fr

Author’s identification

ORCID:

Victor Turpin: 0000-0001-8141-275X

Grégory Genta-Jouve: 0000-0002-9239-4371

Mehdi A. Beniddir: 0000-0003-2153-4290

Adam Skiredj: 0000-0003-2160-5757

Pierre Le Pogam: 0000-0002-3351-4708

Erwan Poupon: 0000-0002-1178-6083

The authors declare no competing financial interest.

Acknowledgements

We thank the French “Agence Nationale pour la Recherche" (grant ANR-19-CE07-0002-01)

for funding this study.

14

References

(1) Sang, S.; Lambert, J. D.; Ho, C.-T.; Yang, C. S. The Chemistry and Biotransformation

of Tea Constituents. Pharmacol. Res. 2011, 64 (2), 87–99.

https://doi.org/10.1016/j.phrs.2011.02.007.

(2) Li, S.; Lo, C.-Y.; Pan, M.-H.; Lai, C.-S.; Ho, C.-T. Black Tea: Chemical Analysis and

Stability. Food Funct 2013, 4 (1), 10–18. https://doi.org/10.1039/C2FO30093A.

(3) Robertson, A.; Bendall, D. S. Production and HPLC Analysis of Black Tea Theaflavins

and Thearubigins during in Vitro Oxidation. Phytochemistry 1983, 22 (4), 883–887.

https://doi.org/10.1016/0031-9422(83)85016-X.

(4) Matsuo, Y.; Tanaka, T.; Kouno, I. Production Mechanism of Proepitheaflagallin, a

Precursor of Benzotropolone-Type Black Tea Pigment, Derived from Epigallocatechin

via a Bicyclo[3.2.1]Octane-Type Intermediate. Tetrahedron Lett. 2009, 50 (12), 1348–

1351. https://doi.org/10.1016/j.tetlet.2009.01.030.

(5) Wang, W.; Tang, X.; Hua, F.; Ling, T.-J.; Wan, X.-C.; Bao, G.-H. Camellimidazole A–

C, Three Methylene-Bridged Dimeric Imidazole Alkaloids from Keemun Black Tea.

Org. Lett. 2018, 20 (9), 2672–2675. https://doi.org/10.1021/acs.orglett.8b00878.

(6) Wang, W.; Zhu, B.-Y.; Wang, P.; Zhang, P.; Deng, W.-W.; Wu, F.-H.; Ho, C.-T.; Ling,

T.-J.; Zhang, Z.-Z.; Wan, X.-C.; Bao, G.-H. Enantiomeric Trimethylallantoin

Monomers, Dimers, and Trimethyltriuret: Evidence for an Alternative Catabolic

Pathway of Caffeine in Tea Plant. Org. Lett. 2019, 21 (13), 5147–5151.

https://doi.org/10.1021/acs.orglett.9b01750.

(7) Kigasawa, K.; Ohkubo, K.; Shimizu, H.; Kohagizawa, T.; Shoji, R. Decomposition and

Stabilization of Drugs. X. A New Decomposition Route of Caffeine in Alkaline

Solution. Chem. Pharm. Bull. 1974, 22 (10), 2448–2451.

https://doi.org/10.1248/cpb.22.2448.

(8) Itaya, T.; Ogawa, K. 3, 9-Dialkylhypoxanthines. Chem. Pharm. Bull. (Tokyo) 1985, 33

(5), 1906–1913. https://doi.org/10.1248/cpb.33.1906.

(9) Itaya, T.; Matsumoto, H. 3-Methylinosine. Chem. Pharm. Bull. (Tokyo) 1985, 33 (6),

2213–2219. https://doi.org/10.1248/cpb.33.2213.

(10) Yang, J. Y.; Sanchez, L. M.; Rath, C. M.; Liu, X.; Boudreau, P. D.; Bruns, N.;

Glukhov, E.; Wodtke, A.; de Felicio, R.; Fenner, A.; Wong, W. R.; Linington, R. G.;

Zhang, L.; Debonsi, H. M.; Gerwick, W. H.; Dorrestein, P. C. Molecular Networking

as a Dereplication Strategy. J. Nat. Prod. 2013, 76 (9), 1686–1699.

https://doi.org/10.1021/np400413s.

(11) Beauxis, Y.; Genta-Jouve, G. Metwork: A Web Server for Natural Products

Anticipation. Bioinformatics 2018. https://doi.org/10.1093/bioinformatics/bty864.

(12) Audoin, C.; Cocandeau, V.; Thomas, O.; Bruschini, A.; Holderith, S.; Genta-Jouve, G.

Metabolome Consistency: Additional Parazoanthines from the Mediterranean Zoanthid

Parazoanthus Axinellae. Metabolites 2014, 4 (2), 421–432.

https://doi.org/10.3390/metabo4020421.

(13) Fox Ramos, A. E.; Pavesi, C.; Litaudon, M.; Dumontet, V.; Poupon, E.; Champy, P.;

Genta-Jouve, G.; Beniddir, M. A. CANPA: Computer-Assisted Natural Products

Anticipation. Anal. Chem. 2019, 91 (17), 11247–11252.

(14) Schymanski, E. L.; Jeon, J.; Gulde, R.; Fenner, K.; Ruff, M.; Singer, H. P.; Hollender,

J. Identifying Small Molecules via High Resolution Mass Spectrometry:

Communicating Confidence. Environ. Sci. Technol. 2014, 48 (4), 2097–2098.

https://doi.org/10.1021/es5002105.

(15) Besselièvre, R.; Langlois, N.; Potier, P. Chlorure de Méthylène, Solvant Ou Réactif.

Bull Soc Chim Fr 1972, 4, 1477.

15

(16) Fox Ramos, A. E.; Evanno, L.; Poupon, E.; Champy, P.; Beniddir, M. A. Natural

Products Targeting Strategies Involving Molecular Networking: Different Manners,

One Goal. Nat. Prod. Rep. 2019, 36 (7), 960–980.

(17) See for example: Poupon, E.; Gravel, E. Manipulating simple reactive chemical units:

fishing for alkaloids from complex mixtures. Chem. Eur. J. 2015, 21 (30), 10604–

10615.

(18) Ruiz-Mirazo, K.; Briones, C.; de la Escosura, A. Prebiotic Systems Chemistry: New

Perspectives for the Origins of Life. Chem. Rev. 2014, 114 (1), 285–366.

https://doi.org/10.1021/cr2004844.

16

Table of content graphical abstract

SI-1

Caffeine Alkaline Hydrolysis and in Silico Anticipation Reveal the Origin of

Camellimidazoles

Victor Turpin,† Mehdi A. Beniddir,† Grégory Genta-Jouve,‡§ Adam Skiredj,† Jean-

François Gallard,# Karine Leblanc † Pierre Le Pogam,†* Erwan Poupon†*

† Université Paris-Saclay, CNRS, BioCIS, 92290, Châtenay-Malabry, France

‡ Laboratoire de Chimie-Toxicologie Analytique et Cellulaire (C-TAC), UMR CNRS

8038 CiTCoM, Université de Paris, 4, Avenue de l’Observatoire, 75006, Paris, France

§ Laboratoire Ecologie, Evolution, Interactions des Systèmes Amazoniens (LEEISA),

USR 3456, Université De Guyane, CNRS Guyane, 275 Route de Montabo, 97334

Cayenne, French Guiana

# Institut de Chimie des Substances Naturelles, CNRS, ICSN UPR 2301, Université

Paris-Saclay, 21 Avenue de la Terrasse, 91198 Gif-sur-Yvette, France

*Corresponding authors: [email protected]; pierre.le-pogam-

[email protected]

SI-2

Table of Content

I. General Information ....................................................................................................................... 3

II. Structure Elucidation of Camellimidazoles D–F (7–9). ................................................................... 5

III. Experimental Procedures and Spectroscopic Data for Compounds 3–5 and 7–9 ..................... 7

IV. NMR spectra of isolated compounds 3–5 and 7–9 .................................................................. 13

V. Comparison of the Spectra and Data of Synthetic and Natural Products 3 and 4 ....................... 42

VI. Screening For Camellimidazoles and Its Synthetic Intermediates in Keemun Black Teas ....... 48

VII. Molecular Network Generated Using MS/MS Data From The Crude Batch Obtained From

Caffeine in Conditions Mimicking Wang Phytochemical Workflow ...................................................... 52

VIII. GNPS Library Spectrum Accession Codes of the Isolated Compounds .................................... 53

IX. References ................................................................................................................................ 54

SI-3

I. General Information

General Experimental Procedures

1D and 2D NMR spectra were recorded using either a Bruker AM-300 (300 MHz), a Bruker

AM-400 (400 MHz) fitted with a BBFO probe or a AM-600 (600 MHz) fitted with a TCI cryogenic

probe. Chemical shifts were referenced to the solvent residual signals. IR spectra were

recorded with a PerkinElmer type 257 spectrometer. An Armen Instrument spot liquid

chromatography flash device equipped with a Silica 120 g Grace cartridge was used for flash

chromatography. TLC analyses were carried out on precoated silicagel 60 F254 plates (Merck).

The preparative TLC plates were glass backed (20 x 20 cm) and normal phase silica gel plates

(250 µm thickness) purchased from Macherey-Nagel (Düren, Germany). The preparative TLC

separations were performed in presaturated development chambers, and the plates were rerun

in the same solvent system until correct separation of the closely related imidazole derivatives

could be achieved. Subsequently, the silica was scraped off the plate, and the bands of interest

were desorbed in DCM/MeOH (95/5) for 30 min. Chemicals and solvents were purchased from

Sigma-Aldrich. Qimen Da Bie and Keemun Mao Feng black tea samples were obtained from

“Palais des Thés” purchased at “Le Temps d’un Rêve”, Antony, France.

Mass Spectrometry and Instrument Settings

Samples were analyzed using an Agilent LC-MS (Liquid Chromatography Mass Spectrometry)

system composed of an Agilent 6530 ESI-Q-TOF-MS (Electrospray Ionization Quadrupole

Time of Flight Mass Spectrometry), fitted with a Sunfire analytical C18 column (150 x 2.1 mm,

i. d. 3.5 µm, Waters) operating in positive ionization mode. HPLC analyses were performed by

gradient elution using the following parameters : A (0.1% formic acid in water) and B

(acetonitrile) ; T, 0-2 min, 5% B ; 2-30 min, 100% B linear ; 30-40 min, 100 %B ; 40-41 min,

5% B. ESI conditions were set with the capillary temperature at 320°C, source voltage at 3.5

kV, and a sheath gas flow rate at 10 L/min. Injection volume was set at 5 µL. The mass

spectrometer was operated in Extended Dynamic Range (2 GHz). There were four scan

events: positive MS, window from m/z 100−1200, then three data-dependent MS/MS scans of

the first, second, and third most intense ions from the first scan event. MS/MS settings were

the following: three collision energies (10, 25 and 40 eV), default charge of 1, isolation width

of m/z 2. Purine (C5H4N4, m/z 121.050873), and HP-0921 (hexakis(1 H, 1 H, 3 H-

tetrafluoropropoxy)-phosphazene C18H18F24N3O6P3, m/z 922.009798 were used as internal

lock masses. Full scans were acquired at a resolution of 10 000 (m/z 922) and 4000 (m/z 121).

A permanent MS/MS exclusion list criterion was established to prevent oversampling of the

internal calibrant.

Molecular Networking Parameters

The molecular network MN was generated from the crude batch obtained throughout the

experimental conditions. Data were converted from the standard Agilent data-format .d to the

.mgf format (Mascot Generic Format) using the MassHunter Workstation software from Agilent.

The converted data files were processed following the MN strategy developed by Dorrestein

et al 1. The MN was obtained following the GNPS pipeline (http://gnps.ucsd.edu). Analyses

results and parameters for the network can be accessed via

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=59a823e19b954be48f1e4a4fd9af2a10.

SI-4

The following settings were used for MN generation; minimum pairs cos, 0.6 ; precursor ion

mass tolerance, 0.02 Da ; fragment ion mass tolerance, 0.02 Da ; network topK, 10 ; minimum

matched peaks, 6 ; minimum cluster size, 2; Run MSCluster On.

MNs obtained from black tea samples were generated through the same workflow and are

made accessible at the URL provided in section VI. The retained settings were identical except

that the clustered spectra were searched against the standards with a decreased cosine

threshold value of 0.2 given that the experimental MS2 spectra were partly compared to CFM-

ID predicted fragment patterns that only approximate experimental one.2,3 Regarding our

specific investigation, this threshold is known to enable an efficient MN-tagging for having

correctly annotated the alkaline hydrolysis-derived MN (Figure 3B). To better outline possible

matches between the investigated phytochemical material and the standards, the Run

MSCluster option was turned off.

The molecular networking data were analyzed and visualized using Cytoscape 3.5.1.4

Metwork Annotation Parameters

MetWork tagging was performed using the online webserver available at

https://metwork.pharmacie.parisdescartes.fr. Computed structures were metabolized 15 times

in a row (depth limit of 15) following an implemented set of envisaged reaction schemes

inspired from purine alkaline hydrolysis conditions, namely, amide hydrolysis, urea hydrolysis,

alcohol dehydration, amine hydroxymethylation, enamine hydroxymethylation, carbamic acid

decarboxylation, imidazole decarboxylation, iminium amination, hemiaminal dehydration, aza-

Michael addition and Michael addition with enamine. Metabolization networks were then

created using a m/z tolerance of 0.02, with a cosine threshold at 0.20 and more than 1 matched

peak. Derived structures were obtained from caffeine with formaldehyde as readily available

reactants. Some in situ generated reactants illuminated throughout this first MetWork run were

then unlocked as readily available reactants in further in silico metabolizing step.

SI-5

II. Structure Elucidation of Camellimidazoles D–F (7–9).

Camellimidazole D (7) was isolated as a white amorphous solid. A molecular formula of

C10H17N5O was assigned based on the [M+H]+ signal at m/z 224.1506 (calcd for C10H18N5O,

224.1506, 0 ppm) corresponding to an hydrogen deficiency index of 5. The 1H NMR spectrum

displayed one aromatic/olefinic proton at H 7.12 (1H, s, H-2), and four methyl groups at H

3.42 (3H, s, H3-6), 2.86 (3H, s, H3-15), 2.80 (3H, s, H3-13) and 2.41 (3H, s, H3-14). Some

further resonances arose as extremely broad, low intensity signals spanning from δH 3.87 to

4.27, probably due to an intramolecular mobility. To remedy, compound 7 was subjected to

variable temperature NMR experiments (VT-NMR) with the temperature being gradually

decreased from 300 K to a final temperature of 243 K. These latter conditions revealed two

sharp sets of diastereotopic methylenic protons at H 3.48 and 4.33 (1 H each, d, J = 12.7 Hz)

and at H 3.50 and 4.16 (1 H each, d, J = 14.0 Hz). The 13C NMR and HSQC spectra revealed

10 carbons resonances comprising one carbonyl carbon (C 163.8) and two sp2 quaternary

carbons (C 154.0, and 105.4). These spectroscopic landmarks were consistent with the

tentatively MetWork-generated compound. The HMBC crosspeaks from the aromatic proton

at H 7.12 (H-2) with C 154.0 (C-5) and 105.4 (C-4) and from the methyl protons at H 3.69

(H3-6) to C 137.8 (C-2) and 105.4 (C-4) defined a 1-methylimidazole moiety. The structural

features elucidated so far stand for 4 indices of hydrogen deficiency requiring a supplementary

ring to be introduced to account for the last remaining degree of unsaturation. According to the

HMBC correlation of H 3.15 with both C 154.0 and C 70.5 (C-11), an -NHCH3 group was

determined to connect the imidazole ring with a first diastereotopic methylene group. A second

-NHCH3 moiety was determined through the HMBC crosspeaks from the methyl protons

resonating at H 2.41 to C-11 and a second diastereotopic methylene at C 68.9 (C-9). A third

and last -NHCH3 group was determined to connect C-9 and a carbonyl carbon based on the

HMBC correlations from the proton signal at H 2.41 to C-9 and to the carbon occurring at C

163.8 (C-7). Owing to connectivity constraints and molecular formula requirements, the C-4/C-

7 linkage could be established. This deduction was further backed up by the chemical shift

value of N-8 (N 106.9) that was diagnostic of an amide moiety, and further supported by the

full set of 1H/15N HMBC crosspeaks outlined in Figure S1.5 The determined structure was

further supported by the excellent agreement between its 13C NMR spectral data and those of

structurally related 1-methyl-2-formamide imidazole analogues.6,7 These spectroscopic

evidence confirmed the MetWork-anticipated structure of camellimidazole D (7), as indicated

in Figure S1.

Figure S1. Key HMBC correlations of camellimidazoles D–F (7−9)

SI-6

Camellimidazole E (8) was obtained, along with a few impurities, as a faint yellow amorphous

solid with a molecular formula of C8H12N4O, established by HRESIMS m/z 181.1093 [M+H]+

(calcd for C8H13N4O, 181.10839, Δ 5.02 ppm) differing from 7 by the loss of -CH2NCH3.

Accordingly, both the 13C and 1H NMR spectroscopic data were highly reminiscent of those of

8 except for the lack of these signals. Long range heteronuclear correlations confirmed the

structure of 8 as a 6-membered ring analogue of camellimidazole D as outlined in Figure 2, in

line with MetWork’s assumption.

Camellimidazole F (9). The targeted isolation of m/z 349.2 resulted in obtaining 9 occurring

within a fraction with caffeidine (5) as the major compound. Its molecular formula, from the

protonated molecular peak at m/z 349.2092 (calcd. for C15H25N8O2, 349.2095, Δ 0.86 ppm) in

HRESIMS, determined it to be an isomer of camellimidazoles A and C (2 and 4). Both the 1H

and 13C NMR spectroscopic data of camellimidazole F were very similar to those of caffeidine,

with the only prominent difference being the arising of an additional methylenic signal at H

4.38 (2H, s). Jointly with its molecular formula, this led us to surmise that camellimidazole F

consisted of two symmetrical structural units tethered with a methylene group. The 13C and 1H

NMR chemical shift values of this methylene were consistent with its being entangled between

two aromatic ring-linked -NHCH3 groups.8 This deduction was further confirmed by the HMBC

correlations originating from the methylenic protons at H 4.38 to C 39.2 (C-11/11′) and 151.4

(C-5/5′) (Figure S1).

SI-7

III. Experimental Procedures and Spectroscopic Data for

Compounds 3–5 and 7–9

To a solution of 1 M aq NaOH (102 mmol, 102 mL), was added caffeine (1) (51.5 mmol, 10

g). After 10 minutes stirring at 80°C, cloudy solution became clear. The mixture stirred at 80°C

for 3.3 hours and was cooled in an ice bath. Then 37% aq solution of formaldehyde (136 mmol

10.2 mL,) was added. After 1 hour of stirring, a solution of 2 M aq HCl (80 mmol, 40 mL) was

added until pH became ~5. Color mixture turned from colorless to slight yellow and gaz

production was observed. After stirring 1 hour with strong agitation, 40% aq solution of

methylamine (348 mmol, 30 mL) was added until pH ~9. After stirring open flask at room

temperature for 12 days, Na2CO3 sat aq solution was added until pH~10. Mixture was extracted

with 3 x 180 mL of chloroform. Combined organic layers were dried over Na2SO4, filtered and

concentrated under reduced pressure to afford a yellow lacquer crude.

Keemun black tea extraction for camellimidazole screening

To confirm or infirm the genuine occurrence based on a more reliable analytical workflow, a

sample of Keemun black tea was extracted exactly as reported by Wang et al.6

Purification of compounds 3–5 and 7–9

Crude batch dry residue was submitted to flash chromatography using a silica 120g Grace cartridge with a gradient of DCM-MeOH (100:0 to 20:80) at 100 mL/min to afford 6 fractions (F1-6), including camellimidazole C (4) (59.2 mg, 0.2% yield) and D (7) (1.52 g, 7% yield). Fraction F4 (21.8 mg) was separated by preparative TLC in a DCM/MeOH (9/1) solvent system to afford camellimidazole E (8) (1.9 mg, 0.01% yield) and a mixture of camellimidazole F (9) /caffeidine (5) (2.1 mg, 0.006% yield). Fraction F6 (331.5 mg) was also selected for further purification. A part of the residue (50 mg) was submitted to preparative TLC using the same solvent system. Three consecutive runs were required to achieve a convenient separation of the compounds, leading to the isolation of camellimidazole B (3) (1.7 mg, 0.004% yield).

SI-8

Camellimidazole B (3):

* IUPAC name: 1,1'-(((methylazanediyl)bis(methylene))bis(1-methyl-1H-imidazole-5,4-

diyl))bis(1,3-dimethylurea).

* Physical state: slight yellow lacquer

* Rf: 0.1 (CH2Cl2/MeOH 9/1).

* HRMS (ESI+) m/z: calculated for C17H30N9O2+ [M+H]+ 392.2517, found 392.2503 ( ppm

3.57).

* 1H NMR (400 MHz, methanol-d4): 7.35 (s, 2H), 4.90 (d, J = 4.5 Hz, 2H), 3.58 (s, 6H), 3.39

(s, 4H), 3.17 (s, 6H), 2.69 (d, J = 4.6 Hz, 6H), 2.05 (s, 3H).

* 13C NMR (100 MHz, methanol-d4): 158.5 (2C), 140.6 (2C), 135.8 (2C), 121.5 (2C), 49.2

(2C), 41.6, 36.4 (2C), 32.2 (2C), 27.4 (2C).

Camellimidazole C (4):

* IUPAC name: 4-(((4-(1,3-dimethylureido)-1-methyl-1H-imidazol-5-yl)methyl)(methyl)amino)-

N,1-dimethyl-1H-imidazole-5-carboxamide.

* Physical state: yellow lacquer

* Rf: 0.25 (CH2Cl2/MeOH 9/1).

* HRMS (ESI+) m/z: calculated for C15H25N8O2+ [M+H]+ 349.2095, found 349.2084 ( ppm

3.15).

* 1H NMR (400 MHz, methanol-d4): 7.59 (s, 1H), 7.56 (s, 1H), 4.07 (s, 2H), 3.84 (s, 3H), 3.66

(s, 3H), 2.93 (s, 3H), 2.86 (s, 3H), 2.73 (s, 3H), 2.64 (s, 3H).

* 13C NMR (100 MHz, methanol-d4): 162.7, 160.5, 152.6, 140.2, 139.7, 137.8, 123.6, 117.6,

48.8, 43.8, 36.6, 35.4, 32.6, 27.6, 25.7.

Camellimidazole D (7):

* IUPAC name: 1,4,6,8-tetramethyl-1,4,5,6,7,8-hexahydro-9H-imidazo[4,5-f][1,3,5]triazocin-

9-one.

* IR (υmax): 3365, 2961, 2927, 2901, 2818, 1495, 1034 cm-1

* Physical state: slight yellow amorphous solid.

* Rf: 0.55 (CH2Cl2/MeOH 9/1).

* HRMS (ESI+) m/z: calculated for C10H18N5O+ [M+H]+ 224.1506, found 224.1506 ( ppm 0).

* 1H NMR (600 MHz, chloroform-d) at 243 K: 6.91 (s, 1H), 4.33 (d, J = 12.6 Hz, 1H), 4.16 (d,

J = 14.0 Hz, 1H), 3.50 (d, J = 14.1 Hz, 1H), 3.48 (d, J = 12.8 Hz, 1H), 3.42 (s, 3H), 2.86 (s,

3H), 2.80, (s, 3H), 2.15 (s, 3H).

* 13C NMR (150 MHz, chloroform-d) at 243 K: 163.1, 153.2, 137,2 104.4, 69.8, 68.2, 39.2,

38.7, 34.6, 34.3.

* 15N NMR (60 MHz, chloroform-d) at 300 K: 231.5, 159.7, 106.9, 55.4, 42.2.

SI-9

NMR Data Assignments (chloroform-d) 243 K:

Camellimidazole D (7) (CDCl3)

No. δH (J, Hz)a,b δCc δN

d,e

1 - - 231.5

2 6.91, s 137.2 -

3 - - 159.7

4 - 104.4 -

5 - 153.2 -

6 3.42, s 34.6 -

7 - 163.1 -

8 - - 106.9

9 3.48, d (12.7) 68.2 -

4.33, d (12.7)

10 - - 42.2

11 3.50, d (14.0) 69.8 -

4.16, d (14.0)

12 - - 55.4

13 2.80, s 34.3 -

14 2.15, s 38.7 -

15 2.86, s 39.2 -

a Recorded at 300 MHz, b Recorded at 243 K c Recorded at 75 MHz, d Recorded at 60 MHz, e Inverse detection from heteronuclear sequences

SI-10

Camellimidazole E (8):

* IUPAC name: 1,3,7-trimethyl-1,2,3,7-tetrahydro-6H-purin-6-one.

* Physical state: yellow lacquer

* Rf: 0.45 (CH2Cl2/MeOH 9/1).

* HRMS (ESI+) m/z: calculated for C8H13N4O+ [M+H]+ 181.1084, found 181.1096 ( ppm 6.68).

* 1H NMR (400 MHz, methanol-d4): 7.53 (s, 1H), 4.70 (s, 2H), 3.87 (s, 3H), 2.98 (s, 3H), 2.74

(s, 3H).

* 13C NMR (100 MHz, methanol-d4): 166.3, 151.3, 139.7, 117.3, 70.9, 40.7, 35.4, 35.6.

NMR Data Assignments (methanol-d4, 400/100 MHz)):

No. δH (J, Hz) δC

1 - -

2 7.53, s 139.7

3 - -

4 - 117.3

5 - 151.3

6 3.87, s 35.4

7 - 166.3

8 - -

9 4.70, s 70.9

10 - -

11 2.98, s 35.6

12 2.74, s 40.7

13 - -

14 - -

15 - -

SI-11

Camellimidazole F (9):

* IUPAC name: 4,4'-(methylenebis(methylazanediyl))bis(N,1-dimethyl-1H-imidazole-5-

carboxamide).

* Physical state: yellow lacquer

* Rf: 0.45 (CH2Cl2/MeOH 9/1).

* HRMS (ESI+) m/z: calculated for C15H25N8O2+ [M+H]+ 349.2095, found 349.2092 ( ppm

0.86).

* 1H NMR (400 MHz, methanol-d4): 7.53 (s, 2H), 4.38 (s, 2H), 3.87 (s, 6H), 2.89 (s, 6H), 2.83

(s, 6H).

* 13C NMR (100 MHz, methanol-d4): 163.0 (2C), 151.4 (2C), 139.2 (2C), 116.4 (2C), 89.6,

39.2 (2C), 35.4 (2C), 25.7 (2C).

NMR Data Assignments (methanol-d4, 400/100 MHz):

No. δH (J, Hz) δC

2,2′ 7.53, s 139.2

4,4′ - 116.4

5,5′ - 151.4

6,6′ 3.87, s 35.4

7,7′ - 163.0

9,9′ 2.89, s 25.7

11,11′ 2.83, s 39.2

12 4.38, s 89.6

SI-12

Caffeidine (5):

* IUPAC name: N,1-dimethyl-4-(methylamino)-1H-imidazole-5-carboxamide

* Physical state: yellow lacquer

* Rf: 0.45 (CH2Cl2/MeOH 9/1).

* HRMS (ESI+) m/z: calculated for C7H13N4O+ [M+H]+ 169.1084, found 169.1086 ( ppm 1.24).

* 1H NMR (400 MHz, methanol-d4): 7.36 (s, 1H), 3.79 (s, 3H), 2.87 (s, 3H), 2.86 (s, 3H).

* 13C NMR (100 MHz, methanol-d4): 164.4, 154.6, 139.2, 108.1, 34.9, 31.0, 26.2.

NMR Data Assignments (methanol-d4, 400/100 MHz):

No. 1H δ (ppm) Integration Multiplicity J (Hz) 13C δ (ppm)

9 2.86 3 s - 26.2

11 2.87 3 s - 31.0

6 3.79 3 s - 34.9

4 - 108.1

2 7.36 1 s - 139.2

5 - 154.6

7 164.4

SI-13

IV. NMR spectra of isolated compounds 3–5 and 7–9

1H NMR spectrum of 3 (400 MHz, methanol-d4)

SI-14

13C NMR spectrum of 3 (100 MHz, methanol-d4)

HRESIMS of 3

SI-15

HSQC NMR spectrum of 3 (400/100 MHz, methanol-d4).

SI-16

HMBC NMR spectrum of 3 (400/100 MHz, methanol-d4).

SI-17

Expanded view of the HMBC NMR spectrum of 3 (400/100 MHz, methanol-d4)

12

and

12′

12 and 12′

SI-18

1H NMR spectrum of 4 (400 MHz, methanol-d4).

SI-19

13C NMR spectrum of 4 (100 MHz, methanol-d4)

HRESIMS of 4

SI-20

HSQC NMR spectrum of 4 (400/100 MHz, methanol-d4)

HMBC NMR spectrum of 4 (400/100 MHz, methanol-d4)

SI-21

1H NMR spectrum of 7 (600 MHz, chloroform-d) at 243 K

SI-22

VT-NMR experiments of 7 showing the sharpening of the diastereotopic methylenic signals as

a consequence of temperature decrease

243 K

273 K

298 K

SI-23

13C NMR spectrum of 7 (100 MHz, chloroform-d)

HRESIMS of 7

SI-24

HSQC NMR spectrum of 7 (600/100 MHz, chloroform-d)

SI-25

HSQC NMR spectrum of 7 (600/150 MHz, chloroform-d) at 243 K.

SI-26

HMBC NMR spectrum of 7 (600/150 MHz, chloroform-d)

SI-27

15N HMBC NMR spectrum of 7 (600/60 MHz, chloroform-d)

SI-28

1H NMR spectrum of 8 (400 MHz, methanol-d4).

SI-29

13C NMR spectrum of 8 (100 MHz, methanol-d4)

HRESIMS of 8

SI-30

HSQC NMR spectrum of 8 (400/100 MHz, methanol-d4)

SI-31

HMBC NMR spectrum of 8 (400/100 MHz, methanol-d4)

SI-32

Expanded view of HMBC NMR spectrum of 8 (400/100 MHz, methanol-d4)

9 2

12

5

SI-33

Expanded view of HMBC NMR spectrum of 8 (400/100 MHz, methanol-d4)

6

2

4

SI-34

1H NMR spectrum of 9 (400 MHz, methanol-d4)

SI-35

13C NMR spectrum of 9 (100 MHz, methanol-d4)

HRESIMS of 9

SI-36

HSQC NMR spectrum of 9 (400/100 MHz, methanol-d4)

HMBC NMR spectrum of 9 (400/100 MHz, methanol-d4)

SI-37

Expanded view of the HMBC NMR spectrum of 9 (400/100 MHz, methanol-d4).

11 9

5

7

11

and

11′ 12

5 and 5′

2 and 2′

SI-38

1H NMR spectrum of 5 (400 MHz, methanol-d4)

SI-39

13C NMR spectrum of 5 (100 MHz, methanol-d4)

HRESIMS of 5

SI-40

HSQC NMR spectrum of 5 (400/100 MHz, methanol-d4).

HMBC NMR spectrum of 5 (400/100 MHz, methanol-d4).

SI-41

Expanded view of the HMBC NMR spectrum of 5 (400/100 MHz, methanol-d4).

2 9

4

6

SI-42

V. Comparison of the Spectra and Data of Synthetic and

Natural Products 3 and 4

The spectra and data of synthetic camellimidazole B (3) and camellimidazole C (4) were

compared to those of the natural products isolated by Wang et al.6 from Keemum black tea.

1. Camellimidazole B (3)

1H NMR spectrum of 3 (400 MHz, chloroform-d)

Synthetic mixture

Natural (Wang et al.)

600 MHz, chloroform-d

SI-43

13C NMR spectrum of 3 (100 MHz, chloroform-d)

Natural (Wang et al.)

125 MHz, chloroform-d

Synthetic mixture

SI-44

Comparison of natural/synthetic camellimidazole B (chloroform-d).6

Assignment Natural from Wang et al.6

Synthetic Natural from

Wang et al.6 Synthetic

1H δ (ppm) Δ ppm 13C δ (ppm) Δ ppm

2, 2′ 7.34 7.34 - 135.8 135.8 0

4, 4′

140.6 140.6 0

5, 5′

121.5 121.5 0

6, 6′ 3.57 3.58 +0.01 32.2 32.2 0

8, 8′ 3.16 3.17 +0.01 36.4 36.4 0

9, 9′

158.5 158.5 0

10, 10′ 4.89 4.90 +0.01

11, 11′ 2.68 2.69 +0.01 27.4 27.4 0

12, 12′ 3.38 3.39 +0.01 49.2 49.2 0

14 2.04 2.05 +0.01 41.6 41.6 0

SI-45

2. Camellimidazole C (4)

1H NMR spectrum of 4 (400 MHz, DMSO-d6)

Natural (Wang et al.)

600 MHz, DMSO-d6

Synthetic

SI-46

13C NMR spectrum of 4 (100 MHz, DMSO-d6)

Natural (Wang et al.)

125 MHz, DMSO-d6

Synthetic

SI-47

Comparison of natural/synthetic camellimidazole C (DMSO-d6).6

Assignment Natural from Wang et al.6

Synthetic Natural from

Wang et al.6 Synthetic

1H δ (ppm) Δ ppm 13C δ (ppm) Δ ppm

2 7.49 7.51 +0.02 136.4 136.4 0

4

139.9 140.0 +0.1

5

121.5 121.5 0

6 3.53 3.55 +0.02 32.0 32.1 +0.1

8 2.81 2.85 +0.04 36.0 36.1 +0.1

9

158.1 158.1 0

10 5.66 5.66

11 2.47 2.49 +0.02 27.6 27.6 0

12 3.90 3.92 +0.02 47.9 47.9 0

14 2.51 2.53 +0.02 43.4 43.5 +0.1

15

151.9 152.0 +0.1

17 7.56 7.58 +0.02 138.3 138.4 +0.1

19

115.4 115.4 0

20 3.72 3.74 +0.02 34.7 34.8 +0.1

21

160.8 160.8 0

22 8.04 8.05 +0.01

23 2.72 2.74 +0.02 25.6 25.6 0

SI-48

VI. Screening For Camellimidazoles and Its Synthetic

Intermediates in Keemun Black Teas

Figure S2. Full MN realized using MS/MS data from the organic fraction of the Qimen Da Bie

black tea sample as phytochemically processed by Wang et al. (section ‘UHPLC-HR-ESI-MS

detection of 3 and 4 in the extract of the fresh leaves and the processed materials of Keemun

black tea, p. S-5), the reference compounds 3, 4, 7−9, and the CFM-ID-predicted tandem mass

spectrometric data for the synthetic intermediates disclosed in Figure 3A. The merging of

seemingly similar MS/MS spectra across different samples was precluded by switching off the

Run MS Cluster option and the cosine similarity cutoff for the molecular network was set at

0.2, owing to some experimental vs theoretical MS/MS data comparison. In these conditions,

the lack of edges instigated between reference ions and the ions detected from the

phytochemically processed material straightforwardly determined their absence in Qimen Da

Bie Black Tea sample.

For validation, a second tea sample was considered in the frame of this study, viz. Keemun

Mao Feng. The MN of its organic phase, obtained through the same molecular networking

parameters is displayed in Figure S3, also revealing the lack of all camellimidazoles and their

putative precursors.

SI-49

Figure S3. Full MN realized using MS/MS data from the organic fraction of the Keemun Mao

Feng Black Tea sample. Standards and MN parameters are identical to those reported in

Figure S2.

To further ascertain the absence of these compounds, these substances were screened for in

the aqueous phase and in the crude ethanolic extract, with MNs being subsequently generated

against the same standards. Two different brands of Keemun Black Tea were considered.

Although not being displayed in the supporting material, the obtained MN can be accessed at

the URL links collated in Table S1.

Table S1. URL Links related to the MNs generated from the various extracts and fractions of

interest for Qimen Da Bie and Keemun Mao Feng Black Tea.

Qimen Da Bie Keemun Mao Feng

Organic Phase 03616eb367354949b45537708eed24d4 e2bc843567d2463babe3de3e523236a7

Aqueous Phase 5d977a769d5645648b3ff534f44b048a 439268d799f8454e8f1b64e4e297ec94

Crude Ethanolic Extract

e03e5da4eaa14d2794d2dcc1a4a39777 16f2779e6c8d4302a81082c33b85c637

Assuming that the sought-after camellimidazoles may have occurred in too scarce amounts to be selected for MS/MS fragmentation, the signals corresponding to their protonated molecules were searched in all Keemun Black tea-derived extracts/fractions. The resulting Reconstructed Ion Chromatograms (RIC), garnered in Figures S4 and S5 reveal the lack of any compatible [M+H]+ signal in all of them.

SI-50

Figure S4. Extracted Ion Chromatograms (EICs) for the protonated molecular signal of camellimidazole B (3) (m/z 392.2) in a reference sample, and in Keemun Black Tea-derived extracts/fractions.

SI-51

Figure S5. Extracted Ion Chromatograms (EICs) for the protonated molecular signal of camellimidazole C (4) (m/z 349.2) in a reference sample, and in Keemun Black Tea-derived extracts/fractions.

SI-52

VII. Molecular Network Generated Using MS/MS Data From

The Crude Batch Obtained From Caffeine in Conditions

Mimicking Wang Phytochemical Workflow

Figure S6. Full MN realized using MS/MS data from the crude organic batch obtained from

caffeine (1) processing in a biphasic system consisting of a basic aqueous

phase/dichloromethane, after 3 days of stirring at room temperature. MN parameters: Min Pair

Cos 0.6 ; NetWork TopK 10 ; Minimum Matched Fragment Ions 6 ; MSCluster On. Library

Search Option: Minimum Matched Peaks 6 ; Score Threshold 0.7. The MN can be accessed

at the following ID: 70383feb0d154ef89cadf6feb59de085.

SI-53

VIII. GNPS Library Spectrum Accession Codes of the Isolated

Compounds

Compound GNPS Accession Code

Camellimidazole B (3) CCMSLIB00005464623 Camellimidazole C (4) CCMSLIB00005464624 Camellimidazole D (7) CCMSLIB00005691871 Camellimidazole E (8) CCMSLIB00005691872 Camellimidazole F (9) CCMSLIB00005691920

Caffeidine (5) CCMSLIB00005464625

SI-54

IX. References

(1) Yang, J. Y.; Sanchez, L. M.; Rath, C. M.; Liu, X.; Boudreau, P. D.; Bruns, N.; Glukhov,

E.; Wodtke, A.; de Felicio, R.; Fenner, A.; Wong, W. R.; Linington, R. G.; Zhang, L.; Debonsi,

H. M.; Gerwick, W. H.; Dorrestein, P. C. Molecular Networking as a Dereplication Strategy.

Journal of Natural Products 2013, 76 (9), 1686–1699. https://doi.org/10.1021/np400413s.

(2) Cabral, R. S.; Allard, P.-M.; Marcourt, L.; Young, M. C. M.; Queiroz, E. F.; Wolfender,

J.-L. Targeted Isolation of Indolopyridoquinazoline Alkaloids from Conchocarpus

Fontanesianus Based on Molecular Networks. J. Nat. Prod. 2016, 79 (9), 2270–2278.

https://doi.org/10.1021/acs.jnatprod.6b00379.

(3) Ramos, A. E. F.; Evanno, L.; Poupon, E.; Champy, P.; Beniddir, M. A. Natural Products

Targeting Strategies Involving Molecular Networking: Different Manners, One Goal. Nat. Prod.

Rep. 2019, 36 (7), 960–980. https://doi.org/10.1039/C9NP00006B.

(4) Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N. S.; Wang, J. T.; Ramage, D.; Amin, N.;

Schwikowski, B.; Ideker, T. Cytoscape: A Software Environment for Integrated Models of

Biomolecular Interaction Networks. Genome Res. 2003, 13 (11), 2498–2504.

https://doi.org/10.1101/gr.1239303.

(5) Marek, R.; Lycka, A. 15N NMR Spectroscopy in Structural Analysis. COC 2002, 6 (1),

35–66. https://doi.org/10.2174/1385272023374643.

(6) Wang, W.; Tang, X.; Hua, F.; Ling, T.-J.; Wan, X.-C.; Bao, G.-H. Camellimidazole A–

C, Three Methylene-Bridged Dimeric Imidazole Alkaloids from Keemun Black Tea. Org. Lett.

2018, 20 (9), 2672–2675. https://doi.org/10.1021/acs.orglett.8b00878.

(7) Kumar, R.; Wacker, C. D.; Mende, P.; Spiegelhalder, B.; Preussmann, R.; Siddiqi, M.

Caffeine-Derived N-Nitroso Compounds. II. Synthesis and Characterization of Nitrosation

Products from Caffeidine and Caffeidine Acid. Chemical Research in Toxicology 1993, 6 (1),

50–58. https://doi.org/10.1021/tx00031a008.

(8) Frogneux, X.; Blondiaux, E.; Thuéry, P.; Cantat, T. Bridging Amines with CO2:

Organocatalyzed Reduction of CO2 to Aminals. ACS Catal. 2015, 5 (7), 3983–3987.

https://doi.org/10.1021/acscatal.5b00734.