Computer-Assisted Applications for the Practicing...

21
9/19/2001, Page 1 2001 Herman Skolnik Award Symposium Computer-Assisted Applications for the Practicing Chemist Thirty years of computer-assisted applications for the synthetic chemist: Experiences of a non-programmer 222 nd National Meeting of the American Chemical Society August 26 – 30, Chicago, ILL Guenter Grethe

Transcript of Computer-Assisted Applications for the Practicing...

Page 1: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 1

2001 Herman Skolnik Award Symposium

Computer-Assisted Applications for the Practicing Chemist

Thirty years of computer-assisted applications for the synthetic chemist: Experiences of a non-programmer

222nd National Meeting of the American Chemical SocietyAugust 26 – 30, Chicago, ILL

Guenter Grethe

Page 2: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 2

It all started here..

1961

Thirty years of computer-assisted applications for the synthetic chemist: Experiences of a non-programmer

…in the basement of the chemistry building at the

Technical University Braunschweig

Cl

OCH3

CH3

O O

OH

CONH2

Page 3: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 3

Thirty years of computer-assisted applications for the synthetic chemist: Experiences of a non-programmer

Princeton Computer Chemistry Laboratory (1972)

DEC rules!

W. Todd Wipke and Robert Langridge

Entering the world of computer-assisted synthesis planningand 3D- modeling of small and large molecules

…..and seeing the light….

Page 4: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 4

Thirty years of computer-assisted applications for the synthetic chemist: Experiences of a non-programmer

NATO Advanced Study Institute, Noordwijkerhout, Netherlands - 1973

Organizers: W. Todd Wipke, Stephen R. Heller, Richard J. Feldmann, Ernest Hyde

The education continues….

Page 5: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 5

Thirty years of computer-assisted applications for the synthetic chemist: Experiences of a non-programmer

The twilight zone

Chemistry, wet or dry?

…but is has to be synthetic chemistry!

ca. 1975

Page 6: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 6

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

CASP at ROCHE

Page 7: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 7

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

Convincing management – preaching the gospel

CHEMIST

CAS, ISI, etc. ONLINE

SCIENTIFIC LIBRARY

REACTIONDATABASES

SYNTHESISPLANNING

COMMERCIAL CHEMICALS

CORPORATEDATABASES

Page 8: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 8

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

The result: MACCS and

Applications

REACCS at ROCHE and….

“You want to use a HONEYWELL? – No problem, the programs were developed on a PRIME”

Yeah, sure. REACCS never ran on the HONEYWELL!!!

The solution: ROCHE bought a VAX!

Page 9: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 9

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

…our own little computer room

…developing MACCS-based application for the scientists.

User friendly???

Page 10: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 10

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

1985 The Big Change!move to Californiaworking for a young and exciting companyworking in an area that interested me most traveling, seminars and ‘preaching the gospel’ to my peersbeing closer to the fulfillment of my dreams

Add Re-define ‘User-friendly’Seamless integration of data

Eliminate the obstacles that prevent synthetic chemists to use available tools AND use them

effectively.

The challenge:

Example: Managing reaction information

ca.1980

Page 11: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 11

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

Major problemsQuery formulation processEvaluation of search resultsLimited involvement of synthetic chemist

Potential solutionsTools that simulate the problem solving processUser interfaces based on users’ tasks and capabilitiesSimplification of the querying processEffective indexing of databasesEfficient post-search management toolsSeamless integration of various information sources Improved non-structural searches, e.g. hierarchical thesauri for keywords

Most importantly: Recognition of the vast knowledge of synthetic chemists

What are the major problems and potential solutions?

Page 12: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 12

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

Problems associated with structural searches

N

O

CH3O

O

O

N

O

CH3O

O

O

Synthetic Problem:

Full Structure Search: No hitsReaction Substructure Search (colored fragment): 188 hits!

Data Source: MDL’s combined reaction databases (ca. 950K reactions)

Keyword Search “Michael Addition”: 3338 hits!!

Solution: Indexing of reactions

Page 13: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 13

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

The indexing is based on changes occurring at atoms and bonds involved in the reaction (reaction center) and the immediate vicinity (alpha and beta atoms) and is expressed as a hashcodeIt is an important tool to advance the involvement of chemists in the retrieval processIts uses include:

Clustering reactions of the same typePost-management of large hitlistFacilitating query formulation (Transformation Searches)Linking of reaction information from different sources

MDL licenses InfoChem’s RCP program to classify all reaction databases.

Reaction Classification Based on Reaction Centers(Reaction Type)

RCP program from InfoChem, Munich

Page 14: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 14

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

N

CNH

H

N

CN 0-Sphere (Broad)

Reaction centers only, similar to broadlybased substructure search

large-sized cluster or hitlist

1-Sphere (Medium)Reaction centers plus alpha atoms,excluding hydrogens

medium-sized cluster or hitlist

2-Sphere (Narrow)

N

CC

N

N

CC

NH

H

NC

CN

NC

CNH

HReaction centers plus beta atoms,excluding consecutive sp3-atoms

small-sized cluster or hitlist

Number of hits from CIRX97 (70060 rxns) for identical transformation at different classification levels

O

O

OH

OH

...655778

...151297

...077692

Number of hits

Topological specificity

700

300

50

broad

medium

narrow

Definitions of RCP Classification

Page 15: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 15

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

Reaction Classification as Post-Search Management ToolClassification codes are data

stored in the databaseusable for sorting (clustering)

N

O

CH3O

O

O

N

O

CH3O

O

O

N

O

O

O N

O

O

O

H

H

Chiral

RSS-Search Query:(in red)

Result: 188 hits

Clustered byClassification Code “MEDIUM”

90 clusters

1.Cluster (21 rxns)

N O

OO

O

ON

O

O O

2.Cluster (15 rxns)

NO

OO O

NO

OO

O

Chiral

O

O O O

OO

HH

4.Cluster (8 rxns)

Result: Large hitlist reduced to 21 relevant reactions in 1.cluster - elimination of noise

Page 16: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 16

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

Reaction Classification as an Effective Querying Tool

Query Form of MDL’s Reaction Browser:

Result of ‘Same Transformation’ Search: 24 hits

Examples:

CH3O OCH3

N

CH3O

O

OCHO

N

CH3O OCH3

O

CH3O

O

CHO

H

H

DBU

.ret.N

N

CO2CH3

O

Chiral

.ret.N

N

H3CO2C

OH

Chiral

Pyrrolidine

Eliminates problem of drawingefficient RSS-queries, easier to understand by end-user chemistCalculates classification code of drawn reactions on-the-flyRetrieves all reactions of the same reaction type without the noise ofnormal RSS searches

Result: Chemists do not have to struggle with formulating the most efficient RSS query to find relevant examples

Page 17: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 17

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

Strategy Meeting in MunichWhere else can reaction classification be used to

benefit the synthetic chemist?Facilitating access to and linking of information sources, of course!

Page 18: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 18

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

Access to Information Sources

Access must simulate chemists’ information gathering processAccess must be multi-directional, fast and seamless to

primary sources (journals, laboratory notebooks, etc.)secondary sources (databases)tertiary sources (major reference works, review articles etc.)other data (catalogues, spectral data, etc.)

All sources must be interlinked and accessible from any sourceIntra- or internet as the primary medium

Information Triangle primary sources

tertiary sourcessecondary sources

Requirements

Page 19: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 19

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

Reaction Databases (ISIS/Host)(MDL, Third Party, Proprietary etc.)

Tertiary Sources(COFGT, EROS, CAC,etc) Primary Journals

ReactionClassification Codes Reaction

Classification Codes

LitLink (citations)

LitLink (citations)

Rxn Class. Codes, citations, structures

Future links

The integrated Major Reference Works (iMRW) Project

Page 20: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 20

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

iMRW – Linking of reaction information

The beginnings of a personal electronic library?

Information from “Comprehensive AsymmetricCatalysis” (Springer Verlag)

stereochemistry,mechanism

Information from“Enzyclopedia of Reagents for

Organic Synthesis”(Wiley & Sons)

Page 21: Computer-Assisted Applications for the Practicing Chemistacscinf.org/docs/meetings/222nm/presentations/222nm60.pdf · 2001 Herman Skolnik Award Symposium Computer-Assisted Applications

9/19/2001, Page 21

Thirty years of computer-assisted applications for thesynthetic chemist: Experiences of a non-programmer

Computers will never be a substitute for the creativity and experienceof synthetic chemists, it will just make them more efficient…

…provided they are being given the right tools

Much has been achieved over the last three decades, but we still have a long way to go!