Unified Markush Search on New STN® - stn-international.com · REFX simplifies Cross File Search...

51
Unified Markush Search on New STN®

Transcript of Unified Markush Search on New STN® - stn-international.com · REFX simplifies Cross File Search...

Unified Markush Search on New STN®

• Overview • Content • Tips for structure queries • Search examples

Agenda

3

Markush structures represent the broadest aspect of the invention being claimed

• Markush structures are also referred to as generic structures.

• Markush structures have both fixed and variable portions (A1 and B1 in the structure shown).

• Specific structures have no variability, all atoms and locations are known.

Specific structures Markush structure

4

Markush databases contain the generic structures from patents

. . .

. . .

Partial display of MARPAT® 2014:1785534-1-1

5

The two premier Markush databases, MARPAT and Derwent Markush Resource (DWPIM), are available to search on a single platform • True Markush databases of generic structures from

patents • Variable parts of the structures may be represented by

‒ Specific atoms or groups of atoms, e.g., fluorine or cyano ‒ Nomenclature terms representing specific rings or functional

groups, e.g., morpholino, cyclohexyl ‒ Generic terminology to represent structural classes , e.g.

halogen, alkyl, aryl ‒ Terminology describing a function, property or activity, e.g.

“protecting group,” “spacer,” or “electron withdrawing group”

6

New STN® allows simultaneous search with a single query structure STN search conventions preserved: • Bond types & values • Generic nodes • Attributes • Match Level • Element Count Level

DCR: Derwent Chemistry Resource

~ 2.5 M structures ~ 1.9 M structures

CAS RegistrySM

~ 107 M structures

MARPAT

~ 1.1 M structures

DWPIM

Query Structure

7

REFX simplifies Cross File Search from MARPAT and DWPIM

• MARPAT and DWPIM are both implemented as structure databases in new STN

• Interactions with CAplusSM and DWPI work similarly to REGISTRY and DCR

• Use REFX to find corresponding patent references from a structure search

• Use Automatic Cross File Search for easy results retrieval

8

Options for searching are the same as classic STN

• Substructure (SSS) or Closed Substructure (CSS) • Search scope - FULL • Precision tools available

‒ Match level ‒ Generic definitions (saturation, ring or chain type, etc.) ‒ Element Count ‒ Element Count level

• Ring/Chain bonds are not allowed in MARPAT, but are allowed in DWPIM

9

Precision search tools provided for both Markush databases on new STN

• Match level (ATOM, CLASS, ANY)

• Element Count Level (Limited, Unlimited)

Match Level CLASS

Query Result

Generic node match for a heterocyclic ring, e.g. Hy, HEA, HEF

+

Query Result

+

Generic node match for a heterocyclic ring, e.g. Hy, HEA, HEF, with a range overlap N=1

Match Level CLASS

Limited

10

New STN structure queries have the same definitions as STN Express

Attributes can be changed using a right-click.

The Cy node has been changed to Class match.

Changes from defaults are noted with an asterisk.

Nodes Default Match Level

• Ring atoms • Cb, Cy, Hy

Atom

• Chain atoms • Ak

Class

11

Structure attributes will be greyed out if they do not effect the query

12

New warning messages help with query building

13

Assigned attributes clarify what is being searched

14

MARPAT coverage

• Over 1.1M Markush structures from patents • Updated daily • 63 patent authorities 1988 - present

‒ Russian patents from Jan 2000 - present ‒ Korean patents from 2008 - present

• Selective coverage of: ‒ English language patents from 1984-1987 ‒ French and German patents from 1986-1987 ‒ Japanese patents from 1987 ‒ INPI data (1961-1987)

15

Improved MARPAT assembled displays in new STN

Query

New STN assembled answer

Classic STN assembled answer

16

Example: CAplus, REGISTRY, and MARPAT search for ledipasvir analogs

The non-java editor allows CAS Registry Number® modeling using the Add to editor tool.

The entire query is set to CLASS match.

17

You can search all three files at the same time

Automatic Cross File Search is ON under General Search settings.

18

Hit structure information appears in the indexing in CAplus

REGISTRY hit structures can be viewed in the indexing.

19

Hit Markush information appears in the indexing in CAplus

MARPAT Structures hit highlighting indicate the Markush Structure information for this record.

Markush structures can be toggled to show or hide.

20

Click the Registry tab to view structure results

21

Click the MARPAT tab to view the Assembled hit structures

22

Click on any structure to zoom

23

Detailed displays allow you to choose a preferred view

• Full - detailed Markush structures with all associated additional information ‒ Hit G-groups are highlighted

• Assembled Hit+ - assembled hit Markush structures with hit G-group variables, plus all non-hit G-groups in the assembled structure shown at the bottom of the hit list

• Unassembled Hit+ - unassembled hit Markush base structures with complete hit G-groups, plus all non-hit G-groups

24

Complete and Incomplete MARPAT iterations can be filtered, sorted or separated

25

An alternative approach if you want to isolate unique MARPAT answers 1. Run query in REGISTRY/CAplus 2. Run query in MARPAT/CAplus 3. NOT out bibliographic answers of first set from second

set

Use REFX for each L-number to NOT out bibliographic answers.

26

Unique answer from MARPAT search

27

Get Substances from CAplus always retrieves substances from REGISTRY - use SUBX to see MARPAT structures

The MARPAT Accession Number ends in 1-2, reflecting the CAplus Markush hit indexing Structure 1, Diagram 2 with the patent location indicated.

28

Derwent Markush Resource (DWPIM) on new STN

• Markush chemical compounds indexed by Thomson Reuters from basic patents in DWPI℠

• DWPI patents classified in pharmaceutical (B), agrochemical (C), and/or general chemical (E) sections

• Updated with DWPI – 82 times per year • >1.9 million Markush structures • 33 patent issuing authorities • French patent office (INPI) backfile 1961-1998 • US, EP and WO coverage from 1978 onwards

29

DWPI structure databases on STN

> 3.2 M patents

DWPI

REFX

SUBX

> 1.9 M structures

DWPIM

> 2.5 M structures

DCR

Specific chemical structures indexed by Thomson Reuters are available in the Derwent Chemistry Resource (DCR) database.

30

DWPIM authority coverage

31

Types of structures indexed

Chemical compound coverage ‒ Non-polymeric organic molecules ‒ Organometallic compounds ‒ Inorganic structures

• Simple inorganic molecules • Extended structures such as clays, zeolites and heteropolyacids

‒ Partially defined structures ‒ Polymeric structures

• Only for pharmaceutical and agrochemical patents • Includes synthetic polymers, polysaccharides, polypeptides, etc.

32

Acyclic / Cyclic Other

CHK (Alkyl, Alkylene) MX (Any metal)

CHE (Alkenyl, Alkenylene) A35 (Group III A - V A metal)

CHY (Alkynyl, Alkynylene) ACT (Actinide)

ARY (Aryl) AMX (Alkali(ne) earth metal)

CYC (Cycloaliphatic) LAN (Lanthanide)

HEA (Monocyclic heteroaryl) TRM (Transition metal)

HET (Monocyclic nonaromatic) HAL (Halogen)

HEF (Fused heterocyclic) XX (Unspecified, except H)

DWPIM generic nodes

• Markush structures in DWPIM are indexed using a variety of generic nodes (Superatoms)

33

STN variable query nodes retrieve DWPIM generic nodes

STN variable query nodes

HET HEF HEA

DWPIM generic nodes for Hy

DWPIM retrieved generic nodes

CYC ARY

DWPIM generic nodes for Cb

CHE CHY CHK

DWPIM generic nodes for Ak

ML = Class

34

DWPIM generic node attributes

Ring type MON monocyclic FU Fused Degree of ring saturation SAT Fully saturated UNS Unsaturated

Chain length LO Low (up to 6 carbons) MID Middle (7-10 carbons) HI High (11 or more carbons) Chain type STR Straight BRA Branched

35

STN node attributes retrieve DWPIM indexed attributes

STN node attributes, e.g. Ak DWPIM retrieved attributes

CHE CHY

CHKSTR CHK

DWPIM alkyl (straight)

CHK

DWPIM alkyl (no limitation) ML = Class

CHKLOW CHK

DWPIM alkyl (low) ML = Class

ML = Class

36

‒ Thiophene: ML = Atom

‒ Carbocycle (Cb): ML = Atom

‒ Alkyl (Ak): ML = Class

‒ Heterocycle (Hy): ML = Atom

Search example

Default settings. Class

Class

Search Query:

1 3 2 4

= No further substitution on Ak (Locked).

1

3

2

4

37

STN variable query nodes retrieve DWPIM generic nodes

STN search query Typical DWPIM assembled hits

1

3

2

4

1

3

2

4

STN query nodes with Match Level Class, retrieve corresponding generic and specific nodes in DWPIM.

DWPIM attributes are also accessible, e.g. MON = monocyclic, FUS = Fused.

1 3 2 4

38

Prepare structure queries using the structure editor

Click OK to add the query to the structures tab of the history panel.

Cb and Hy nodes have been set to Class match. Changes from defaults are indicated with an asterisk. This has no effect on DCR.

Right click on a node to change Attributes, e.g. Match Level.

Block substitution with the lock atoms tool.

39

Search the structure query and review structures

Automatic Cross File Search is set ON.

Assembled structures with hit highlighting.

Click on any structure to enlarge (zoom).

Click on a Markush compound number of interest for detailed display views (next).

40

DWPIM detailed display – Brief view

Hit fragments are combined to form the assembled structure.

Query relevant G-groups (G2, etc.).

Hit fragments are highlighted.

Unassembled DWPIM Markush base structure.

41

Detailed display allows you to choose a preferred view

• Brief – unassembled hit Markush base structure with complete hit G-groups related to the query ‒ Hit fragments within hit G-groups are highlighted

• Full – unassembled hit Markush base structure with all G-groups, including those not related to the query ‒ Hit fragments within hit G-groups are highlighted

42

Crossover with REFX and review hit structures in DWPI

Use the REFX operator to retrieve corresponding DWPI references (L2).

The structure search (L1) is combined with terms for antiviral in DWPI (L2).

43

Multi-database search example – Diazepam

Click OK to add the query to the structures tab of the history panel.

Note: Since the focus of this example is multi-database searching, default settings are used with a simple Closed Substructure Search (CSS) query.

To handle possible tautomeric issues, unspecified bonds are used here.

44

Search the structure query and review structures

Query relevant G-group hit fragments are represented as an assembled structure.

Closed Substructure Search (CSS).

Click on any structure to enlarge (zoom).

45

Crossover using REFX to retrieve corresponding DWPI and CAplus℠ bibliographic records

Use the REFX operator to retrieve corresponding DWPI and CAplus references (L2).

46

Use Create Term List to identify unique hits

47

Use Create Term List to identify unique hits (cont.)

Q111 = patent numbers+kind codes taken from CAplus (L2).

L2 = CAplus and DWPI combined search results.

383 patent records only found in DWPI (L3).

L2 NOT Q111

48

DWPIM resources

• DWPIM Reference Manual (new STN Sign In required) https://www.stn.org/help/stn/en/dwpim_manual.pdf • Recorded Events http://www.stn-international.com/recorded_events.html

‒ Unified Markush Search on new STN

‒ Derwent Markush Resource (DWPIM) now available on STN

49

Summary of features for unified Markush searching

• Structure search key features ‒ Single structure STN structure conventions

• Search and retrieval ‒ Generic node concept Match level Variety of attributes

• Display formats for Markush structures ‒ Various formats Assembled display Hit structure

highlighting

• Integrated environment for search and display ‒ Crossfile search Structure displays in bibliographic

databases

50

Learn more via the STN help menu

What’s New – lists enhancements from recent releases.

Quick Tour – an overview video of the new platform that enables users to get started easily.

Help – a detailed reference tool to all features and content, including videos, examples, and tutorials.

Log in to new STN: www.stn.org

CAS [email protected] Support and Training: www.cas.org

FIZ Karlsruhe [email protected] Support and Training: www.stn-international.de

For more information …