Chemical Information Retrieval - Wofford...

12
Chemical Information Retrieval Page 1 David Whisnant Chemical Information Retrieval CAS & SciFinder Searching for Substances In the past two lessons you worked with SciFinder to search for articles from the chemistry literature. In this exercise we will use look for information about substances rather than research topics. CAS Registry Numbers In the Chemical Abstracts databases, substances are uniquely identified by their CAS Registry Numbers. Registry numbers are much better than names or formulas for identifying a substance, because they are unique. If you know the Registry Number of a specific compound, you can search for that number in the CA database and be certain of finding only information about that particular compound. The first thing we need to learn is how to find the Registry Number of a substance. There are over 26,000,000 Registry Numbers for organic and inorganic compounds, so guessing one would be tough! Luckily, Registry Numbers are so commonly used to identify substances that you will find them in most chemistry databases. Although you can use SciFinder to search for Registry Numbers, it is worthwhile to look for them in other databases first. You may not always have SciFinder available when you need it. Finding Registry Numbers: Wolfram|Alpha Let’s find the Registry Number of the compound shown at the right. We don’t know its name, but can start with its molecular formula, C 4 H 6 N 4 O 3 S 2. A convenient source of information about organic compounds is Wolfram|Alpha http://www.wolframalpha.com/

Transcript of Chemical Information Retrieval - Wofford...

Page 1: Chemical Information Retrieval - Wofford Collegewebs.wofford.edu/whisnantdm/Courses/360/CA2/LES06_SciFinder_ S… · It is interesting that SciFinder lists salts as multicomponent

Chemical Information Retrieval Page 1

David Whisnant

Chemical Information Retrieval

CAS & SciFinder

Searching for Substances

In the past two lessons you worked with SciFinder to search for articles from the chemistry

literature. In this exercise we will use look for information about substances rather than research

topics.

CAS Registry Numbers

In the Chemical Abstracts databases, substances are uniquely identified by their CAS Registry

Numbers. Registry numbers are much better than names or formulas for identifying a substance,

because they are unique. If you know the Registry Number of a specific compound, you can

search for that number in the CA database and be certain of finding only information about that

particular compound.

The first thing we need to learn is how to find the Registry Number of a substance. There are

over 26,000,000 Registry Numbers for organic and inorganic compounds, so guessing one would

be tough! Luckily, Registry Numbers are so commonly used to identify substances that you will

find them in most chemistry databases. Although you can use SciFinder to search for Registry

Numbers, it is worthwhile to look for them in other databases first. You may not always have

SciFinder available when you need it.

Finding Registry Numbers: Wolfram|Alpha

Let’s find the Registry Number of the compound shown at the

right. We don’t know its name, but can start with its molecular

formula, C4H6N4O3S2.

A convenient source of information about organic compounds is Wolfram|Alpha

http://www.wolframalpha.com/

Page 2: Chemical Information Retrieval - Wofford Collegewebs.wofford.edu/whisnantdm/Courses/360/CA2/LES06_SciFinder_ S… · It is interesting that SciFinder lists salts as multicomponent

Chemical Information Retrieval Page 2

David Whisnant

Enter the molecular formula for the

compound, C4H6N4O3S2, in the query

space.

This finds the compound, acetazolamide.

Its CAS Registry Number is 59-66-5.

Finding Registry Numbers: Sigma-Aldrich Database

Sigma-Aldrich is a chemical company with a significant database of compounds. Its web site is

another good place to find Registry Numbers.

http://www.sigmaaldrich.com

Enter the molecular formula in the

Search box

Page 3: Chemical Information Retrieval - Wofford Collegewebs.wofford.edu/whisnantdm/Courses/360/CA2/LES06_SciFinder_ S… · It is interesting that SciFinder lists salts as multicomponent

Chemical Information Retrieval Page 3

David Whisnant

In the information about the

compound, click on Properties. The

CAS Registry Number will be one of

the properties displayed.

SciFinder: Structure Search

In this problem, you are a research scientist in a

laboratory who is exploring the properties of

chemicals that are potential sweeteners. By

analogy with other compounds, you think that a

chemical with the structure at the right is a

potential sweetener. You want to search the

chemical literature to see

(1) if this compound has been prepared before;

(2) if it is a potential sweetener; and

(3) if there are English-language descriptions of nonpatented methods for its preparation.

For the sake of this exercise, suppose that you have looked at the Wolfram Alpha and Sigma-

Aldrich databases and have not found the compound there. In this case, you will need to go to

SciFinder

In SciFinder, click on Explore Substances at the

top. You should see the Chemical Structure

Search option displayed.

Click on “Click to Edit”

Page 4: Chemical Information Retrieval - Wofford Collegewebs.wofford.edu/whisnantdm/Courses/360/CA2/LES06_SciFinder_ S… · It is interesting that SciFinder lists salts as multicomponent

Chemical Information Retrieval Page 4

David Whisnant

Draw the structure on the editing window.

Select Exact search in the lower right corner of the window. Then click on OK.

In the next window that appears, click on Search.

Page 5: Chemical Information Retrieval - Wofford Collegewebs.wofford.edu/whisnantdm/Courses/360/CA2/LES06_SciFinder_ S… · It is interesting that SciFinder lists salts as multicomponent

Chemical Information Retrieval Page 5

David Whisnant

We obtain a lot of references. Sorted by

relevance we see that many of them are optical

isomers of each other.

The L-, L- isomer with Registry Number

22839-47-0 seems to be important. It is the

second substance in the list and a component of

the first.

Click on Experimental

Properties for this compound.

We see that the compound

is a sweetener.

Click on the check mark in

the Preparation row and

Nonpatents column to see

a list of references

involving the synthesis of

this compound.

You can refine the list of preparations to limit them to English-language publications.

Page 6: Chemical Information Retrieval - Wofford Collegewebs.wofford.edu/whisnantdm/Courses/360/CA2/LES06_SciFinder_ S… · It is interesting that SciFinder lists salts as multicomponent

Chemical Information Retrieval Page 6

David Whisnant

Multicomponent Substances

When we were searching for this compound, you may have noticed

that some substances you found had our compound as a component.

About 10% of the CAS substances database consists of multicomponent substances.1 The above

substances (5910-52-1 and 106372-55-8) are two of them.

It is interesting that SciFinder lists salts as multicomponent systems. If for

example you search for sodium sulfate, you will that the compound is

represented as the sodium salt of sulfuric acid.

1 D. R. Ridley. “Introduction to Structure Searching with SciFinder Scholar. American Chemical Society.

Page 7: Chemical Information Retrieval - Wofford Collegewebs.wofford.edu/whisnantdm/Courses/360/CA2/LES06_SciFinder_ S… · It is interesting that SciFinder lists salts as multicomponent

Chemical Information Retrieval Page 7

David Whisnant

Starting With a Compound Name

Dr. Bass has pointed out an interesting compound named "vinblastin," which is obtained from a

natural product.

From what plant is vinblastin extracted?

For what purpose(s) is vinblastin used in medicine?

In the “Explore

Substances” section,

select Substance

Identifier.

Enter “vinblastin” in

the text box.

SciFinder only returns one compound with Registry Number

865-21-4. Its name is vincaleukoblastine.

Note the icons that send you to a list of references, reactions,

etc.

Page 8: Chemical Information Retrieval - Wofford Collegewebs.wofford.edu/whisnantdm/Courses/360/CA2/LES06_SciFinder_ S… · It is interesting that SciFinder lists salts as multicomponent

Chemical Information Retrieval Page 8

David Whisnant

We are interested in locating references about how the compound

is obtained and used.

Select the compound by checking the box by its name. Click on

Get References.

In the window that appears, elect

Preparation and then Get.

Use the list of references to identify the

plant from which vinblastin is isolated.

We also want to learn about how vinblastin is used in medicine.

Click on substances in the

breadcrumb trail to return to

the substance itself.

Click on Get References again. This time, select Uses in the “Limit results to” list. Read the

abstracts of the first dozen or so references. What is vinblastin used for in medicine?

Page 9: Chemical Information Retrieval - Wofford Collegewebs.wofford.edu/whisnantdm/Courses/360/CA2/LES06_SciFinder_ S… · It is interesting that SciFinder lists salts as multicomponent

Chemical Information Retrieval Page 9

David Whisnant

Vinblastin appears to be biologically active. We can learn more in the

Substance Detail section. Click on this link.

In the “Substance Detail” section, you will find a

link for Bioactivity Indicators.

Which one of the bioactivity indicators for

vinblastin has a much larger number of references

than the others? Does this agree with the

application of vinblastin to medicine that you

found earlier?

Starting With a Registry Number

In the first few pages of this lesson, we searched Wolfram-Alpha

and Sigma-Aldrich for the CAS Registry Number of the

compound on the right. We found it to be 59-66-5.

Let’s suppose we need the C-13 NMR spectrum of this compound. We can use this Registry

Number and SciFinder to find the spectrum.

Go to the Explore Substances and enter the Registry

Number as the Substance Identifier.

Page 10: Chemical Information Retrieval - Wofford Collegewebs.wofford.edu/whisnantdm/Courses/360/CA2/LES06_SciFinder_ S… · It is interesting that SciFinder lists salts as multicomponent

Chemical Information Retrieval Page 10

David Whisnant

Click on the Spectra link under the structure and name of the

compound.

Three different sources of the C-13 NMR are available.

Refining by Structure

Suppose you are interested in the cyclohexylmethylium ion.

One way of finding this ion is to search for it

by molecular formula, C7H13

This search returns over 120 structures – way too many to look through yourself. We need to

refine the search,

Page 11: Chemical Information Retrieval - Wofford Collegewebs.wofford.edu/whisnantdm/Courses/360/CA2/LES06_SciFinder_ S… · It is interesting that SciFinder lists salts as multicomponent

Chemical Information Retrieval Page 11

David Whisnant

Under the Refine tab, click on the Chemical Structure image.

Draw the structure, select Exact search, and then OK

You will see the structure in the “Refine” section. Click on Refine at the

bottom of the section.

This narrows the list down to

around 11 substances, which

you can look through to find

the ion in which you are

interested.

Page 12: Chemical Information Retrieval - Wofford Collegewebs.wofford.edu/whisnantdm/Courses/360/CA2/LES06_SciFinder_ S… · It is interesting that SciFinder lists salts as multicomponent

Chemical Information Retrieval Page 12

David Whisnant

Using Greek Letters

When you are exploring research topics or substances, you may have occasion to enter a Greek

letter. For example:

β-alanine

α-alkylation

π-bond

You can express a Greek letter by spelling out the name of the letter and surrounding it with

periods.

.beta.-alanine

.alpha.-alkylation

.pi.-bond

Searching for Salts

As I alluded to on page 6, the CAS databases express salts

rather archaically as multicomponent systems. For example,

the formula of sodium sulfate in the CAS databases is

H2O4S.2Na .

The best way to search for a salt is to find its Registry

Number in another database and search for the Registry

Number.